Skip to content

Add JNI layer and core library improvements for OpenSearch integration#13

Open
model-collapse wants to merge 1 commit into
opensearch-project:mainfrom
model-collapse:feature/jni-opensearch-integration
Open

Add JNI layer and core library improvements for OpenSearch integration#13
model-collapse wants to merge 1 commit into
opensearch-project:mainfrom
model-collapse:feature/jni-opensearch-integration

Conversation

@model-collapse

Copy link
Copy Markdown
Collaborator

Summary

  • Adds JNI bindings (jni/) for integrating the native sparse vector library with OpenSearch's neural-search plugin
  • Core library improvements for production-scale operation at 138M+ documents across multiple shards

Changes

JNI Layer (jni/nsparse_jni.cpp)

  • Index lifecycle: createIndex, addVectors, addVectorsWithIds, buildIndex, buildAndSaveIndex, saveIndex, loadIndex, deleteIndex
  • Search: search (SEISMIC) and searchSQ (scalar-quantized SEISMIC)
  • Memory management: release_build_memory() + malloc_trim() to return RSS after build
  • Page cache eviction: evictPageCache using madvise(MADV_DONTNEED) via /proc/self/maps scanning with dynamic buffer (getline) and error diagnostics

Core Library

  • int64 offsets (offset_t): handles >2B non-zeros in CSR indptr arrays (required at 45M+ docs per shard)
  • Streaming build_and_save: writes clusters batch-by-batch to reduce peak RSS from ~104 GB to ~50 GB
  • Scalar quantizer: configurable quantization_ceiling parameter for ingest and search paths
  • IDMapIndex: add_with_ids support for external document ID mapping
  • K-means clustering: OpenMP parallelization (32-thread builds take 12 min vs 50+ hours single-threaded)
  • Distance functions: unified SIMD dispatch across AVX2/AVX512/NEON/SVE with prefetch hints

Tests

  • Extended IDMapIndex tests for add_with_ids and save/load round-trip
  • Scalar quantizer boundary condition tests
  • SEISMIC common component tests

Benchmark Results (138M docs, 3 nodes × 46M docs/shard, uint8 SQ)

Force Merge (Index Build)

Metric Value
Build time 12.7 min per shard (32 threads)
Peak RSS 59-61 GB (on 128 GB r6i.4xlarge nodes)
Settled RSS 17 GB (JVM only, native index mmap'd on demand)
.nsparse file size ~33 GB per shard

Search Latency (3903 queries, k=10, top_n=3)

heap_factor p50 (ms) p95 (ms) p99 (ms) QPS
1.03 11 15 18 64
1.08 3 6 8 136
1.15 3 6 8 136
1.25 3 6 8 133
2.00 8 15 18 80
4.00 22 28 29 39

Test plan

  • cmake -S . -B build -DNSPARSE_ENABLE_TESTS=ON && cmake --build build -j && ctest --test-dir build
  • JNI build: cmake -S . -B build -DNSPARSE_ENABLE_JNI=ON && cmake --build build --target nsparse_jni
  • Integration validated at 138M docs across 3-node OpenSearch cluster

🤖 Generated with Claude Code

Adds JNI bindings for the native sparse vector library to integrate with
OpenSearch's neural-search plugin. Key capabilities:
- Index lifecycle: create, add vectors (with IDs), build, save, load, delete
- Search: dense-scoring SEISMIC and SEISMIC-SQ (scalar quantized) search
- Memory management: streaming build_and_save, release_build_memory, malloc_trim
- Page cache eviction: madvise(MADV_DONTNEED) via /proc/self/maps scanning

Core library changes for production-scale operation (138M+ documents):
- int64 offsets (offset_t) to handle >2B non-zeros in CSR indptr
- Streaming build_and_save: writes clusters batch-by-batch to reduce peak RSS
- Scalar quantizer improvements: configurable quantization ceilings
- IDMapIndex: add_with_ids support for external document ID mapping
- K-means clustering: OpenMP parallelization for 32-thread builds
- Distance functions: unified SIMD dispatch across AVX2/AVX512/NEON/SVE

Validated at 138M docs (3 shards × 46M), uint8 SQ SEISMIC:
- Build time: 12.7 min per shard (32 threads)
- Peak RSS: 59-61 GB during force merge (on 128 GB nodes)
- Search: 3ms p50 at heap_factor=1.08, top_n=3

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants