Summary
Update the native BEIR benchmark summary to report HNSW graph/index memory separately from stored vector bytes.
Why
The current native benchmark table makes HNSW look like it costs only the full float vector payload (dim * 4 bytes/vector). That is useful as a lower-bound stored-vector column, but it underreports the actual serving footprint because HNSW also owns graph/link/search-structure memory.
This matters for the launch comparison: OrdVec/OrdinalDB two-stage paths trade a full scan over compact encodings for much lower build/storage complexity, while HNSW trades graph build/storage for sublinear traversal. The benchmark should make that trade visible rather than comparing latency with an incomplete memory column.
Proposed change
In benchmarks/beir-bench:
- Keep the existing vector-payload bytes/vector column if useful, but rename it clearly, e.g.
stored_vector_bytes_per_vec.
- Add HNSW index/graph memory accounting if
hnsw_rs exposes it directly.
- If direct accounting is unavailable, add a documented estimate based on configured
M, level/link storage, ids, and any known per-node metadata.
- Emit both values in JSON and summary tables.
- Mark estimated fields explicitly so they are not confused with measured RSS.
Acceptance criteria
- HNSW rows no longer imply that graph overhead is zero.
- OrdVec rows and HNSW rows have clearly comparable storage columns.
- Summary text explains measured vs estimated memory if the graph memory value is not directly measured.
- No changes to retrieval behavior are required.
Related
Summary
Update the native BEIR benchmark summary to report HNSW graph/index memory separately from stored vector bytes.
Why
The current native benchmark table makes HNSW look like it costs only the full float vector payload (
dim * 4bytes/vector). That is useful as a lower-bound stored-vector column, but it underreports the actual serving footprint because HNSW also owns graph/link/search-structure memory.This matters for the launch comparison: OrdVec/OrdinalDB two-stage paths trade a full scan over compact encodings for much lower build/storage complexity, while HNSW trades graph build/storage for sublinear traversal. The benchmark should make that trade visible rather than comparing latency with an incomplete memory column.
Proposed change
In
benchmarks/beir-bench:stored_vector_bytes_per_vec.hnsw_rsexposes it directly.M, level/link storage, ids, and any known per-node metadata.Acceptance criteria
Related
sign-rq2native benchmark row so threaded HNSW and threaded ordvec two-stage are compared under equivalent orchestration.