Skip to content

bench(native): report HNSW graph memory separately from stored vector bytes #239

@Fieldnote-Echo

Description

Summary

Update the native BEIR benchmark summary to report HNSW graph/index memory separately from stored vector bytes.

Why

The current native benchmark table makes HNSW look like it costs only the full float vector payload (dim * 4 bytes/vector). That is useful as a lower-bound stored-vector column, but it underreports the actual serving footprint because HNSW also owns graph/link/search-structure memory.

This matters for the launch comparison: OrdVec/OrdinalDB two-stage paths trade a full scan over compact encodings for much lower build/storage complexity, while HNSW trades graph build/storage for sublinear traversal. The benchmark should make that trade visible rather than comparing latency with an incomplete memory column.

Proposed change

In benchmarks/beir-bench:

  1. Keep the existing vector-payload bytes/vector column if useful, but rename it clearly, e.g. stored_vector_bytes_per_vec.
  2. Add HNSW index/graph memory accounting if hnsw_rs exposes it directly.
  3. If direct accounting is unavailable, add a documented estimate based on configured M, level/link storage, ids, and any known per-node metadata.
  4. Emit both values in JSON and summary tables.
  5. Mark estimated fields explicitly so they are not confused with measured RSS.

Acceptance criteria

  • HNSW rows no longer imply that graph overhead is zero.
  • OrdVec rows and HNSW rows have clearly comparable storage columns.
  • Summary text explains measured vs estimated memory if the graph memory value is not directly measured.
  • No changes to retrieval behavior are required.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    perfPerformance-relevant: scan/SIMD/alloc/memory/parallelismtestingTesting / CI / fuzz / bench

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions