bench(native): add parallel sign-rq2 row matching OrdinalDB serving

## Summary

Add a native BEIR benchmark row for parallel `sign-rq2` orchestration that mirrors the OrdinalDB serving path.

## Why

PR #237's native benchmark harness currently gives HNSW a threaded path through `hnsw.parallel_search`, while the `sign-rq2` row uses serial sign candidate generation plus serial subset rerank inside the batch loop. OrdinalDB's serving path now drives the ordvec 0.5 caller-owned APIs from rayon workers:

- split the query batch into chunks
- each worker calls `SignBitmap::top_m_candidates_batched_serial_csr`
- each worker reuses its own `SubsetScratch`
- each worker calls `RankQuant::search_asymmetric_subset_batched_serial_into`
- workers write directly into their output score/index chunks

Without this row, native benchmark comparisons answer "serial sign-rq2 vs threaded HNSW" rather than "ordvec two-stage vs HNSW under equivalent caller-owned threading."

## Proposed benchmark row

Add a row such as `ordvec-sign-rq2-par` or `sign-rq2-par` to `benchmarks/beir-bench`:

1. Use the same corpus, query batch, `k`, candidate count, and thread pool size as the existing threaded comparison.
2. Build `SignBitmap` + `RankQuant` exactly like the current `sign-rq2` row.
3. In the timed search closure, split query chunks across the configured rayon pool.
4. Per worker, allocate/reuse one `SubsetScratch` and call the serial CSR sign probe plus serial batched subset rerank into the assigned output slice.
5. Report the same recall/latency/build/storage columns as the existing rows.

## Acceptance criteria

- The new row is apples-to-apples with HNSW for thread count and batch size.
- The existing serial `sign-rq2` row remains, so single-thread/serial behavior is still visible.
- The benchmark summary makes clear which row is serial and which row is caller-parallel.
- The row uses public ordvec APIs only; no OrdinalDB dependency in the ordvec benchmark.

## Related

- OrdinalDB 0.5 worktree now exposes `DenseSearchPlan`/`DenseSearchTimings` and uses caller-owned parallel sign→rerank orchestration.
- #234 tracks allocation-reusing sign probe APIs.
- #235 tracks subset-rerank LUT reuse through `SubsetScratch`.
- #236 tracks tiled batched sign probing without a full score matrix.
- #128 tracks native ordvec per-call execution reports.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bench(native): add parallel sign-rq2 row matching OrdinalDB serving #238

Summary

Why

Proposed benchmark row

Acceptance criteria

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

bench(native): add parallel sign-rq2 row matching OrdinalDB serving #238

Description

Summary

Why

Proposed benchmark row

Acceptance criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions