[codex] Pin RankQuant subset duplicate contract by Fieldnote-Echo · Pull Request #167 · Project-Navi/ordvec

Fieldnote-Echo · 2026-06-04T01:43:25Z

Summary

assert full score-descending/global-row-id tie ordering in RankQuant fuzz targets
document duplicate subset-candidate behavior at the Rust, Python, C ABI, and Go wrapper call sites
add duplicate-candidate regression tests for Python, C ABI, and Go RankQuant subset search
enforce RankQuant asymmetric public ordering after SIMD center-offset reapplication so exposed score ties are sorted by row ID

Review loop

Scouted fuzz ordering gaps and public duplicate-candidate contract surfaces before implementation.
Ran adversarial review after implementation.
Remediated review findings by splitting score monotonicity from equal-score tie assertions and adding Rust-side ABI field docs to back the generated C header comment.
Remediated CI/Gemini findings by tightening full-index fuzz ties to strict row-ID increase, regenerating the C header with CI-pinned cbindgen 0.29.2, and fixing the exposed-score ordering hole found by the two-stage fuzz target.

Validation

cargo test -p ordvec --test determinism_contract
cargo check --manifest-path fuzz/Cargo.toml --all-targets
ASAN_OPTIONS=detect_leaks=0 LSAN_OPTIONS=detect_leaks=0 cargo +nightly-2025-08-15 fuzz run signbitmap_rankquant_twostage -- -max_total_time=15 -rss_limit_mb=4096
/tmp/ordvec-cbindgen-0.29.2/bin/cbindgen ordvec-ffi --config ordvec-ffi/cbindgen.toml --output ordvec-ffi/include/ordvec.h --verify
cargo test -p ordvec-ffi full_and_subset_search_rankquant
cargo test -p ordvec-ffi
cargo build -p ordvec-ffi --release
GOCACHE=/tmp/ordvec-gocache GOMODCACHE=/tmp/ordvec-gomodcache go test -count=1 ./...
VIRTUAL_ENV=/home/ndspence/GitHub/ordvec/ordvec-python/.venv PATH=/home/ndspence/GitHub/ordvec/ordvec-python/.venv/bin:$PATH maturin develop -m ordvec-python/Cargo.toml
VIRTUAL_ENV=/home/ndspence/GitHub/ordvec/ordvec-python/.venv PATH=/home/ndspence/GitHub/ordvec/ordvec-python/.venv/bin:$PATH python -m pytest ordvec-python/tests/test_rank_quant.py
cargo clippy -p ordvec --all-targets -- -D warnings
cargo clippy -p ordvec-ffi --all-targets -- -D warnings
cargo fmt --all --check
git diff --check

gemini-code-assist

Code Review

This pull request documents and tests the behavior of subset candidate searches when duplicate or unsorted global row IDs are provided, ensuring they are scored independently and can produce duplicate hits. The feedback suggests tightening an assertion in the full index search fuzz target to require strictly ascending document IDs on equal scores, since full index search document IDs are unique.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

codecov · 2026-06-04T01:44:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

qodo-code-review · 2026-06-04T02:30:13Z

Review Summary by Qodo

Pin RankQuant subset duplicate contract with deterministic tie ordering

🐞 Bug fix ✨ Enhancement

Walkthroughs

Description

• Enforce deterministic RankQuant subset search ordering by score descending, then row ID ascending
• Add center-offset reapplication sort to handle score ties after SIMD offset adjustment
• Document duplicate candidate behavior across Rust, Python, C ABI, and Go wrapper interfaces
• Add regression tests for duplicate candidate handling in Python, C ABI, and Go bindings

Diagram

flowchart LR
  A["RankQuant subset search"] --> B["Score candidates"]
  B --> C["Apply center offset"]
  C --> D["Re-sort by score desc, row ID asc"]
  D --> E["Return ordered results"]
  F["Duplicate candidates"] --> A
  F --> G["Each scored independently"]
  G --> E

File Changes

1. src/quant.rs 🐞 Bug fix +33/-1

Add deterministic tie-breaking sort after center offset

src/quant.rs

2. fuzz/fuzz_targets/search_rankquant.rs 🧪 Tests +6/-1

Enhance assertions to verify score-desc and row-ID-asc ordering

fuzz/fuzz_targets/search_rankquant.rs

3. fuzz/fuzz_targets/signbitmap_rankquant_twostage.rs 🧪 Tests +19/-1

Add dedicated ordering validation function for full contract

fuzz/fuzz_targets/signbitmap_rankquant_twostage.rs

View more (7)

4. ordvec-ffi/src/lib.rs 📝 Documentation +32/-0

Document duplicate candidate behavior and add C ABI test

ordvec-ffi/src/lib.rs

5. ordvec-ffi/include/ordvec.h 📝 Documentation +4/-0

Document duplicate candidate semantics in C header

ordvec-ffi/include/ordvec.h

6. ordvec-python/src/lib.rs 📝 Documentation +5/-0

Document duplicate candidate handling in Python docstring

ordvec-python/src/lib.rs

7. ordvec-python/tests/test_rank_quant.py 🧪 Tests +14/-0

Add Python regression test for duplicate candidates

ordvec-python/tests/test_rank_quant.py

8. ordvec-go/ordvec.go 📝 Documentation +3/-0

Document duplicate candidate semantics in Go wrapper

ordvec-go/ordvec.go

9. ordvec-go/ordvec_test.go 🧪 Tests +22/-0

Add Go regression test for duplicate candidate hits

ordvec-go/ordvec_test.go

10. ordvec-go/README.md 📝 Documentation +5/-0

Document duplicate candidate behavior in Go README

ordvec-go/README.md

qodo-code-review · 2026-06-04T02:30:13Z

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX issues (0) 🔗 Cross-repo conflicts (0)

1. ~~Extra post-offset sort~~ ✓ Resolved 🐞 Bug ➹ Performance

Description

When centre_drop_used is true, RankQuant::search_asymmetric and search_asymmetric_subset now
allocate and sort an additional time via sort_by_exposed_score_then_row_id, even though
TopK::finalize_into already allocated+sorted the top-k slice. This adds per-query heap churn and
O(k log k) work on the SIMD path used by public Python/C ABI/Go entrypoints.

Code

src/quant.rs[R695-716]

+// Re-applying a query-constant center offset after TopK finalization can round
+// distinct internal scores onto the same exposed score; enforce the public
+// `(score desc, row ID asc)` order on the finalized top-k slice.
+fn sort_by_exposed_score_then_row_id(scores: &mut [f32], indices: &mut [i64]) {
+    debug_assert_eq!(scores.len(), indices.len());
+    let mut pairs: Vec<(usize, f32, i64)> = scores
+        .iter()
+        .copied()
+        .zip(indices.iter().copied())
+        .enumerate()
+        .map(|(slot, (score, row_id))| (slot, score, row_id))
+        .collect();
+    pairs.sort_unstable_by(|a, b| {
+        b.1.total_cmp(&a.1)
+            .then_with(|| a.2.cmp(&b.2))
+            .then_with(|| a.0.cmp(&b.0))
+    });
+    for (slot, (_, score, row_id)) in pairs.into_iter().enumerate() {
+        scores[slot] = score;
+        indices[slot] = row_id;
+    }
+}

Evidence

The SIMD centre-drop path calls TopK::finalize_into(...) and then (new in this PR) immediately
performs another allocation+sort over the finalized top-k slice to enforce the public ordering after
offset reapplication. TopK::finalize_into already allocates and sorts pairs, so the new helper
duplicates that work and is on the public call paths used by Python and the C ABI.

src/quant.rs[351-420]
src/quant.rs[663-691]
src/quant.rs[695-716]
src/util.rs[472-516]
ordvec-python/src/lib.rs[491-513]
ordvec-ffi/src/lib.rs[830-844]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`RankQuant` currently performs two sorts on the SIMD centre-drop path:
1) `TopK::finalize_into(...)` allocates+sorts the top-k.
2) After adding the center offset, `sort_by_exposed_score_then_row_id(...)` allocates+sorts again.

The second sort is correctness-motivated (offset reapplication can round distinct internal scores onto identical exposed scores), but doing it unconditionally adds avoidable overhead.

## Issue Context
This affects user-facing APIs because Python and the C ABI forward into `RankQuant::search_asymmetric` / `search_asymmetric_subset`.

## Fix Focus Areas
- src/quant.rs[400-420]
- src/quant.rs[663-691]
- src/quant.rs[695-716]
- src/util.rs[472-516]

## Suggested fix approach
Implement one of the following optimizations (keep correctness contract):
- **Only sort when needed**: after applying the offset, do a linear pass to check whether the slice already satisfies `(score desc, row_id asc for equal exposed scores)` (use `total_cmp` equality for the score tie check). Call `sort_by_exposed_score_then_row_id` only if a violation is detected.
- **Sort once instead of twice**: extend `TopK::finalize_into` (or add an alternative finalize) to accept a `score_adjustment` (e.g., add `offset` to each finalized score before sorting), so the final order is produced in a single allocation+sort.
- **Reduce allocations**: use a stack-backed small buffer (e.g., `SmallVec`) for common `k` values, or reuse a preallocated buffer per call if feasible.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9019815077

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

gemini-code-assist Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread fuzz/fuzz_targets/search_rankquant.rs

Nelson Spence (Fieldnote-Echo) force-pushed the codex/rankquant-determinism-followups branch from 4f035a0 to e06b4c9 Compare June 4, 2026 01:54

Pin RankQuant subset duplicate contract

9019815

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Nelson Spence (Fieldnote-Echo) force-pushed the codex/rankquant-determinism-followups branch from e06b4c9 to 9019815 Compare June 4, 2026 01:58

Nelson Spence (Fieldnote-Echo) marked this pull request as ready for review June 4, 2026 02:29

Nelson Spence (Fieldnote-Echo) requested a review from Navi Bot (project-navi-bot) as a code owner June 4, 2026 02:29

chatgpt-codex-connector Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread src/quant.rs Outdated

Nelson Spence (Fieldnote-Echo) force-pushed the codex/rankquant-determinism-followups branch from 925a5d6 to 9db3b36 Compare June 4, 2026 02:50

Fix RankQuant exposed-score TopK ties

bc6ef84

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Nelson Spence (Fieldnote-Echo) force-pushed the codex/rankquant-determinism-followups branch from 9db3b36 to bc6ef84 Compare June 4, 2026 02:59

Merge branch 'main' into codex/rankquant-determinism-followups

57ea139

project-navi-bot approved these changes Jun 9, 2026

View reviewed changes

Navi Bot (project-navi-bot) merged commit cc89ae9 into main Jun 9, 2026
38 checks passed

Navi Bot (project-navi-bot) deleted the codex/rankquant-determinism-followups branch June 9, 2026 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[codex] Pin RankQuant subset duplicate contract#167

[codex] Pin RankQuant subset duplicate contract#167
Navi Bot (project-navi-bot) merged 3 commits into
mainfrom
codex/rankquant-determinism-followups

Fieldnote-Echo commented Jun 4, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

codecov Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

qodo-code-review Bot commented Jun 4, 2026

Uh oh!

qodo-code-review Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Fieldnote-Echo commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review loop

Validation

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

codecov Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

qodo-code-review Bot commented Jun 4, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fieldnote-Echo commented Jun 4, 2026 •

edited

Loading

codecov Bot commented Jun 4, 2026 •

edited

Loading

qodo-code-review Bot commented Jun 4, 2026 •

edited

Loading