feat: capability-gated b=8 RankQuant (research/evidence width) (#221) by Fieldnote-Echo · Pull Request #228 · Project-Navi/ordvec

Fieldnote-Echo · 2026-06-14T20:21:09Z

Closes #221.

What

Adds b=8 as a capability-gated RankQuant width — a stable, documented evidence/refinement surface (asymmetric quant after repair flows, edge-case rerank healing), not experimental-unstable, and not a broad retrieval-quant method.

Capability matrix (the constant-composition rule dim % 2^bits exists only for the symmetric analytical norm):

path	b=8
bucket-code generation, pair-evidence/contingency, asymmetric scoring	any dim
symmetric scoring + analytical norm	only `dim % 256 == 0`

b=1/2/4 are unchanged — they stay the stable headline retrieval surface.

API (additive — no breaking changes)

RankQuantCapability { AsymmetricOnly, SymmetricAndAsymmetric } + capability() + symmetric_supported().
RankQuant::new(dim, 8) requires dim % 256 == 0 (full capability; fail-loud otherwise, directing to new_asymmetric). RankQuant::new_asymmetric(dim, 8) — any dim, AsymmetricOnly (auto-upgrades when 256-aligned).
search() on an AsymmetricOnly instance fails loud: "RankQuant b=8 symmetric scoring requires dim % 256 == 0; dim={dim} supports asymmetric/evidence APIs only." (documented; query symmetric_supported() first).
validate_params(dim, 8) = code-validity at any dim.

Kernel

b=8 asymmetric = a dim*256 float-LUT gather. AVX-512 vgatherdps kernel (runtime-dispatched, explicit tail handling), scalar LUT fallback. ~1.23× over scalar on an AVX-512 host — honest (gather/LUT-latency bound, not the ~10× of b=2/4 which touch no LUT); shipped because it reproducibly wins. Parity vs scalar within the crate's 1e-4 tolerance across dims 384/400/768/1024/1536.

Verified (locally, my own runs)

fmt / clippy -D warnings (default + experimental) / test (196 default + 206 experimental + no-default-features) / MSRV 1.89 — all green. b=1/2/4 behaviour and every prior test unchanged. Unsafe gather bounds proven by hand. Test matrix covers dim=384 (asym pass, symmetric rejects with exact message), 768/1024/1536 (full), b=4 dim=384 unchanged, capability reporting, brute-force asymmetric parity.

⚠️ Decisions for maintainer sign-off

b=8 persistence is deferred (most important for your use case) — RankQuant::write() returns io::Error(Unsupported) and writes no file for a b=8 index (the .tvrq loader admits {1,2,4} only; refusing avoids a silent broken round-trip). b=8 is in-memory only this phase. Your stated use case is "asymmetric quant storage" — if that means persisting b=8 to disk, say so and I'll add b=8 to the .tvrq format (trivial: 1 byte/coord) as a follow-up or in this PR.
Symmetric-on-asym-only = fail-loud panic (not Result) — matches the crate's fail-loud guard style and avoids a breaking search signature; the new_asymmetric name + symmetric_supported() make it non-surprising. Confirm.
Persistence/Python — Python binding keeps its {1,2,4} guard (b=8 is Rust-core-only; Python parity is the separate Add Python/CLI parity utilities for ordinal pair evidence #223).

Coupling

Once this lands, #226 (RankQuantSpec, #220) gets a small follow-up to accept b=8 at any dim + report symmetric_supported().

Targets 0.6 (may land in 0.5.0 — maintainer decides).

b=8 is an evidence/refinement-oriented RankQuant width (asymmetric quant after repair flows, edge-case rerank healing) — a stable, documented surface, NOT experimental-unstable. Capability matrix: - bucket-code generation / pair-evidence / asymmetric scoring: ANY dim - symmetric scoring + analytical norm: ONLY dim % 256 == 0 API (additive, no breaking changes to existing signatures): - RankQuantCapability { AsymmetricOnly, SymmetricAndAsymmetric } + capability() + symmetric_supported(). - RankQuant::new(dim, 8) requires dim % 256 == 0 (full capability; fail-loud else, directing to new_asymmetric). RankQuant::new_asymmetric(dim, 8): any dim, AsymmetricOnly (auto-upgrades to full when 256-aligned). b=1/2/4 unchanged. - search() on an AsymmetricOnly instance fails loud with the exact message 'RankQuant b=8 symmetric scoring requires dim % 256 == 0; dim={dim} supports asymmetric/evidence APIs only.' (documented; check symmetric_supported() first). - validate_params(dim, 8): code-validity any dim (no dim%256). Primitives widened to bits<=8 (mask in u16 to avoid 1u8<<8 overflow); b=8 packs 1 byte/coord. Kernel: b=8 asymmetric uses an AVX-512 vgatherdps kernel (dim*256 LUT gather), runtime-dispatched (avx512f + dim%16==0, explicit tail handling), scalar LUT fallback; ~1.23x over scalar on this host (gather/LUT-latency bound — honest, shipped). Parity vs scalar within 1e-4 across dims 384/400/768/1024/1536. Persistence: b=8 write() returns io::Error(Unsupported) (no file) — the .tvrq loader admits {1,2,4} only; b=8 is in-memory this phase (FLAGGED, see PR). Verified: fmt/clippy(-D warnings, default+experimental)/test(196 default + 206 experimental + no-default-features)/MSRV 1.89 green; b=1/2/4 unchanged; unsafe gather bounds proven. Closes #221. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

qodo-code-review · 2026-06-14T20:21:15Z

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Context used

✅ Tickets: 🎫 Expose reusable RankQuant bucket-code API for downstream evidence systems 🎫 Add Python/CLI parity utilities for ordinal pair evidence 🎫 Decide b=8 RankQuant evidence surface for OrdGraph/Ordscope use

1. rankquant_eval_search contradicts b=8 gating ✓ Resolved 📎 Requirement gap ≡ Correctness

Description

The public docs state b=8 symmetric scoring is only supported when dim % 256 == 0, but
rankquant_eval_search now explicitly supports bits=8 at non-256-aligned dims via an empirical
norm. This creates a semantic mismatch in the crate’s b=8 capability matrix and can mislead
downstream users relying on consistent b=8 symmetric behavior.

Code

src/quant.rs[R44-48]

fn check_eval_bits(bits: u8) {
-    assert!((1..=7).contains(&bits), "bits must be in 1..=7");
+    // b=8 codes still fit a u8 (0..=255); the eval norm is computed empirically
+    // (not the analytical b=8 norm), so it is valid at any dim. b=9 is the first
+    // width whose codes overflow u8.
+    assert!((1..=8).contains(&bits), "bits must be in 1..=8");

Evidence

src/lib.rs documents b=8 symmetric scoring as only valid when dim % 256 == 0, but
check_eval_bits was widened to allow bits=8 specifically noting the eval norm is empirical and
valid at any dim, and a new test asserts rankquant_eval_search(..., dim=384, bits=8, ...) works.
This demonstrates a mismatch between the documented b=8 symmetric gating and the public eval-search
behavior.

Define and document b=8 RankQuant evidence surface and ensure no bit-width semantic mismatch
src/lib.rs[16-19]
src/quant.rs[44-48]
tests/index/quant_b8.rs[451-467]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`rankquant_eval_search` accepts `bits=8` at non-`256`-aligned dims (via an empirical norm), while the crate-level documentation states `b=8` symmetric scoring is only supported when `dim % 256 == 0`. This creates an internal/public semantic mismatch for b=8.

## Issue Context
- `src/lib.rs` documents b=8 symmetric scoring as `dim % 256 == 0` only.
- `check_eval_bits` and the new integration test explicitly assert eval-search supports `bits=8` at non-256-aligned dims.
- Downstream users (incl. OrdGraph migration) may treat `rankquant_eval_search` as a reference for symmetric scoring semantics.

## Fix Focus Areas
- src/quant.rs[44-48]
- tests/index/quant_b8.rs[451-467]
- src/lib.rs[16-19]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. ~~Docs omit b=8 semantics~~ ✓ Resolved 📎 Requirement gap ⚙ Maintainability

Description

The PR adds and documents b=8 RankQuant support, but top-level docs still state RankQuant supports
only bits ∈ {1, 2, 4}. This violates the requirement to clearly document retrieval-supported vs
evidence-only widths and can mislead downstream users about b=8 availability and intended use.

Code

src/quant.rs[R3-19]

+//! Storage is `dim * bits / 8` bytes per document at `bits ∈ {1, 2, 4, 8}`
+//! (`b=8` is one byte per coordinate). Symmetric search uses a per-query,
+//! per-coord LUT; asymmetric search dispatches AVX-512 → AVX2 → scalar via
+//! the kernels in [`crate::quant_kernels`].
+//!
+//! `b=8` is an evidence/refinement-oriented width: it is supported for
+//! asymmetric scoring and code/projection generation at **any** dimension,
+//! but symmetric scoring uses the equal-bucket analytical norm and therefore
+//! requires `dim % 256 == 0`. For `b ∈ {1, 2, 4}` the existing retrieval
+//! modes remain the stable headline surface; `b=8` is an opt-in,
+//! explicitly-documented high-precision evidence/refinement surface
+//! (e.g. asymmetric quant storage after repair flows, edge-case rerank
+//! healing), not a broad retrieval-quant method. It is **not**
+//! unstable-experimental. See [`RankQuantCapability`] and
+//! [`RankQuant::new_asymmetric`]. Its asymmetric path is a per-coordinate
+//! gather against the `dim * 256` LUT: an AVX-512 `vgatherdps` kernel when
+//! available (`avx512f` + `dim % 16 == 0`), else the portable scalar LUT.

Evidence

Rule 11 requires documentation to clearly distinguish retrieval-supported widths from any
research/evidence-only widths and to align with the implementation. The PR updates src/quant.rs to
describe b=8 support and semantics, but README.md and docs/RANK_MODES.md still claim only
{1,2,4} are supported, creating an inconsistency.

Document bit-width semantics (retrieval-supported vs research/evidence-only)
src/quant.rs[3-19]
README.md[29-37]
docs/RANK_MODES.md[70-75]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
After introducing and documenting `b=8` in `src/quant.rs`, the repository documentation still describes RankQuant bit widths as `bits ∈ {1, 2, 4}` only. This creates conflicting guidance about which widths are supported and what `b=8` means (retrieval vs evidence-only surface).

## Issue Context
- `src/quant.rs` now documents `bits ∈ {1, 2, 4, 8}` and describes `b=8` as an evidence/refinement-oriented width with capability gating.
- `README.md` and `docs/RANK_MODES.md` still state RankQuant supports only `{1,2,4}`.

## Fix Focus Areas
- README.md[29-37]
- docs/RANK_MODES.md[70-75]
- src/quant.rs[3-19]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. ~~AVX512 gather misgated~~ ✓ Resolved 🐞 Bug ☼ Reliability

Description

scan_b8_asym dispatches to the AVX-512 gather kernel when only avx512f is detected, but the
kernel widens 16 u8 doc bytes to 16 i32 lanes; on CPUs lacking the required AVX-512 byte/word subset
this can crash with an illegal-instruction at runtime. The fix is to gate dispatch (and the kernel’s
#[target_feature]) on the full feature set actually used (at least avx512bw in addition to
avx512f).

Code

src/quant_kernels.rs[R575-586]

+        if is_x86_feature_detected!("avx512f") && dim.is_multiple_of(16) {
+            // SAFETY: `avx512f` is confirmed by the runtime detection above
+            // and `dim % 16 == 0` satisfies the kernel's lane invariant;
+            // `packed.len() == n * dim` and `lut.len() == dim * 256` hold by
+            // construction (b=8 packs one byte/coord; the LUT is built just
+            // above). The explicit block is required by
+            // `#![deny(unsafe_op_in_unsafe_fn)]`.
+            unsafe {
+                scan_b8_asym_avx512_gather(packed, n, dim, &lut, scale, top);
+            }
+            return;
+        }

Evidence

The dispatch currently checks only avx512f before calling the gather kernel, while the kernel
explicitly uses a byte-to-dword widening intrinsic; elsewhere in the codebase, similar byte-widening
AVX-512 code is gated on avx512bw, indicating that subset is expected to be checked.

src/quant_kernels.rs[555-616]
src/quant_kernels.rs[678-706]
src/fastscan.rs[440-450]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The new b=8 AVX-512 gather path is dispatched when only `avx512f` is detected, but the kernel performs byte-lane widening (`_mm512_cvtepu8_epi32`) which typically requires `avx512bw`. On x86_64 hosts that have AVX-512F but not AVX-512BW, this can lead to a runtime SIGILL when `search_asymmetric` / `search_asymmetric_subset` hits the gather path.

### Issue Context
The repo already treats AVX-512 subsets carefully elsewhere (e.g., FastScan gates on `avx512f` + `avx512bw` + `avx512dq`). The b=8 gather kernel should follow the same pattern: both the `#[target_feature(...)]` annotation and the runtime `is_x86_feature_detected!` checks must cover every required ISA subset.

### Fix Focus Areas
- src/quant_kernels.rs[564-589]
- src/quant_kernels.rs[618-707]
- src/fastscan.rs[440-450]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

4. ~~b=8 norm mis-scaled~~ ✓ Resolved 🐞 Bug ≡ Correctness

Description

For bits=8 at non-256-aligned dims, RankQuant::search_asymmetric (and
search_asymmetric_subset) normalizes scores using rankquant_norm(dim, 8), even though
rankquant_norm is documented as exact only when dim % 256 == 0. This makes b=8 asymmetric scores
for AsymmetricOnly instances systematically mis-normalized (potentially outside the intended score
scale/range).

Code

↗ src/quant.rs

            .for_each(|((q, out_scores), out_indices)| {

Evidence

validate_params explicitly allows b=8 at any dimension (no dim % 256 requirement), but the
asymmetric scoring paths still derive inv_norm from rankquant_norm, which is documented as exact
only when dim % 256 == 0 for bits==8. This creates a mismatch where the newly-supported
b=8/asymmetric-only configuration uses a normalization constant that is explicitly stated to be
inexact for that configuration.

src/quant.rs[239-300]
src/quant.rs[601-615]
src/quant.rs[905-916]
src/rank.rs[287-313]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`RankQuant::validate_params(dim, 8)` explicitly allows any `dim`, and `RankQuant::new_asymmetric(dim, 8)` enables asymmetric scoring at non-256-aligned dimensions. However, both `search_asymmetric` and `search_asymmetric_subset` still compute `inv_norm` using `rankquant_norm(dim, bits)`, whose contract (and doc) is the *symmetric analytical* closed-form norm that is only exact when bucket occupancy is equal (for b=8: `dim % 256 == 0`).

This leads to incorrect normalization/scaling of b=8 asymmetric scores for the newly-supported non-256-aligned dimensions.

### Issue Context
There is already an in-crate implementation pattern for computing an exact norm by iterating ranks and accumulating `bucket_centre(rank_to_bucket(rank))^2` (see the eval helpers). A similar approach can be applied for b=8 (or more generally) in the RankQuant asymmetric path.

### Fix Focus Areas
- src/quant.rs[601-616]
- src/quant.rs[901-917]
- src/rank.rs[287-313]

### Suggested fix
- Introduce an *exact* norm computation for `(dim, bits)` when `bits == 8 && dim % 256 != 0` (or more generally for any non-constant-composition case), e.g.:
 - `rankquant_norm_exact(dim, bits)` that loops `rank in 0..dim`, computes `b = rank_to_bucket(rank as u16, dim, bits)`, `c = bucket_centre(b, bits)`, and accumulates `c*c`.
- In `search_asymmetric` and `search_asymmetric_subset`, when `bits == 8` and `!dim.is_multiple_of(256)`, use the exact norm instead of the closed-form `rankquant_norm`.
- (Optional but robust) Cache the computed norm (or `inv_norm`) in the `RankQuant` instance for `b=8` asymmetric-only to avoid recomputation across calls.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

5. ~~Eval rejects b=8~~ ✓ Resolved 🐞 Bug ≡ Correctness

Description

rankquant_eval_search still validates bits via check_eval_bits which rejects bits=8, so
calling rankquant_eval_search(..., bits=8, ...) will panic even though the PR expands the
supported widths to include 8 in RankQuant::validate_params and related APIs. This creates an
inconsistent public surface where b=8 is supported for RankQuant but not for the standalone eval
helper.

Code

src/quant.rs[R256-262]

    pub fn validate_params(dim: usize, bits: u8) -> Result<(), OrdvecError> {
-        if !matches!(bits, 1 | 2 | 4) {
+        if !matches!(bits, 1 | 2 | 4 | 8) {
            return Err(OrdvecError::InvalidParameter {
                name: "bits",
-                message: "must be 1, 2, or 4".to_string(),
+                message: "must be 1, 2, 4, or 8".to_string(),
            });
        }

Evidence

The eval path hard-rejects 8 via check_eval_bits(1..=7), while the updated validation and docs in
the same module now explicitly accept bits ∈ {1,2,4,8}, making rankquant_eval_search
unexpectedly panic for a newly-supported width.

src/quant.rs[44-50]
src/quant.rs[241-261]
src/quant.rs[1093-1101]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`rankquant_eval_search` still uses `check_eval_bits` which asserts `bits` is in `1..=7`, so `bits=8` panics. After this PR, other public entry points (e.g., `RankQuant::validate_params`) explicitly accept `bits=8`, so the eval API is now inconsistent with the supported bit-width domain.

### Issue Context
The underlying primitives (`rank_to_bucket`, `bucket_centre`) now support `bits<=8`, so `rankquant_eval_search` can safely support `bits=8` as well.

### Fix Focus Areas
- src/quant.rs[44-46]
- src/quant.rs[1093-1101]
- src/quant.rs[241-262]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

chatgpt-codex-connector · 2026-06-14T20:21:17Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

qodo-code-review · 2026-06-14T20:21:55Z

PR Summary by Qodo

Add capability-gated b=8 RankQuant with asymmetric gather kernel and tests
✨ Enhancement 🧪 Tests 🕐 40+ Minutes

Walkthroughs

Description

• Add b=8 RankQuant with capability gating for symmetric vs asymmetric scoring.
• Implement AVX-512 gather-based b=8 asymmetric scan with scalar fallback.
• Add integration/unit tests that pin capability rules and scoring parity.

Diagram

graph TD
  A["Public API (lib.rs)"] --> B["RankQuant (quant.rs)"] --> C["Rank utils (rank.rs)"]
  B --> D["Kernels (quant_kernels.rs)"]
  B --> E["TVRQ IO (rank_io)"]
  F["b=8 tests (tests/index/quant_b8.rs)"] --> B

  subgraph Legend
    direction LR
    _api["API Module"] ~~~ _core["Core Logic"] ~~~ _kern["SIMD/Scalar Kernels"]
  end

High-Level Assessment

The following are alternative approaches to this PR:

1. Make symmetric search return Result for unsupported capability

➕ Avoids panics for AsymmetricOnly instances and enables error handling in libraries/services
➕ Allows clearer propagation when consumers accidentally call symmetric search
➖ Breaking API change (search signature) unless introduced as a parallel method
➖ Would diverge from current crate’s fail-loud guard style

2. Split types: RankQuantAsymmetricOnly vs RankQuantFull

➕ Capability becomes a compile-time property; impossible to call symmetric search on asymmetric-only
➕ Reduces runtime branching and docs burden
➖ More API surface and conversions; may complicate ergonomics and generics
➖ Still needs persistence/versioning decisions for b=8

3. Version/extend the .tvrq format to include b=8 immediately

➕ Unblocks the stated asymmetric quant storage use case with an end-to-end story
➕ Avoids the current 'in-memory only' limitation for b=8
➖ Requires format bump and compatibility strategy across Rust/Python loaders
➖ Increases rollout coordination and review surface beyond kernel + API

Recommendation: The capability-gated design is a reasonable way to add b=8 without weakening the correctness contract of symmetric scoring. If the project strongly prefers fail-loud invariants, keeping the panic on unsupported symmetric search is consistent; otherwise consider adding a non-breaking try_search() variant returning Result. The biggest product decision is persistence: the current choice to hard-reject write() for b=8 avoids broken round-trips, but if storage is a core use case it’s worth prioritizing a .tvrq format extension soon.

File Changes

Enhancement (4)

lib.rs Re-export RankQuantCapability in public API +1/-1
Re-export RankQuantCapability in public API
• Extends the crate’s public exports to include the new RankQuantCapability enum alongside RankQuant, enabling downstream callers to inspect supported scoring modes.
src/lib.rs

quant.rs Add b=8 RankQuant capability gating and asymmetric-only constructor +328/-57

Add b=8 RankQuant capability gating and asymmetric-only constructor

• Introduces RankQuantCapability and stores it in RankQuant instances. Adds 'new_asymmetric()', capability helpers, and symmetric gating that panics with a pinned message for non-256-aligned b=8. Routes b=8 symmetric scanning through a dedicated scan function and routes b=8 asymmetric scoring through a new kernel dispatch. Rejects persistence for b=8 via 'write()' returning Unsupported and documents the limitation.

src/quant.rs

quant_kernels.rs Implement b=8 scans and AVX-512 gather asymmetric kernel +450/-0
Implement b=8 scans and AVX-512 gather asymmetric kernel
• Adds a b=8 per-coordinate LUT builder, a scalar reference scan ('scan_b8_to_topk'), and a runtime-dispatched asymmetric entry point ('scan_b8_asym'). Implements an AVX-512 'vgatherdps' kernel with explicit invariants and test coverage for parity against the scalar reference, plus an ignored micro-benchmark for kernel-level performance validation.
src/quant_kernels.rs

rank.rs Widen bucketing/packing primitives to support bits=8 +141/-28

Widen bucketing/packing primitives to support bits=8

• Extends 'rank_to_bucket', 'bucket_ranks', pack/unpack helpers, bytes-per-vec, and norm helpers to accept bits=8, including fixing mask calculation to avoid '1u8<<8' overflow. Adds documentation clarifying that 'dim % 256 == 0' is a symmetric-norm requirement (not code validity), and expands unit tests for b=8 behavior and invariants.

src/rank.rs

Tests (3)

main.rs Register new b=8 integration test module +1/-0
Register new b=8 integration test module
• Adds the 'quant_b8' integration test module to the test suite entrypoint so capability and parity tests run under the index test harness.
tests/index/main.rs

quant_b8.rs Add b=8 capability matrix and parity integration tests +449/-0
Add b=8 capability matrix and parity integration tests
• Adds comprehensive integration tests that pin: constructor capability behavior, 'validate_params' semantics, fail-loud symmetric gating message for non-aligned dims, symmetric correctness on 256-aligned dims, and asymmetric parity vs a naive reference (including subset rerank path).
tests/index/quant_b8.rs

redteam_gamma.rs Update redteam guard expectation for bits<=8 +3/-2
Update redteam guard expectation for bits<=8
• Updates the redteam test commentary and expectation to reflect that bits=8 is now a supported width and the ‘bits too large’ guard begins above 8.
tests/redteam_gamma.rs

gemini-code-assist

Code Review

This pull request introduces support for 8-bit (b=8) bucketed-rank quantization (RankQuant) as a high-precision evidence and refinement surface. It implements capability gating via the new RankQuantCapability enum, allowing asymmetric scoring and code generation at any dimension, while restricting symmetric scoring to dimensions aligned to 256 (dim % 256 == 0). The changes include a new new_asymmetric constructor, updated parameter validation, an optimized AVX-512 gather kernel for asymmetric scans with a portable scalar fallback, and comprehensive integration tests. Since there are no review comments, I have no additional feedback to provide on this pull request.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Fieldnote-Echo · 2026-06-14T20:52:27Z

/agentic_review

qodo-code-review · 2026-06-14T20:52:32Z

Code review by qodo was updated up to the latest commit 790f478

check_eval_bits capped at 1..=7, rejecting b=8 from rankquant_eval_search — but b=8 codes fit u8 and the eval norm is computed empirically (valid at any dim, no dim%256). Widen to 1..=8 (b=9 is the first u8-overflowing width). Test: eval search with b=8 at a non-256-aligned dim (384). Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

qodo flagged the b=8 gather (scan_b8_asym_avx512_gather, uses _mm512_cvtepu8_epi32) dispatching on avx512f alone. VPMOVZXBD is AVX-512F per Intel, but gating on avx512f+avx512bw matches the rest of the crate's AVX-512 kernels (which require avx512dq), keeps the byte-widening conservatively gated, and adds no real exclusion (F-without-BW CPUs like KNL/KNM are already excluded by the dq requirement). Updated both the runtime dispatch and the Signed-off-by: Nelson Spence <nelson@projectnavi.ai> #[target_feature], plus the direct test/bench callers' guards.

Two rustdoc/comment sites (scan_b8_asym dispatch note; RankQuant::search_asymmetric b=8 doc) still described the gate as avx512f-only after 124b5a1 widened it to avx512f+avx512bw. Docs now match the code. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Remaining stale sites after 44f1f2d: the quant.rs module-level b=8 dispatch note, and the three SAFETY comments at the b=8 gather's test/bench call sites. All now say avx512f+avx512bw, matching the dispatch + #[target_feature]. Non-b8 kernels (bitmap vpopcntdq, b2/b4 dq, fastscan, sign) are unchanged. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

qodo: crate-level lib.rs and README still described RankQuant widths as bits ∈ {1,2,4}. Add the b=8 note (capability-gated evidence/refinement width: asymmetric + code/projection at any dim; symmetric only when dim % 256 == 0), so the headline docs match the new surface and don't mislead on b=8 scope. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Fieldnote-Echo · 2026-06-14T22:40:20Z

/agentic_review

qodo-code-review · 2026-06-14T22:40:25Z

Code review by qodo was updated up to the latest commit 9c7923a

The closed-form `rankquant_norm` (`sqrt(dim * var)`, `var = (2^bits-1... )`) assumes exactly-uniform bucket occupancy, which only holds for b in {1,2,4} and for b=8 when `dim % 256 == 0`. At a b=8 dim not divisible by 256 the buckets are unequally occupied, so the closed form mis-scales the absolute asymmetric scores. The ranking is unaffected (the norm is one global constant shared by every document), but `search_asymmetric` / `search_asymmetric_subset` report cosine-like scores that must be correctly scaled. Add `asymmetric_norm(dim, bits)`: closed form for the uniform regimes, exact empirical norm (`rankquant_eval_norm`, summing realised squared bucket centres) for b=8 at non-256 dims. Wire it into both asymmetric scoring sites. The symmetric path is untouched (it is gated to dim % 256 == 0, where the closed form is exact). Update `ref_b8_asymmetric` to compute the exact per-codes norm so the parity tests validate against the true cosine at dim=384 (previously both production and reference shared the same wrong norm, masking the mis-scale). Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

…m gating qodo flagged an apparent contradiction: the crate docs state b=8 symmetric scoring requires dim % 256 == 0, yet `rankquant_eval_search` supports b=8 at non-256 dims. These are two distinct surfaces and there is no correctness bug — clarify the docs so the capability matrix reads consistently: - `rankquant_eval_search` rustdoc: fix the inaccurate 'analytical norm' (it has always used the *empirical* norm) and state explicitly that the empirical norm is exact under any bucket occupancy, which is why this path is unbound by the dim % 256 gate that the analytical-norm `RankQuant::search` carries. - lib.rs crate doc: scope the dim % 256 restriction to analytical-norm symmetric `RankQuant::search`; note the empirical eval path has no such limit. - check_eval_bits + the eval-at-any-dim test: spell out the relationship to the gated symmetric path. No functional change; doc-only. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Signed-off-by: Nelson Spence <nelson@projectnavi.ai> # Conflicts: # src/lib.rs # src/quant.rs

The merge of main (SubsetScratch batched rerank) brought the b=8 asymmetric routing into `search_asymmetric_subset_batched_serial_into` via the reused scratch buffers. Add a parity test through the public `search_asymmetric_subset_batched_serial` entry point covering both a non-256-aligned dim (384, empirical asymmetric norm) and an aligned dim (768), with two queries on distinct CSR candidate rows so scratch reuse across rows is exercised. Every returned score matches the naive per-doc reference. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

codecov · 2026-06-14T23:29:25Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Nelson Spence (Fieldnote-Echo) requested a review from Navi Bot (project-navi-bot) as a code owner June 14, 2026 20:21

gemini-code-assist Bot reviewed Jun 14, 2026

View reviewed changes

qodo-code-review Bot reviewed Jun 14, 2026

View reviewed changes

Comment thread src/quant_kernels.rs Outdated

qodo-code-review Bot reviewed Jun 14, 2026

View reviewed changes

Comment thread src/quant.rs Outdated

Nelson Spence (Fieldnote-Echo) added 5 commits June 14, 2026 16:07

qodo-code-review Bot reviewed Jun 14, 2026

View reviewed changes

Comment thread src/quant.rs

Nelson Spence (Fieldnote-Echo) added 4 commits June 14, 2026 18:13

Merge remote-tracking branch 'origin/main' into feat/rankquant-b8

1d2916b

Signed-off-by: Nelson Spence <nelson@projectnavi.ai> # Conflicts: # src/lib.rs # src/quant.rs

project-navi-bot approved these changes Jun 14, 2026

View reviewed changes

Merge branch 'main' into feat/rankquant-b8

d90412b

Navi Bot (project-navi-bot) merged commit 57787d9 into main Jun 15, 2026
38 checks passed

Navi Bot (project-navi-bot) deleted the feat/rankquant-b8 branch June 15, 2026 00:22

Nelson Spence (Fieldnote-Echo) mentioned this pull request Jun 15, 2026

0.8.0 staging: public API re-architecture & streamlining (breaking) #232

Open

5 tasks

qodo-code-review Bot mentioned this pull request Jun 19, 2026

[codex] Reuse scalar subset rerank LUT scratch #250

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: capability-gated b=8 RankQuant (research/evidence width) (#221)#228

feat: capability-gated b=8 RankQuant (research/evidence width) (#221)#228
Navi Bot (project-navi-bot) merged 11 commits into
mainfrom
feat/rankquant-b8

Fieldnote-Echo commented Jun 14, 2026

Uh oh!

qodo-code-review Bot commented Jun 14, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Jun 14, 2026

Uh oh!

qodo-code-review Bot commented Jun 14, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Fieldnote-Echo commented Jun 14, 2026

Uh oh!

qodo-code-review Bot commented Jun 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Fieldnote-Echo commented Jun 14, 2026

Uh oh!

qodo-code-review Bot commented Jun 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

codecov Bot commented Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Fieldnote-Echo commented Jun 14, 2026

What

API (additive — no breaking changes)

Kernel

Verified (locally, my own runs)

⚠️ Decisions for maintainer sign-off

Coupling

Uh oh!

qodo-code-review Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

chatgpt-codex-connector Bot commented Jun 14, 2026

Uh oh!

qodo-code-review Bot commented Jun 14, 2026

PR Summary by Qodo

Walkthroughs

File Changes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Fieldnote-Echo commented Jun 14, 2026

Uh oh!

qodo-code-review Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Fieldnote-Echo commented Jun 14, 2026

Uh oh!

qodo-code-review Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 14, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qodo-code-review Bot commented Jun 14, 2026 •

edited

Loading

qodo-code-review Bot commented Jun 14, 2026 •

edited

Loading

qodo-code-review Bot commented Jun 14, 2026 •

edited

Loading