Skip to content

feat(experimental): reusable RankQuant bucket-code API (#220)#226

Merged
Navi Bot (project-navi-bot) merged 8 commits into
mainfrom
feat/rankquant-bucket-code
Jun 15, 2026
Merged

feat(experimental): reusable RankQuant bucket-code API (#220)#226
Navi Bot (project-navi-bot) merged 8 commits into
mainfrom
feat/rankquant-bucket-code

Conversation

@Fieldnote-Echo

Copy link
Copy Markdown
Member

What

Ports ordgraph-proto/src/code.rs into ordvec as an index-free bucket-code surface (#220), experimental-gated — so downstream evidence systems (ordgraph-ordvec bridge) consume declared codes instead of forking kernel math.

ordvec::{CompositionSpec, RankQuantSpec, BucketCode, CompositionViolation}:

Rank math is not duplicated — delegates to crate::rank. Parity tests reproduce the proto's exact values, plus validation + a from_vector↔contingency cross-check. Verified locally: fmt / clippy -D warnings (experimental + default) / test (69 lib + 2 integration; default-build gating confirmed — absent without the feature). No new deps.

⚠️ API decisions for maintainer sign-off (agent did NOT finalize these)

  1. Error type — introduces a dedicated CompositionViolation enum rather than reusing the crate's flat OrdvecError (the proto's structured values — DuplicateRank{rank}, WrongBucketCount{...} — don't fit OrdvecError). A From<CompositionViolation> for OrdvecError could unify later. A second public error type is a stable commitment.
  2. bits = 8 rejected (proto accepted it) — narrowed to ordvec's {1,2,4} packed-format domain. This intersects Decide b=8 RankQuant evidence surface for OrdGraph/Ordscope use #221 (the b=8 decision) — if Decide b=8 RankQuant evidence surface for OrdGraph/Ordscope use #221 later adds b=8, RankQuantSpec extends too. Flagging so Expose reusable RankQuant bucket-code API for downstream evidence systems #220 doesn't silently pre-empt Decide b=8 RankQuant evidence surface for OrdGraph/Ordscope use #221.
  3. Type names kept verbatim from the proto — note RankQuantSpec sits next to the existing RankQuant index type (could read as similar).
  4. New variant NonFiniteValue for from_vector (fail-soft Err vs the crate's usual fail-loud panic at the vector entry point).
  5. Dropped the proto's unused MismatchedSpecs; derives Hash on all four types (additive).
  6. Experimental vs stable gating — behind experimental for this draft (same as MultiBucketBitmap). Graduation to stable is your call.
  7. Cross-PR couplingtests/bucket_code_contingency.rs inlines a reference reimplementation of Contingency::new (which lives in Promote bucket-overlap contingency/projection math: dense (pairwise) + indexed (MultiBucketBitmap) #219, unmerged) and asserts the codes feed it; the swap to the literal type is mechanical once Promote bucket-overlap contingency/projection math: dense (pairwise) + indexed (MultiBucketBitmap) #219 merges.

Refs #220. Targets 0.6 (may land in 0.5.0 — maintainer decides).

Port ordgraph-proto/src/code.rs into ordvec as an index-free bucket-code surface
so downstream evidence systems consume declared codes instead of forking kernel
math. experimental-gated.

API (ordvec::{CompositionSpec, RankQuantSpec, BucketCode, CompositionViolation}):
- CompositionSpec: fixed-composition validation (dim/buckets, expected_per_bucket,
  histogram, validate_codes).
- RankQuantSpec: dim+bits in {1,2,4}.
- BucketCode: new(spec, codes) validated; from_ranks; from_vector(dim, bits, &[f32])
  — the new ordvec-primitive path (rank_transform + rank_to_bucket), output feeds
  Contingency::new (#219); top_bitmap.
- CompositionViolation: structured error enum (Display + Error).

Rank math not duplicated (delegates to crate::rank). Parity tests reproduce the
proto's exact assertion values; +validation + from_vector<->contingency cross-check.
Verified: fmt/clippy(-D warnings, experimental+default)/test (69 lib + 2 integration,
default-build gating confirmed) green. No new deps. Refs #220.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@qodo-code-review

qodo-code-review Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (1)

Context used

Grey Divider


Action required

1. No OrdGraph migration notes ✓ Resolved 📎 Requirement gap ⚙ Maintainability
Description
The PR adds a new experimental bucket-code surface but does not provide explicit OrdGraph migration
notes describing how to replace/delete the local RankQuantSpec fork using this API. This makes
downstream migration ambiguous and risks continued semantic drift between ordgraph and ordvec.
Code

src/bucket_code.rs[R1-32]

+//! Index-free, fixed-composition ordinal **bucket codes** (issue #220).
+//!
+//! This module exposes the reusable bucket-code surface that underpins the
+//! RankQuant family, lifted out of any retrieval index. It lets a caller
+//! derive and validate the per-coordinate bucket codes of a vector (or of an
+//! already-computed rank permutation) **without building a corpus, a packed
+//! payload, or a search structure**. The output is a plain `Vec<u8>` of bucket
+//! ids in `[0, buckets)`.
+//!
+//! Three types model the contract:
+//!
+//! - [`CompositionSpec`] — a *fixed-composition* parameterisation
+//!   (`dim`, `buckets`) with `dim % buckets == 0`, so every bucket receives
+//!   exactly `dim / buckets` coordinates. It owns the code-validation rules:
+//!   length, range, and per-bucket occupancy.
+//! - [`RankQuantSpec`] — the RankQuant-shaped specialisation: `buckets`
+//!   derived as `1 << bits` for `bits ∈ {1, 2, 4}`, matching the crate's
+//!   [`crate::RankQuant`] bit-width domain.
+//! - [`BucketCode`] — a single validated code vector against a
+//!   [`CompositionSpec`], built from raw codes, from a rank permutation
+//!   ([`BucketCode::from_ranks`]), or directly from a float vector
+//!   ([`BucketCode::from_vector`]).
+//!
+//! The codes [`BucketCode::from_vector`] produces are exactly the bucket ids
+//! the crate's rank primitives ([`crate::rank::rank_transform`] +
+//! [`crate::rank::rank_to_bucket`]) assign, so they can be fed straight into
+//! the stateless dense-code contingency surface (`Contingency::new`, issue
+//! #219) without any further transform.
+//!
+//! Ported to reach behavioural parity with the `ordgraph` bucket-code
+//! prototype; the rank math is *not* re-implemented here — it delegates to the
+//! crate's shared [`crate::rank`] primitives.
Evidence
Compliance ID 4 requires clear migration notes enabling OrdGraph to delete its local fork. The added
module and re-export docs describe the API, but do not include any OrdGraph-specific migration
guidance or steps.

Provide clear OrdGraph migration notes enabling deletion of local RankQuantSpec fork
src/bucket_code.rs[1-32]
src/lib.rs[91-97]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A migration note is required so OrdGraph can remove its local `RankQuantSpec`/bucket-code fork and adopt `ordvec::bucket_code` directly.

## Issue Context
The new API is described at a high level, but there are no concrete migration steps, compatibility notes, or examples targeted at OrdGraph.

## Fix Focus Areas
- src/bucket_code.rs[1-32]
- src/lib.rs[91-97]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. No ordgraph-proto parity test 📎 Requirement gap ≡ Correctness
Description
The added tests state they reproduce values from the ordgraph prototype but they do not actually
execute or compare against ordgraph-proto/src/code.rs, so behavioral parity is not automatically
verified. This can allow silent drift between ordvec and ordgraph-proto behavior during migration.
Code

src/bucket_code.rs[R476-486]

+    // ---- ordgraph bucket-code parity gate -------------------------------
+    // Every assertion value below is reproduced verbatim from the reference
+    // `code.rs` #[cfg(test)] module.
+
+    #[test]
+    fn from_ranks_builds_uniform_bucket_code() {
+        let code = BucketCode::from_ranks(8, 4, &[0, 1, 2, 3, 4, 5, 6, 7]).unwrap();
+
+        assert_eq!(code.spec().expected_per_bucket(), 2);
+        assert_eq!(code.codes(), &[0, 0, 1, 1, 2, 2, 3, 3]);
+    }
Evidence
PR Compliance ID 10 requires automated parity tests against the prototype implementation. The added
test module explicitly states values are reproduced from the reference code.rs test module, but
the tests only assert against hardcoded expected values and do not compare against
ordgraph-proto/src/code.rs behavior.

Parity tests against ordgraph-proto/src/code.rs before removing local implementation
src/bucket_code.rs[476-486]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Rule requires automated parity tests against `ordgraph-proto/src/code.rs`, but current tests only hardcode expected values and do not compare against the prototype implementation.

## Issue Context
The tests currently say assertion values are reproduced from the reference prototype, but there is no test that runs both implementations on the same inputs and asserts equality.

## Fix Focus Areas
- src/bucket_code.rs[472-535]

## Implementation notes
- Add a test-only copy of the prototype implementation (e.g., vendored `tests/ordgraph_proto_code.rs` or `include!()` of a pinned snapshot) and write tests that compute bucket codes via both implementations and `assert_eq!` on outputs and on rejection/error cases.
- Cover at least: `from_ranks`, `validate_codes`-equivalent behavior, and `from_vector` behavior (if present in the prototype) on representative inputs.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Downstream evidence use undocumented ✓ Resolved 📎 Requirement gap ⚙ Maintainability
Description
The PR introduces bucket-code derivation/validation intended for evidence systems, but does not
document intended downstream usage patterns (e.g., how an evidence system should produce/ingest
declared codes and validate them). This increases the chance downstream users will re-implement
semantics or use the API inconsistently.
Code

src/bucket_code.rs[R1-28]

+//! Index-free, fixed-composition ordinal **bucket codes** (issue #220).
+//!
+//! This module exposes the reusable bucket-code surface that underpins the
+//! RankQuant family, lifted out of any retrieval index. It lets a caller
+//! derive and validate the per-coordinate bucket codes of a vector (or of an
+//! already-computed rank permutation) **without building a corpus, a packed
+//! payload, or a search structure**. The output is a plain `Vec<u8>` of bucket
+//! ids in `[0, buckets)`.
+//!
+//! Three types model the contract:
+//!
+//! - [`CompositionSpec`] — a *fixed-composition* parameterisation
+//!   (`dim`, `buckets`) with `dim % buckets == 0`, so every bucket receives
+//!   exactly `dim / buckets` coordinates. It owns the code-validation rules:
+//!   length, range, and per-bucket occupancy.
+//! - [`RankQuantSpec`] — the RankQuant-shaped specialisation: `buckets`
+//!   derived as `1 << bits` for `bits ∈ {1, 2, 4}`, matching the crate's
+//!   [`crate::RankQuant`] bit-width domain.
+//! - [`BucketCode`] — a single validated code vector against a
+//!   [`CompositionSpec`], built from raw codes, from a rank permutation
+//!   ([`BucketCode::from_ranks`]), or directly from a float vector
+//!   ([`BucketCode::from_vector`]).
+//!
+//! The codes [`BucketCode::from_vector`] produces are exactly the bucket ids
+//! the crate's rank primitives ([`crate::rank::rank_transform`] +
+//! [`crate::rank::rank_to_bucket`]) assign, so they can be fed straight into
+//! the stateless dense-code contingency surface (`Contingency::new`, issue
+//! #219) without any further transform.
Evidence
Compliance ID 9 requires documentation describing downstream evidence-system use. The module docs
and integration test comments mention contingency consumption, but do not provide downstream usage
guidance or examples for evidence systems consuming/validating declared code artifacts.

Document intended downstream evidence-system use of bucket-code APIs (e.g., OrdGraph)
src/bucket_code.rs[1-32]
tests/bucket_code_contingency.rs[1-14]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The bucket-code API lacks documentation targeted at downstream evidence systems (like OrdGraph): recommended workflows, validation expectations, and minimal examples.

## Issue Context
Although the module doc explains what the API is, it does not explain how to use it for declared-code artifacts (produce, validate, and consume) and how it is intended to integrate with evidence computations.

## Fix Focus Areas
- src/bucket_code.rs[1-32]
- tests/bucket_code_contingency.rs[1-14]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Unbounded bucket count ✓ Resolved 🐞 Bug ≡ Correctness
Description
CompositionSpec::new permits buckets > 256 even though BucketCode stores codes as Vec<u8>,
and from_ranks truncates computed bucket IDs via as u8. This makes some specs unrepresentable
and causes misleading failures (or large histogram allocations) instead of a clear “buckets out of
representable range” error.
Code

src/bucket_code.rs[R271-275]

+            // `rank < dim` and `dim % buckets == 0`, so `rank * buckets / dim`
+            // lands in `[0, buckets)` and `buckets <= 1 << 8` for any code that
+            // can be a `u8`. (For the RankQuant domain `buckets <= 16`.)
+            codes.push((rank * buckets / dim) as u8);
+        }
Evidence
The code claims bucket ids are stored as u8, but does not enforce a corresponding upper bound on
buckets; from_ranks then truncates computed bucket ids during as u8 conversion. Additionally,
histogram allocates based on self.buckets, so permitting arbitrarily large buckets can induce
very large allocations.

src/bucket_code.rs[46-76]
src/bucket_code.rs[107-127]
src/bucket_code.rs[207-218]
src/bucket_code.rs[271-275]
src/rank.rs[74-110]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`BucketCode` stores bucket ids in `Vec<u8>`, so the representable bucket id domain is `0..=255` and the representable bucket *count* domain is `<= 256`. Currently `CompositionSpec::new(dim, buckets)` allows any `buckets >= 2`, and `BucketCode::from_ranks` computes `rank * buckets / dim` then casts to `u8` (`as u8`), which truncates for `buckets > 256`.

This yields specs that cannot be faithfully represented by the chosen storage type, and callers get secondary errors (e.g., `WrongBucketCount`) rather than a direct spec/domain violation.

### Issue Context
- `BucketCode` uses `Vec<u8>` for codes.
- `CompositionSpec::histogram` allocates `vec![0usize; self.buckets]`, so huge `buckets` also implies huge allocation risk.

### Fix Focus Areas
- src/bucket_code.rs[61-76]
- src/bucket_code.rs[107-127]
- src/bucket_code.rs[244-277]

### Suggested change
1. In `CompositionSpec::new`, reject `buckets > (u8::MAX as usize + 1)` (i.e. `> 256`). Reuse `CompositionViolation::InvalidSpec("buckets must be <= 256")` to avoid adding a new public enum variant.
2. In `BucketCode::from_ranks`, replace the `as u8` cast with a checked conversion (e.g. compute in `u64`/`u128` then `u8::try_from(...)`) under the new `buckets <= 256` invariant.
3. Add/adjust a unit test that `CompositionSpec::new(dim, 257)` (or any `>256`) returns `InvalidSpec(...)`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. bits=8 behavior untested ✓ Resolved 📎 Requirement gap ☼ Reliability
Description
The implementation explicitly rejects bits outside {1,2,4}, but tests do not explicitly cover
the chosen bits=8 behavior. This risks regressions or ambiguity around the documented b=8 decision
during downstream migration.
Code

src/bucket_code.rs[R511-523]

+    fn rankquant_spec_rejects_unsupported_bits_and_large_dims() {
+        assert_eq!(
+            RankQuantSpec::new(8, 3).unwrap_err(),
+            CompositionViolation::InvalidBits { bits: 3 }
+        );
+        assert_eq!(
+            RankQuantSpec::new(u16::MAX as usize + 1, 2).unwrap_err(),
+            CompositionViolation::DimTooLarge {
+                dim: u16::MAX as usize + 1,
+                max: u16::MAX as usize,
+            }
+        );
+    }
Evidence
PR Compliance ID 14 requires tests to cover the b=8 decision outcomes. The added tests only
demonstrate invalid-bit rejection using bits=3 (not bits=8), so the b=8 behavior is not
explicitly pinned by tests.

Tests cover the chosen b=8 behavior and enable deletion of ordgraph local RankQuantSpec fork
src/bucket_code.rs[511-515]
src/bucket_code.rs[664-669]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Rule requires tests that explicitly cover the chosen b=8 behavior. The code rejects unsupported bits, but tests currently demonstrate rejection for `bits=3` (and `from_vector` also uses `bits=3`), not `bits=8`.

## Issue Context
This PR documents that the prototype accepted `bits=8` but ordvec rejects it; tests should pin that behavior so it cannot change silently.

## Fix Focus Areas
- src/bucket_code.rs[511-670]

## Suggested test additions
- Add `assert_eq!(RankQuantSpec::new(8, 8).unwrap_err(), CompositionViolation::InvalidBits { bits: 8 });`
- Add `assert_eq!(BucketCode::from_vector(8, 8, &v).unwrap_err(), CompositionViolation::InvalidBits { bits: 8 });`

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Informational

6. Ignored rustdoc import example 🐞 Bug ⚙ Maintainability
Description
The adoption snippet is marked rust,ignore, so it is never compiled by doctests and can silently
drift from the actual exported API. This weakens the reliability of the migration/import guidance
for downstream users of the experimental surface.
Code

src/bucket_code.rs[R36-38]

+//! ```rust,ignore
+//! use ordvec::{BucketCode, CompositionSpec, RankQuantSpec, CompositionViolation};
+//! ```
Evidence
The docs currently prevent compilation of the import example, even though the crate root exports the
referenced symbols behind the same feature gate; removing ignore would let doctests catch future
API drift.

src/bucket_code.rs[30-48]
src/lib.rs[91-97]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The rustdoc example is tagged `rust,ignore`, so it is never compiled during `cargo test --doc` and may become stale if the public re-exports change.

### Issue Context
This module is already behind `#[cfg(feature = "experimental")]`, and the crate root re-exports the referenced symbols behind the same feature, so the snippet can be safely compiled as a doctest when `experimental` is enabled.

### Fix Focus Areas
- src/bucket_code.rs[36-38]

### Suggested change
- Change the code fence from ` ```rust,ignore` to ` ```rust` (or ` ```rust,no_run`), so the import line is compile-checked when doctests are run with `--features experimental`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Add experimental index-free RankQuant BucketCode API with validation
✨ Enhancement 🧪 Tests 🕐 40+ Minutes

Grey Divider

Walkthroughs

Description
• Introduce experimental BucketCode API to derive validated fixed-composition bucket ids.
• Delegate rank math to existing rank primitives; add structured CompositionViolation errors.
• Add parity and integration tests, including Contingency contract cross-check.
Diagram
graph TD
  A["Downstream caller"] --> B["ordvec::BucketCode"] --> C["crate::rank"]
  B --> D["CompositionSpec / RankQuantSpec"]
  B --> E["Contingency::new (from #219)"]
  F["Tests"] --> B
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Reuse existing OrdvecError for bucket-code failures
  • ➕ Avoids committing to a second public error type
  • ➕ Simplifies downstream error handling (single error enum)
  • ➖ Loses structured detail (e.g., DuplicateRank{rank}, WrongBucketCount{...}) unless OrdvecError is expanded significantly
  • ➖ May blur separation between index/search errors and pure validation errors
2. Expose a fallible-from-vector API that returns raw Vec + separate validate()
  • ➕ Keeps the core output type simple (just codes)
  • ➕ Lets callers skip validation in trusted pipelines
  • ➖ Weakens the invariant that constructed values are always well-formed
  • ➖ Increases risk of accidentally passing invalid codes into Contingency/candidate scoring
3. Allow bits=8 in RankQuantSpec now (match proto) and gate packed-format support later
  • ➕ Preserves parity with the prototype surface and avoids future breaking changes
  • ➕ Keeps the API more future-proof if 8-bit support lands
  • ➖ Conflicts with the crate’s current RankQuant domain ({1,2,4}) and could imply unsupported packed paths
  • ➖ May pre-empt/complicate the separate decision-making tracked in Decide b=8 RankQuant evidence surface for OrdGraph/Ordscope use #221

Recommendation: The PR’s approach (feature-gated, index-free BucketCode that delegates rank math to crate::rank and enforces fixed-composition invariants) is directionally strong and minimizes duplication. The two key maintainer decisions to settle before stabilizing are (1) whether CompositionViolation remains a distinct public error type versus being folded into OrdvecError, and (2) whether bits=8 should be accepted now for API continuity or intentionally excluded to reflect the current packed RankQuant domain.

Grey Divider

File Changes

Enhancement (1)
bucket_code.rs Add BucketCode API with fixed-composition specs and structured validation errors +713/-0

Add BucketCode API with fixed-composition specs and structured validation errors

• Introduces CompositionSpec and RankQuantSpec to define fixed-composition constraints (dim/buckets, bits domain) and validate code balance. Adds BucketCode constructors from raw codes, explicit ranks, and directly from vectors via crate::rank primitives, plus top-bucket bitmap extraction. Defines CompositionViolation as a structured, public error enum and adds extensive unit tests for parity and validation behavior.

src/bucket_code.rs


Tests (1)
bucket_code_contingency.rs Add experimental integration test for BucketCode output vs Contingency contract +82/-0

Add experimental integration test for BucketCode output vs Contingency contract

• Adds feature-gated integration tests that assert BucketCode::from_vector produces in-range codes suitable for the upcoming Contingency::new API (#219). Includes a small reference implementation of Contingency’s histogram/diagonal agreement behavior to pin the expected contract until #219 merges.

tests/bucket_code_contingency.rs


Other (1)
lib.rs Feature-gate and re-export bucket_code module at crate root +11/-0

Feature-gate and re-export bucket_code module at crate root

• Adds the experimental bucket_code module and re-exports BucketCode, CompositionSpec, RankQuantSpec, and CompositionViolation behind the experimental feature flag. Documents the intent to keep the surface experimental pending stabilization decisions.

src/lib.rs


Grey Divider

Qodo Logo

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the bucket_code module under the experimental feature flag, which implements index-free, fixed-composition ordinal bucket codes. It includes the CompositionSpec, RankQuantSpec, and BucketCode structures, along with comprehensive unit and integration tests. The review feedback identifies two key issues: first, CompositionSpec::new should reject buckets > 256 to prevent silent truncation when casting bucket IDs to u8; second, the calculation rank * buckets in from_ranks should cast operands to u128 to prevent integer overflow.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/bucket_code.rs
Comment thread src/bucket_code.rs Outdated
@codecov

codecov Bot commented Jun 14, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Comment thread src/bucket_code.rs Outdated
@Fieldnote-Echo

Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 5af8411

Comment thread src/bucket_code.rs Outdated
…226)

- CompositionSpec::new now rejects buckets>256 (InvalidSpec) — codes are u8, so
  >256 would silently truncate a computed id via 'as u8' and fail later with a
  misleading WrongBucketCount.
- from_ranks computes rank*buckets in u64 to avoid usize overflow on 32-bit/
  wasm32 for large dim (matches util.rs checked-math discipline).
- Tests: >256-buckets rejection + boundary; from_vector unsupported-bits path.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
qodo: the PR added the bucket-code surface without explicit migration steps for
deleting ordgraph-proto's local CompositionSpec/RankQuantSpec/BucketCode fork.
Add a 'Migration from ordgraph-proto' subsection: the drop-in import + delete
src/code.rs steps, the rank-math-delegation note, and the two intentional
deviations (bits=8 deferred to #221, buckets<=256).

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@Fieldnote-Echo

Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit fd1efe8

@Fieldnote-Echo

Fieldnote-Echo commented Jun 14, 2026

Copy link
Copy Markdown
Member Author

On the parity-test finding — deliberate, documented deferral.

This finding asks for a test asserting parity against an external prototype implementation that is not yet a published or git-reachable dependency of this crate. We are intentionally not adding that test in this PR:

  • Pulling that implementation in here would mean vendoring or hard-coding a snapshot of it, which would rot immediately and invert the dependency directionordvec is the lower layer and must not depend on a higher-level consumer.
  • Cross-crate parity belongs downstream, in the consumer crate, once that crate depends on a published ordvec. That is where the parity test can pin real behavior against a real dependency.

The ordvec-side contract in this PR is fully covered by its own unit/integration tests. This finding is an accepted, documented deferral rather than an oversight; it is expected to remain un-crossed.

Add a focused test `rankquant_spec_rejects_bits_8` that asserts both
`RankQuantSpec::new(8, 8)` and `BucketCode::from_vector(8, 8, &v)`
return `CompositionViolation::InvalidBits { bits: 8 }`, pinning the
deliberate b=8 exclusion so the boundary cannot regress silently.

Addresses the qodo finding: "bits=8 behavior untested" (PR #226).

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…e references)

Replace the "Migration from ordgraph-proto" rustdoc section with a
self-contained "Adopting this API" adoption note describing how
external callers can replace local forks with the ordvec types.
Rename the "ordgraph bucket-code parity gate" test block comment to
"bucket-code behavioral contract", describing the invariants under
test rather than a migration relationship. No production behaviour,
test logic, or assert values changed.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@Fieldnote-Echo

Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 15, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 1c8d5ab

…-code

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

# Conflicts:
#	src/lib.rs
…erge)

The cross-API integration test stood in a local re-implementation of
`Contingency::new`'s histogram + `diagonal_agreement` because the contingency
module was not yet in this branch's tree (it shipped via the sibling #219 work).
Now that the contingency surface is merged in, do the mechanical swap the test's
own doc anticipated: drop the reference re-implementation and assert directly
against the real `Contingency::new(..).diagonal_agreement()`. Same acceptance
property (#220#219): from_vector output is a valid, balanced contingency
input. Both cases pass.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@project-navi-bot Navi Bot (project-navi-bot) merged commit b46d855 into main Jun 15, 2026
38 checks passed
@project-navi-bot Navi Bot (project-navi-bot) deleted the feat/rankquant-bucket-code branch June 15, 2026 01:31
Navi Bot (project-navi-bot) added a commit that referenced this pull request Jun 15, 2026
… helpers (#222) (#227)

* feat(experimental): reusable RankQuant bucket-code API (#220)

Port ordgraph-proto/src/code.rs into ordvec as an index-free bucket-code surface
so downstream evidence systems consume declared codes instead of forking kernel
math. experimental-gated.

API (ordvec::{CompositionSpec, RankQuantSpec, BucketCode, CompositionViolation}):
- CompositionSpec: fixed-composition validation (dim/buckets, expected_per_bucket,
  histogram, validate_codes).
- RankQuantSpec: dim+bits in {1,2,4}.
- BucketCode: new(spec, codes) validated; from_ranks; from_vector(dim, bits, &[f32])
  — the new ordvec-primitive path (rank_transform + rank_to_bucket), output feeds
  Contingency::new (#219); top_bitmap.
- CompositionViolation: structured error enum (Display + Error).

Rank math not duplicated (delegates to crate::rank). Parity tests reproduce the
proto's exact assertion values; +validation + from_vector<->contingency cross-check.
Verified: fmt/clippy(-D warnings, experimental+default)/test (69 lib + 2 integration,
default-build gating confirmed) green. No new deps. Refs #220.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

* feat(experimental): constant-weight bitmap overlap + finite null-tail helpers (#222)

Port ordgraph-proto/src/bitmap.rs into ordvec (stacked on #220 BucketCode),
experimental-gated. The ordinal-bitmap overlap + finite-null surface downstream
evidence systems consume.

ordvec::{ConstantWeightBitmap, PackedConstantWeightBitmap, BitmapNull,
top_group_overlap_vector, choose}:
- ConstantWeightBitmap::from_top_bucket(&BucketCode) + overlap (reference path).
- PackedConstantWeightBitmap::from_bucket_range/from_top_group; overlap routes
  through crate::util::and_popcount (the production popcount-AND primitive, not a
  hand-rolled loop).
- BitmapNull: space_size=C(dim,w), fiber_count(o)=C(w,o)*C(dim-w,w-o),
  tail_count(threshold)=Σ_{o>=threshold} fiber_count(o); upper-tail = tail/space.
- choose: u128 binomial.

Parity: reproduces proto assertion values; fiber-partition identity
(Σ fiber_count == space_size) verified for 7 (dim,weight); overlap parity
(bool == packed == naive); choose vs known binomials + symmetry. No new deps;
zero unsafe. Verified: fmt/clippy(-D warnings, exp+default)/test(exp+default
gating)/MSRV 1.89 green. Refs #222.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

* fix(bucket_code): bound buckets<=256 + u64 bucket math (gemini/qodo) (#226)

- CompositionSpec::new now rejects buckets>256 (InvalidSpec) — codes are u8, so
  >256 would silently truncate a computed id via 'as u8' and fail later with a
  misleading WrongBucketCount.
- from_ranks computes rank*buckets in u64 to avoid usize overflow on 32-bit/
  wasm32 for large dim (matches util.rs checked-math discipline).
- Tests: >256-buckets rejection + boundary; from_vector unsupported-bits path.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

* fix(const_weight_bitmap): fail-loud overflow in choose/fiber_count (gemini/qodo) (#222)

Resolves the flagged u128-overflow decision in the safe direction (the bots and
the maintainer both wanted fail-loud before stable):
- choose(): gcd-cancellation of each (n-i)/(i+1) factor (extends the
  representable range to the full set of C(n,k) that fit u128) + checked_mul,
  so a true overflow PANICS (both debug and release) instead of silently
  wrapping to a wrong count.
- fiber_count(): checked_mul on the two binomials.
- Overflow docs updated (deliberate divergence from the proto's wrap-in-release).
- Tests: range-extension via Pascal's identity (C(128,64)); should_panic on
  C(300,150) overflow.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

* docs(bucket_code): add ordgraph-proto migration note (qodo) (#220)

qodo: the PR added the bucket-code surface without explicit migration steps for
deleting ordgraph-proto's local CompositionSpec/RankQuantSpec/BucketCode fork.
Add a 'Migration from ordgraph-proto' subsection: the drop-in import + delete
src/code.rs steps, the rank-math-delegation note, and the two intentional
deviations (bits=8 deferred to #221, buckets<=256).

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

* test: pin bits=8 rejection for RankQuantSpec and BucketCode::from_vector

Add a focused test `rankquant_spec_rejects_bits_8` that asserts both
`RankQuantSpec::new(8, 8)` and `BucketCode::from_vector(8, 8, &v)`
return `CompositionViolation::InvalidBits { bits: 8 }`, pinning the
deliberate b=8 exclusion so the boundary cannot regress silently.

Addresses the qodo finding: "bits=8 behavior untested" (PR #226).

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

* fix: harden bitmap overlap and BitmapNull against three qodo findings

- PackedConstantWeightBitmap::overlap: add assert!(dim <= u32::MAX)
  before and_popcount, which accumulates into u32 and would overflow
  silently for very large dims (Finding 1 / Correctness).
- top_group_overlap_vector: add assert_eq!(lhs.spec(), rhs.spec())
  precondition with a clear message; comparing bitmaps across specs
  is meaningless (Finding 2 / Maintainability).
- BitmapNull::tail_probability(observed): new method returning
  tail_count(observed) / space_size as f64, the exact hypergeometric
  upper-tail P(overlap >= observed); rustdoc + unit tests pin boundary
  values and a known exact result (Finding 3 / Requirement gap).

Tests added: packed_overlap_within_u32_max_does_not_panic,
top_group_overlap_vector_panics_on_mismatched_spec,
top_group_overlap_vector_passes_on_matching_spec,
tail_probability_boundary_values, tail_probability_known_value,
tail_probability_is_in_unit_interval_and_monotone.

Signed-off-by: Nelson Spence <projectnavi@pm.me>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

* docs: product-clean the bucket-code surface (remove internal-prototype references)

Replace the "Migration from ordgraph-proto" rustdoc section with a
self-contained "Adopting this API" adoption note describing how
external callers can replace local forks with the ordvec types.
Rename the "ordgraph bucket-code parity gate" test block comment to
"bucket-code behavioral contract", describing the invariants under
test rather than a migration relationship. No production behaviour,
test logic, or assert values changed.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

* docs(const_weight_bitmap): product-clean the bitmap surface (remove internal-prototype references)

Replace the module-level "ported to reach parity with..." paragraph with a
product-clean adoption note following the pattern established for bucket_code.rs
in PR #220. Replace the internal "parity gate" test comment with a behavioral
description that explains what the pinned literals assert without naming any
internal repository.

No production behavior change; no test logic or assertion values changed.

Signed-off-by: Nelson Spence <projectnavi@pm.me>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

* fix: honest exactness + gcd-reduced f64 for BitmapNull::tail_probability

Two qodo findings on the finite constant-weight null:

- (Correctness) tail_probability advertised an 'exact' probability but computed
  `tail_count as f64 / space_size as f64`, double-rounding two large u128 counts;
  for C(dim, weight) past 2^53 that can mis-round (e.g. report 1.0 when the true
  value is just under 1). Reframe: the EXACT surface is the u128 `tail_count` /
  `space_size`; tail_probability returns the nearest f64, now gcd-reducing the
  two counts first so the conversion is exact whenever the reduced pair fits an
  f64 mantissa, and short-circuiting the observed==0 (1.0) and observed>weight
  (0.0) boundaries with no division. New test pins tail_probability == nearest
  f64 of the gcd-reduced exact ratio across cases up to C(100,50) ~ 1e29.

- (Requirement gap) document that this is an in-model finite null under the
  idealized uniform constant-weight assumption — a selectivity / false-positive
  statement, NOT a real-corpus or encoder guarantee and not corpus-calibrated
  evidence strength.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

---------

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Signed-off-by: Nelson Spence <projectnavi@pm.me>
Co-authored-by: Navi Bot <267427491+project-navi-bot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants