Skip to content

feat(experimental): constant-weight bitmap overlap + finite null-tail helpers (#222)#227

Merged
Navi Bot (project-navi-bot) merged 12 commits into
mainfrom
feat/bitmap-null-helpers
Jun 15, 2026
Merged

feat(experimental): constant-weight bitmap overlap + finite null-tail helpers (#222)#227
Navi Bot (project-navi-bot) merged 12 commits into
mainfrom
feat/bitmap-null-helpers

Conversation

@Fieldnote-Echo

Copy link
Copy Markdown
Member

What

Ports ordgraph-proto/src/bitmap.rs into ordvec (#222), experimental-gated. Stacked on #226 (uses #220's BucketCode) — merge #226 first, this auto-retargets to main.

ordvec::{ConstantWeightBitmap, PackedConstantWeightBitmap, BitmapNull, top_group_overlap_vector, choose}:

  • ConstantWeightBitmapfrom_top_bucket(&BucketCode) + overlap (bool reference path).
  • PackedConstantWeightBitmapfrom_bucket_range / from_top_group; overlap routes through crate::util::and_popcount (the production popcount-AND primitive, not a hand-rolled loop).
  • BitmapNull (finite hypergeometric null) — space_size = C(dim,w), fiber_count(o) = C(w,o)·C(dim−w, w−o), tail_count(t) = Σ_{o≥t} fiber_count(o); upper-tail prob = tail/space.
  • choose — u128 binomial.

Parity: reproduces the proto's assertion values; the fiber-partition identity Σ fiber_count == space_size is verified for 7 (dim,weight) pairs (strong validation of the null math); overlap parity (bool == packed == naive); choose vs known binomials + symmetry. No new deps, zero unsafe. Verified locally: fmt / clippy -D warnings (exp+default) / test (exp+default gating) / MSRV 1.89.

⚠️ Decisions for maintainer sign-off

  1. u128 overflow (most important)choose/space_size/fiber_count/tail_count inherit the proto's behaviour: the running product panics in debug, silently wraps in release once C(dim,weight) exceeds u128::MAX (first at C(126,63)). Documented in a module # Overflow note, no guard. My recommendation: before any stable promotion, harden to fail-loud (a checked-based panic in release too, or a checked_choose -> Option) — a public combinatorial that silently returns a wrong count in release is at odds with the crate's fail-loud-primitive discipline. Kept proto-parity in this draft; say the word and I'll harden it.
  2. Error style divergence from Expose reusable RankQuant bucket-code API for downstream evidence systems #220 — these overlap/null primitives are fail-loud (assert!) on contract violation, matching ordvec's existing bitmap-kernel discipline (and_popcount etc.), NOT Expose reusable RankQuant bucket-code API for downstream evidence systems #220's fail-soft Result/CompositionViolation. Intentional split: construction validates softly, hot-path ops assert. Confirm.
  3. choose as a public crate-root free fn (ordvec::choose) vs pub(crate)/inlined into BitmapNull.
  4. and_popcount visibility coupling — packed overlap binds to the pub(crate) popcount primitive (fine while experimental).
  5. Experimental vs stable gating — behind experimental for this draft.

Refs #222. Targets 0.6 (may land in 0.5.0 — maintainer decides).

Port ordgraph-proto/src/code.rs into ordvec as an index-free bucket-code surface
so downstream evidence systems consume declared codes instead of forking kernel
math. experimental-gated.

API (ordvec::{CompositionSpec, RankQuantSpec, BucketCode, CompositionViolation}):
- CompositionSpec: fixed-composition validation (dim/buckets, expected_per_bucket,
  histogram, validate_codes).
- RankQuantSpec: dim+bits in {1,2,4}.
- BucketCode: new(spec, codes) validated; from_ranks; from_vector(dim, bits, &[f32])
  — the new ordvec-primitive path (rank_transform + rank_to_bucket), output feeds
  Contingency::new (#219); top_bitmap.
- CompositionViolation: structured error enum (Display + Error).

Rank math not duplicated (delegates to crate::rank). Parity tests reproduce the
proto's exact assertion values; +validation + from_vector<->contingency cross-check.
Verified: fmt/clippy(-D warnings, experimental+default)/test (69 lib + 2 integration,
default-build gating confirmed) green. No new deps. Refs #220.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
… helpers (#222)

Port ordgraph-proto/src/bitmap.rs into ordvec (stacked on #220 BucketCode),
experimental-gated. The ordinal-bitmap overlap + finite-null surface downstream
evidence systems consume.

ordvec::{ConstantWeightBitmap, PackedConstantWeightBitmap, BitmapNull,
top_group_overlap_vector, choose}:
- ConstantWeightBitmap::from_top_bucket(&BucketCode) + overlap (reference path).
- PackedConstantWeightBitmap::from_bucket_range/from_top_group; overlap routes
  through crate::util::and_popcount (the production popcount-AND primitive, not a
  hand-rolled loop).
- BitmapNull: space_size=C(dim,w), fiber_count(o)=C(w,o)*C(dim-w,w-o),
  tail_count(threshold)=Σ_{o>=threshold} fiber_count(o); upper-tail = tail/space.
- choose: u128 binomial.

Parity: reproduces proto assertion values; fiber-partition identity
(Σ fiber_count == space_size) verified for 7 (dim,weight); overlap parity
(bool == packed == naive); choose vs known binomials + symmetry. No new deps;
zero unsafe. Verified: fmt/clippy(-D warnings, exp+default)/test(exp+default
gating)/MSRV 1.89 green. Refs #222.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@qodo-code-review

qodo-code-review Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (1)

Context used

Grey Divider


Action required

1. tail_probability missing null boundary ✓ Resolved 📎 Requirement gap ⚙ Maintainability
Description
BitmapNull::tail_probability documents an exact probability under a uniform constant-weight null
but does not explicitly state this is an in-model finite null and not a real-corpus guarantee. This
can mislead downstream users into interpreting the probability as corpus-calibrated evidence
strength.
Code

src/const_weight_bitmap.rs[R348-355]

+    /// Exact upper-tail probability `P(overlap >= observed)` under the uniform
+    /// constant-weight null.
+    ///
+    /// Returns `tail_count(observed) / space_size` as an `f64`. This is the
+    /// fraction of all weight-`weight` bitmaps whose overlap with a fixed
+    /// weight-`weight` bitmap is at least `observed` — the exact hypergeometric
+    /// upper tail at the given threshold.
+    ///
Evidence
PR Compliance ID 9 requires explicit documentation that the uniform finite null is an in-model
construct and not a real-corpus guarantee. The tail_probability docs describe the probability
purely as an exact hypergeometric upper tail under the null, without the required boundary
statement.

Optionally expose finite-null helpers for constant-weight bitmap overlap with correct claim boundary
src/const_weight_bitmap.rs[348-355]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`BitmapNull` exposes a finite uniform null-model probability, but the docs do not explicitly state the required boundary: this is an in-model finite null, not a real-corpus guarantee.

## Issue Context
Compliance requires that if finite-null helpers are exposed, documentation must clearly prevent over-claiming (e.g., implying corpus-level guarantees).

## Fix Focus Areas
- src/const_weight_bitmap.rs[348-355]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. choose overflows too early ✓ Resolved 🐞 Bug ≡ Correctness
Description
choose() multiplies before dividing (acc = acc * (n-i) / (i+1)), so the intermediate product can
overflow u128 even when the final C(n,k) value would still fit, yielding wrong results (wrap in
release, panic in debug). This breaks BitmapNull::{space_size,fiber_count,tail_count} for
parameter ranges that should be representable and also contradicts the module docs that describe
overflow only once C(dim,weight) exceeds u128::MAX.
Code

src/const_weight_bitmap.rs[R319-327]

+pub fn choose(n: usize, k: usize) -> u128 {
+    if k > n {
+        return 0;
+    }
+    let k = k.min(n - k);
+    let mut acc = 1u128;
+    for i in 0..k {
+        acc = acc * (n - i) as u128 / (i + 1) as u128;
+    }
Evidence
The module docs claim overflow only when the binomial itself exceeds u128::MAX, but the
implementation multiplies before dividing, which can overflow earlier and thus return incorrect
coefficients that feed directly into the null’s space_size and tail/fiber counts.

src/const_weight_bitmap.rs[37-45]
src/const_weight_bitmap.rs[312-329]
src/const_weight_bitmap.rs[268-309]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`choose(n,k)` uses a multiply-then-divide recurrence in `u128`:

```rust
acc = acc * (n - i) as u128 / (i + 1) as u128;
```

This can overflow at the multiplication step even when the mathematically correct `C(n,k)` still fits in `u128` (because the intermediate equals `C(n,i+1) * (i+1)`). In release, that overflow wraps and yields incorrect binomial coefficients, which then corrupts `BitmapNull::{space_size,fiber_count,tail_count}`.

Additionally, the module-level `# Overflow` docs currently imply overflow happens only when `C(dim,weight) > u128::MAX`, which is not true for this implementation.

### Issue Context
No new dependencies are desired; keep it `no unsafe` and compatible with the crate’s “fail-loud” discipline.

### Fix Focus Areas
- src/const_weight_bitmap.rs[37-45]
- src/const_weight_bitmap.rs[312-329]
- src/const_weight_bitmap.rs[268-309]

### Suggested fix
Implement a checked/cancellation-based binomial that avoids intermediate overflow when the final result fits, e.g.:
- Reduce numerator/denominator by `gcd` before multiplying.
- Use `checked_mul` and either:
 - panic with a clear message on overflow in both debug and release, or
 - expose `checked_choose(...) -> Option<u128>` and have `choose()` call it and `expect(...)`.

Update the overflow docs to reflect the true failure condition (intermediate overflow is possible before `C(n,k)` itself exceeds `u128::MAX`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. No ordgraph-proto parity tests 📎 Requirement gap ☼ Reliability
Description
The added tests reproduce prototype literals but do not automatically compare behavior against
ordgraph-proto/src/bitmap.rs, so changes in the prototype would not be detected. This does not
meet the requirement for automated parity testing against the reference implementation.
Code

src/const_weight_bitmap.rs[R361-415]

+    // ---- ordgraph bitmap parity gate ------------------------------------
+    // Assertion values reproduced verbatim from the reference
+    // `ordgraph-proto/src/bitmap.rs` #[cfg(test)] module. The prototype
+    // cross-checked the overlap against `Contingency::top_overlap`; here the
+    // same literal (1, and [1, 3, 8]) is asserted directly and cross-checked
+    // against the naive shared-set-bit count instead.
+
+    #[test]
+    fn top_bitmap_has_expected_constant_weight() {
+        let code = code(&[0, 0, 1, 1, 2, 2, 3, 3]);
+        let bitmap = ConstantWeightBitmap::from_top_bucket(&code);
+
+        assert_eq!(bitmap.dim(), 8);
+        assert_eq!(bitmap.weight(), 2);
+    }
+
+    #[test]
+    fn top_overlap_matches_naive_top_top_count() {
+        let query = code(&[0, 0, 1, 1, 2, 2, 3, 3]);
+        let doc = code(&[0, 1, 1, 2, 2, 3, 3, 0]);
+        let query_bitmap = ConstantWeightBitmap::from_top_bucket(&query);
+        let doc_bitmap = ConstantWeightBitmap::from_top_bucket(&doc);
+
+        // Prototype literal: top-top overlap is 1.
+        assert_eq!(query_bitmap.overlap(&doc_bitmap), 1);
+    }
+
+    #[test]
+    fn packed_top_overlap_matches_naive_top_top_count() {
+        let query = code(&[0, 0, 1, 1, 2, 2, 3, 3]);
+        let doc = code(&[0, 1, 1, 2, 2, 3, 3, 0]);
+        let query_bitmap = PackedConstantWeightBitmap::from_top_group(&query, 1);
+        let doc_bitmap = PackedConstantWeightBitmap::from_top_group(&doc, 1);
+
+        assert_eq!(query_bitmap.dim(), 8);
+        assert_eq!(query_bitmap.weight(), 2);
+        // Prototype literal: top-top overlap is 1.
+        assert_eq!(query_bitmap.overlap(&doc_bitmap), 1);
+        assert_eq!(
+            query_bitmap.overlap(&doc_bitmap),
+            naive_packed_overlap(&query_bitmap, &doc_bitmap)
+        );
+    }
+
+    #[test]
+    fn top_group_overlap_vector_uses_popcount_backed_bitmaps() {
+        let query = code(&[0, 0, 1, 1, 2, 2, 3, 3]);
+        let doc = code(&[0, 1, 1, 2, 2, 3, 3, 0]);
+
+        // Prototype literal.
+        assert_eq!(
+            top_group_overlap_vector(&query, &doc, &[1, 2, 4]),
+            [1, 3, 8]
+        );
+    }
Evidence
PR Compliance ID 10 requires automated parity tests that compare against
ordgraph-proto/src/bitmap.rs. The added tests explicitly use prototype literals (not a live
comparison), so they are not parity tests against the current reference behavior.

Add parity tests vs ordgraph-proto bitmap.rs behavior
src/const_weight_bitmap.rs[361-415]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Rule requires automated parity tests that compare `ordvec` behavior directly against the current `ordgraph-proto/src/bitmap.rs` behavior. Current tests only assert copied literal values, which is not an automated cross-implementation parity check.

## Issue Context
The tests explicitly state values are reproduced verbatim from `ordgraph-proto/src/bitmap.rs`, but they do not execute the prototype implementation as a reference.

## Fix Focus Areas
- src/const_weight_bitmap.rs[361-415]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. Inexact tail_probability result ✓ Resolved 🐞 Bug ≡ Correctness
Description
BitmapNull::tail_probability is documented as an exact upper-tail probability, but it computes
tail_count/space_size by casting u128 counts to f64, which can round distinct counts to the
same float for large C(dim,weight) and return incorrect probabilities (e.g., 1.0 when the true
value is < 1.0). This breaks the advertised exactness of the finite null in the
large-but-still-u128-representable regime.
Code

src/const_weight_bitmap.rs[R348-379]

+    /// Exact upper-tail probability `P(overlap >= observed)` under the uniform
+    /// constant-weight null.
+    ///
+    /// Returns `tail_count(observed) / space_size` as an `f64`. This is the
+    /// fraction of all weight-`weight` bitmaps whose overlap with a fixed
+    /// weight-`weight` bitmap is at least `observed` — the exact hypergeometric
+    /// upper tail at the given threshold.
+    ///
+    /// Returns `0.0` for `observed > weight` (impossible overlap) and `1.0`
+    /// for `observed == 0` (all bitmaps overlap in `>= 0` positions).
+    ///
+    /// # Example
+    /// ```
+    /// # #[cfg(feature = "experimental")] {
+    /// use ordvec::const_weight_bitmap::BitmapNull;
+    /// let null = BitmapNull::new(10, 3);
+    /// // All bitmaps have overlap >= 0.
+    /// assert_eq!(null.tail_probability(0), 1.0);
+    /// // No bitmap overlaps in more than weight positions.
+    /// assert_eq!(null.tail_probability(4), 0.0);
+    /// // The probability is in [0, 1].
+    /// let p = null.tail_probability(2);
+    /// assert!(p >= 0.0 && p <= 1.0);
+    /// # }
+    /// ```
+    pub fn tail_probability(&self, observed: usize) -> f64 {
+        let space = self.space_size();
+        if space == 0 {
+            return 0.0;
+        }
+        self.tail_count(observed) as f64 / space as f64
+    }
Evidence
The function is explicitly documented as “Exact upper-tail probability” but returns
tail_count(observed) as f64 / space as f64, which necessarily loses integer precision for large
u128 values. The module-level overflow note also establishes that counts are intended to range up
to the full u128 representable region, where such lossy casts become relevant.

src/const_weight_bitmap.rs[348-379]
src/const_weight_bitmap.rs[50-61]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`BitmapNull::tail_probability` is described as an **exact** probability, but the implementation converts `u128` counts to `f64` (`as f64`) and divides. For large representable combinatorial counts, `f64` cannot represent all integers exactly, so the returned probability is only approximate and can be materially wrong.

### Issue Context
The module emphasizes exact finite combinatorial counts in `u128` and fail-loud overflow behavior. Returning an `f64` is fine as a convenience API, but the doc contract should not claim exactness unless you return an exact representation.

### Fix Focus Areas
- src/const_weight_bitmap.rs[348-379]

### Suggested fix options
1. **Make the contract true (recommended):**
  - Add `tail_ratio(observed) -> (u128, u128)` (or a small struct) returning `(tail_count, space_size)` exactly, and either:
    - keep `tail_probability` as a convenience wrapper clearly documented as *approximate*, or
    - rename it to `tail_probability_f64` / `tail_probability_approx`.
2. **Doc-only fix (minimal):**
  - Update docs to state the value is an `f64` approximation derived from exact counts, and may lose precision for large `dim/weight` even when counts fit in `u128`.

(Option 1 best preserves the “exact finite null” theme while still offering a convenient float API.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. u32 overflow in overlap ✓ Resolved 🐞 Bug ≡ Correctness
Description
PackedConstantWeightBitmap::overlap() returns usize but delegates to util::and_popcount()
which accumulates into a u32; if the true overlap exceeds u32::MAX (~4.29B), the count will
overflow (panic in debug, wrap in release) and the returned overlap becomes incorrect. This is
reachable because CompositionSpec::new does not cap dim, so callers can construct very large
BucketCodes even though the bitmap index path caps dim to u16.
Code

src/const_weight_bitmap.rs[R198-201]

+    pub fn overlap(&self, other: &Self) -> usize {
+        assert_eq!(self.dim, other.dim, "bitmap dimensions must match");
+        and_popcount(&self.words, &other.words) as usize
+    }
Evidence
The packed overlap path uses and_popcount and casts its u32 result to usize, but
and_popcount itself sums into u32 (so overflow occurs before the cast). Separately,
CompositionSpec::new imposes no upper bound on dim, so constructing a BucketCode with very
large dim is not prohibited by the bucket-code layer.

src/const_weight_bitmap.rs[198-201]
src/util.rs[134-166]
src/util.rs[201-208]
src/bucket_code.rs[61-76]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`PackedConstantWeightBitmap::overlap()` calls `crate::util::and_popcount()`, which returns `u32` and internally sums popcounts into a `u32`. For bitmaps with overlap > `u32::MAX`, this will overflow (debug panic / release wrap), producing incorrect overlap values even though the API returns `usize`.

## Issue Context
This module is built on `BucketCode`/`CompositionSpec`, and `CompositionSpec::new` does not cap `dim`, so callers can construct codes at dimensions far above the bitmap-index `MAX_DIM`.

## Fix Focus Areas
- src/const_weight_bitmap.rs[198-201]
- src/util.rs[134-166]
- src/util.rs[201-208]
- src/bucket_code.rs[61-76]

### Suggested fix options
1) **Fail-loud guard (minimal change):** in `PackedConstantWeightBitmap::overlap` (or constructors), add `assert!(self.dim <= u32::MAX as usize, "..." )` before calling `and_popcount`, and use `usize::from(and_popcount(...))`.
2) **Correct for all dims:** introduce a new util reduction that accumulates into `u64`/`usize` (including NEON/wasm paths), and call that from `PackedConstantWeightBitmap::overlap`.
3) **Tighten domain:** if intended domain is `dim <= crate::rank_io::MAX_DIM`, assert that instead (and document it).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. BitmapNull missing tail probability ✓ Resolved 📎 Requirement gap ≡ Correctness
Description
BitmapNull exposes space_size, fiber_count, and tail_count but does not provide an explicit
upper-tail probability helper as required. Relying on callers to manually divide
tail_count/space_size risks inconsistent downstream usage and does not meet the stated API surface
requirement.
Code

src/const_weight_bitmap.rs[R225-309]

+/// Idealized uniform constant-weight bitmap null.
+///
+/// Models a uniform distribution over **all** weight-`weight` bitmaps in `dim`
+/// positions (there are `C(dim, weight)` of them). The overlap of a random such
+/// bitmap with a fixed weight-`weight` bitmap is hypergeometric; this type
+/// exposes the exact finite counts:
+///
+/// - [`Self::space_size`] = `C(dim, weight)` — the total number of bitmaps.
+/// - [`Self::fiber_count`] = the number of bitmaps overlapping a fixed one in
+///   exactly `overlap` positions (the hypergeometric numerator).
+/// - [`Self::tail_count`] = the upper-tail sum `Σ_{o>=threshold} fiber_count(o)`.
+///
+/// The fibers partition the space, so `Σ_{o=0..=weight} fiber_count(o) ==
+/// space_size`, and `tail_count(threshold) / space_size` is the exact upper-tail
+/// probability of seeing an overlap `>= threshold` under the null.
+#[derive(Clone, Debug, PartialEq, Eq)]
+pub struct BitmapNull {
+    dim: usize,
+    weight: usize,
+}
+
+impl BitmapNull {
+    /// Build the null over weight-`weight` bitmaps in `dim` positions.
+    ///
+    /// # Panics
+    /// Panics if `dim == 0` or `weight > dim`.
+    pub fn new(dim: usize, weight: usize) -> Self {
+        assert!(dim > 0, "dim must be > 0");
+        assert!(weight <= dim, "weight must be <= dim");
+        Self { dim, weight }
+    }
+
+    /// The number of positions.
+    pub fn dim(&self) -> usize {
+        self.dim
+    }
+
+    /// The constant bitmap weight.
+    pub fn weight(&self) -> usize {
+        self.weight
+    }
+
+    /// Total number of weight-`weight` bitmaps: `C(dim, weight)`.
+    pub fn space_size(&self) -> u128 {
+        choose(self.dim, self.weight)
+    }
+
+    /// Number of weight-`weight` bitmaps overlapping a fixed weight-`weight`
+    /// bitmap in exactly `overlap` positions.
+    ///
+    /// This is the hypergeometric numerator
+    /// `C(weight, overlap) * C(dim - weight, weight - overlap)`: choose which
+    /// `overlap` of the `weight` set bits coincide, then place the remaining
+    /// `weight - overlap` set bits among the `dim - weight` zero positions.
+    /// Returns `0` for an infeasible `overlap` (more than `weight`, or leaving
+    /// more remaining set bits than there are zero positions).
+    pub fn fiber_count(&self, overlap: usize) -> u128 {
+        if overlap > self.weight {
+            return 0;
+        }
+        let outside = self.weight - overlap;
+        if outside > self.dim - self.weight {
+            return 0;
+        }
+        choose(self.weight, overlap) * choose(self.dim - self.weight, outside)
+    }
+
+    /// Upper-tail count `Σ_{o>=threshold} fiber_count(o)`.
+    ///
+    /// `tail_count(0) == space_size` (every bitmap overlaps in `>= 0`
+    /// positions), and `tail_count(threshold) == 0` for `threshold > weight`
+    /// (no bitmap overlaps a weight-`weight` bitmap in more than `weight`
+    /// positions). Monotone non-increasing in `threshold`. Divide by
+    /// [`Self::space_size`] for the exact upper-tail probability.
+    pub fn tail_count(&self, threshold: usize) -> u128 {
+        if threshold == 0 {
+            return self.space_size();
+        }
+        if threshold > self.weight {
+            return 0;
+        }
+        (threshold..=self.weight)
+            .map(|overlap| self.fiber_count(overlap))
+            .sum()
+    }
Evidence
PR Compliance ID 9 requires finite-null helpers to include space size, fiber count, and upper-tail
count/probability. The code provides space_size, fiber_count, and tail_count, but no method
that returns the upper-tail probability directly.

Optionally expose finite-null model helpers for constant-weight bitmap overlap
src/const_weight_bitmap.rs[225-309]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Compliance requires finite-null helpers to include an upper-tail count/probability helper. The current API provides `tail_count` and instructs callers to divide by `space_size`, but does not expose a dedicated probability helper.

## Issue Context
`BitmapNull` is intended as a downstream evidence helper; an explicit probability method reduces ambiguity and downstream re-implementation.

## Fix Focus Areas
- src/const_weight_bitmap.rs[225-309]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
7. Unenforced same-spec precondition ✓ Resolved 🐞 Bug ⚙ Maintainability
Description
top_group_overlap_vector() documents that both BucketCodes must share the same dimension/spec,
but it does not validate this before computing. Callers can accidentally compare codes with
different bucket counts (or dims), producing semantically meaningless output or panicking from
deeper assertions with less targeted error messages.
Code

src/const_weight_bitmap.rs[R210-223]

+pub fn top_group_overlap_vector(
+    lhs: &BucketCode,
+    rhs: &BucketCode,
+    widths: &[usize],
+) -> Vec<usize> {
+    widths
+        .iter()
+        .map(|&width| {
+            let lhs_bitmap = PackedConstantWeightBitmap::from_top_group(lhs, width);
+            let rhs_bitmap = PackedConstantWeightBitmap::from_top_group(rhs, width);
+            lhs_bitmap.overlap(&rhs_bitmap)
+        })
+        .collect()
+}
Evidence
The function explicitly states a same-dimension/spec requirement but performs no checks before
constructing per-width bitmaps; the only enforcement happens indirectly inside from_top_group /
overlap, which can yield less clear panics and allows misuse when bucket counts differ but widths
still happen to be in-range.

src/const_weight_bitmap.rs[204-223]
src/const_weight_bitmap.rs[161-168]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`top_group_overlap_vector(lhs, rhs, widths)` states both codes must share the same dimension/spec, but it doesn’t enforce this. If callers accidentally pass mismatched codes, results are undefined/meaningless, and panics (if any) come from deeper code paths.

### Issue Context
`PackedConstantWeightBitmap::from_top_group` asserts `width <= code.spec().buckets()`, and `PackedConstantWeightBitmap::overlap` asserts `dim` equality. These checks are correct locally, but `top_group_overlap_vector` should fail fast with a clear message about incompatible inputs.

### Fix Focus Areas
- src/const_weight_bitmap.rs[204-223]
- src/const_weight_bitmap.rs[161-169]

### Suggested fix
Add upfront assertions in `top_group_overlap_vector`, e.g.:
- `assert_eq!(lhs.spec().dim(), rhs.spec().dim(), "...")`
- `assert_eq!(lhs.spec().buckets(), rhs.spec().buckets(), "...")`

(or `assert_eq!(lhs.spec(), rhs.spec(), ...)` if `CompositionSpec: Eq` is intended as the full compatibility key). This produces a single, clear failure mode before the per-width loop begins.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Add experimental constant-weight bitmap overlap + finite null-tail helpers
✨ Enhancement 🧪 Tests 🕐 40+ Minutes

Grey Divider

Walkthroughs

Description
• Add experimental constant-weight bitmap types derived from BucketCode.
• Compute overlap via naive bool path and packed and_popcount fast path.
• Add finite constant-weight null (hypergeometric) + parity/identity tests.
Diagram
graph TD
  Caller["Caller"] --> BC["BucketCode"] --> CWB["ConstantWeightBitmap"] --> OV["overlap()"]
  BC --> PCWB["PackedConstantWeightBitmap"] --> APC["util::and_popcount"] --> OV
  Caller --> TGOV["top_group_overlap_vector"] --> OV
  OV --> BN["BitmapNull"] --> CH["choose() (u128)"]
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Fail-loud checked combinatorics (checked_choose / checked_* counts)
  • ➕ Prevents silent wrong results in release when u128 overflows
  • ➕ Aligns with crate’s existing fail-loud/checked-allocation discipline
  • ➖ Slightly more code and branching overhead (though likely irrelevant at these sizes)
  • ➖ API design decision: panic vs Option/Result ripples into callers
2. Keep combinatorics internal (make `choose` pub(crate) or private)
  • ➕ Avoids committing to a public combinatorics API while still experimental
  • ➕ Reduces long-term API surface area to support
  • ➖ Less convenient for downstreams that want to reuse choose directly
  • ➖ If later needed publicly, still requires an API change
3. Use big integers for exact counts (e.g., `num-bigint`)
  • ➕ Correct exact counts for large (dim, weight) beyond u128::MAX
  • ➕ Avoids overflow semantics entirely
  • ➖ Introduces a new dependency and likely larger build/runtime costs
  • ➖ Overkill if intended regime is small and performance-sensitive

Recommendation: For an experimental-gated landing, the current approach is reasonable (no deps, no unsafe, parity tests, and reuse of util::and_popcount). Before promoting to stable, strongly consider making overflow fail-loud in release via checked arithmetic (panic or Option/Result) and reconsider whether choose should remain a public crate-root API or be kept internal until the overflow contract is finalized.

Grey Divider

File Changes

Enhancement (2)
const_weight_bitmap.rs Add experimental constant-weight bitmaps, overlap, and finite null-tail math +545/-0

Add experimental constant-weight bitmaps, overlap, and finite null-tail math

• Introduces 'ConstantWeightBitmap' (Vec<bool>) and 'PackedConstantWeightBitmap' (u64-packed) derived from 'BucketCode', with overlap computed via naive shared-set-bit count or 'util::and_popcount'. Adds 'BitmapNull' implementing exact hypergeometric fiber/tail counts over constant-weight bitmaps, plus a 'u128' 'choose' helper and extensive parity/identity tests; documents 'u128' overflow behavior.

src/const_weight_bitmap.rs


lib.rs Wire up experimental const-weight bitmap module and re-exports +15/-0

Wire up experimental const-weight bitmap module and re-exports

• Adds the 'const_weight_bitmap' module behind the 'experimental' feature and re-exports 'ConstantWeightBitmap', 'PackedConstantWeightBitmap', 'BitmapNull', 'top_group_overlap_vector', and 'choose' at the crate root when enabled.

src/lib.rs


Grey Divider

Qodo Logo

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces constant-weight bitmap overlap calculations and the finite constant-weight null distribution (issue #222), built on top of fixed-composition bucket codes. It adds the ConstantWeightBitmap, PackedConstantWeightBitmap, and BitmapNull structs, along with comprehensive unit tests. The review feedback highlights several critical improvement opportunities: preventing silent overflow in fiber_count by using checked_mul, optimizing the choose function to prevent intermediate overflows by simplifying fractions using the Greatest Common Divisor (GCD) at each step, and optimizing top_group_overlap_vector to perform a single pass over the codes with zero heap allocations.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/const_weight_bitmap.rs
Comment thread src/const_weight_bitmap.rs
Comment thread src/const_weight_bitmap.rs
@codecov

codecov Bot commented Jun 14, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Comment thread src/const_weight_bitmap.rs Outdated
Comment thread src/const_weight_bitmap.rs
@Fieldnote-Echo

Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 4a98bdd

…226)

- CompositionSpec::new now rejects buckets>256 (InvalidSpec) — codes are u8, so
  >256 would silently truncate a computed id via 'as u8' and fail later with a
  misleading WrongBucketCount.
- from_ranks computes rank*buckets in u64 to avoid usize overflow on 32-bit/
  wasm32 for large dim (matches util.rs checked-math discipline).
- Tests: >256-buckets rejection + boundary; from_vector unsupported-bits path.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…emini/qodo) (#222)

Resolves the flagged u128-overflow decision in the safe direction (the bots and
the maintainer both wanted fail-loud before stable):
- choose(): gcd-cancellation of each (n-i)/(i+1) factor (extends the
  representable range to the full set of C(n,k) that fit u128) + checked_mul,
  so a true overflow PANICS (both debug and release) instead of silently
  wrapping to a wrong count.
- fiber_count(): checked_mul on the two binomials.
- Overflow docs updated (deliberate divergence from the proto's wrap-in-release).
- Tests: range-extension via Pascal's identity (C(128,64)); should_panic on
  C(300,150) overflow.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
qodo: the PR added the bucket-code surface without explicit migration steps for
deleting ordgraph-proto's local CompositionSpec/RankQuantSpec/BucketCode fork.
Add a 'Migration from ordgraph-proto' subsection: the drop-in import + delete
src/code.rs steps, the rank-math-delegation note, and the two intentional
deviations (bits=8 deferred to #221, buckets<=256).

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@Fieldnote-Echo

Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit fb7561f

@Fieldnote-Echo

Fieldnote-Echo commented Jun 14, 2026

Copy link
Copy Markdown
Member Author

On the parity-test finding — deliberate, documented deferral.

This finding asks for a test asserting parity against an external prototype implementation that is not yet a published or git-reachable dependency of this crate. We are intentionally not adding that test in this PR:

  • Pulling that implementation in here would mean vendoring or hard-coding a snapshot of it, which would rot immediately and invert the dependency directionordvec is the lower layer and must not depend on a higher-level consumer.
  • Cross-crate parity belongs downstream, in the consumer crate, once that crate depends on a published ordvec. That is where the parity test can pin real behavior against a real dependency.

The ordvec-side contract in this PR is fully covered by its own unit/integration tests. This finding is an accepted, documented deferral rather than an oversight; it is expected to remain un-crossed.

Add a focused test `rankquant_spec_rejects_bits_8` that asserts both
`RankQuantSpec::new(8, 8)` and `BucketCode::from_vector(8, 8, &v)`
return `CompositionViolation::InvalidBits { bits: 8 }`, pinning the
deliberate b=8 exclusion so the boundary cannot regress silently.

Addresses the qodo finding: "bits=8 behavior untested" (PR #226).

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
- PackedConstantWeightBitmap::overlap: add assert!(dim <= u32::MAX)
  before and_popcount, which accumulates into u32 and would overflow
  silently for very large dims (Finding 1 / Correctness).
- top_group_overlap_vector: add assert_eq!(lhs.spec(), rhs.spec())
  precondition with a clear message; comparing bitmaps across specs
  is meaningless (Finding 2 / Maintainability).
- BitmapNull::tail_probability(observed): new method returning
  tail_count(observed) / space_size as f64, the exact hypergeometric
  upper-tail P(overlap >= observed); rustdoc + unit tests pin boundary
  values and a known exact result (Finding 3 / Requirement gap).

Tests added: packed_overlap_within_u32_max_does_not_panic,
top_group_overlap_vector_panics_on_mismatched_spec,
top_group_overlap_vector_passes_on_matching_spec,
tail_probability_boundary_values, tail_probability_known_value,
tail_probability_is_in_unit_interval_and_monotone.

Signed-off-by: Nelson Spence <projectnavi@pm.me>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…e references)

Replace the "Migration from ordgraph-proto" rustdoc section with a
self-contained "Adopting this API" adoption note describing how
external callers can replace local forks with the ordvec types.
Rename the "ordgraph bucket-code parity gate" test block comment to
"bucket-code behavioral contract", describing the invariants under
test rather than a migration relationship. No production behaviour,
test logic, or assert values changed.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…nternal-prototype references)

Replace the module-level "ported to reach parity with..." paragraph with a
product-clean adoption note following the pattern established for bucket_code.rs
in PR #220. Replace the internal "parity gate" test comment with a behavioral
description that explains what the pinned literals assert without naming any
internal repository.

No production behavior change; no test logic or assertion values changed.

Signed-off-by: Nelson Spence <projectnavi@pm.me>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@Fieldnote-Echo

Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 15, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit f6d3580

Comment thread src/const_weight_bitmap.rs Outdated
Base automatically changed from feat/rankquant-bucket-code to main June 15, 2026 01:31
@project-navi-bot Navi Bot (project-navi-bot) dismissed their stale review June 15, 2026 01:31

The base branch was changed.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

# Conflicts:
#	src/lib.rs
#	tests/bucket_code_contingency.rs
Two qodo findings on the finite constant-weight null:

- (Correctness) tail_probability advertised an 'exact' probability but computed
  `tail_count as f64 / space_size as f64`, double-rounding two large u128 counts;
  for C(dim, weight) past 2^53 that can mis-round (e.g. report 1.0 when the true
  value is just under 1). Reframe: the EXACT surface is the u128 `tail_count` /
  `space_size`; tail_probability returns the nearest f64, now gcd-reducing the
  two counts first so the conversion is exact whenever the reduced pair fits an
  f64 mantissa, and short-circuiting the observed==0 (1.0) and observed>weight
  (0.0) boundaries with no division. New test pins tail_probability == nearest
  f64 of the gcd-reduced exact ratio across cases up to C(100,50) ~ 1e29.

- (Requirement gap) document that this is an in-model finite null under the
  idealized uniform constant-weight assumption — a selectivity / false-positive
  statement, NOT a real-corpus or encoder guarantee and not corpus-calibrated
  evidence strength.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@project-navi-bot Navi Bot (project-navi-bot) merged commit 0f40510 into main Jun 15, 2026
38 checks passed
@project-navi-bot Navi Bot (project-navi-bot) deleted the feat/bitmap-null-helpers branch June 15, 2026 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants