feat(experimental): constant-weight bitmap overlap + finite null-tail helpers (#222)#227
Conversation
Port ordgraph-proto/src/code.rs into ordvec as an index-free bucket-code surface
so downstream evidence systems consume declared codes instead of forking kernel
math. experimental-gated.
API (ordvec::{CompositionSpec, RankQuantSpec, BucketCode, CompositionViolation}):
- CompositionSpec: fixed-composition validation (dim/buckets, expected_per_bucket,
histogram, validate_codes).
- RankQuantSpec: dim+bits in {1,2,4}.
- BucketCode: new(spec, codes) validated; from_ranks; from_vector(dim, bits, &[f32])
— the new ordvec-primitive path (rank_transform + rank_to_bucket), output feeds
Contingency::new (#219); top_bitmap.
- CompositionViolation: structured error enum (Display + Error).
Rank math not duplicated (delegates to crate::rank). Parity tests reproduce the
proto's exact assertion values; +validation + from_vector<->contingency cross-check.
Verified: fmt/clippy(-D warnings, experimental+default)/test (69 lib + 2 integration,
default-build gating confirmed) green. No new deps. Refs #220.
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
… helpers (#222) Port ordgraph-proto/src/bitmap.rs into ordvec (stacked on #220 BucketCode), experimental-gated. The ordinal-bitmap overlap + finite-null surface downstream evidence systems consume. ordvec::{ConstantWeightBitmap, PackedConstantWeightBitmap, BitmapNull, top_group_overlap_vector, choose}: - ConstantWeightBitmap::from_top_bucket(&BucketCode) + overlap (reference path). - PackedConstantWeightBitmap::from_bucket_range/from_top_group; overlap routes through crate::util::and_popcount (the production popcount-AND primitive, not a hand-rolled loop). - BitmapNull: space_size=C(dim,w), fiber_count(o)=C(w,o)*C(dim-w,w-o), tail_count(threshold)=Σ_{o>=threshold} fiber_count(o); upper-tail = tail/space. - choose: u128 binomial. Parity: reproduces proto assertion values; fiber-partition identity (Σ fiber_count == space_size) verified for 7 (dim,weight); overlap parity (bool == packed == naive); choose vs known binomials + symmetry. No new deps; zero unsafe. Verified: fmt/clippy(-D warnings, exp+default)/test(exp+default gating)/MSRV 1.89 green. Refs #222. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Code Review by Qodo
Context used 1.
|
PR Summary by QodoAdd experimental constant-weight bitmap overlap + finite null-tail helpers WalkthroughsDescription• Add experimental constant-weight bitmap types derived from BucketCode. • Compute overlap via naive bool path and packed and_popcount fast path. • Add finite constant-weight null (hypergeometric) + parity/identity tests. Diagramgraph TD
Caller["Caller"] --> BC["BucketCode"] --> CWB["ConstantWeightBitmap"] --> OV["overlap()"]
BC --> PCWB["PackedConstantWeightBitmap"] --> APC["util::and_popcount"] --> OV
Caller --> TGOV["top_group_overlap_vector"] --> OV
OV --> BN["BitmapNull"] --> CH["choose() (u128)"]
High-Level AssessmentThe following are alternative approaches to this PR: 1. Fail-loud checked combinatorics (checked_choose / checked_* counts)
2. Keep combinatorics internal (make `choose` pub(crate) or private)
3. Use big integers for exact counts (e.g., `num-bigint`)
Recommendation: For an File ChangesEnhancement (2)
|
There was a problem hiding this comment.
Code Review
This pull request introduces constant-weight bitmap overlap calculations and the finite constant-weight null distribution (issue #222), built on top of fixed-composition bucket codes. It adds the ConstantWeightBitmap, PackedConstantWeightBitmap, and BitmapNull structs, along with comprehensive unit tests. The review feedback highlights several critical improvement opportunities: preventing silent overflow in fiber_count by using checked_mul, optimizing the choose function to prevent intermediate overflows by simplifying fractions using the Greatest Common Divisor (GCD) at each step, and optimizing top_group_overlap_vector to perform a single pass over the codes with zero heap allocations.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit 4a98bdd |
…226) - CompositionSpec::new now rejects buckets>256 (InvalidSpec) — codes are u8, so >256 would silently truncate a computed id via 'as u8' and fail later with a misleading WrongBucketCount. - from_ranks computes rank*buckets in u64 to avoid usize overflow on 32-bit/ wasm32 for large dim (matches util.rs checked-math discipline). - Tests: >256-buckets rejection + boundary; from_vector unsupported-bits path. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…emini/qodo) (#222) Resolves the flagged u128-overflow decision in the safe direction (the bots and the maintainer both wanted fail-loud before stable): - choose(): gcd-cancellation of each (n-i)/(i+1) factor (extends the representable range to the full set of C(n,k) that fit u128) + checked_mul, so a true overflow PANICS (both debug and release) instead of silently wrapping to a wrong count. - fiber_count(): checked_mul on the two binomials. - Overflow docs updated (deliberate divergence from the proto's wrap-in-release). - Tests: range-extension via Pascal's identity (C(128,64)); should_panic on C(300,150) overflow. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
qodo: the PR added the bucket-code surface without explicit migration steps for deleting ordgraph-proto's local CompositionSpec/RankQuantSpec/BucketCode fork. Add a 'Migration from ordgraph-proto' subsection: the drop-in import + delete src/code.rs steps, the rank-math-delegation note, and the two intentional deviations (bits=8 deferred to #221, buckets<=256). Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit fb7561f |
|
On the parity-test finding — deliberate, documented deferral. This finding asks for a test asserting parity against an external prototype implementation that is not yet a published or git-reachable dependency of this crate. We are intentionally not adding that test in this PR:
The |
Add a focused test `rankquant_spec_rejects_bits_8` that asserts both
`RankQuantSpec::new(8, 8)` and `BucketCode::from_vector(8, 8, &v)`
return `CompositionViolation::InvalidBits { bits: 8 }`, pinning the
deliberate b=8 exclusion so the boundary cannot regress silently.
Addresses the qodo finding: "bits=8 behavior untested" (PR #226).
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
- PackedConstantWeightBitmap::overlap: add assert!(dim <= u32::MAX) before and_popcount, which accumulates into u32 and would overflow silently for very large dims (Finding 1 / Correctness). - top_group_overlap_vector: add assert_eq!(lhs.spec(), rhs.spec()) precondition with a clear message; comparing bitmaps across specs is meaningless (Finding 2 / Maintainability). - BitmapNull::tail_probability(observed): new method returning tail_count(observed) / space_size as f64, the exact hypergeometric upper-tail P(overlap >= observed); rustdoc + unit tests pin boundary values and a known exact result (Finding 3 / Requirement gap). Tests added: packed_overlap_within_u32_max_does_not_panic, top_group_overlap_vector_panics_on_mismatched_spec, top_group_overlap_vector_passes_on_matching_spec, tail_probability_boundary_values, tail_probability_known_value, tail_probability_is_in_unit_interval_and_monotone. Signed-off-by: Nelson Spence <projectnavi@pm.me> Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…e references) Replace the "Migration from ordgraph-proto" rustdoc section with a self-contained "Adopting this API" adoption note describing how external callers can replace local forks with the ordvec types. Rename the "ordgraph bucket-code parity gate" test block comment to "bucket-code behavioral contract", describing the invariants under test rather than a migration relationship. No production behaviour, test logic, or assert values changed. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…nternal-prototype references) Replace the module-level "ported to reach parity with..." paragraph with a product-clean adoption note following the pattern established for bucket_code.rs in PR #220. Replace the internal "parity gate" test comment with a behavioral description that explains what the pinned literals assert without naming any internal repository. No production behavior change; no test logic or assertion values changed. Signed-off-by: Nelson Spence <projectnavi@pm.me> Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit f6d3580 |
The base branch was changed.
Signed-off-by: Nelson Spence <nelson@projectnavi.ai> # Conflicts: # src/lib.rs # tests/bucket_code_contingency.rs
Two qodo findings on the finite constant-weight null: - (Correctness) tail_probability advertised an 'exact' probability but computed `tail_count as f64 / space_size as f64`, double-rounding two large u128 counts; for C(dim, weight) past 2^53 that can mis-round (e.g. report 1.0 when the true value is just under 1). Reframe: the EXACT surface is the u128 `tail_count` / `space_size`; tail_probability returns the nearest f64, now gcd-reducing the two counts first so the conversion is exact whenever the reduced pair fits an f64 mantissa, and short-circuiting the observed==0 (1.0) and observed>weight (0.0) boundaries with no division. New test pins tail_probability == nearest f64 of the gcd-reduced exact ratio across cases up to C(100,50) ~ 1e29. - (Requirement gap) document that this is an in-model finite null under the idealized uniform constant-weight assumption — a selectivity / false-positive statement, NOT a real-corpus or encoder guarantee and not corpus-calibrated evidence strength. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
What
Ports
ordgraph-proto/src/bitmap.rsinto ordvec (#222),experimental-gated. Stacked on #226 (uses #220'sBucketCode) — merge #226 first, this auto-retargets tomain.ordvec::{ConstantWeightBitmap, PackedConstantWeightBitmap, BitmapNull, top_group_overlap_vector, choose}:ConstantWeightBitmap—from_top_bucket(&BucketCode)+overlap(bool reference path).PackedConstantWeightBitmap—from_bucket_range/from_top_group;overlaproutes throughcrate::util::and_popcount(the production popcount-AND primitive, not a hand-rolled loop).BitmapNull(finite hypergeometric null) —space_size = C(dim,w),fiber_count(o) = C(w,o)·C(dim−w, w−o),tail_count(t) = Σ_{o≥t} fiber_count(o); upper-tail prob =tail/space.choose— u128 binomial.Parity: reproduces the proto's assertion values; the fiber-partition identity
Σ fiber_count == space_sizeis verified for 7 (dim,weight) pairs (strong validation of the null math); overlap parity (bool == packed == naive);choosevs known binomials + symmetry. No new deps, zerounsafe. Verified locally: fmt / clippy-D warnings(exp+default) / test (exp+default gating) / MSRV 1.89.choose/space_size/fiber_count/tail_countinherit the proto's behaviour: the running product panics in debug, silently wraps in release onceC(dim,weight)exceedsu128::MAX(first atC(126,63)). Documented in a module# Overflownote, no guard. My recommendation: before any stable promotion, harden to fail-loud (achecked-based panic in release too, or achecked_choose -> Option) — a public combinatorial that silently returns a wrong count in release is at odds with the crate's fail-loud-primitive discipline. Kept proto-parity in this draft; say the word and I'll harden it.assert!) on contract violation, matching ordvec's existing bitmap-kernel discipline (and_popcountetc.), NOT Expose reusable RankQuant bucket-code API for downstream evidence systems #220's fail-softResult/CompositionViolation. Intentional split: construction validates softly, hot-path ops assert. Confirm.chooseas a public crate-root free fn (ordvec::choose) vspub(crate)/inlined intoBitmapNull.and_popcountvisibility coupling — packed overlap binds to thepub(crate)popcount primitive (fine while experimental).experimentalfor this draft.Refs #222. Targets 0.6 (may land in 0.5.0 — maintainer decides).