[codex] address release audit findings by Fieldnote-Echo · Pull Request #264 · Project-Navi/ordvec

Fieldnote-Echo · 2026-06-19T22:50:02Z

Summary

Expose the calibration-profile byte limit in the ordvec-manifest Python bindings: default constant, default_resource_limits(), verifier/create keyword arguments, exports, and regression coverage for calibration_profile_too_large.
Add a persisted-format registry in src/format.rs as the lockstep source of truth for format magic, IndexKind, probe coverage, manifest coverage, and C ABI load support. The registry is #[doc(hidden)] in v0.5.0 because it supports workspace crates such as ordvec-manifest; it is not a documented downstream API.
Rewire probe_index_metadata, ordvec-manifest, and ordvec-ffi through that registry. .ovfs / OVFS is now explicitly known-but-not-probeable and not manifest-covered in v1, and the C ABI reports it as UNSUPPORTED_FORMAT instead of corrupt.
Harden Python GIL-released paths by copying NumPy query/vector/candidate/doc-id inputs into Rust-owned buffers before py.detach. This closes the safe-Python mutation race class; docs and changelog call out the intentional memory tradeoff for large calls.
Add a Python snapshot regression test that mutates a query array while a batched search is running and asserts results match the pre-mutation snapshot.
Clarify the C ABI empty-candidate contract in Rust docs and the generated header: full search is (candidate_count = 0, candidates = NULL), while (0, non-NULL) is BAD_ARGUMENT.
Refresh release-facing docs for .ovfs, FastScan, b=8 support boundaries, two-approver release governance, formalization links, and the SubsetScratch::capacities_for_test test-utils gate.

Why

This PR addresses the release-surface audit before v0.5.0. The substantive fixes are:

F1: remove scattered persisted-format knowledge and make manifest/probe/FFI coverage explicit per format.
F2: remove undefined behavior reachable from safe Python by avoiding borrowed NumPy reads after releasing the GIL.
F3: report OVFS through the C ABI as a known unsupported format, not as corruption.

It also closes the earlier manifest Python parity gap and stale release/governance documentation found during the same audit pass.

Benchmark Note

The Python copy-before-detach change affects the Python binding. The public BEIR latency figures in the README are generated by the Rust benchmarks/beir-bench binary; the BEIR harness explicitly avoids importing the Python ordvec package and has a Makefile guardrail for that. No README BEIR numbers are refreshed in this PR because that benchmark hot path does not cross the Python/FFI boundary.

Validation

Local validation run during this PR:

cargo fmt --check
git diff --check
cargo test --locked
cargo test --locked --features test-utils batched_into_is_allocation_free_after_warmup
cargo test -p ordvec-manifest --locked
cargo test -p ordvec-ffi --locked
cargo check -p ordvec-python --locked
cargo check -p ordvec-manifest-python --locked
cargo test -p ordvec-manifest-python --locked
cargo clippy --workspace --all-targets --all-features --locked -- -D warnings
cargo clippy -p ordvec-manifest-python --all-targets --locked -- -D warnings
cargo doc -p ordvec --no-deps --locked
/tmp/ordvec-cbindgen-0.29.3/bin/cbindgen ordvec-ffi --config ordvec-ffi/cbindgen.toml --output ordvec-ffi/include/ordvec.h --verify
maturin build --release --manifest-path ordvec-python/Cargo.toml --out /tmp/ordvec-python-dist
uv run --with pytest --with /tmp/ordvec-python-dist/ordvec-0.5.0-cp310-abi3-manylinux_2_34_x86_64.whl python -m pytest ordvec-python/tests -q (511 passed)
maturin build --release --manifest-path ordvec-manifest-python/Cargo.toml --out /tmp/ordvec-manifest-python-dist
uv run --with pytest --with /tmp/ordvec-manifest-python-dist/ordvec_manifest-0.5.0-cp310-abi3-manylinux_2_34_x86_64.whl python -m pytest ordvec-manifest-python/tests -q
python3 tests/release_publish_invariants.py

GitHub Actions on the PR head remain the merge gate; the weekly fuzz job is expected to skip on ordinary PR pushes.

chatgpt-codex-connector · 2026-06-19T22:50:55Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

codecov · 2026-06-19T22:51:35Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

qodo-code-review · 2026-06-19T22:51:55Z

PR Summary by Qodo

Release audit fixes: Python manifest limit parity + OVFS/FastScan doc refresh
✨ Enhancement 📝 Documentation 🐞 Bug fix 🧪 Tests 🕐 40+ Minutes

Description

• Expose calibration-profile size limits across ordvec-manifest Python defaults and APIs.
• Refresh OVFS/FastScan security + provenance docs and formalization links post repo transfer.
• Tighten release/ABI/test-only surfaces (FFI empty-candidate contract, test-utils gating).

Diagram

graph TD
  U(["Python caller"]) --> I["ordvec_manifest/__init__.py"] --> B["ordvec-manifest-python/src/lib.rs"] --> C["ordvec_manifest_core"] --> L["ResourceLimits + DEFAULT_MAX_CALIBRATION_PROFILE_BYTES"] --> V["verify_* / create_* options"]
  T["ordvec-manifest-python/tests"] --> B

High-Level Assessment

The following are alternative approaches to this PR:

1. Codegen the Python export surface from Rust core constants

➕ Reduces future parity gaps between core ResourceLimits and Python exports
➕ Makes audits easier by having a single source of truth
➖ Adds build/tooling complexity (code generation and CI checks)
➖ May complicate contributor workflow for small changes

2. Expose a single structured ResourceLimits object instead of many kwargs

➕ Cleaner API surface; fewer duplicated plumbing points (inspect/verify/verify_for_load/create)
➕ Easier to extend without touching many function signatures
➖ Potentially breaking or at least ergonomically disruptive for existing callers
➖ Requires careful PyO3 type design and docs migration

Recommendation: The PR’s approach (plumb the missing limit end-to-end and test it) is the right short-term fix for an audit-driven parity gap with minimal API disruption. For long-term drift prevention, consider adding a lightweight parity check (or codegen) that ensures every core ResourceLimits field either has a Python binding or is intentionally omitted.

Files changed (19) +143 / -47

Enhancement (2) +33 / -4

__init__.pyExport DEFAULT_MAX_CALIBRATION_PROFILE_BYTES in Python package surface +2/-0

Export DEFAULT_MAX_CALIBRATION_PROFILE_BYTES in Python package surface
• Adds the calibration-profile byte limit constant to the Python import/export list so it is available as a stable module constant.
ordvec-manifest-python/python/ordvec_manifest/init.py

lib.rsPlumb calibration-profile byte limit through Python resource limits and options +31/-4

Plumb calibration-profile byte limit through Python resource limits and options
• Imports and exports DEFAULT_MAX_CALIBRATION_PROFILE_BYTES, adds 'max_calibration_profile_bytes' to resource limit builders and to verify/create/inspect/verify_for_load keyword arguments, and includes it in the PythonResourceLimits mapping.
ordvec-manifest-python/src/lib.rs

Bug fix (2) +7 / -3

ordvec.hClarify ABI v1 empty-candidate contract for full search +4/-0

Clarify ABI v1 empty-candidate contract for full search
• Documents that full search uses '(candidate_count==0 && candidate_rows==NULL)' and that '(0, non-NULL)' is a BAD_ARGUMENT case callers must avoid.
ordvec-ffi/include/ordvec.h

quant.rsGate SubsetScratch::capacities_for_test behind the test-utils feature +3/-3

Gate SubsetScratch::capacities_for_test behind the test-utils feature
• Moves the test-only capacity probe behind '#[cfg(feature = "test-utils")]' while keeping it doc-hidden, preventing it from appearing on the default public API surface.
src/quant.rs

Tests (2) +51 / -1

test_manifest_bindings.pyAdd tests for calibration-profile default and enforcement in verify_manifest +50/-0

Add tests for calibration-profile default and enforcement in verify_manifest
• Extends the default_resource_limits() assertion to include the new limit and adds an integration test that constructs a calibration profile and verifies size-limit enforcement returns the expected error code.
ordvec-manifest-python/tests/test_manifest_bindings.py

release_publish_invariants.pyUpdate release invariants for formalization URL transfer +1/-1

Update release invariants for formalization URL transfer
• Updates the expected Python package metadata invariants so the Formalization URL matches the Project-Navi repository location.
tests/release_publish_invariants.py

Documentation (12) +51 / -38

CHANGELOG.mdDocument audit-driven API and doc surface updates +13/-0

Document audit-driven API and doc surface updates

• Adds release notes for exposing the calibration-profile byte limit in Python, aligning OVFS/FastScan docs, updating formalization links, and gating a test-only helper behind 'test-utils'.

CHANGELOG.md

Cargo.tomlClarify docs.rs build intent for FastScan vs experimental APIs +3/-3

Clarify docs.rs build intent for FastScan vs experimental APIs

• Updates the docs.rs metadata comment to reflect that 'RankQuantFastscan' is part of the stable documented surface while MultiBucketBitmap remains excluded under default features.

Cargo.toml

README.mdUpdate formalization repository links to Project-Navi +5/-5

Update formalization repository links to Project-Navi
• Repoints ordvec-formalization links (including proof-spine/theorem-map/reviewer-brief) to the new GitHub org location.
README.md

SECURITY.mdInclude OVFS/FastScan format in supported/fuzzed loader discussion +6/-6

Include OVFS/FastScan format in supported/fuzzed loader discussion
• Updates the serialized index formats list to include '.ovfs'/'OVFS' and refines the fuzzing and reporting guidance accordingly.
SECURITY.md

THREAT_MODEL.mdRefresh maintenance-budget language for two-approver releases +10/-8

Refresh maintenance-budget language for two-approver releases
• Reframes maintainer capacity assumptions to a lead + approver model and documents residual supply-chain risk under protected environment approvals.
THREAT_MODEL.md

INDEX_PROVENANCE.mdExtend structural well-formedness guarantees to FastScan (.ovfs) +6/-3

Extend structural well-formedness guarantees to FastScan (.ovfs)
• Adds explicit '.ovfs' row validity requirements (FastScan nibble validity, b=2 composition, padding constraints) and updates fuzz target count to include 'load_fastscan'.
docs/INDEX_PROVENANCE.md

RANK_MODES.mdUpdate ordvec-formalization link to Project-Navi +1/-1

Update ordvec-formalization link to Project-Navi
• Repoints the formalization repository reference used in the Rank modes discussion to the new org.
docs/RANK_MODES.md

README.mdUpdate experiment doc reference to formalization PR under Project-Navi +1/-1

Update experiment doc reference to formalization PR under Project-Navi
• Updates the ordvec-formalization PR link for the CRT seam oracle result to the new org location.
experiments/ordinal-routing-research/README.md

crt_seam_oracle_results.mdRepoint formalization PR link for CRT seam oracle theorem +1/-1

Repoint formalization PR link for CRT seam oracle theorem
• Updates the ordvec-formalization#17 link to the Project-Navi repo location.
experiments/ordinal-routing-research/crt_seam_oracle_results.md

README.mdUpdate formalization link for Python package docs +1/-1

Update formalization link for Python package docs
• Repoints the formalization link in the ordvec-python README to the Project-Navi repository.
ordvec-python/README.md

lib.rsClarify bits validation rustdoc and remove duplicated candidate docs +3/-8

Clarify bits validation rustdoc and remove duplicated candidate docs
• Updates the 'bits' validator documentation to match the Python API surface expectations and removes a duplicated paragraph describing candidate duplication semantics.
ordvec-python/src/lib.rs

rank.rsFix rustdoc wording for bits upper bound in bucket_full +1/-1

Fix rustdoc wording for bits upper bound in bucket_full
• Corrects documentation to state the correct panic condition ('bits > 8') for the bucket-full path.
src/rank.rs

Other (1) +1 / -1

pyproject.tomlUpdate project.urls.Formalization to new repository location +1/-1

Update project.urls.Formalization to new repository location
• Changes the Formalization metadata URL to point at the Project-Navi ordvec-formalization repository.
ordvec-python/pyproject.toml

qodo-code-review · 2026-06-19T22:54:42Z

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX issues (0) 🔗 Cross-repo conflicts (0) 📜 Skill insights (0)

1. ~~Misleading bits guard comment~~ ✓ Resolved 🐞 Bug ⚙ Maintainability

Description

check_bits_max7’s doc comment says it enforces the Python RankQuant constructor surface, but the
function is only used by standalone bucket helper pyfunctions
(rank_to_bucket/bucket_ranks/bucket_centre) and not by the RankQuant constructor. This can
mislead maintainers/auditors about which Python API surface is actually being constrained by the
bits <= 7 rule.

Code

ordvec-python/src/lib.rs[R120-122]

+/// Reject a `bits` value outside the Python constructor surface. The Rust core
+/// has b=8 evidence/refinement helpers, but Python `RankQuant` exposes only the
+/// byte-aligned persisted widths through this constructor path.

Relevance

⭐⭐⭐ High
Team often accepts “docs must match behavior/API surface” fixes in ordvec-python (e.g., doc
alignment accepted in PRs #92, #209).
PR-#92
PR-#209

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The cited regions show (1) the comment asserting constructor-surface semantics, (2)
RankQuant.__new__ performing its own {1,2,4} validation without calling check_bits_max7, and
(3) the bucket helper pyfunctions calling check_bits_max7.
ordvec-python/src/lib.rs[120-128]
ordvec-python/src/lib.rs[579-611]
ordvec-python/src/lib.rs[1481-1534]
ordvec-python/src/lib.rs[1612-1627]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`check_bits_max7`'s doc comment claims it is rejecting `bits` values outside the Python `RankQuant` constructor surface, but the function is not used by the `RankQuant` constructor. It is only used by the standalone bucket primitive pyfunctions, while `RankQuant.__new__` enforces `bits ∈ {1,2,4}` directly.

## Issue Context
This is a documentation/maintainability issue, not a runtime bug. The goal is to make the comment accurately describe which API surface uses this guard and what bit-width domain it intends to enforce.

## Fix Focus Areas
- ordvec-python/src/lib.rs[120-128]
- ordvec-python/src/lib.rs[579-611]
- ordvec-python/src/lib.rs[1481-1534]
- ordvec-python/src/lib.rs[1612-1627]

## Suggested change
Update the `check_bits_max7` comment to describe that it is used by the bucket helper functions (and/or rename it if desired), and explicitly note the distinction between:
- `RankQuant` constructor: `bits ∈ {1,2,4}`
- Bucket helper surface (current): `bits <= 7`
- Rust core bucket helpers: `bits <= 8`

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

github-actions

Thorough review of the release-audit PR:

F1 — Format registry (src/format.rs): Well-designed. Centralizing magic, IndexKind, probe/manifest/FFI coverage into one lockstep table eliminates the scattered format knowledge that F1 set out to remove. The #[non_exhaustive] enums and explicit tracking-issue fields mean new formats must make deliberate stance decisions. ✅

F2 — Python GIL safety: All GIL-released paths now copy inputs before py.detach. Spot-checked Rank.add, RankQuant.search, RankQuant.search_asymmetric, RankQuant.search_asymmetric_subset, Bitmap.search, and Bitmap.search_subset — all use owned vectors before detach. The CandidateIds::into_owned() helper correctly handles both borrowed and already-converted paths. ✅

F3 — C ABI OVFS reporting: ordvec_index_load now correctly returns ORDVEC_STATUS_UNSUPPORTED_FORMAT for OVFS (and SignBitmap/Rank) rather than the previous CORRUPT_INDEX/UNKNOWN mishandling. The info_for_metadata path remains consistent with the registry-backed dispatch. ✅

Tests: The new test_search_asymmetric_snapshots_query_before_detach regression test is a good stress test for the copy-before-detach invariant. The manifest test_verify_manifest_enforces_calibration_profile_limit covers the new Python binding surface. ✅

Docs: Updated links from Fieldnote-Echo/ordvec-formalization to Project-Navi/ordvec-formalization are consistent across README, SECURITY, threat model, and experiments. ✅

No blocking issues found. This is a solid pre-v0.5.0 release-surface cleanup.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/Project-Navi/ordvec/actions/runs/27853967591

address release audit findings

c4600ce

Nelson Spence (Fieldnote-Echo) marked this pull request as ready for review June 19, 2026 22:50

Nelson Spence (Fieldnote-Echo) requested review from Navi Bot (project-navi-bot) and Todd Baur (toadkicker) as code owners June 19, 2026 22:50

Nelson Spence (Fieldnote-Echo) added 4 commits June 19, 2026 18:10

Fix FFI header drift

e231490

Address audit format registry and Python GIL safety

90163b3

Tighten audit PR review items

f849f2b

Add Python GIL snapshot regression test

a3ba31c

github-code-quality Bot found potential problems Jun 19, 2026

View reviewed changes

Comment thread ordvec-python/tests/test_rank_quant.py Fixed

Address Python snapshot test review nit

eeb7be2

Nelson Spence (Fieldnote-Echo) added the review-this Trigger OpenHands PR review label Jun 20, 2026

github-actions Bot reviewed Jun 20, 2026

View reviewed changes

Relax Python snapshot test coverage timeout

3fed7e9

project-navi-bot approved these changes Jun 20, 2026

View reviewed changes

Navi Bot (project-navi-bot) merged commit b013cce into main Jun 20, 2026
39 checks passed

Navi Bot (project-navi-bot) deleted the codex/release-governance-approver-gates branch June 20, 2026 00:16

Nelson Spence (Fieldnote-Echo) mentioned this pull request Jun 20, 2026

ordvec-python: optional safe_copy input isolation for GIL-releasing methods #113

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[codex] address release audit findings#264

[codex] address release audit findings#264
Navi Bot (project-navi-bot) merged 7 commits into
mainfrom
codex/release-governance-approver-gates

Fieldnote-Echo commented Jun 19, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Jun 19, 2026

Uh oh!

codecov Bot commented Jun 19, 2026

Uh oh!

qodo-code-review Bot commented Jun 19, 2026

Uh oh!

qodo-code-review Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Fieldnote-Echo commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Benchmark Note

Validation

Uh oh!

chatgpt-codex-connector Bot commented Jun 19, 2026

Uh oh!

codecov Bot commented Jun 19, 2026

Codecov Report

Uh oh!

qodo-code-review Bot commented Jun 19, 2026

PR Summary by Qodo

Uh oh!

qodo-code-review Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fieldnote-Echo commented Jun 19, 2026 •

edited

Loading

qodo-code-review Bot commented Jun 19, 2026 •

edited

Loading