[codex] Tighten malformed index loader diagnostics#153
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Code Review
This pull request refactors the index file metadata probing and loading logic in src/rank_io.rs to improve error reporting and reduce code duplication. It introduces helper functions for reading exact fields and calculating payload sizes, providing detailed context on truncated headers and payload mismatches. The test suite has been updated to verify these new error messages. Feedback on the changes highlights a potential integer overflow on 32-bit targets in rankquant_payload_bytes due to premature intermediate multiplication, and suggests refactoring the calculation using the pre-computed bytes per vector to avoid this issue.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Review Summary by QodoTighten malformed index loader diagnostics with field context
WalkthroughsDescription• Refactors malformed index loader diagnostics with contextual field labels - Adds label parameter to all header read functions for precise error context - Splits exact-length validation into distinct truncated vs trailing-byte errors • Extracts shared payload-size calculation helpers across all formats - rank_payload_bytes, rankquant_payload_bytes, bitmap_payload_bytes - Preserves existing MAX_PAYLOAD guards and allocation safety • Enhances error messages with format and field identification - Truncated headers now report which field failed (magic, version, dim, etc.) - Payload mismatches distinguish between truncation and trailing bytes • Adds comprehensive malformed payload tests for all index formats - Tests probe and public loader paths (RankQuant::load, SignBitmap::load) - Validates both truncated header and payload error contexts Diagramflowchart LR
A["Header Read Functions"] -->|add label param| B["Contextual Error Messages"]
C["Payload Validation"] -->|split logic| D["Truncated vs Trailing Errors"]
E["Duplicate Calculations"] -->|extract helpers| F["Shared Size Functions"]
B --> G["Enhanced Diagnostics"]
D --> G
F --> G
G --> H["Malformed Payload Tests"]
File Changes1. src/rank_io.rs
|
Code Review by Qodo
Context used 1. Hardcoded version in tests
|
There was a problem hiding this comment.
Pull request overview
This PR tightens persisted-index loader/prober diagnostics for malformed .tvr* formats by adding field-aware truncated-header errors and by splitting exact-length payload validation into distinct “truncated payload” vs “trailing bytes” failures. This improves operator-facing error context (notably for downstream consumers like ordgrep) while preserving existing size/allocation guards.
Changes:
- Add contextual fixed-header reads so truncated magic/version/field reads identify the format + specific field.
- Split exact-length validation into separate truncated-payload vs trailing-byte errors across TVR1/TVRQ/TVBM/TVSB.
- Add/adjust tests to pin public loader and probe error strings for malformed headers/payload lengths.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/rank_io.rs |
Introduces field-aware header reads and more specific payload-length mismatch diagnostics; factors payload-size computations into shared helpers. |
tests/index/loader_validation.rs |
Adds integration tests asserting stable, format-labeled error context for truncated payloads and trailing bytes across public loaders. |
tests/index/main.rs |
Updates malformed-file fuzz test comments to reflect the new exact-length validation behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
fb73040 to
5f76feb
Compare
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
Addressed Qodo's hardcoded-version test finding in 4988948. The integration malformed-payload helpers now derive empty-file headers through the public writers ( Local validation:
No unresolved inline Qodo thread was present to resolve; the finding was emitted as a PR conversation comment. |
Summary
TVR1,TVRQ,TVBM, andTVSBMAX_PAYLOADguardsRankQuant::loadandSignBitmap::loadpaths consumed by ordgrepCloses #151.
Scope notes
This is intentionally limited to the primitive persisted-index loader/prober boundary. It does not add the persisted format compatibility document/fixtures from #118, backend determinism semantics from #127, manifest schema changes, or ordgrep store policy.
Downstream check
ordgrep-rsfull-loads.ordgrep/index.ordvecwithRankQuant::loadand.ordgrep/sign.ordvecwithSignBitmap::load, then adds onlyloading <path>context. That means the primitive loader error string is the operator-facing reason for malformed binary artifacts, so this PR keeps the error context in ordvec proper.Validation
cargo fmt --checkcargo test --lib rank_io::testscargo test --test index loadercargo test --test redteam_delta delta_acargo test -p ordvec --all-featurescargo test --workspace --exclude ordvec-python --all-featurescargo check -p ordvec-python --all-featurescargo clippy --workspace --all-targets --all-features -- -D warningscargo deny checkgit diff --checkcargo test --workspace --all-featureswas also attempted and failed while linking theordvec-pythonRust test harness, with unresolved Python C API symbols such asPyObject_StrandPy_InitializeEx. The Python binding compile check listed above passed; this PR does not touch Python code.Review
Adversarial review found no functional blockers. One stale SignBitmap loader doc comment was corrected before publishing.