[codex] Document persisted format compatibility#154
Conversation
There was a problem hiding this comment.
Code Review
This pull request documents the compatibility contract and byte layouts for ordvec persisted index files in a new PERSISTED_FORMAT.md file, and links to it from existing documentation. It also introduces integration tests in tests/persistence_compat.rs to ensure the stability of the serialized formats. The review feedback suggests improving the tests by using an RAII helper to safely manage temporary files and prevent leaks during panics, and adding an explicit length check in a helper function to avoid potential out-of-bounds panics.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Review Summary by QodoDocument persisted format compatibility contract and add fixture tests
WalkthroughsDescription• Add docs/PERSISTED_FORMAT.md documenting v1 compatibility contract for persisted index formats • Define byte layouts, versioning policy, and metadata descriptor for .tvr, .tvrq, .tvbm, .tvsb artifacts • Document score semantics versioning and breaking-change criteria for future format evolution • Add standalone tests/persistence_compat.rs with committed byte fixtures for all four index types • Validate version rejection, trailing-byte rejection, and metadata probing for each format • Cross-link persisted format contract from README and INDEX_PROVENANCE.md Diagramflowchart LR
A["Persisted Format Contract"] -->|defines| B["Byte Layouts<br/>Versioning Policy<br/>Score Semantics"]
B -->|covers| C["Rank TVR1<br/>RankQuant TVRQ<br/>Bitmap TVBM<br/>SignBitmap TVSB"]
D["Fixture Tests"] -->|validate| E["Header Parsing<br/>Payload Invariants<br/>Version Rejection<br/>Trailing Bytes"]
E -->|ensure| F["Format Stability"]
A -.->|referenced by| G["README<br/>INDEX_PROVENANCE"]
File Changes1. docs/PERSISTED_FORMAT.md
|
Code Review by Qodo
Context used 1. Clock-based temp filenames
|
There was a problem hiding this comment.
Pull request overview
This PR establishes and tests a stable v1 compatibility contract for ordvec’s persisted index artifacts (.tvr, .tvrq, .tvbm, .tvsb), so downstream systems can safely cache/replicate/probe these files and know when bytes will be accepted vs rejected.
Changes:
- Adds a new
docs/PERSISTED_FORMAT.mddocument defining the v1 byte layouts, metadata contract (IndexMetadata), and compatibility/versioning policy. - Adds golden (non-roundtrip) fixture tests that assert exact persisted bytes for Rank, RankQuant, Bitmap, and SignBitmap, plus rejection of unsupported versions and trailing bytes.
- Cross-links the persisted-format contract from README and provenance documentation.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| tests/persistence_compat.rs | Adds golden byte fixtures + rejection tests to lock in v1 persisted-format compatibility behavior. |
| README.md | Links to the persisted-format compatibility contract alongside provenance/threat-model docs. |
| docs/PERSISTED_FORMAT.md | New canonical v1 persisted-format contract: layout tables, metadata meaning, versioning and score-semantics policy. |
| docs/INDEX_PROVENANCE.md | Cross-links to the persisted-format contract for byte layout/versioning details. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6a1aaa2566
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
6a1aaa2 to
a1bc2be
Compare
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
Addressed Qodo's clock-based temp filename finding in aa7969d4e03e649f61a53d516f8c344ec560a217.
Local validation:
The Qodo finding was emitted as a PR conversation comment rather than an unresolved inline review thread. |
Summary
Closes #118.
docs/PERSISTED_FORMAT.mdas the v1 compatibility contract for.tvr,.tvrq,.tvbm, and.tvsbpersisted index artifacts.IndexMetadataas a manifest-friendly byte-shape descriptor, plus an external segment manifest example.Scope Notes
This stays on
origin/mainand does not stack on PR #153. It intentionally does not pin malformed-loader diagnostic strings; it only documents rejection behavior. It also does not absorb #127 backend determinism/tie-order policy: the new contract covers primitive per-row score interpretation for identical persisted bytes, while composed retrieval policy remains separate.I also checked
/home/ndspence/GitHub/ordgrep-rsread-only. Its current loader path usesRankQuant::loadandSignBitmap::loaddirectly and binds those files through its own store manifest/row-count checks, which matches this PR's primitive-format boundary.Validation
cargo fmt --checkcargo test --test persistence_compatcargo test -p ordvec --all-featurescargo test --workspace --exclude ordvec-python --all-featurescargo check -p ordvec-python --all-featurescargo clippy --workspace --all-targets --all-features -- -D warningscargo deny checkgit diff --checkAdversarial Review
A read-only adversarial review found one medium docs gap around score-semantics versioning. I fixed it by documenting that
format_versionalso covers primitive score interpretation of persisted bytes, while equal-score ordering/backend policy remains out of the byte-format contract. The reviewer found no blockers in byte layout or fixture bytes after that scope was addressed.