feat: ordvec on-disk format (.ov* magics) with full back-compat for legacy .tv*#230
Conversation
…-compat Files written by the crate now use the ordvec magics OVR1/OVRQ/OVBM/OVSB (extensions .ovr/.ovrq/.ovbm/.ovsb), replacing the turbovec-era TV* magics. The read contract is unchanged: all loaders (rank_io.rs) AND the C ABI accept BOTH the current OV* and the legacy TV* magics, so every file the crate (or turbovec) ever wrote still loads. Only the write path changed. - src/rank_io.rs: OV* magic constants (written) + TV* retained read-only for back-compat; writers emit OV*; loaders + probe_index_metadata accept both. - ordvec-ffi: the C ABI sniff-magic dispatch accepts both OV* and TV*; probe path was already format-agnostic (uses IndexKind). - tests/persistence_compat.rs: forward fixtures now pin OV*; added tests proving legacy TV* files still load for all four index types. - Parity sweep (docs / extensions only, no logic): ordvec-manifest (+ python bindings), ordvec-python docstrings + tests, ordvec-go test, C header, fuzz targets, docs/*, README format line, SECURITY/THREAT_MODEL, CONTRIBUTING stable-surface statement (the read contract is never broken), .gitignore. Gate: fmt + clippy -D warnings (core/ffi/manifest) + full test suites (core exp+default, manifest, ffi) + ordvec-python check + rustdoc -D warnings. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Code Review by Qodo
1.
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
PR Summary by QodoRename persisted index magics to OV* with legacy TV* load compatibility WalkthroughsDescription• Writers now emit .ov* files with OV* magics for all index types. • Loaders and C ABI accept both OV* and legacy TV* magics. • Refresh fixtures/docs/bindings to .ov* and add explicit legacy-load tests. Diagramgraph TD
A["Index APIs"] --> B["rank_io writers"] --> C[(".ov* index files")]
C --> D{"Magic OV* or TV*?"} --> E["rank_io loaders"]
C --> F["ordvec-ffi magic sniff"] --> G["RankQuant/Bitmap load"]
subgraph Legend
direction LR
_m["Module/API"] ~~~ _f[("On-disk file")] ~~~ _d{"Decision"}
end
High-Level AssessmentThe following are alternative approaches to this PR: 1. Keep TV* magics and only rename extensions to .ov*
2. Introduce a v2 format version with new header fields (kind + magic)
3. Extension-based dispatch in FFI (ignore magic sniffing)
Recommendation: The PR’s approach (new File ChangesEnhancement (6)
Tests (3)
Documentation (12)
Other (13)
|
There was a problem hiding this comment.
Code Review
This pull request rebrands the on-disk persistence format from the legacy turbovec-era TV* magics and extensions (.tvr, .tvrq, .tvbm, .tvsb) to the new ordvec OV* formats (.ovr, .ovrq, .ovbm, .ovsb). The writers have been updated to emit the new formats, while the loaders across Rust, C FFI, Go, and Python bindings have been updated to support both the new and legacy magics to maintain backward compatibility. Relevant tests, fuzz targets, documentation, and manifest tools have also been updated. There are no review comments, and I have no additional feedback to provide.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
The shorter UNSUPPORTED_FORMAT message in info_for_metadata let rustfmt collapse
`info.kind = match {...}` onto one line. Formatting only, no logic change.
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…der (qodo) After the loaders/probes were widened to accept both `OV*` and legacy `TV*` magics, the field-validation calls and baked-in error messages still hard-coded `TVR1`/`TVRQ`/`TVBM`/`TVSB` labels — so a malformed `OV*` file emitted errors referencing `TV*`, confusing readers and support tooling keyed off the text (qodo, Observability). Canonicalize every error label/prefix to the new `OV*` names (the format is byte-identical; `OV*` is now primary). The `b"TV*"` magic constants and the legacy-file test fixtures are unchanged — back-compat acceptance is intact — and the magic-mismatch messages keep their explicit `OV*/TV*` wording. Updated the loader_validation assertions to match. Regenerate `ordvec-ffi/include/ordvec.h` with cbindgen 0.29.3 (the committed header had drifted from the loader doc-comment wording). Full suite green. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
Remediated in
Verified: |
…tmap doc magic (codex)
Two stale `TV*` labels slipped past the first canonicalization pass because they
were not in the `"<MAGIC> ` quote-prefix form:
- The RankQuant probe + loader emitted `"unsupported TVRQ bits: {bits} ..."`
(TVRQ mid-string, after `unsupported `) even though the `bits` field is read
with the `OVRQ` label — now `OVRQ`.
- The SignBitmap on-disk-format doc table showed `magic = TVSB` / "shorter than
TVBM"; the written magic is `OVSB`/`OVBM` — updated for accuracy.
Remaining `TV*` references are intentional: the legacy-format module docs
("also reads TVR1"), the dual-magic acceptance code + mismatch messages
("OV*/TV*"), and the back-compat test fixtures/expectations that forge and load
legacy `TV*` files. Full suite green; ffi header unaffected.
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
Follow-up in |
…b doc fix #230 (OV* on-disk format) landed on main, conflicting with this branch's FastScan work in src/rank_io.rs. Resolution: - kept this branch's FastScan additions — the 5th `.ovfs`/`OVFS` format (doc bullet, `OVFS_MAGIC`, "Five formats"); - took main's canonicalized OV* loader magic labels (`read_magic(.., "OVR1")` etc.), matching the surrounding `OV*` field labels #230 introduced. Also rolled in the stop-gate follow-up that post-dates #230's merge (per request): the SignBitmap write/load/dim-check doc comments still described the CURRENT format as `.tvsb`; corrected to `.ovsb` (the written extension — `.tvsb` is legacy-only, accepted on load and documented as such). Verified: cbindgen --verify ok (no ffi drift), fmt + clippy clean, full suite green. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
What
Renames the four on-disk index magics from the turbovec-era
TV*to the ordvecOV*family, without breaking the read contract:Rank.ovrOVR1TVR1RankQuant.ovrqOVRQTVRQBitmap.ovbmOVBMTVBMSignBitmap.ovsbOVSBTVSBWriters emit
OV*; every loader accepts bothOV*and legacyTV*. Any file the crate (or upstream turbovec) ever wrote still loads — only the write path changed.Changes
src/rank_io.rs—OV*magic constants (written) +TV*retained read-only; writers emitOV*; the four loaders andprobe_index_metadataaccept both.ordvec-ffi(C ABI) — the sniff-magic load dispatch accepts bothOV*andTV*(functional: new files must load through the C/Go path). Probe path was already format-agnostic.tests/persistence_compat.rs— the byte-stable fixtures now pinOV*; new back-compat tests prove a legacyTV*file still loads for all four types.ordvec-manifest(+ python bindings),ordvec-pythondocstrings + tests,ordvec-gotest (primary fixture onOV*, one explicit legacy-TV*back-compat case), C header, fuzz targets,docs/*, README format line,SECURITY.md/THREAT_MODEL.md, theCONTRIBUTING.mdstable-surface statement (reworded: the read contract is never broken), and.gitignore(adds*.ov*, keeps*.tv*).Validation
cargo fmt --all --check;cargo clippy -D warnings(core / ffi / manifest); full test suites (core experimental + default, manifest 53, ffi);cargo check -p ordvec-python;RUSTDOCFLAGS=-D warnings cargo doc. The manifest crate is format-agnostic (opaque hashed artifacts), so its updates are extension-naming parity only.Follow-ups (separate PRs)
RankQuantFastscan) + its ownOVFS/.ovfspersistence in thisOV*convention.