Skip to content

[codex] Add params-only Debug and SearchResults serde#248

Merged
Navi Bot (project-navi-bot) merged 3 commits into
mainfrom
codex/prerelease-api-debug-serde
Jun 19, 2026
Merged

[codex] Add params-only Debug and SearchResults serde#248
Navi Bot (project-navi-bot) merged 3 commits into
mainfrom
codex/prerelease-api-debug-serde

Conversation

@Fieldnote-Echo

Copy link
Copy Markdown
Member

Summary

  • add params-only Debug impls for public result/index types without dumping packed buffers
  • add a default-off serde feature for SearchResults serialization/deserialization
  • document the new feature in the release-facing MSRV/features matrix

Closes #81.
Closes #82.

Validation

  • cargo fmt --check
  • cargo test --test debug_serde
  • cargo test --features serde --test debug_serde
  • cargo check --no-default-features
  • cargo check --features serde
  • git diff --check

@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements custom Debug formatting for several core types (Bitmap, RankQuantFastscan, SearchResults, RankQuant, Rank, and SignBitmap) to output structural parameters rather than internal storage buffers. It also introduces optional serde serialization/deserialization support for SearchResults under a new serde feature, accompanied by comprehensive integration tests. The review feedback recommends a minor improvement to reuse the existing bytes_per_vec() helper method in the Debug implementations of Bitmap and SignBitmap instead of manually recalculating the vector byte size.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/bitmap.rs Outdated
Comment thread src/sign_bitmap.rs Outdated
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Add params-only Debug impls and optional serde for SearchResults
✨ Enhancement 🧪 Tests 📝 Documentation ⚙️ Configuration changes 🕐 40+ Minutes

Grey Divider

Description

• Implement params-only Debug for packed/public types to avoid dumping buffers.
• Add default-off serde feature to (de)serialize SearchResults.
• Add coverage/tests and document the new feature in the MSRV/features matrix.
Diagram

graph TD
  C["Cargo.toml"] --> F["serde feature"] --> SR["SearchResults"] --> API["Public API"]
  IDX["Packed index types"] --> API
  F --> SER[("serde crate")]
  T["tests/debug_serde.rs"] --> IDX --> API
  T --> SR --> API
  D["docs/msrv-and-features.md"] --> F
Loading
High-Level Assessment

The chosen approach is the most idiomatic for Rust library ergonomics: feature-gated cfg_attr(..., derive(Serialize, Deserialize)) keeps serde off the default surface, while hand-written Debug implementations prevent accidental logging of large packed buffers. Alternatives like introducing a separate serializable DTO type or using a third-party "redacted Debug" derive add API surface or dependencies without clear benefit here.

Files changed (10) +137 / -4

Enhancement (6) +66 / -2
bitmap.rsImplement params-only Debug for Bitmap +12/-0

Implement params-only Debug for Bitmap

• Adds a custom 'Debug' implementation that prints shape/metadata fields (dim, n_top, n_vectors) plus computed bytes-per-vector. Intentionally avoids printing internal packed bitmap storage.

src/bitmap.rs

fastscan.rsImplement params-only Debug for RankQuantFastscan +9/-0

Implement params-only Debug for RankQuantFastscan

• Adds a custom 'Debug' implementation for 'RankQuantFastscan' that prints only dimensions and vector count. Avoids emitting the internal packed byte buffer in debug output.

src/fastscan.rs

lib.rsAdd SearchResults serde derives (feature-gated) and redacted Debug +12/-0

Add SearchResults serde derives (feature-gated) and redacted Debug

• Feature-gates 'Serialize'/'Deserialize' derives for 'SearchResults' behind the new 'serde' feature. Replaces the default Debug output with a params-only formatter that reports nq/k and vector lengths instead of full arrays.

src/lib.rs

quant.rsImplement params-only Debug for RankQuant +11/-0

Implement params-only Debug for RankQuant

• Adds a custom 'Debug' implementation for 'RankQuant' that prints core parameters (dim, bits, n_vectors, capability). Avoids dumping the packed quantized buffer.

src/quant.rs

rank.rsImplement params-only Debug for Rank +9/-0

Implement params-only Debug for Rank

• Adds a custom 'Debug' implementation for 'Rank' that reports dim and n_vectors only. Prevents debug output from including rank storage contents.

src/rank.rs

sign_bitmap.rsImplement params-only Debug for SignBitmap and update test comment +13/-2

Implement params-only Debug for SignBitmap and update test comment

• Adds a custom 'Debug' implementation that prints dim/n_vectors plus computed bytes-per-vector, without dumping packed bitmaps. Updates an existing test comment to reflect the new Debug behavior while preserving explicit error-arm matching.

src/sign_bitmap.rs

Tests (2) +67 / -1
debug_serde.rsAdd integration tests for Debug redaction and serde roundtrip +66/-0

Add integration tests for Debug redaction and serde roundtrip

• Adds a new integration test ensuring public types' Debug output contains parameters but not packed storage field contents. Adds a serde_json roundtrip test for 'SearchResults' gated behind the 'serde' feature.

tests/debug_serde.rs

quant_b8.rsMinor test cleanup: use expect_err for panic payload +1/-1

Minor test cleanup: use expect_err for panic payload

• Refactors the panic assertion to use 'expect_err' instead of manually unwrapping 'err()'. No behavior change; improves clarity.

tests/index/quant_b8.rs

Documentation (1) +1 / -1
msrv-and-features.mdDocument default-off serde feature for SearchResults +1/-1

Document default-off serde feature for SearchResults

• Updates the MSRV/features matrix to note that the crate now supports a stable default-off 'serde' feature. Clarifies that this feature derives Serialize/Deserialize for 'SearchResults' only.

docs/msrv-and-features.md

Other (1) +3 / -0
Cargo.tomlAdd optional serde feature and serde_json for tests +3/-0

Add optional serde feature and serde_json for tests

• Introduces an optional 'serde' dependency with 'derive' enabled and wires it behind a new 'serde' crate feature. Adds 'serde_json' as a dev-dependency to support roundtrip serialization tests.

Cargo.toml

@qodo-code-review

qodo-code-review Bot commented Jun 19, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0) 📜 Skill insights (0)

Context used

Grey Divider


Action required

1. Missing SearchResults serde round-trip ✓ Resolved 📎 Requirement gap ☼ Reliability
Description
The new serde coverage only checks trait bounds and custom deserialization shape validation, but
does not round-trip a SearchResults value through a serde data format and assert the restored
contents match. This leaves serialization/deserialization compatibility untested under `--features
serde`.
Code

tests/debug_serde.rs[R50-55]

+#[cfg(feature = "serde")]
+#[test]
+fn search_results_implements_serde_traits_with_serde_feature() {
+    fn assert_serde<T: serde::Serialize + for<'de> serde::Deserialize<'de>>() {}
+
+    assert_serde::<SearchResults>();
Relevance

⭐⭐⭐ High

Team often accepts adding stronger regression/contract tests; similar test-hardening accepted in PRs
#154/#159.

PR-#154
PR-#159

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
PR Compliance ID 3 requires a serde format round-trip test under the serde feature. The current
--features serde test only asserts trait implementation, and the in-crate serde tests only
validate deserialization shape/error cases without performing a serialize->deserialize round-trip
and equality assertion.

Add serde round-trip test for SearchResults under --features serde
tests/debug_serde.rs[50-56]
src/lib.rs[392-548]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A serde round-trip test for `SearchResults` is missing under `--features serde`.

## Issue Context
Compliance requires a test that serializes and deserializes a representative `SearchResults` value using a serde format (e.g., JSON/bincode) and asserts the restored value matches (field-by-field if `PartialEq` is not implemented).

## Fix Focus Areas
- tests/debug_serde.rs[50-56]
- src/lib.rs[392-548]
- Cargo.toml[66-86]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. serde_json is unconditional dev-dep ✓ Resolved 📎 Requirement gap ⚙ Maintainability
Description
serde_json is added as an unconditional dev-dependency, which pulls serde into the default
(no-feature) cargo test dependency graph. This violates the requirement to avoid introducing new
default serde dependencies when the serde feature is not enabled.
Code

Cargo.toml[74]

+serde_json = "1.0"
Relevance

⭐⭐⭐ High

Serde deps kept out of core default; serde_json previously added only in bench crate; serde
previously removed.

PR-#1
PR-#237

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
PR Compliance ID 5 requires that default (no-feature) builds/tests do not introduce new default
serde dependencies. The diff adds serde_json = "1.0" under [dev-dependencies] without any
feature gating, which pulls in serde transitively even when --features serde is not enabled.

Default build/test behavior unchanged without the serde feature
Cargo.toml[70-75]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`serde_json` is currently included as an unconditional `dev-dependency`, which introduces `serde` into the default (no-feature) test build.

## Issue Context
Compliance requires default (no-feature) build/test behavior to remain unchanged and not introduce new default serde dependencies.

## Fix Focus Areas
- Cargo.toml[70-88]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Unbounded serde allocations 🐞 Bug ⛨ Security
Description
SearchResults deserialization fully allocates scores/indices via the derived
SearchResultsSerdeRepr before from_serde_repr validates shape/overflow, so attacker-controlled
payloads can force large allocations/OOM before any checks run. This contrasts with the crate’s file
loaders, which validate declared sizes before allocating payload buffers.
Code

src/lib.rs[R342-349]

+impl<'de> serde::Deserialize<'de> for SearchResults {
+    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
+    where
+        D: serde::Deserializer<'de>,
+    {
+        let repr = <SearchResultsSerdeRepr as serde::Deserialize>::deserialize(deserializer)?;
+        SearchResults::from_serde_repr(repr).map_err(serde::de::Error::custom)
+    }
Relevance

⭐⭐⭐ High

Team frequently adds pre-allocation size/overflow validation for untrusted inputs (PRs #153, #233,
#30).

PR-#153
PR-#233
PR-#30

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
SearchResults::deserialize first deserializes SearchResultsSerdeRepr (which must materialize the
Vec fields) and only afterwards calls from_serde_repr to validate sizes. The repo explicitly
documents that index loaders validate header fields and payload sizes before allocating buffers to
avoid malformed-input allocation issues, highlighting the gap in the serde path.

src/lib.rs[315-400]
src/rank_io.rs[24-41]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`SearchResults` serde deserialization currently allocates `Vec<f32>`/`Vec<i64>` before performing any bounds/shape validation, which can lead to memory-exhaustion (DoS) if deserializing untrusted input.

### Issue Context
Other ingestion surfaces in this crate (e.g. file loaders) explicitly validate size fields *before* allocating, but the current serde path cannot reject oversized inputs early because it delegates allocation to `Vec` deserialization.

### Fix Focus Areas
- Implement a custom `Deserialize` visitor for `SearchResults` (or custom deserialization for the `scores`/`indices` fields) that enforces an upper bound *during* deserialization (e.g. via `SeqAccess::size_hint()` and/or by stopping after `MAX_ELEMS`).
- Consider reusing an existing crate-wide cap concept (similar to the loader’s “validate before allocate” approach) and return a serde error when exceeded.

- src/lib.rs[315-400]
- src/rank_io.rs[24-41]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Unvalidated SearchResults Deserialize ✓ Resolved 🐞 Bug ☼ Reliability
Description
With the serde feature enabled, SearchResults derives Deserialize without enforcing
scores.len() == indices.len() == nq*k, so malformed serialized input can create instances that
panic when scores_for_query/indices_for_query slice using qi * k ranges.
Code

src/lib.rs[R308-329]

+#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub struct SearchResults {
    pub scores: Vec<f32>,
    pub indices: Vec<i64>,
    pub nq: usize,
    pub k: usize,
}

+impl fmt::Debug for SearchResults {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        f.debug_struct("SearchResults")
+            .field("nq", &self.nq)
+            .field("k", &self.k)
+            .field("scores_len", &self.scores.len())
+            .field("indices_len", &self.indices.len())
+            .finish()
+    }
+}
+
impl SearchResults {
    pub fn scores_for_query(&self, qi: usize) -> &[f32] {
        &self.scores[qi * self.k..(qi + 1) * self.k]
Relevance

⭐⭐⭐ High

Team consistently hardens against malformed inputs/panics; accepted many invariant/overflow guards
in core/persistence paths.

PR-#153
PR-#233
PR-#158

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
SearchResults now derives Serialize/Deserialize behind the serde feature, but its accessors
slice scores/indices using qi * k..(qi+1)*k without bounds/invariant checks, so inconsistent
deserialized lengths will panic upon access. The test suite documents the expected invariant
explicitly as scores.len() == nq*k and indices.len() == nq*k.

src/lib.rs[296-334]
tests/index/quant.rs[35-52]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
When the optional `serde` feature is enabled, `SearchResults` derives `Deserialize` but its methods assume internal invariants (`scores`/`indices` are flat buffers of length `nq * k`). Deserializing malformed input can construct an invalid `SearchResults` that later panics on `scores_for_query`/`indices_for_query`.

### Issue Context
`SearchResults` query accessors use unchecked slice indexing based on `qi` and `k`. The crate’s own tests treat `scores.len() == nq*k` and `indices.len() == nq*k` as part of the type’s contract.

### Fix Focus Areas
- src/lib.rs[296-334]

### Suggested fix approach
- Replace the `cfg_attr(..., derive(Serialize, Deserialize))` with either:
 - `derive(Serialize)` only and a manual `Deserialize` impl that validates:
   - `scores.len() == indices.len()`
   - `scores.len() == nq.saturating_mul(k)` (prefer `checked_mul` and error on overflow)
 - or manual `Serialize` + `Deserialize` if you prefer symmetry.
- On validation failure, return a serde deserialization error (`serde::de::Error::custom(...)`).
- Consider adding a small negative test under `--features serde` that attempts to deserialize a payload with inconsistent lengths and asserts it fails.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Serde uses usize counts ✓ Resolved 🐞 Bug ≡ Correctness
Description
With the serde feature enabled, SearchResults serializes/deserializes nq and k as usize,
which is pointer-width dependent (32-bit vs 64-bit) and can break cross-architecture persistence/IPC
when values don’t fit the destination usize. This makes the on-the-wire format less
stable/portable than necessary for a cache/message-queue use case.
Code

src/lib.rs[R316-323]

+#[cfg(feature = "serde")]
+#[derive(serde::Deserialize)]
+struct SearchResultsSerdeRepr {
+    scores: Vec<f32>,
+    indices: Vec<i64>,
+    nq: usize,
+    k: usize,
+}
Relevance

⭐⭐ Medium

Team cares about 32-bit/portability (accepted PRs #30, #63), but no prior evidence enforcing
fixed-width serde schema.

PR-#30
PR-#63

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The new serde-gated repr struct uses usize for nq and k, so the serialized form depends on the
target’s pointer width rather than a fixed-width schema.

src/lib.rs[296-333]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`SearchResults`’s serde representation currently uses `usize` for `nq`/`k`. `usize` is architecture-dependent, so serialized data can fail to deserialize on a different pointer width (e.g., 64-bit producer → 32-bit consumer) when values exceed the consumer’s `usize` range.

## Issue Context
- `SearchResults` is a public type; the new `serde` feature is intended for logging/caching/IPC-like use.
- The current implementation already validates `scores.len() == indices.len() == nq * k` on deserialize, so extending this to fixed-width integer conversion is straightforward.

## Fix approach
1. Change the serde-only representation to use fixed-width integers, e.g. `u64` for `nq` and `k`.
2. Replace the derived `Serialize` on `SearchResults` with a manual `Serialize` impl (behind `cfg(feature = "serde")`) that serializes via the fixed-width repr.
3. Update `from_serde_repr` to convert `u64 -> usize` with overflow checks before computing `expected_len`.
4. Update serde tests accordingly.

## Fix Focus Areas
- src/lib.rs[296-381]
- src/lib.rs[392-1001]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread Cargo.toml Outdated

Copy link
Copy Markdown
Member Author

/agentic_review

@Fieldnote-Echo Nelson Spence (Fieldnote-Echo) force-pushed the codex/prerelease-api-debug-serde branch from 5858cf9 to 8b8f1c0 Compare June 19, 2026 16:35

Copy link
Copy Markdown
Member Author

/agentic_review

Comment thread tests/debug_serde.rs
@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 8b8f1c0

@Fieldnote-Echo Nelson Spence (Fieldnote-Echo) force-pushed the codex/prerelease-api-debug-serde branch from 8b8f1c0 to a8df631 Compare June 19, 2026 16:47
@Fieldnote-Echo

Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit a8df631

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@Fieldnote-Echo Nelson Spence (Fieldnote-Echo) force-pushed the codex/prerelease-api-debug-serde branch from a8df631 to 625cdeb Compare June 19, 2026 16:57
@Fieldnote-Echo

Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 625cdeb

…debug-serde

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

# Conflicts:
#	tests/index/quant_b8.rs
@project-navi-bot Navi Bot (project-navi-bot) merged commit 7c29ae0 into main Jun 19, 2026
38 checks passed
@project-navi-bot Navi Bot (project-navi-bot) deleted the codex/prerelease-api-debug-serde branch June 19, 2026 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optional serde feature for SearchResults Add Debug impls for the public index types (params-only)

2 participants