diff --git a/README.md b/README.md index b4e9846b..971e9063 100644 --- a/README.md +++ b/README.md @@ -156,6 +156,7 @@ The runtime dependency floor is `numpy>=2.2`. [`docs/ALTERNATIVES_CONSIDERED.md`](https://github.com/Fieldnote-Echo/ordvec/blob/main/docs/ALTERNATIVES_CONSIDERED.md) - **Index-file trust model:** [`docs/INDEX_PROVENANCE.md`](https://github.com/Fieldnote-Echo/ordvec/blob/main/docs/INDEX_PROVENANCE.md), + [`docs/determinism.md`](https://github.com/Fieldnote-Echo/ordvec/blob/main/docs/determinism.md), [`THREAT_MODEL.md`](https://github.com/Fieldnote-Echo/ordvec/blob/main/THREAT_MODEL.md) - **Repo-local manifest verifier, C ABI, and Go wrapper:** available from the full GitHub checkout. These sidecars are not part of the diff --git a/docs/FOLLOWUP_BODY_KERNEL_TIE_BREAK.md b/docs/FOLLOWUP_BODY_KERNEL_TIE_BREAK.md index 0994a25d..ac0452ff 100644 --- a/docs/FOLLOWUP_BODY_KERNEL_TIE_BREAK.md +++ b/docs/FOLLOWUP_BODY_KERNEL_TIE_BREAK.md @@ -1,17 +1,11 @@ -# Follow-up: deterministic tie-breaking for body bitmap candidate selection +# Resolved: deterministic tie-breaking for body bitmap candidate selection -`Bitmap::top_m_candidates` and `top_m_candidates_batched` -(in `src/bitmap.rs`) currently partition on -bitmap overlap score alone. Boundary ties are not rare — overlap -scores are small integers (`0..n_top`, e.g. `0..256`), so multiple -docs frequently share the cutoff score, and `select_nth_unstable_by` -may then choose different equal-scored docs at the boundary across -runs or dispatch paths. +`Bitmap::top_m_candidates` and `top_m_candidates_batched` now partition and +sort by the composite key `(score desc, doc_id asc)`. Boundary ties are not +rare because overlap scores are small integers (`0..n_top`, e.g. `0..256`), so +the candidate set at the cutoff must be fully determined by score and row ID. -**Fix**: add composite-key ordering `(score desc, doc_id asc)` to -both the partition predicate (`select_nth_unstable_by`) and the -post-partition sort (`sort_unstable_by`), so the candidate set at any -given M is fully determined by `(score, doc_id)`. +The fixed comparator is: ```rust let mut cmp = |&a: &u32, &b: &u32| { @@ -23,9 +17,6 @@ idx.select_nth_unstable_by(m_eff - 1, &mut cmp); idx[..m_eff].sort_unstable_by(&mut cmp); ``` -**Keep it as a standalone change.** Rolling the determinism fix into -an unrelated benchmark or kernel change would muddy attribution — if -recall/latency numbers move, it should be clear whether the kernel -changed or only the tie-break at the candidate-set boundary changed. -The fix is behaviour-preserving on score ordering and only pins the -boundary, so it is safe to land on its own. +The broader search-output policy is now tracked in +[`determinism.md`](determinism.md). Future changes to golden row IDs, tie keys, +or duplicate-candidate behavior need an explicit compatibility note. diff --git a/docs/RANK_MODES.md b/docs/RANK_MODES.md index bb47a98b..f36fed14 100644 --- a/docs/RANK_MODES.md +++ b/docs/RANK_MODES.md @@ -328,10 +328,10 @@ facts qualify this: rank-mode README recommends, and where the structural prior pays off. - **The asymmetric AVX-512 kernel is an exact packed scan, not an ANN - approximation.** It returns identical top-k to the scalar RankQuant - scorer and agrees within 1e-4 on scores (verified by - `rankquant_asymmetric_matches_reference_b{1,2,4}` in - `tests/index/quant.rs`). + approximation.** It is checked against the scalar RankQuant scorer with + score tolerances and deterministic golden tie fixtures (see + [`determinism.md`](determinism.md)); the random reference tests avoid + overfitting top-k order at near-tolerance boundaries. The byte-LUT scorer remains in the codebase as a labelled reference path (`ordvec::search_asymmetric_byte_lut`, @@ -435,6 +435,9 @@ single-pass b=2 fast path; it supports `add`/`search` but not bilinear bucket-overlap decomposition and is reachable only behind the `experimental` feature. +Search result ordering, backend score-equivalence expectations, tie keys, and +empty-result shapes are specified in [`determinism.md`](determinism.md). + ## Test coverage `cargo test --lib` — unit tests for the primitives in diff --git a/docs/determinism.md b/docs/determinism.md new file mode 100644 index 00000000..40a64722 --- /dev/null +++ b/docs/determinism.md @@ -0,0 +1,86 @@ +# Search Determinism Contract + +This document states the compatibility contract for ordvec search output: +scores, ordering, tie handling, backend dispatch, and empty-result shape. It +covers the primitive retrieval surface only. It does not define distributed +merge order, replication, storage manifests, or deployment policy. + +## Global Ordering Rule + +For public top-k search results, ordvec orders hits by: + +1. score descending; +2. row ID ascending when scores compare equal. + +The row ID is the internal zero-based insertion row. Subset APIs receive row +IDs from the caller and return the same global row IDs. Duplicate candidate IDs +are scored as duplicate candidate entries and may produce duplicate hits. + +`k` is clamped to the search space before result buffers are allocated. A +full-index search space is the number of indexed rows. A subset search space is +the candidate-list length. If the effective `k` is zero, or the search space is +empty, search returns an empty result shape rather than padded sentinel hits. + +## Backend Scope + +Backend selection must not change the documented ordering rule. Exact integer +popcount primitives are bit-exact across scalar, AVX-512, aarch64 NEON, and +wasm `simd128` implementations. Floating-point score equivalence uses an +absolute tolerance of `1e-4` and no relative tolerance (`rtol = 0`) unless a +row below explicitly states that the score is integer-exact. Some tests use +tighter tolerances for specific scalar helper comparisons, but `1e-4` is the +public cross-backend/architecture compatibility tolerance. Intentional changes +to that tolerance or to golden top-k output are compatibility-affecting and +must be called out in the PR and release notes. + +Query-level parallelism may change scheduling, but each query is scored and +finalized independently. Batched APIs must match the corresponding single-query +API for the same query rows, modulo the primitive-specific tolerance stated +below. Floating-point comparison tolerances apply only to score equivalence; +the public hit order still follows the global ordering rule above. + +## Primitive Contracts + +| Surface | Score contract | Tie key | Backend contract | +| --- | --- | --- | --- | +| `Rank::search` | Normalized Spearman-style rank cosine; floating scores are tolerance-based with absolute tolerance `1e-4`, `rtol = 0`. | Global row ID ascending. | Fixed scalar arithmetic per row; query parallelism does not affect per-query output. | +| `Rank::search_asymmetric` | Float query against stored ranks; floating scores are tolerance-based with absolute tolerance `1e-4`, `rtol = 0`. | Global row ID ascending. | Fixed scalar arithmetic per row; query parallelism does not affect per-query output. | +| `RankQuant::search` | Symmetric bucketed-rank score; floating scores are tolerance-based with absolute tolerance `1e-4`, `rtol = 0`. | Global row ID ascending. | Scalar packed-byte LUT path; query parallelism does not affect per-query output. | +| `RankQuant::search_asymmetric` | Float query against stored buckets; floating scores are tolerance-based with absolute tolerance `1e-4`, `rtol = 0`. | Global row ID ascending. | AVX-512, AVX2, and scalar-LUT dispatch must agree with the scalar reference within the public tolerance and preserve top-k order for the golden fixtures. | +| `RankQuant::search_asymmetric_subset` | Same score as `RankQuant::search_asymmetric`, restricted to caller-supplied candidates; floating scores use the same `1e-4` absolute tolerance and `rtol = 0`. | Global row ID ascending, not candidate-list position. Duplicate candidate IDs remain duplicate entries. | Uses the same AVX-512, AVX2, or scalar dispatch as full asymmetric search over a gathered scratch buffer. | +| `Bitmap::search` | Exact `popcount(Q AND D)` represented as `f32`; score is integer-exact, not tolerance-based. | Global row ID ascending. | Popcount scores are integer-exact across scalar and SIMD implementations. | +| `Bitmap::top_m_candidates` | Exact `popcount(Q AND D)` candidate ordering; score key is integer-exact, not tolerance-based. | Global row ID ascending. | Single-query and batched candidate APIs must return the same ordered candidates. | +| `Bitmap::search_subset` | Exact subset `popcount(Q AND D)` represented as `f32`; score is integer-exact, not tolerance-based. | Global row ID ascending. Duplicate candidate IDs remain duplicate entries. | Subset score kernels must agree bit-exactly with scalar popcount. | +| `SignBitmap::top_m_candidates` | Lowest Hamming distance, equivalently highest sign agreement; score key is integer-exact, not tolerance-based. | Global row ID ascending. | Single-query and batched candidate APIs must return the same ordered candidates. | +| `SignBitmap::score_all` | Dense sign-agreement counts aligned by row ID; scores are `u32` integer-exact, not tolerance-based. | Not a top-k API. | Popcount scores are integer-exact across scalar and SIMD implementations. | + +## FastScan + +`RankQuantFastscan` is a hidden, optional b=2 pre-ranker. It is deterministic +for a fixed index, query, and backend dispatch, and its scalar and AVX-512 +FastScan kernels operate on the same quantized LUT inputs. It is not +score-equivalent to exact `RankQuant::search_asymmetric`: the global 8-bit LUT +quantization is intentional and can change scores or boundary ordering. Callers +that need exact RankQuant scores should use `RankQuant::search_asymmetric` or +`RankQuant::search_asymmetric_subset`. + +## Compatibility Notes + +Intentional changes to any of these are compatibility-affecting: + +- golden top-k row IDs; +- tie keys or duplicate-candidate behavior; +- empty-result or `k` clamping shape; +- scalar/SIMD score tolerance; +- whether an API is exact or approximate; +- whether a backend is covered by this contract. + +Such changes need a compatibility note in the PR and release notes. Performance +changes that preserve the same scores, row ordering, tie keys, and empty-result +shape are not search-contract breaks. + +Compatibility note for this contract PR: `RankQuant::search_asymmetric_subset` +now breaks equal-score ties by global row ID instead of local candidate-list +position. That matches full-index search, C ABI hit ordering, Python binding +ordering, and the candidate prefilters. Duplicate candidate IDs are still +scored as duplicate entries and may still produce duplicate hits. diff --git a/ordvec-ffi/src/lib.rs b/ordvec-ffi/src/lib.rs index 763c3c0c..03f660d3 100644 --- a/ordvec-ffi/src/lib.rs +++ b/ordvec-ffi/src/lib.rs @@ -833,11 +833,9 @@ pub unsafe extern "C" fn ordvec_index_search( (LoadedIndex::RankQuant(index), Some(rows)) => { // Ask the core for every candidate score, then normalize by the // ABI's global row-id tie policy before truncating. The core - // subset helper breaks ties by candidate position before - // mapping back to global row IDs, so requesting only k could - // drop a boundary-tied lower row ID from an unsorted candidate - // list. Materializing all candidates preserves the ABI ordering - // contract until core exposes a global-row top-k scorer. + // subset helper uses global row IDs as score-tie keys; keeping + // the ABI normalization centralized preserves duplicate and + // boundary handling for caller-supplied candidate lists. let (scores, indices) = index.search_asymmetric_subset(validation.query, rows, rows.len()); normalize_global_order(scores, indices, validation.required_hits) diff --git a/ordvec-python/tests/test_rank_quant.py b/ordvec-python/tests/test_rank_quant.py index cc2893e3..24d0bd96 100644 --- a/ordvec-python/tests/test_rank_quant.py +++ b/ordvec-python/tests/test_rank_quant.py @@ -299,8 +299,7 @@ def test_search_asymmetric_subset_matches_full_when_candidates_eq_all(): # When the candidate set is every doc, the subset path must agree # with full `search_asymmetric` on the top-k. Both use the # asymmetric kernel; the subset path just iterates the candidate - # list instead of all N docs. (Allow set equality — ties may - # permute within the same scoring tier.) + # list instead of all N docs. vectors = unit_vectors(40, 128, seed=0) idx = RankQuant(dim=128, bits=2) idx.add(vectors) @@ -310,7 +309,21 @@ def test_search_asymmetric_subset_matches_full_when_candidates_eq_all(): _, subset_ids = idx.search_asymmetric_subset(query, candidates, k=10) _, full_ids = idx.search_asymmetric(query[None, :], k=10) - assert set(int(i) for i in subset_ids) == set(int(i) for i in full_ids[0]) + np.testing.assert_array_equal(subset_ids, full_ids[0]) + + +def test_search_asymmetric_subset_ties_use_global_row_ids(): + vectors = np.ones((12, 64), dtype=np.float32) + idx = RankQuant(dim=64, bits=2) + idx.add(vectors) + + candidates = np.array([9, 3, 7, 1], dtype=np.uint32) + scores, ids = idx.search_asymmetric_subset( + np.zeros(64, dtype=np.float32), candidates, k=2 + ) + + np.testing.assert_array_equal(ids, np.array([1, 3], dtype=np.int64)) + np.testing.assert_array_equal(scores, np.array([0.0, 0.0], dtype=np.float32)) def test_search_asymmetric_subset_k_caps_at_candidate_count(): diff --git a/ordvec-python/tests/test_sign_bitmap.py b/ordvec-python/tests/test_sign_bitmap.py index 000378b1..ba3d8d0c 100644 --- a/ordvec-python/tests/test_sign_bitmap.py +++ b/ordvec-python/tests/test_sign_bitmap.py @@ -80,10 +80,8 @@ def test_top_m_candidates_batched_shape(): def test_batched_matches_scalar_for_each_row(): - # The batched AVX-512 VPOPCNTDQ kernel must agree with the scalar - # path on the same query at top-1; we check the leading match for a - # small batch (boundary ties at deeper ranks may diverge — see the - # body-kernel tie-break follow-up, separate from sign-bitmap). + # The batched kernel must agree with the single-query path for the full + # ordered top-m row, including boundary ties. idx = SignBitmap(dim=128) idx.add(unit_vectors(60, 128, seed=0)) queries = unit_vectors(6, 128, seed=99) @@ -91,11 +89,7 @@ def test_batched_matches_scalar_for_each_row(): batched = idx.top_m_candidates_batched(queries, m=5) for i in range(6): scalar = idx.top_m_candidates(queries[i], m=5) - # Top-1 must agree exactly across both code paths. - assert int(batched[i, 0]) == int(scalar[0]), ( - f"batched vs scalar disagree on top-1 for query {i}: " - f"batched={batched[i, 0]} scalar={scalar[0]}" - ) + np.testing.assert_array_equal(batched[i], scalar) def test_empty_batch_returns_consistent_column_count(): diff --git a/src/quant.rs b/src/quant.rs index 0e4a2ffc..f7700433 100644 --- a/src/quant.rs +++ b/src/quant.rs @@ -536,7 +536,9 @@ impl RankQuant { /// subset (e.g., the top-M from a bitmap probe). Returns /// `(scores, indices)`: the top-`k` scores and their corresponding /// **global** doc IDs (the local candidate positions are mapped back - /// to global IDs before returning). + /// to global IDs before returning). Results are ordered by score + /// descending, then global row ID ascending, matching the full-index + /// search tie policy even when `candidates` is unsorted. /// /// Uses the same AVX-512 → AVX2 → scalar dispatch as /// [`Self::search_asymmetric`] and the same centre-drop math, just @@ -606,7 +608,7 @@ impl RankQuant { // never reaches a kernel that would drop its tail chunk. #[cfg_attr(not(target_arch = "x86_64"), allow(unused_variables))] let simd_tier = select_simd_tier(dim, bits); - let mut top = TopK::new(k_eff); + let mut top = TopK::new_with_tie_keys(k_eff, candidates); #[cfg_attr(not(target_arch = "x86_64"), allow(unused_mut))] let mut centre_drop_used = false; #[cfg(target_arch = "x86_64")] diff --git a/src/util.rs b/src/util.rs index 1684ae76..0229f72e 100644 --- a/src/util.rs +++ b/src/util.rs @@ -352,28 +352,31 @@ fn xor_popcount_simd128(doc: &[u64], q: &[u64]) -> u32 { /// partial sort. /// /// **Tie-break (deterministic across CPUs).** Ranking is by the -/// composite key `(score desc, doc_id asc)`: on equal scores the -/// LOWER doc_id wins, both for eviction and in the final order. SIMD -/// vs scalar f32 summation-order differences can flip genuine -/// near-ties between hosts; the composite key removes that -/// nondeterminism and matches the candidate-gen paths -/// (`top_m_candidates`) which already partition on `(score, doc_id)`. -/// The "worst kept" entry — the one evicted first — is therefore the -/// one with the lowest score and, among equal-score entries, the -/// HIGHEST doc_id. +/// composite key `(score desc, tie_key asc)`: on equal scores the lower +/// tie key wins, both for eviction and in the final order. Full-index +/// scans use `doc_id` as the tie key. Subset scans may emit local scratch +/// indices while supplying global row IDs as the tie keys. SIMD vs scalar +/// f32 summation-order differences can flip genuine near-ties between +/// hosts; the composite key removes exact-tie nondeterminism and matches +/// the candidate-gen paths (`top_m_candidates`) which already partition on +/// `(score, doc_id)`. The "worst kept" entry — the one evicted first — is +/// therefore the one with the lowest score and, among equal-score entries, +/// the highest tie key. pub(crate) struct TopK { k: usize, scores: Vec, indices: Vec, + tie_keys: Vec, + tie_key_by_index: Option>, filled: usize, - /// Slot holding the worst kept entry under `(score asc, doc_id + /// Slot holding the worst kept entry under `(score asc, tie_key /// desc)` — the next to be evicted. worst_pos: usize, /// Score of the worst kept entry. worst_val: f32, - /// doc_id of the worst kept entry (used to break score ties: - /// among equal scores the higher doc_id is worse to keep). - worst_idx: i64, + /// Tie key of the worst kept entry. Among equal scores, the higher + /// tie key is worse to keep. + worst_tie_key: i64, } impl TopK { @@ -382,13 +385,27 @@ impl TopK { k, scores: vec![f32::NEG_INFINITY; k], indices: vec![-1; k], + tie_keys: vec![i64::MAX; k], + tie_key_by_index: None, filled: 0, worst_pos: 0, worst_val: f32::INFINITY, - worst_idx: i64::MAX, + worst_tie_key: i64::MAX, } } + /// Construct a top-k collector whose emitted indices are local scan + /// positions but whose score ties are broken by caller-supplied keys. + /// + /// This is used by subset scans: SIMD kernels still emit local candidate + /// positions into the gathered scratch buffer, while ties must follow the + /// public global row-id policy. + pub(crate) fn new_with_tie_keys(k: usize, tie_key_by_index: &[u32]) -> Self { + let mut top = Self::new(k); + top.tie_key_by_index = Some(tie_key_by_index.iter().map(|&id| i64::from(id)).collect()); + top + } + #[inline] pub(crate) fn maybe_insert(&mut self, score: f32, idx: usize) { // Convert the doc_id to its i64 storage form once, up front. doc_ids @@ -401,52 +418,59 @@ impl TopK { // stays clippy-clean on 32-bit, where `idx <= i64::MAX as usize` would // be an always-true `absurd_extreme_comparison`. let id = i64::try_from(idx).expect("ordvec: doc_id exceeds i64::MAX"); + let tie_key = self + .tie_key_by_index + .as_ref() + .map(|keys| keys[idx]) + .unwrap_or(id); if self.filled < self.k { self.scores[self.filled] = score; self.indices[self.filled] = id; + self.tie_keys[self.filled] = tie_key; self.filled += 1; if self.filled == self.k { self.recompute_worst(); } } else { - // Replace the worst kept entry iff the incoming `(score, id)` is - // strictly better to keep under the `(score desc, doc_id asc)` - // order: a higher score, or an equal score with a lower doc_id. - // doc_ids are unique per scan, so this is a total order — the - // greedy eviction keeps exactly the top-k set under the composite - // key. - let better = score > self.worst_val || (score == self.worst_val && id < self.worst_idx); + // Replace the worst kept entry iff the incoming `(score, tie_key)` + // is strictly better to keep under the `(score desc, tie_key asc)` + // order: a higher score, or an equal score with a lower row key. + // Full-index scans use `doc_id` as the tie key. Subset scans use + // global row IDs while still emitting local scratch-buffer indices. + let better = + score > self.worst_val || (score == self.worst_val && tie_key < self.worst_tie_key); if better { self.scores[self.worst_pos] = score; self.indices[self.worst_pos] = id; + self.tie_keys[self.worst_pos] = tie_key; self.recompute_worst(); } } } - /// Locate the worst kept entry under `(score asc, doc_id desc)`: - /// lowest score, and among equal scores the highest doc_id. That - /// is the entry a strictly-better incoming candidate evicts. + /// Locate the worst kept entry under `(score asc, tie_key desc)`: + /// lowest score, and among equal scores the highest tie key. That is the + /// entry a strictly-better incoming candidate evicts. fn recompute_worst(&mut self) { let mut wv = f32::INFINITY; - let mut wi = i64::MIN; + let mut wt = i64::MIN; let mut wp = 0; for i in 0..self.filled { let s = self.scores[i]; - let id = self.indices[i]; - if s < wv || (s == wv && id > wi) { + let tie_key = self.tie_keys[i]; + if s < wv || (s == wv && tie_key > wt) { wv = s; - wi = id; + wt = tie_key; wp = i; } } self.worst_val = wv; - self.worst_idx = wi; + self.worst_tie_key = wt; self.worst_pos = wp; } /// Drain into `out_scores` / `out_indices` sorted by the composite - /// key `(score desc, doc_id asc)`. `out_scores.len()` is the + /// key `(score desc, tie_key asc)`. `out_scores.len()` is the /// user-requested `k`; positions beyond `self.filled` are left as /// sentinels. pub(crate) fn finalize_into(&self, out_scores: &mut [f32], out_indices: &mut [i64]) { @@ -457,26 +481,32 @@ impl TopK { for i in out_indices.iter_mut() { *i = -1; } - let mut pairs: Vec<(f32, i64)> = self + let mut pairs: Vec<(f32, i64, i64, usize)> = self .scores .iter() .zip(self.indices.iter()) + .zip(self.tie_keys.iter()) + .enumerate() .take(self.filled) - .map(|(&s, &i)| (s, i)) + .map(|(slot, ((&s, &i), &tie_key))| (s, i, tie_key, slot)) .collect(); - // Composite key: score descending, then doc_id ascending. The - // doc_id tie-break makes the final order deterministic when - // scores are equal. + // Composite key: score descending, then tie key ascending. The kept + // slot is only a final deterministic tie-break when duplicate + // candidate entries are otherwise indistinguishable. For full-index + // scans the tie key is the doc_id; for subset scans it is the global + // row id associated with the emitted local index. pairs.sort_unstable_by(|a, b| { // `total_cmp` is a true total order (IEEE-754 `totalOrder`), so the // sort stays well-defined even if a non-finite score ever slipped // past the finite-input guards — `partial_cmp(..).unwrap_or(Equal)` // is not a total order and can mis-sort around NaN. For the finite - // scores we actually have, the two agree. doc_id ascending breaks - // score ties (unchanged). - b.0.total_cmp(&a.0).then_with(|| a.1.cmp(&b.1)) + // scores we actually have, the two agree. The ascending tie key + // makes score ties deterministic. + b.0.total_cmp(&a.0) + .then_with(|| a.2.cmp(&b.2)) + .then_with(|| a.3.cmp(&b.3)) }); - for (slot, (s, i)) in pairs.into_iter().enumerate() { + for (slot, (s, i, _, _)) in pairs.into_iter().enumerate() { if slot >= out_scores.len() { break; } @@ -533,6 +563,21 @@ mod tests { assert!(scores.is_empty() && indices.is_empty()); } + #[test] + fn topk_duplicate_candidate_ties_have_total_final_order() { + let mut top = TopK::new_with_tie_keys(2, &[7, 7, 7]); + top.maybe_insert(0.0, 0); + top.maybe_insert(0.0, 1); + top.maybe_insert(0.0, 2); + + let mut scores = [f32::NEG_INFINITY; 2]; + let mut indices = [-1; 2]; + top.finalize_into(&mut scores, &mut indices); + + assert_eq!(scores, [0.0, 0.0]); + assert_eq!(indices, [0, 1]); + } + #[test] fn checked_new_len_accepts_up_to_max() { use crate::rank_io::MAX_VECTORS; diff --git a/tests/determinism_contract.rs b/tests/determinism_contract.rs new file mode 100644 index 00000000..cddffd10 --- /dev/null +++ b/tests/determinism_contract.rs @@ -0,0 +1,142 @@ +use ordvec::{search_asymmetric_byte_lut, Bitmap, Rank, RankQuant, SignBitmap}; + +fn repeated_docs(n: usize, dim: usize, value: f32) -> Vec { + vec![value; n * dim] +} + +fn assert_ids(actual: &[i64], expected: &[i64]) { + assert_eq!(actual, expected, "ids {actual:?} != expected {expected:?}"); +} + +fn assert_u32_ids(actual: &[u32], expected: &[u32]) { + assert_eq!(actual, expected, "ids {actual:?} != expected {expected:?}"); +} + +#[test] +fn full_search_ties_return_lowest_row_ids() { + const DIM: usize = 64; + const N: usize = 8; + let docs = repeated_docs(N, DIM, 1.0); + let query = vec![1.0; DIM]; + let zero_query = vec![0.0; DIM]; + + let mut rank = Rank::new(DIM); + rank.add(&docs); + assert_ids(rank.search(&query, 4).indices_for_query(0), &[0, 1, 2, 3]); + let rank_asym = rank.search_asymmetric(&zero_query, 4); + assert_ids(rank_asym.indices_for_query(0), &[0, 1, 2, 3]); + assert!(rank_asym.scores_for_query(0).iter().all(|&s| s == 0.0)); + + let mut rankquant = RankQuant::new(DIM, 2); + rankquant.add(&docs); + assert_ids( + rankquant.search(&query, 4).indices_for_query(0), + &[0, 1, 2, 3], + ); + let rq_asym = rankquant.search_asymmetric(&zero_query, 4); + assert_ids(rq_asym.indices_for_query(0), &[0, 1, 2, 3]); + assert!(rq_asym.scores_for_query(0).iter().all(|&s| s == 0.0)); + + let mut bitmap = Bitmap::new(DIM, DIM / 4); + bitmap.add(&docs); + let bitmap_hits = bitmap.search(&query, 4); + assert_ids(bitmap_hits.indices_for_query(0), &[0, 1, 2, 3]); + let bitmap_score = bitmap_hits.scores_for_query(0)[0]; + assert!(bitmap_hits + .scores_for_query(0) + .iter() + .all(|&s| s == bitmap_score)); +} + +#[test] +fn rankquant_dispatch_matches_scalar_reference_on_ordered_ties() { + for &dim in &[20usize, 64] { + let docs = repeated_docs(8, dim, 1.0); + let query = vec![0.0; dim]; + let mut index = RankQuant::new(dim, 2); + index.add(&docs); + + let production = index.search_asymmetric(&query, 6); + let scalar = search_asymmetric_byte_lut(&index, &query, 6); + + assert_ids(production.indices_for_query(0), &[0, 1, 2, 3, 4, 5]); + assert_eq!(production.indices, scalar.indices, "dim={dim}"); + assert_eq!(production.scores, scalar.scores, "dim={dim}"); + } +} + +#[test] +fn rankquant_subset_ties_use_global_row_ids() { + const DIM: usize = 64; + let docs = repeated_docs(12, DIM, 1.0); + let query = vec![0.0; DIM]; + let mut index = RankQuant::new(DIM, 2); + index.add(&docs); + + let (scores, ids) = index.search_asymmetric_subset(&query, &[9, 3, 7, 1], 2); + assert_eq!(scores, vec![0.0, 0.0]); + assert_ids(&ids, &[1, 3]); + + let (duplicate_scores, duplicate_ids) = index.search_asymmetric_subset(&query, &[7, 8, 7], 2); + assert_eq!(duplicate_scores, vec![0.0, 0.0]); + assert_ids(&duplicate_ids, &[7, 7]); +} + +#[test] +fn candidate_prefilters_preserve_order_across_single_and_batched_paths() { + const DIM: usize = 64; + const N: usize = 10; + let docs = repeated_docs(N, DIM, 1.0); + let query = vec![1.0; DIM]; + let queries = [query.clone(), query.clone()].concat(); + + let mut bitmap = Bitmap::new(DIM, DIM / 4); + bitmap.add(&docs); + let bitmap_expected = vec![0, 1, 2, 3, 4]; + assert_u32_ids(&bitmap.top_m_candidates(&query, 5), &bitmap_expected); + for row in bitmap.top_m_candidates_batched(&queries, 5) { + assert_u32_ids(&row, &bitmap_expected); + } + + let mut sign = SignBitmap::new(DIM); + sign.add(&docs); + let sign_expected = vec![0, 1, 2, 3, 4]; + assert_u32_ids(&sign.top_m_candidates(&query, 5), &sign_expected); + for row in sign.top_m_candidates_batched(&queries, 5) { + assert_u32_ids(&row, &sign_expected); + } +} + +#[test] +fn empty_and_zero_k_result_shapes_are_empty() { + const DIM: usize = 64; + let query = vec![1.0; DIM]; + + let rank = Rank::new(DIM); + let rank_empty = rank.search(&query, 10); + assert_eq!(rank_empty.k, 0); + assert!(rank_empty.scores.is_empty()); + assert!(rank_empty.indices.is_empty()); + + let rankquant = RankQuant::new(DIM, 2); + let rq_empty = rankquant.search_asymmetric(&query, 10); + assert_eq!(rq_empty.k, 0); + assert!(rq_empty.scores.is_empty()); + assert!(rq_empty.indices.is_empty()); + + let bitmap = Bitmap::new(DIM, DIM / 4); + let bitmap_empty = bitmap.search(&query, 10); + assert_eq!(bitmap_empty.k, 0); + assert!(bitmap_empty.scores.is_empty()); + assert!(bitmap_empty.indices.is_empty()); + + let sign = SignBitmap::new(DIM); + assert!(sign.top_m_candidates(&query, 10).is_empty()); + + let mut nonempty = RankQuant::new(DIM, 2); + nonempty.add(&repeated_docs(2, DIM, 1.0)); + let zero_k = nonempty.search_asymmetric(&query, 0); + assert_eq!(zero_k.k, 0); + assert!(zero_k.scores.is_empty()); + assert!(zero_k.indices.is_empty()); +} diff --git a/tests/index/quant.rs b/tests/index/quant.rs index bf99a50e..fc8e5450 100644 --- a/tests/index/quant.rs +++ b/tests/index/quant.rs @@ -164,8 +164,9 @@ fn rankquant_asymmetric_matches_reference(bits: u8) { ); } - // And the top-10 set must match (we allow tied scores to permute - // within ties — same set, possibly different order). + // This random reference check uses set equality to avoid overfitting a + // near-tolerance boundary. Exact score-tie ordering is pinned by + // tests/determinism_contract.rs. let mut ref_sorted: Vec<(usize, f32)> = ref_scores .iter() .enumerate() diff --git a/tests/redteam_beta.rs b/tests/redteam_beta.rs index a884d041..12d0e663 100644 --- a/tests/redteam_beta.rs +++ b/tests/redteam_beta.rs @@ -87,7 +87,9 @@ fn assert_asym_matches_byte_lut(dim: usize, bits: u8, seed: u64) { let prod_idx = prod.indices_for_query(0); let ref_idx = reference.indices_for_query(0); - // Top-k *set* must match (ties may permute within equal scores). + // This dispatch-grid red-team check uses set equality because random + // near-ties can sit inside the scalar/SIMD tolerance. Exact score-tie + // ordering is pinned by tests/determinism_contract.rs. let prod_set: std::collections::HashSet = prod_idx.iter().copied().collect(); let ref_set: std::collections::HashSet = ref_idx.iter().copied().collect(); assert_eq!(