feat(recall_check): hybrid retrieval — FTS5 + Vectorize + RRF fusion (CLA-109)#39
Conversation
…(CLA-109)
Hybrid search adds BM25-ranked lexical matching alongside the existing
semantic (cosine) search. Lexical catches exact-term hits cosine misses
(names, jargon, unique phrases); semantic catches conceptual hits
lexical misses. Together substantially better than either alone, and
the default architecture for production retrieval systems.
## Pieces
* **migrations/0007_fts5_hybrid_retrieval.sql** — standalone FTS5 virtual
table indexing content/summary/entity/tags. Insert/update/delete
triggers keep it in sync with memories. Backfill INSERT seeds the
index with pre-CLA-109 rows. Smoke-tested locally end-to-end: triggers
fire correctly on each mutation; partial content updates preserve
other indexed fields; column-scoped queries work.
* **src/hybrid.rs** (new, non-wasm-gated) — pure-logic module with
build_fts_query (free-text → safe FTS5 MATCH expression: tokenise,
trim non-alphanumeric edges, quote each token, OR-join — quoting
defends against FTS5 keywords like AND/OR/NOT/NEAR), rrf_fuse
(Reciprocal Rank Fusion with configurable per-retriever weights;
default k=60 per Cormack et al. 2009), and cosine_similarity
(shared with worker_mmr, single source of truth). 13 unit tests
covering tokenisation edge cases, RRF correctness, weight=0
degeneration, k constant effect.
* **src/worker_vectorize.rs** — adds get_by_ids binding + helper for
fetching stored vectors by id (no query, no score). Enables FTS-only
hits to participate in MMR diversity ranking.
* **src/worker_store.rs** — fts_search(db, match_expr, limit) executes
BM25-ranked queries against memories_fts, returns ids in rank order.
* **src/worker_mcp.rs** — tool_recall_check rewritten:
1. Vectorize query (existing) + FTS query in parallel-ish (sequential,
~5ms saved isn't worth the complexity).
2. FTS-only hits get vectors via get_by_ids, cosine computed,
threshold-filtered. Lexical hits no longer invisible to MMR.
3. RRF fuses the two rankings; pool narrows to fused top-(limit*3).
The narrowing is what gives fusion influence over the final
ranking — without it, MMR's cosine-relevance metric would
dominate and FTS contribution would be wasted.
4. CLA-108 metadata filters apply (unchanged).
5. MMR rerank within the narrowed pool (unchanged).
6. Header surfaces active fts_weight for transparency.
## Tunable knob
`ONEIRO_HYBRID_FTS_WEIGHT` env var (default 1.0) scales the lexical leg's
RRF contribution. Set to 0.0 to disable FTS entirely (degenerates to
pre-CLA-109 semantic-only behaviour) — kill switch in case hybrid
regresses on some query class. Set > 1.0 to over-weight lexical. Tunable
without redeploy.
## Risk / migration notes
The 0007 migration is non-destructive (additive: new table + triggers +
backfill INSERT). Existing memories get indexed in place. Rollback path
exists (DROP TRIGGER + DROP TABLE) but migrations are one-way in
practice — branch can be abandoned cheaply up to ship.
recall_orient is unchanged — it's fixed-recent + orientation, not
topic-driven, so hybrid retrieval doesn't apply.
154 cargo tests pass (was 141 + 13 new). cargo check wasm32 clean.
worker-build --release produces 28.1KB bundle.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds hybrid retrieval: an FTS5 virtual table and triggers, hybrid primitives (cosine similarity, FTS query builder, RRF fusion), storage/vector batch APIs, and integration into recall_check to optionally fuse FTS and vector results controlled by ONEIRO_HYBRID_FTS_WEIGHT. ChangesHybrid FTS+Vector Retrieval
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/hybrid.rs (1)
45-60: 💤 Low valueConsider adding a debug assertion for vector length equality.
The doc comment states "equal-length vectors" but mismatched lengths silently compute over the truncated range via
zip(). Adebug_assert_eq!is zero-cost in release and catches integration bugs early.🛡️ Suggested defensive assertion
pub fn cosine_similarity(a: &[f64], b: &[f64]) -> f64 { + debug_assert_eq!(a.len(), b.len(), "cosine_similarity requires equal-length vectors"); let mut dot = 0.0; let mut na = 0.0; let mut nb = 0.0;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hybrid.rs` around lines 45 - 60, The function cosine_similarity currently uses zip() which truncates mismatched slices; add a defensive debug assertion at the start of cosine_similarity to ensure equal-length vectors (e.g., debug_assert_eq!(a.len(), b.len());) so integration bugs are caught in debug builds without impacting release performance, leaving the rest of the function unchanged.src/worker_mcp.rs (1)
756-768: 💤 Low valueConsider clamping or rejecting negative FTS weights.
A negative
fts_weightwould invert the FTS contribution in RRF fusion (subtracting instead of adding), which is likely unintended. Consider clamping to0.0minimum or logging a warning.🛡️ Suggested fix
fn read_fts_weight(env: &Env) -> f64 { let raw = match env.var("ONEIRO_HYBRID_FTS_WEIGHT") { Ok(v) => v.to_string(), Err(_) => return 1.0, }; - raw.parse::<f64>().unwrap_or_else(|_| { + let weight = raw.parse::<f64>().unwrap_or_else(|_| { worker::console_error!( "ONEIRO_HYBRID_FTS_WEIGHT={:?} unparseable; using default 1.0", raw ); 1.0 - }) + }); + if weight < 0.0 { + worker::console_error!( + "ONEIRO_HYBRID_FTS_WEIGHT={} negative; clamping to 0.0", + weight + ); + return 0.0; + } + weight }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/worker_mcp.rs` around lines 756 - 768, The read_fts_weight function currently returns whatever parsed f64 from ONEIRO_HYBRID_FTS_WEIGHT (or default 1.0); change it to clamp negative values to 0.0 and emit a warning when a negative value is provided: in read_fts_weight, after parsing the raw string (the current parse::<f64>().unwrap_or_else branch), check the parsed value and if v < 0.0 call worker::console_warn! with the raw value and return 0.0, otherwise return v; keep the existing behavior of returning 1.0 when the env var is missing or unparseable but still warn when unparseable as already done.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/worker_store.rs`:
- Around line 333-335: Trim the match_expr value before checking emptiness and
before sending to D1 so whitespace-only input is treated as empty; replace uses
of match_expr.is_empty() with a trimmed check (e.g., let trimmed =
match_expr.trim(); if trimmed.is_empty() ...) and use trimmed (or reassign) for
subsequent logic and the D1 MATCH call, and apply the same change to the second
occurrence that currently checks match_expr at the other spot.
---
Nitpick comments:
In `@src/hybrid.rs`:
- Around line 45-60: The function cosine_similarity currently uses zip() which
truncates mismatched slices; add a defensive debug assertion at the start of
cosine_similarity to ensure equal-length vectors (e.g.,
debug_assert_eq!(a.len(), b.len());) so integration bugs are caught in debug
builds without impacting release performance, leaving the rest of the function
unchanged.
In `@src/worker_mcp.rs`:
- Around line 756-768: The read_fts_weight function currently returns whatever
parsed f64 from ONEIRO_HYBRID_FTS_WEIGHT (or default 1.0); change it to clamp
negative values to 0.0 and emit a warning when a negative value is provided: in
read_fts_weight, after parsing the raw string (the current
parse::<f64>().unwrap_or_else branch), check the parsed value and if v < 0.0
call worker::console_warn! with the raw value and return 0.0, otherwise return
v; keep the existing behavior of returning 1.0 when the env var is missing or
unparseable but still warn when unparseable as already done.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: cd4d1584-9278-45fc-ac69-e4e28f57110b
📒 Files selected for processing (8)
migrations/0007_fts5_hybrid_retrieval.sqlsrc/hybrid.rssrc/lib.rssrc/main.rssrc/worker_mcp.rssrc/worker_mmr.rssrc/worker_store.rssrc/worker_vectorize.rs
PR #39 review (CodeRabbit). The only current caller (build_fts_query) can't produce whitespace-only output, but fts_search is pub and a future caller might pre-construct a malformed expression. A whitespace-only MATCH would error on FTS5's syntax check rather than degenerating cleanly to empty. Shadow match_expr with its trimmed form so the empty-check and the bind both see the same canonical value.
The hybrid retrieval demo exposed the gap directly: surfacing memories
by topic works, but you can't tell which candidates have images
attached without calling recall_image speculatively. The "I saw the
photo in the bucket, surface it via search" workflow stalls at the
"which one has the bytes" step.
Adds a terse `| img` indicator to both display paths:
* format_memory (worker_orient.rs) — used by recall_orient and
recall_specific. Image-carrying rows now render as
`[episodic | 5d ago | str:0.50 | id:abc12345 | img]`.
* tool_recall_check inline render (worker_mcp.rs) — image-carrying
rows now render as `[sim:0.77 | str:1.00 | abc12345 | img]`.
Both formats only emit the suffix when m.image_hash.is_some() — empty
case is byte-identical to pre-change output, so nothing else has to
change. Reader sees `img` and knows recall_image will succeed against
the id without a speculative round-trip.
Summary
The 0.3 headline.
recall_checknow runs lexical (BM25 via SQLite FTS5) and semantic (cosine via Vectorize) searches in parallel, fuses their rankings with Reciprocal Rank Fusion, and returns the unified top-N. Lexical catches exact-term hits cosine misses (names, jargon, unique phrases); semantic catches conceptual hits lexical misses. Together substantially better than either alone — the default architecture for production retrieval systems.Pieces
migrations/0007_fts5_hybrid_retrieval.sql— standalone FTS5 virtual table indexing content/summary/entity/tags. Insert/update/delete triggers keep it in sync with memories. Backfill INSERT seeds the index with pre-CLA-109 rows.src/hybrid.rs(new, non-wasm-gated, native-testable) — pure logic module:build_fts_query— free-text → safe FTS5 MATCH expression (tokenise, trim non-alphanumeric edges, quote each token, OR-join). Quoting defends against FTS5 keywords (AND/OR/NOT/NEAR) being misinterpreted as operators.rrf_fuse— Reciprocal Rank Fusion with configurable per-retriever weights.k=60per Cormack et al. 2009.cosine_similarity— promoted from worker_mmr so both the MMR reranker and the new hybrid path use one source of truth.src/worker_vectorize.rs— addsget_by_idsbinding + helper. Lets FTS-only hits get their vectors so they can participate in MMR diversity ranking.src/worker_store.rs—fts_search(db, match_expr, limit)executes BM25-ranked queries against memories_fts.src/worker_mcp.rs—tool_recall_checkrewritten:get_by_ids, cosine computed, threshold-filtered. Lexical hits no longer invisible to MMR.Tunable knob
ONEIRO_HYBRID_FTS_WEIGHTenv var (default1.0) scales the lexical leg's RRF contribution.0.0→ disables FTS entirely (degenerates to pre-CLA-109 semantic-only behaviour). Kill switch in case hybrid regresses on some query class.1.0→ equal weighting (default, vector-baseline).> 1.0→ over-weight lexical.Tunable via
wrangler secret putwithout redeploy or schema change.Migration / risk
The 0007 migration is non-destructive (additive: new virtual table + triggers + backfill INSERT). Existing memories get indexed in place. Smoke-tested locally end-to-end against a fresh SQLite copy of the schema:
entity:chopper) ✓The migration is the one piece that's annoying to undo — but a rollback path exists (DROP TRIGGER + DROP TABLE). Branch can be abandoned cheaply up to ship; once
wrangler d1 migrations applyruns in prod, we're committed.recall_orientis unchanged — it's fixed-recent + orientation, not topic-driven, so hybrid doesn't apply.Test plan
cargo test— 154 pass (was 141; +13 newhybrid::testscovering tokenisation edge cases, RRF correctness, weight=0 degeneration, k constant effect)cargo check --target wasm32-unknown-unknown --libcleanworker-build --releaseproduces wasm bundle (28.1KB)wrangler secret put ONEIRO_HYBRID_FTS_WEIGHT→0→ recall_check behaves identically to pre-CLA-109? Then set back to1.0.Closes CLA-109. Unblocks CLA-110 (cross-encoder reranker on top of the hybrid pool).
Summary by CodeRabbit
New Features
Database
UI / Output
Refactor