feat(recall_check): hybrid retrieval — FTS5 + Vectorize + RRF fusion (CLA-109) by JuzzyDee · Pull Request #39 · JuzzyDee/oneiro

JuzzyDee · 2026-05-23T00:57:19Z

Summary

The 0.3 headline. recall_check now runs lexical (BM25 via SQLite FTS5) and semantic (cosine via Vectorize) searches in parallel, fuses their rankings with Reciprocal Rank Fusion, and returns the unified top-N. Lexical catches exact-term hits cosine misses (names, jargon, unique phrases); semantic catches conceptual hits lexical misses. Together substantially better than either alone — the default architecture for production retrieval systems.

Pieces

migrations/0007_fts5_hybrid_retrieval.sql — standalone FTS5 virtual table indexing content/summary/entity/tags. Insert/update/delete triggers keep it in sync with memories. Backfill INSERT seeds the index with pre-CLA-109 rows.
src/hybrid.rs (new, non-wasm-gated, native-testable) — pure logic module:
- build_fts_query — free-text → safe FTS5 MATCH expression (tokenise, trim non-alphanumeric edges, quote each token, OR-join). Quoting defends against FTS5 keywords (AND/OR/NOT/NEAR) being misinterpreted as operators.
- rrf_fuse — Reciprocal Rank Fusion with configurable per-retriever weights. k=60 per Cormack et al. 2009.
- cosine_similarity — promoted from worker_mmr so both the MMR reranker and the new hybrid path use one source of truth.
src/worker_vectorize.rs — adds get_by_ids binding + helper. Lets FTS-only hits get their vectors so they can participate in MMR diversity ranking.
src/worker_store.rs — fts_search(db, match_expr, limit) executes BM25-ranked queries against memories_fts.
src/worker_mcp.rs — tool_recall_check rewritten:
1. Vectorize query (existing) + FTS query.
2. FTS-only hits get vectors via get_by_ids, cosine computed, threshold-filtered. Lexical hits no longer invisible to MMR.
3. RRF fuses the two rankings; pool narrows to fused top-(limit×3) — the narrowing is what gives fusion influence over the final ranking.
4. CLA-108 metadata filters apply (unchanged).
5. MMR rerank within the narrowed pool (unchanged).
6. Header surfaces active fts_weight for transparency.

Tunable knob

ONEIRO_HYBRID_FTS_WEIGHT env var (default 1.0) scales the lexical leg's RRF contribution.

0.0 → disables FTS entirely (degenerates to pre-CLA-109 semantic-only behaviour). Kill switch in case hybrid regresses on some query class.
1.0 → equal weighting (default, vector-baseline).
> 1.0 → over-weight lexical.

Tunable via wrangler secret put without redeploy or schema change.

Migration / risk

The 0007 migration is non-destructive (additive: new virtual table + triggers + backfill INSERT). Existing memories get indexed in place. Smoke-tested locally end-to-end against a fresh SQLite copy of the schema:

Triggers fire on insert/update/delete ✓
Partial content updates preserve other indexed fields ✓
Column-scoped queries work (entity:chopper) ✓
Multi-token OR queries return ranked merged hits ✓

The migration is the one piece that's annoying to undo — but a rollback path exists (DROP TRIGGER + DROP TABLE). Branch can be abandoned cheaply up to ship; once wrangler d1 migrations apply runs in prod, we're committed.

recall_orient is unchanged — it's fixed-recent + orientation, not topic-driven, so hybrid doesn't apply.

Test plan

cargo test — 154 pass (was 141; +13 new hybrid::tests covering tokenisation edge cases, RRF correctness, weight=0 degeneration, k constant effect)
cargo check --target wasm32-unknown-unknown --lib clean
worker-build --release produces wasm bundle (28.1KB)
Migration end-to-end test: applied against fresh local SQLite, inserted/updated/deleted test rows, verified triggers + backfill + BM25 query work
Post-merge smoke test on live worker:
- Exact-name query ("Chopper") — does the named memory now rank top?
- Jargon query ("Hebbian") — does the specific term hit?
- Conceptual query ("how do we handle errors") — no regression vs vector-only?
- Kill switch: wrangler secret put ONEIRO_HYBRID_FTS_WEIGHT → 0 → recall_check behaves identically to pre-CLA-109? Then set back to 1.0.

Closes CLA-109. Unblocks CLA-110 (cross-encoder reranker on top of the hybrid pool).

Summary by CodeRabbit

New Features
- Hybrid search combining full-text and vector retrieval with configurable FTS weight
- Reciprocal Rank Fusion merges FTS and vector results for improved ranking
- FTS-only hits are bridged into the vector pipeline to increase recall
Database
- Added full-text search index for memories and backfilled existing data
UI / Output
- Results show per-item similarity scores and a terse "| img" indicator for items with images
Refactor
- Shared cosine-similarity logic unified across retrieval and reranking

…(CLA-109) Hybrid search adds BM25-ranked lexical matching alongside the existing semantic (cosine) search. Lexical catches exact-term hits cosine misses (names, jargon, unique phrases); semantic catches conceptual hits lexical misses. Together substantially better than either alone, and the default architecture for production retrieval systems. ## Pieces * **migrations/0007_fts5_hybrid_retrieval.sql** — standalone FTS5 virtual table indexing content/summary/entity/tags. Insert/update/delete triggers keep it in sync with memories. Backfill INSERT seeds the index with pre-CLA-109 rows. Smoke-tested locally end-to-end: triggers fire correctly on each mutation; partial content updates preserve other indexed fields; column-scoped queries work. * **src/hybrid.rs** (new, non-wasm-gated) — pure-logic module with build_fts_query (free-text → safe FTS5 MATCH expression: tokenise, trim non-alphanumeric edges, quote each token, OR-join — quoting defends against FTS5 keywords like AND/OR/NOT/NEAR), rrf_fuse (Reciprocal Rank Fusion with configurable per-retriever weights; default k=60 per Cormack et al. 2009), and cosine_similarity (shared with worker_mmr, single source of truth). 13 unit tests covering tokenisation edge cases, RRF correctness, weight=0 degeneration, k constant effect. * **src/worker_vectorize.rs** — adds get_by_ids binding + helper for fetching stored vectors by id (no query, no score). Enables FTS-only hits to participate in MMR diversity ranking. * **src/worker_store.rs** — fts_search(db, match_expr, limit) executes BM25-ranked queries against memories_fts, returns ids in rank order. * **src/worker_mcp.rs** — tool_recall_check rewritten: 1. Vectorize query (existing) + FTS query in parallel-ish (sequential, ~5ms saved isn't worth the complexity). 2. FTS-only hits get vectors via get_by_ids, cosine computed, threshold-filtered. Lexical hits no longer invisible to MMR. 3. RRF fuses the two rankings; pool narrows to fused top-(limit*3). The narrowing is what gives fusion influence over the final ranking — without it, MMR's cosine-relevance metric would dominate and FTS contribution would be wasted. 4. CLA-108 metadata filters apply (unchanged). 5. MMR rerank within the narrowed pool (unchanged). 6. Header surfaces active fts_weight for transparency. ## Tunable knob `ONEIRO_HYBRID_FTS_WEIGHT` env var (default 1.0) scales the lexical leg's RRF contribution. Set to 0.0 to disable FTS entirely (degenerates to pre-CLA-109 semantic-only behaviour) — kill switch in case hybrid regresses on some query class. Set > 1.0 to over-weight lexical. Tunable without redeploy. ## Risk / migration notes The 0007 migration is non-destructive (additive: new table + triggers + backfill INSERT). Existing memories get indexed in place. Rollback path exists (DROP TRIGGER + DROP TABLE) but migrations are one-way in practice — branch can be abandoned cheaply up to ship. recall_orient is unchanged — it's fixed-recent + orientation, not topic-driven, so hybrid retrieval doesn't apply. 154 cargo tests pass (was 141 + 13 new). cargo check wasm32 clean. worker-build --release produces 28.1KB bundle.

coderabbitai · 2026-05-23T00:57:29Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 24bab93f-0607-4fad-8fc6-88182dbb73cc

📥 Commits

Reviewing files that changed from the base of the PR and between 343dc81 and f0e0cda.

📒 Files selected for processing (2)

src/worker_mcp.rs
src/worker_orient.rs

📝 Walkthrough

Walkthrough

Adds hybrid retrieval: an FTS5 virtual table and triggers, hybrid primitives (cosine similarity, FTS query builder, RRF fusion), storage/vector batch APIs, and integration into recall_check to optionally fuse FTS and vector results controlled by ONEIRO_HYBRID_FTS_WEIGHT.

Changes

Hybrid FTS+Vector Retrieval

Layer / File(s)	Summary
FTS5 Database Schema & Synchronization `migrations/0007_fts5_hybrid_retrieval.sql`	SQLite migration adds `memories_fts` virtual table indexing content, summary, entity, and tags; creates INSERT/UPDATE/DELETE triggers to keep FTS table synchronized; backfills existing memories with NULL-safe coalescing.
Hybrid Retrieval Primitives `src/hybrid.rs`	Core algorithms for hybrid retrieval: `DEFAULT_RRF_K` constant, `cosine_similarity` with zero-denominator handling, `build_fts_query` for tokenizing and quoting FTS expressions, `rrf_fuse` for reciprocal rank fusion with configurable FTS weight, and unit tests validating edge cases and fusion behavior.
Module Infrastructure `src/lib.rs`, `src/main.rs`	Rust module declarations in `lib.rs` and `main.rs` enable compilation and use of the `hybrid` module throughout the codebase.
Storage & Vector Batch APIs `src/worker_store.rs`, `src/worker_vectorize.rs`	`fts_search` in `worker_store` queries the `memories_fts` virtual table with BM25 ranking and returns ids; `worker_vectorize` adds `getByIds` FFI binding, `StoredVector` struct, and `get_by_ids` function for bulk vector lookups by id.
Shared Cosine Similarity `src/worker_mmr.rs`	`worker_mmr` replaces its local `cosine_similarity` implementation with an import from `crate::hybrid`, consolidating similarity scoring logic into the shared module.
Hybrid recall_check Integration `src/worker_mcp.rs`, `src/worker_orient.rs`	`tool_recall_check` gains optional hybrid retrieval: environment variable `ONEIRO_HYBRID_FTS_WEIGHT` enables FTS leg, bridges FTS-only results via vector batch lookup and cosine filtering, fuses rankings with RRF to narrow the candidate pool, applies metadata filters and MMR reranking on the fused pool, and extends output headers with hybrid info and per-memory similarity scores; memory formatting adds an `

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

JuzzyDee/oneiro#38: Overlaps with changes to src/worker_mcp.rs's tool_recall_check retrieval and filtering pipeline; both modify candidate selection and reranking.

Poem

🐰 I hopped through tokens, vectors, and rhyme,

I quoted the query, sanitized in time,
I fused the ranks with reciprocal cheer,
Now memories leap when the rabbit draws near,
A hop, a match, a cosine-sparked rhyme.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main change: implementing hybrid retrieval combining FTS5, Vectorize, and RRF fusion for the recall_check feature.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch juzzydee/cla-109-hybrid-retrieval-d1-fts5-vectorize-score-fusion-for

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

src/hybrid.rs (1)

45-60: 💤 Low value

Consider adding a debug assertion for vector length equality.

The doc comment states "equal-length vectors" but mismatched lengths silently compute over the truncated range via zip(). A debug_assert_eq! is zero-cost in release and catches integration bugs early.
🛡️ Suggested defensive assertion
 pub fn cosine_similarity(a: &[f64], b: &[f64]) -> f64 {
+    debug_assert_eq!(a.len(), b.len(), "cosine_similarity requires equal-length vectors");
     let mut dot = 0.0;
     let mut na = 0.0;
     let mut nb = 0.0;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hybrid.rs` around lines 45 - 60, The function cosine_similarity currently
uses zip() which truncates mismatched slices; add a defensive debug assertion at
the start of cosine_similarity to ensure equal-length vectors (e.g.,
debug_assert_eq!(a.len(), b.len());) so integration bugs are caught in debug
builds without impacting release performance, leaving the rest of the function
unchanged.

src/worker_mcp.rs (1)

756-768: 💤 Low value

Consider clamping or rejecting negative FTS weights.

A negative fts_weight would invert the FTS contribution in RRF fusion (subtracting instead of adding), which is likely unintended. Consider clamping to 0.0 minimum or logging a warning.

🛡️ Suggested fix

 fn read_fts_weight(env: &Env) -> f64 {
     let raw = match env.var("ONEIRO_HYBRID_FTS_WEIGHT") {
         Ok(v) => v.to_string(),
         Err(_) => return 1.0,
     };
-    raw.parse::<f64>().unwrap_or_else(|_| {
+    let weight = raw.parse::<f64>().unwrap_or_else(|_| {
         worker::console_error!(
             "ONEIRO_HYBRID_FTS_WEIGHT={:?} unparseable; using default 1.0",
             raw
         );
         1.0
-    })
+    });
+    if weight < 0.0 {
+        worker::console_error!(
+            "ONEIRO_HYBRID_FTS_WEIGHT={} negative; clamping to 0.0",
+            weight
+        );
+        return 0.0;
+    }
+    weight
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/worker_mcp.rs` around lines 756 - 768, The read_fts_weight function
currently returns whatever parsed f64 from ONEIRO_HYBRID_FTS_WEIGHT (or default
1.0); change it to clamp negative values to 0.0 and emit a warning when a
negative value is provided: in read_fts_weight, after parsing the raw string
(the current parse::<f64>().unwrap_or_else branch), check the parsed value and
if v < 0.0 call worker::console_warn! with the raw value and return 0.0,
otherwise return v; keep the existing behavior of returning 1.0 when the env var
is missing or unparseable but still warn when unparseable as already done.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/worker_store.rs`:
- Around line 333-335: Trim the match_expr value before checking emptiness and
before sending to D1 so whitespace-only input is treated as empty; replace uses
of match_expr.is_empty() with a trimmed check (e.g., let trimmed =
match_expr.trim(); if trimmed.is_empty() ...) and use trimmed (or reassign) for
subsequent logic and the D1 MATCH call, and apply the same change to the second
occurrence that currently checks match_expr at the other spot.

---

Nitpick comments:
In `@src/hybrid.rs`:
- Around line 45-60: The function cosine_similarity currently uses zip() which
truncates mismatched slices; add a defensive debug assertion at the start of
cosine_similarity to ensure equal-length vectors (e.g.,
debug_assert_eq!(a.len(), b.len());) so integration bugs are caught in debug
builds without impacting release performance, leaving the rest of the function
unchanged.

In `@src/worker_mcp.rs`:
- Around line 756-768: The read_fts_weight function currently returns whatever
parsed f64 from ONEIRO_HYBRID_FTS_WEIGHT (or default 1.0); change it to clamp
negative values to 0.0 and emit a warning when a negative value is provided: in
read_fts_weight, after parsing the raw string (the current
parse::<f64>().unwrap_or_else branch), check the parsed value and if v < 0.0
call worker::console_warn! with the raw value and return 0.0, otherwise return
v; keep the existing behavior of returning 1.0 when the env var is missing or
unparseable but still warn when unparseable as already done.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: cd4d1584-9278-45fc-ac69-e4e28f57110b

📥 Commits

Reviewing files that changed from the base of the PR and between c4d8423 and 6cf8387.

📒 Files selected for processing (8)

migrations/0007_fts5_hybrid_retrieval.sql
src/hybrid.rs
src/lib.rs
src/main.rs
src/worker_mcp.rs
src/worker_mmr.rs
src/worker_store.rs
src/worker_vectorize.rs

PR #39 review (CodeRabbit). The only current caller (build_fts_query) can't produce whitespace-only output, but fts_search is pub and a future caller might pre-construct a malformed expression. A whitespace-only MATCH would error on FTS5's syntax check rather than degenerating cleanly to empty. Shadow match_expr with its trimmed form so the empty-check and the bind both see the same canonical value.

The hybrid retrieval demo exposed the gap directly: surfacing memories by topic works, but you can't tell which candidates have images attached without calling recall_image speculatively. The "I saw the photo in the bucket, surface it via search" workflow stalls at the "which one has the bytes" step. Adds a terse `| img` indicator to both display paths: * format_memory (worker_orient.rs) — used by recall_orient and recall_specific. Image-carrying rows now render as `[episodic | 5d ago | str:0.50 | id:abc12345 | img]`. * tool_recall_check inline render (worker_mcp.rs) — image-carrying rows now render as `[sim:0.77 | str:1.00 | abc12345 | img]`. Both formats only emit the suffix when m.image_hash.is_some() — empty case is byte-identical to pre-change output, so nothing else has to change. Reader sees `img` and knows recall_image will succeed against the id without a speculative round-trip.

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

Comment thread src/worker_store.rs

Justin Davis added 2 commits May 23, 2026 11:20

JuzzyDee merged commit 6d1ca11 into dev May 23, 2026
6 checks passed

JuzzyDee deleted the juzzydee/cla-109-hybrid-retrieval-d1-fts5-vectorize-score-fusion-for branch May 23, 2026 01:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(recall_check): hybrid retrieval — FTS5 + Vectorize + RRF fusion (CLA-109)#39

feat(recall_check): hybrid retrieval — FTS5 + Vectorize + RRF fusion (CLA-109)#39
JuzzyDee merged 3 commits into
devfrom
juzzydee/cla-109-hybrid-retrieval-d1-fts5-vectorize-score-fusion-for

JuzzyDee commented May 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 23, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JuzzyDee commented May 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Pieces

Tunable knob

Migration / risk

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JuzzyDee commented May 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 23, 2026 •

edited

Loading