Skip to content

feat(recall_check): hybrid retrieval — FTS5 + Vectorize + RRF fusion (CLA-109)#39

Merged
JuzzyDee merged 3 commits into
devfrom
juzzydee/cla-109-hybrid-retrieval-d1-fts5-vectorize-score-fusion-for
May 23, 2026
Merged

feat(recall_check): hybrid retrieval — FTS5 + Vectorize + RRF fusion (CLA-109)#39
JuzzyDee merged 3 commits into
devfrom
juzzydee/cla-109-hybrid-retrieval-d1-fts5-vectorize-score-fusion-for

Conversation

@JuzzyDee

@JuzzyDee JuzzyDee commented May 23, 2026

Copy link
Copy Markdown
Owner

Summary

The 0.3 headline. recall_check now runs lexical (BM25 via SQLite FTS5) and semantic (cosine via Vectorize) searches in parallel, fuses their rankings with Reciprocal Rank Fusion, and returns the unified top-N. Lexical catches exact-term hits cosine misses (names, jargon, unique phrases); semantic catches conceptual hits lexical misses. Together substantially better than either alone — the default architecture for production retrieval systems.

Pieces

  • migrations/0007_fts5_hybrid_retrieval.sql — standalone FTS5 virtual table indexing content/summary/entity/tags. Insert/update/delete triggers keep it in sync with memories. Backfill INSERT seeds the index with pre-CLA-109 rows.
  • src/hybrid.rs (new, non-wasm-gated, native-testable) — pure logic module:
    • build_fts_query — free-text → safe FTS5 MATCH expression (tokenise, trim non-alphanumeric edges, quote each token, OR-join). Quoting defends against FTS5 keywords (AND/OR/NOT/NEAR) being misinterpreted as operators.
    • rrf_fuse — Reciprocal Rank Fusion with configurable per-retriever weights. k=60 per Cormack et al. 2009.
    • cosine_similarity — promoted from worker_mmr so both the MMR reranker and the new hybrid path use one source of truth.
  • src/worker_vectorize.rs — adds get_by_ids binding + helper. Lets FTS-only hits get their vectors so they can participate in MMR diversity ranking.
  • src/worker_store.rsfts_search(db, match_expr, limit) executes BM25-ranked queries against memories_fts.
  • src/worker_mcp.rstool_recall_check rewritten:
    1. Vectorize query (existing) + FTS query.
    2. FTS-only hits get vectors via get_by_ids, cosine computed, threshold-filtered. Lexical hits no longer invisible to MMR.
    3. RRF fuses the two rankings; pool narrows to fused top-(limit×3) — the narrowing is what gives fusion influence over the final ranking.
    4. CLA-108 metadata filters apply (unchanged).
    5. MMR rerank within the narrowed pool (unchanged).
    6. Header surfaces active fts_weight for transparency.

Tunable knob

ONEIRO_HYBRID_FTS_WEIGHT env var (default 1.0) scales the lexical leg's RRF contribution.

  • 0.0 → disables FTS entirely (degenerates to pre-CLA-109 semantic-only behaviour). Kill switch in case hybrid regresses on some query class.
  • 1.0 → equal weighting (default, vector-baseline).
  • > 1.0 → over-weight lexical.

Tunable via wrangler secret put without redeploy or schema change.

Migration / risk

The 0007 migration is non-destructive (additive: new virtual table + triggers + backfill INSERT). Existing memories get indexed in place. Smoke-tested locally end-to-end against a fresh SQLite copy of the schema:

  • Triggers fire on insert/update/delete ✓
  • Partial content updates preserve other indexed fields ✓
  • Column-scoped queries work (entity:chopper) ✓
  • Multi-token OR queries return ranked merged hits ✓

The migration is the one piece that's annoying to undo — but a rollback path exists (DROP TRIGGER + DROP TABLE). Branch can be abandoned cheaply up to ship; once wrangler d1 migrations apply runs in prod, we're committed.

recall_orient is unchanged — it's fixed-recent + orientation, not topic-driven, so hybrid doesn't apply.

Test plan

  • cargo test — 154 pass (was 141; +13 new hybrid::tests covering tokenisation edge cases, RRF correctness, weight=0 degeneration, k constant effect)
  • cargo check --target wasm32-unknown-unknown --lib clean
  • worker-build --release produces wasm bundle (28.1KB)
  • Migration end-to-end test: applied against fresh local SQLite, inserted/updated/deleted test rows, verified triggers + backfill + BM25 query work
  • Post-merge smoke test on live worker:
    • Exact-name query ("Chopper") — does the named memory now rank top?
    • Jargon query ("Hebbian") — does the specific term hit?
    • Conceptual query ("how do we handle errors") — no regression vs vector-only?
    • Kill switch: wrangler secret put ONEIRO_HYBRID_FTS_WEIGHT0 → recall_check behaves identically to pre-CLA-109? Then set back to 1.0.

Closes CLA-109. Unblocks CLA-110 (cross-encoder reranker on top of the hybrid pool).

Summary by CodeRabbit

  • New Features

    • Hybrid search combining full-text and vector retrieval with configurable FTS weight
    • Reciprocal Rank Fusion merges FTS and vector results for improved ranking
    • FTS-only hits are bridged into the vector pipeline to increase recall
  • Database

    • Added full-text search index for memories and backfilled existing data
  • UI / Output

    • Results show per-item similarity scores and a terse "| img" indicator for items with images
  • Refactor

    • Shared cosine-similarity logic unified across retrieval and reranking

Review Change Stack

…(CLA-109)

Hybrid search adds BM25-ranked lexical matching alongside the existing
semantic (cosine) search. Lexical catches exact-term hits cosine misses
(names, jargon, unique phrases); semantic catches conceptual hits
lexical misses. Together substantially better than either alone, and
the default architecture for production retrieval systems.

## Pieces

* **migrations/0007_fts5_hybrid_retrieval.sql** — standalone FTS5 virtual
  table indexing content/summary/entity/tags. Insert/update/delete
  triggers keep it in sync with memories. Backfill INSERT seeds the
  index with pre-CLA-109 rows. Smoke-tested locally end-to-end: triggers
  fire correctly on each mutation; partial content updates preserve
  other indexed fields; column-scoped queries work.

* **src/hybrid.rs** (new, non-wasm-gated) — pure-logic module with
  build_fts_query (free-text → safe FTS5 MATCH expression: tokenise,
  trim non-alphanumeric edges, quote each token, OR-join — quoting
  defends against FTS5 keywords like AND/OR/NOT/NEAR), rrf_fuse
  (Reciprocal Rank Fusion with configurable per-retriever weights;
  default k=60 per Cormack et al. 2009), and cosine_similarity
  (shared with worker_mmr, single source of truth). 13 unit tests
  covering tokenisation edge cases, RRF correctness, weight=0
  degeneration, k constant effect.

* **src/worker_vectorize.rs** — adds get_by_ids binding + helper for
  fetching stored vectors by id (no query, no score). Enables FTS-only
  hits to participate in MMR diversity ranking.

* **src/worker_store.rs** — fts_search(db, match_expr, limit) executes
  BM25-ranked queries against memories_fts, returns ids in rank order.

* **src/worker_mcp.rs** — tool_recall_check rewritten:
  1. Vectorize query (existing) + FTS query in parallel-ish (sequential,
     ~5ms saved isn't worth the complexity).
  2. FTS-only hits get vectors via get_by_ids, cosine computed,
     threshold-filtered. Lexical hits no longer invisible to MMR.
  3. RRF fuses the two rankings; pool narrows to fused top-(limit*3).
     The narrowing is what gives fusion influence over the final
     ranking — without it, MMR's cosine-relevance metric would
     dominate and FTS contribution would be wasted.
  4. CLA-108 metadata filters apply (unchanged).
  5. MMR rerank within the narrowed pool (unchanged).
  6. Header surfaces active fts_weight for transparency.

## Tunable knob

`ONEIRO_HYBRID_FTS_WEIGHT` env var (default 1.0) scales the lexical leg's
RRF contribution. Set to 0.0 to disable FTS entirely (degenerates to
pre-CLA-109 semantic-only behaviour) — kill switch in case hybrid
regresses on some query class. Set > 1.0 to over-weight lexical. Tunable
without redeploy.

## Risk / migration notes

The 0007 migration is non-destructive (additive: new table + triggers +
backfill INSERT). Existing memories get indexed in place. Rollback path
exists (DROP TRIGGER + DROP TABLE) but migrations are one-way in
practice — branch can be abandoned cheaply up to ship.

recall_orient is unchanged — it's fixed-recent + orientation, not
topic-driven, so hybrid retrieval doesn't apply.

154 cargo tests pass (was 141 + 13 new). cargo check wasm32 clean.
worker-build --release produces 28.1KB bundle.
@coderabbitai

coderabbitai Bot commented May 23, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 24bab93f-0607-4fad-8fc6-88182dbb73cc

📥 Commits

Reviewing files that changed from the base of the PR and between 343dc81 and f0e0cda.

📒 Files selected for processing (2)
  • src/worker_mcp.rs
  • src/worker_orient.rs

📝 Walkthrough

Walkthrough

Adds hybrid retrieval: an FTS5 virtual table and triggers, hybrid primitives (cosine similarity, FTS query builder, RRF fusion), storage/vector batch APIs, and integration into recall_check to optionally fuse FTS and vector results controlled by ONEIRO_HYBRID_FTS_WEIGHT.

Changes

Hybrid FTS+Vector Retrieval

Layer / File(s) Summary
FTS5 Database Schema & Synchronization
migrations/0007_fts5_hybrid_retrieval.sql
SQLite migration adds memories_fts virtual table indexing content, summary, entity, and tags; creates INSERT/UPDATE/DELETE triggers to keep FTS table synchronized; backfills existing memories with NULL-safe coalescing.
Hybrid Retrieval Primitives
src/hybrid.rs
Core algorithms for hybrid retrieval: DEFAULT_RRF_K constant, cosine_similarity with zero-denominator handling, build_fts_query for tokenizing and quoting FTS expressions, rrf_fuse for reciprocal rank fusion with configurable FTS weight, and unit tests validating edge cases and fusion behavior.
Module Infrastructure
src/lib.rs, src/main.rs
Rust module declarations in lib.rs and main.rs enable compilation and use of the hybrid module throughout the codebase.
Storage & Vector Batch APIs
src/worker_store.rs, src/worker_vectorize.rs
fts_search in worker_store queries the memories_fts virtual table with BM25 ranking and returns ids; worker_vectorize adds getByIds FFI binding, StoredVector struct, and get_by_ids function for bulk vector lookups by id.
Shared Cosine Similarity
src/worker_mmr.rs
worker_mmr replaces its local cosine_similarity implementation with an import from crate::hybrid, consolidating similarity scoring logic into the shared module.
Hybrid recall_check Integration
src/worker_mcp.rs, src/worker_orient.rs
tool_recall_check gains optional hybrid retrieval: environment variable ONEIRO_HYBRID_FTS_WEIGHT enables FTS leg, bridges FTS-only results via vector batch lookup and cosine filtering, fuses rankings with RRF to narrow the candidate pool, applies metadata filters and MMR reranking on the fused pool, and extends output headers with hybrid info and per-memory similarity scores; memory formatting adds an `

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • JuzzyDee/oneiro#38: Overlaps with changes to src/worker_mcp.rs's tool_recall_check retrieval and filtering pipeline; both modify candidate selection and reranking.

Poem

🐰 I hopped through tokens, vectors, and rhyme,

I quoted the query, sanitized in time,
I fused the ranks with reciprocal cheer,
Now memories leap when the rabbit draws near,
A hop, a match, a cosine-sparked rhyme.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: implementing hybrid retrieval combining FTS5, Vectorize, and RRF fusion for the recall_check feature.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch juzzydee/cla-109-hybrid-retrieval-d1-fts5-vectorize-score-fusion-for

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/hybrid.rs (1)

45-60: 💤 Low value

Consider adding a debug assertion for vector length equality.

The doc comment states "equal-length vectors" but mismatched lengths silently compute over the truncated range via zip(). A debug_assert_eq! is zero-cost in release and catches integration bugs early.

🛡️ Suggested defensive assertion
 pub fn cosine_similarity(a: &[f64], b: &[f64]) -> f64 {
+    debug_assert_eq!(a.len(), b.len(), "cosine_similarity requires equal-length vectors");
     let mut dot = 0.0;
     let mut na = 0.0;
     let mut nb = 0.0;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hybrid.rs` around lines 45 - 60, The function cosine_similarity currently
uses zip() which truncates mismatched slices; add a defensive debug assertion at
the start of cosine_similarity to ensure equal-length vectors (e.g.,
debug_assert_eq!(a.len(), b.len());) so integration bugs are caught in debug
builds without impacting release performance, leaving the rest of the function
unchanged.
src/worker_mcp.rs (1)

756-768: 💤 Low value

Consider clamping or rejecting negative FTS weights.

A negative fts_weight would invert the FTS contribution in RRF fusion (subtracting instead of adding), which is likely unintended. Consider clamping to 0.0 minimum or logging a warning.

🛡️ Suggested fix
 fn read_fts_weight(env: &Env) -> f64 {
     let raw = match env.var("ONEIRO_HYBRID_FTS_WEIGHT") {
         Ok(v) => v.to_string(),
         Err(_) => return 1.0,
     };
-    raw.parse::<f64>().unwrap_or_else(|_| {
+    let weight = raw.parse::<f64>().unwrap_or_else(|_| {
         worker::console_error!(
             "ONEIRO_HYBRID_FTS_WEIGHT={:?} unparseable; using default 1.0",
             raw
         );
         1.0
-    })
+    });
+    if weight < 0.0 {
+        worker::console_error!(
+            "ONEIRO_HYBRID_FTS_WEIGHT={} negative; clamping to 0.0",
+            weight
+        );
+        return 0.0;
+    }
+    weight
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/worker_mcp.rs` around lines 756 - 768, The read_fts_weight function
currently returns whatever parsed f64 from ONEIRO_HYBRID_FTS_WEIGHT (or default
1.0); change it to clamp negative values to 0.0 and emit a warning when a
negative value is provided: in read_fts_weight, after parsing the raw string
(the current parse::<f64>().unwrap_or_else branch), check the parsed value and
if v < 0.0 call worker::console_warn! with the raw value and return 0.0,
otherwise return v; keep the existing behavior of returning 1.0 when the env var
is missing or unparseable but still warn when unparseable as already done.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/worker_store.rs`:
- Around line 333-335: Trim the match_expr value before checking emptiness and
before sending to D1 so whitespace-only input is treated as empty; replace uses
of match_expr.is_empty() with a trimmed check (e.g., let trimmed =
match_expr.trim(); if trimmed.is_empty() ...) and use trimmed (or reassign) for
subsequent logic and the D1 MATCH call, and apply the same change to the second
occurrence that currently checks match_expr at the other spot.

---

Nitpick comments:
In `@src/hybrid.rs`:
- Around line 45-60: The function cosine_similarity currently uses zip() which
truncates mismatched slices; add a defensive debug assertion at the start of
cosine_similarity to ensure equal-length vectors (e.g.,
debug_assert_eq!(a.len(), b.len());) so integration bugs are caught in debug
builds without impacting release performance, leaving the rest of the function
unchanged.

In `@src/worker_mcp.rs`:
- Around line 756-768: The read_fts_weight function currently returns whatever
parsed f64 from ONEIRO_HYBRID_FTS_WEIGHT (or default 1.0); change it to clamp
negative values to 0.0 and emit a warning when a negative value is provided: in
read_fts_weight, after parsing the raw string (the current
parse::<f64>().unwrap_or_else branch), check the parsed value and if v < 0.0
call worker::console_warn! with the raw value and return 0.0, otherwise return
v; keep the existing behavior of returning 1.0 when the env var is missing or
unparseable but still warn when unparseable as already done.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: cd4d1584-9278-45fc-ac69-e4e28f57110b

📥 Commits

Reviewing files that changed from the base of the PR and between c4d8423 and 6cf8387.

📒 Files selected for processing (8)
  • migrations/0007_fts5_hybrid_retrieval.sql
  • src/hybrid.rs
  • src/lib.rs
  • src/main.rs
  • src/worker_mcp.rs
  • src/worker_mmr.rs
  • src/worker_store.rs
  • src/worker_vectorize.rs

Comment thread src/worker_store.rs
Justin Davis added 2 commits May 23, 2026 11:20
PR #39 review (CodeRabbit). The only current caller (build_fts_query)
can't produce whitespace-only output, but fts_search is pub and a
future caller might pre-construct a malformed expression. A
whitespace-only MATCH would error on FTS5's syntax check rather than
degenerating cleanly to empty.

Shadow match_expr with its trimmed form so the empty-check and the
bind both see the same canonical value.
The hybrid retrieval demo exposed the gap directly: surfacing memories
by topic works, but you can't tell which candidates have images
attached without calling recall_image speculatively. The "I saw the
photo in the bucket, surface it via search" workflow stalls at the
"which one has the bytes" step.

Adds a terse `| img` indicator to both display paths:

  * format_memory (worker_orient.rs) — used by recall_orient and
    recall_specific. Image-carrying rows now render as
    `[episodic | 5d ago | str:0.50 | id:abc12345 | img]`.

  * tool_recall_check inline render (worker_mcp.rs) — image-carrying
    rows now render as `[sim:0.77 | str:1.00 | abc12345 | img]`.

Both formats only emit the suffix when m.image_hash.is_some() — empty
case is byte-identical to pre-change output, so nothing else has to
change. Reader sees `img` and knows recall_image will succeed against
the id without a speculative round-trip.
@JuzzyDee JuzzyDee merged commit 6d1ca11 into dev May 23, 2026
6 checks passed
@JuzzyDee JuzzyDee deleted the juzzydee/cla-109-hybrid-retrieval-d1-fts5-vectorize-score-fusion-for branch May 23, 2026 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant