feat(memory): semantic memory Pillar A — embed + ask + model2vec backend (0.15.0)#32
Merged
Merged
Conversation
…d command (P0) Pillar A foundation for semantic memory (epic claude-memory-wao, bd claude-memory-21y). Dependency-free and offline by default. tj-core: - embed.rs: Embedder trait (embed/model_id/dim + embed_one), cosine similarity, f32<->BLOB codec, is_embeddable filter, and HashEmbedder — a deterministic feature-hashing embedder (L2-normalised bag-of-words) that needs zero deps and serves as both the test fake and the default backend. default_embedder() is the single swap point where the model2vec backend will plug in behind the `embed` feature as a quality upgrade. - schema v008: embeddings(event_id, task_id, project_hash, tier, model, dim, vec BLOB, created_at) + events_index.memory_tier. Purely additive. - db: events_needing_embedding (model-scoped so a model change re-embeds), upsert_embedding (idempotent), count_embeddings, embed_pending (batched orchestration shared by ingest + backfill). tj-cli: - `task-journal embed [--backfill]`: vectorise new events, or the whole project history. Fully offline with the hash embedder. Tests: embed math + HashEmbedder ranking (7), embed_pending idempotency + model-scoped re-embed (1 db), CLI embed --backfill round-trip (1). All green. The model2vec semantic backend lands next as an isolated feature. (Pre-existing WSL-only project_hash /tmp collision test failure is unrelated; passes in CI.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pillar A retrieval on top of the v008 embeddings (bd claude-memory-hdq). tj-core: - db::semantic_search: scores every stored vector for the model against the query vector by cosine, returns the top-k as ScoredHit (event, task, type, tier, text, score). Recency/tier/contradiction weighting layers on later. tj-cli: - `task-journal ask "<query>" [--k N]`: embeds the query, embeds any new events on the fly (embed-on-ask, so the index self-maintains), and prints the most relevant events by score. Offline. Tests: semantic_search ranks the relevant event first + sorted (1 db); `ask` end-to-end ranking via CLI (1). Green. Note: the default hash embedder is lexical (bag-of-words, no stemming) — it ranks by term overlap, roughly FTS5-grade. True paraphrase/morphology semantics require the model2vec backend, which lands next behind the `embed` feature and is what lets retrieval actually beat keyword search. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The real semantic backend for Pillar A. `ask`/`embed` now use a pure-Rust model2vec static model by default, giving paraphrase- and morphology-robust recall that the lexical hash embedder can't match. - embed.rs: Model2VecEmbedder (behind the default `embed` feature) loads minishlab/potion-multilingual-128M (multilingual — RU/EN both work, no onnxruntime), downloaded once via hf-hub and cached. default_embedder() tries it and falls back to HashEmbedder on any failure (offline, no model, or TJ_EMBED=hash) so retrieval never breaks. Model overridable via TJ_EMBED_MODEL. - Feature plumbing: model2vec-rs is optional, pinned to fancy-regex (pure Rust, no native oniguruma). tj-cli re-exposes `embed` as a default feature; tj-mcp stays lean. `--no-default-features` builds the dependency-free lexical configuration. - CI: the msrv (1.88) job now builds the lean configuration; the `embed` feature (newer toolchain via tokenizers/hf-hub) is covered by the stable test jobs. - Inter-crate version reqs bumped to 0.15.0 so the release resolves. Verified manually: query "duplicate refund payments" (no shared term) ranks the "idempotent payment ledger" decision first at 0.617, far above unrelated events — a paraphrase the lexical embedder scored ~0 on. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Pillar A of the memory-platform epic (
claude-memory-wao): the journal can now retrieve events by meaning, not just keyword. This is the foundation for replacing claude-mem/mem0.Three stacked commits:
claude-memory-21y):Embeddertrait + cosine + f32↔BLOB codec +HashEmbedder(lexical, zero-dep), schema v008 (embeddingstable +memory_tier),db::embed_pending, andtask-journal embed [--backfill].claude-memory-hdq):db::semantic_search(cosine top-k) +task-journal ask "<query>" [--k N]with embed-on-ask (the index self-maintains).minishlab/potion-multilingual-128M(multilingual, no onnxruntime), downloaded once via hf-hub and cached. Falls back to the lexical embedder on any failure so retrieval never breaks.Design choices
fancy-regex(pure Rust), not its defaultonig(C oniguruma).default_embedder()tries model2vec, falls back toHashEmbedderoffline / onTJ_EMBED=hash.--no-default-features= lean lexical build.embedfeature (newer toolchain) is covered by the stable test jobs.model+dim; a model change re-embeds cleanly (no cross-model comparison).Verification
Manual model2vec smoke (also an
#[ignore]test): query"duplicate refund payments"— which shares no exact term with the target — ranks the "idempotent payment ledger" decision first at 0.617, far above unrelated events. The lexical embedder scores that paraphrase ~0.Tests: embed math + HashEmbedder ranking,
embed_pendingidempotency + model-scoped re-embed,semantic_searchranking, CLIembed --backfill+ask(deterministic viaTJ_EMBED=hash).fmt/clippyclean on default and--no-default-features. Pre-existing WSL-onlyproject_hash/migrate_project/tmp-collision failures are unrelated (green in CI).Note
First
ask/embedon a machine downloads the model (~one-time, cached). Build with--no-default-featuresto opt out entirely.🤖 Generated with Claude Code