Skip to content

feat(memory): semantic memory Pillar A — embed + ask + model2vec backend (0.15.0)#32

Merged
Shahinyanm merged 3 commits into
mainfrom
feat/memory-p0-embedding-substrate
Jun 12, 2026
Merged

feat(memory): semantic memory Pillar A — embed + ask + model2vec backend (0.15.0)#32
Shahinyanm merged 3 commits into
mainfrom
feat/memory-p0-embedding-substrate

Conversation

@Shahinyanm

Copy link
Copy Markdown
Member

What

Pillar A of the memory-platform epic (claude-memory-wao): the journal can now retrieve events by meaning, not just keyword. This is the foundation for replacing claude-mem/mem0.

Three stacked commits:

  • P0 — substrate (claude-memory-21y): Embedder trait + cosine + f32↔BLOB codec + HashEmbedder (lexical, zero-dep), schema v008 (embeddings table + memory_tier), db::embed_pending, and task-journal embed [--backfill].
  • P1 — retrieval (claude-memory-hdq): db::semantic_search (cosine top-k) + task-journal ask "<query>" [--k N] with embed-on-ask (the index self-maintains).
  • model2vec backend, default-on: pure-Rust minishlab/potion-multilingual-128M (multilingual, no onnxruntime), downloaded once via hf-hub and cached. Falls back to the lexical embedder on any failure so retrieval never breaks.

Design choices

  • Zero native deps: model2vec-rs pinned to fancy-regex (pure Rust), not its default onig (C oniguruma).
  • Always works: default_embedder() tries model2vec, falls back to HashEmbedder offline / on TJ_EMBED=hash. --no-default-features = lean lexical build.
  • MSRV: the 1.88 job now builds the lean configuration; the embed feature (newer toolchain) is covered by the stable test jobs.
  • Model versioning: each vector is tagged with model + dim; a model change re-embeds cleanly (no cross-model comparison).

Verification

Manual model2vec smoke (also an #[ignore] test): query "duplicate refund payments" — which shares no exact term with the target — ranks the "idempotent payment ledger" decision first at 0.617, far above unrelated events. The lexical embedder scores that paraphrase ~0.

Tests: embed math + HashEmbedder ranking, embed_pending idempotency + model-scoped re-embed, semantic_search ranking, CLI embed --backfill + ask (deterministic via TJ_EMBED=hash). fmt/clippy clean on default and --no-default-features. Pre-existing WSL-only project_hash/migrate_project /tmp-collision failures are unrelated (green in CI).

Note

First ask/embed on a machine downloads the model (~one-time, cached). Build with --no-default-features to opt out entirely.

🤖 Generated with Claude Code

Shahinyanm and others added 3 commits June 12, 2026 18:08
…d command (P0)

Pillar A foundation for semantic memory (epic claude-memory-wao, bd
claude-memory-21y). Dependency-free and offline by default.

tj-core:
- embed.rs: Embedder trait (embed/model_id/dim + embed_one), cosine
  similarity, f32<->BLOB codec, is_embeddable filter, and HashEmbedder — a
  deterministic feature-hashing embedder (L2-normalised bag-of-words) that
  needs zero deps and serves as both the test fake and the default backend.
  default_embedder() is the single swap point where the model2vec backend
  will plug in behind the `embed` feature as a quality upgrade.
- schema v008: embeddings(event_id, task_id, project_hash, tier, model, dim,
  vec BLOB, created_at) + events_index.memory_tier. Purely additive.
- db: events_needing_embedding (model-scoped so a model change re-embeds),
  upsert_embedding (idempotent), count_embeddings, embed_pending (batched
  orchestration shared by ingest + backfill).

tj-cli:
- `task-journal embed [--backfill]`: vectorise new events, or the whole
  project history. Fully offline with the hash embedder.

Tests: embed math + HashEmbedder ranking (7), embed_pending idempotency +
model-scoped re-embed (1 db), CLI embed --backfill round-trip (1). All green.
The model2vec semantic backend lands next as an isolated feature.

(Pre-existing WSL-only project_hash /tmp collision test failure is unrelated;
passes in CI.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pillar A retrieval on top of the v008 embeddings (bd claude-memory-hdq).

tj-core:
- db::semantic_search: scores every stored vector for the model against the
  query vector by cosine, returns the top-k as ScoredHit (event, task, type,
  tier, text, score). Recency/tier/contradiction weighting layers on later.

tj-cli:
- `task-journal ask "<query>" [--k N]`: embeds the query, embeds any new
  events on the fly (embed-on-ask, so the index self-maintains), and prints
  the most relevant events by score. Offline.

Tests: semantic_search ranks the relevant event first + sorted (1 db); `ask`
end-to-end ranking via CLI (1). Green.

Note: the default hash embedder is lexical (bag-of-words, no stemming) — it
ranks by term overlap, roughly FTS5-grade. True paraphrase/morphology
semantics require the model2vec backend, which lands next behind the `embed`
feature and is what lets retrieval actually beat keyword search.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The real semantic backend for Pillar A. `ask`/`embed` now use a pure-Rust
model2vec static model by default, giving paraphrase- and
morphology-robust recall that the lexical hash embedder can't match.

- embed.rs: Model2VecEmbedder (behind the default `embed` feature) loads
  minishlab/potion-multilingual-128M (multilingual — RU/EN both work, no
  onnxruntime), downloaded once via hf-hub and cached. default_embedder()
  tries it and falls back to HashEmbedder on any failure (offline, no model,
  or TJ_EMBED=hash) so retrieval never breaks. Model overridable via
  TJ_EMBED_MODEL.
- Feature plumbing: model2vec-rs is optional, pinned to fancy-regex (pure
  Rust, no native oniguruma). tj-cli re-exposes `embed` as a default feature;
  tj-mcp stays lean. `--no-default-features` builds the dependency-free
  lexical configuration.
- CI: the msrv (1.88) job now builds the lean configuration; the `embed`
  feature (newer toolchain via tokenizers/hf-hub) is covered by the stable
  test jobs.
- Inter-crate version reqs bumped to 0.15.0 so the release resolves.

Verified manually: query "duplicate refund payments" (no shared term) ranks
the "idempotent payment ledger" decision first at 0.617, far above unrelated
events — a paraphrase the lexical embedder scored ~0 on.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Shahinyanm Shahinyanm merged commit bf02d64 into main Jun 12, 2026
7 checks passed
@Shahinyanm Shahinyanm deleted the feat/memory-p0-embedding-substrate branch June 12, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant