Skip to content

feat(memory): add local embedding provider via transformers.js#95

Merged
vineethkrishnan merged 1 commit into
mainfrom
feat/local-embeddings
Jun 13, 2026
Merged

feat(memory): add local embedding provider via transformers.js#95
vineethkrishnan merged 1 commit into
mainfrom
feat/local-embeddings

Conversation

@vineethkrishnan

Copy link
Copy Markdown
Owner

What

Makes EMBEDDING_PROVIDER=local real. PR-memory can now embed in-process with no external API key.

Why

The local enum value existed but was never wired, the memory module hard-coded the Voyage adapter, so PR-memory always required a VOYAGE_API_KEY. This adds a true local path for self-hosted deployments that don't want an external embedding API.

How

  • LocalEmbeddingAdapter runs a sentence-transformers model through @huggingface/transformers (default Xenova/all-MiniLM-L6-v2, 384-dim). It forces the WASM backend + single thread because the Alpine runner image can't load the native onnxruntime-node binary.
  • memory.module.ts now selects the embedding adapter from EMBEDDING_PROVIDER (factory) instead of hard-wiring Voyage.
  • EMBEDDING_MODEL becomes optional with a per-adapter default (voyage-3-lite / Xenova/all-MiniLM-L6-v2).
  • Migration SyncEmbeddingDimension aligns pr_memory.embedding with EMBEDDING_DIMENSIONS. It acts only when the dimension actually changes and truncates the (now dimension-incompatible) vectors, since the column is NOT NULL. No-op for existing same-dimension deployments.
  • Persistent modelcache volume so the model downloads once.

Config

EMBEDDING_PROVIDER=local
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2
EMBEDDING_DIMENSIONS=384
EMBEDDING_CACHE_DIR=/app/models

Tests

4 new unit tests for the adapter (default model, model override + pipeline caching, single-vector embed, error wrapping). Full suite 174/174, tsc/eslint/nest build clean.

Note

The advisories from npm audit (brace-expansion, esbuild/vite/vitepress) are pre-existing dev-dependency issues from eslint and vitepress, not introduced here. The WASM-on-Alpine path is verified at deploy time with an in-container smoke embed.

Wire EMBEDDING_PROVIDER=local to an in-process embedding adapter so PR-memory works without an external embedding API. LocalEmbeddingAdapter runs a sentence-transformers model (default Xenova/all-MiniLM-L6-v2, 384-dim) through @huggingface/transformers, forcing the WASM backend and a single thread because the Alpine runner image cannot load the native onnxruntime-node binary.

The memory module now selects the embedding adapter from EMBEDDING_PROVIDER instead of hard-wiring Voyage, and EMBEDDING_MODEL becomes optional with a per-adapter default. A migration aligns pr_memory.embedding with EMBEDDING_DIMENSIONS, acting only when the dimension actually changes and truncating the (model-specific, now-incompatible) vectors since the column is NOT NULL. A persistent model-cache volume keeps the downloaded model across restarts.
@vineethkrishnan vineethkrishnan merged commit cedcbc2 into main Jun 13, 2026
13 checks passed
@vineethkrishnan vineethkrishnan deleted the feat/local-embeddings branch June 13, 2026 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant