feat(memory): add local embedding provider via transformers.js#95
Merged
Conversation
Wire EMBEDDING_PROVIDER=local to an in-process embedding adapter so PR-memory works without an external embedding API. LocalEmbeddingAdapter runs a sentence-transformers model (default Xenova/all-MiniLM-L6-v2, 384-dim) through @huggingface/transformers, forcing the WASM backend and a single thread because the Alpine runner image cannot load the native onnxruntime-node binary. The memory module now selects the embedding adapter from EMBEDDING_PROVIDER instead of hard-wiring Voyage, and EMBEDDING_MODEL becomes optional with a per-adapter default. A migration aligns pr_memory.embedding with EMBEDDING_DIMENSIONS, acting only when the dimension actually changes and truncating the (model-specific, now-incompatible) vectors since the column is NOT NULL. A persistent model-cache volume keeps the downloaded model across restarts.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Makes
EMBEDDING_PROVIDER=localreal. PR-memory can now embed in-process with no external API key.Why
The
localenum value existed but was never wired, the memory module hard-coded the Voyage adapter, so PR-memory always required aVOYAGE_API_KEY. This adds a true local path for self-hosted deployments that don't want an external embedding API.How
LocalEmbeddingAdapterruns a sentence-transformers model through@huggingface/transformers(defaultXenova/all-MiniLM-L6-v2, 384-dim). It forces the WASM backend + single thread because the Alpine runner image can't load the nativeonnxruntime-nodebinary.memory.module.tsnow selects the embedding adapter fromEMBEDDING_PROVIDER(factory) instead of hard-wiring Voyage.EMBEDDING_MODELbecomes optional with a per-adapter default (voyage-3-lite/Xenova/all-MiniLM-L6-v2).SyncEmbeddingDimensionalignspr_memory.embeddingwithEMBEDDING_DIMENSIONS. It acts only when the dimension actually changes and truncates the (now dimension-incompatible) vectors, since the column isNOT NULL. No-op for existing same-dimension deployments.modelcachevolume so the model downloads once.Config
Tests
4 new unit tests for the adapter (default model, model override + pipeline caching, single-vector
embed, error wrapping). Full suite 174/174,tsc/eslint/nest buildclean.Note
The advisories from
npm audit(brace-expansion, esbuild/vite/vitepress) are pre-existing dev-dependency issues from eslint and vitepress, not introduced here. The WASM-on-Alpine path is verified at deploy time with an in-container smoke embed.