fix: call reindexAll() when embedding config changes to preserve vector embeddings#200
Open
gkd2323c wants to merge 2 commits into
Open
fix: call reindexAll() when embedding config changes to preserve vector embeddings#200gkd2323c wants to merge 2 commits into
gkd2323c wants to merge 2 commits into
Conversation
When the embedding provider/model/dimensions config changes, initSchema() correctly detects the change via the needsReindex flag and drops the vector tables. However, tdai-core.ts ignores this flag entirely, so reindexAll() —which already exists and is fully implemented in sqlite.ts—is never called. This means all historical L1 records and L0 messages permanently lose their vector embeddings after a config change. The user must manually re-embed data or wait for new conversations to accumulate vectors. Fix: after initStores() returns, check stores.needsReindex and call reindexAll() with the embedding service to automatically re-embed all existing records. Fixes TencentCloud#164
When the embedding provider/model/dimensions config changes, initSchema() correctly detects the change via the needsReindex flag and drops the vector tables. However, tdai-core.ts ignores this flag entirely, so reindexAll() —which already exists and is fully implemented in sqlite.ts—is never called. This means all historical L1 records and L0 messages permanently lose their vector embeddings after a config change. The user must manually re-embed data or wait for new conversations to accumulate vectors. Fix: after initStores() returns, check stores.needsReindex and call reindexAll() with the embedding service to automatically re-embed all existing records. Fixes TencentCloud#164
Collaborator
|
Good catch on the missing reindexAll() call! When embedding config changes, vectors should be regenerated instead of permanently lost. We'll review it soon. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When the embedding provider/model/dimensions config changes,
initSchema()insqlite.tscorrectly detects the change via theneedsReindexflag and drops the vector tables. However,tdai-core.tsignores this flag entirely —reindexAll()(which is already fully implemented insqlite.ts) is never called.This means all historical L1 records and L0 messages permanently lose their vector embeddings after a config change. Users must either manually re-embed data or wait for new conversations to accumulate vectors.
Root Cause
In
tdai-core.ts, theinitStores()method destructuresvectorStoreandembeddingServicefrom the store initialization result, but silently dropsneedsReindexandreindexReason:The
reindexAll()method insqlite.ts(lines 1810–1880) is complete and handles both L1 and L0 re-embedding with progress callbacks — it just was never wired up.Fix
After
initStores()returns, checkstores.needsReindexand callreindexAll()with the embedding service to automatically re-embed all existing records.Related
reindexAll()method already exists insqlite.tsand is well-tested; this PR simply connects the missing call site.