feat(indexer): implement true incremental indexing pipeline#8
feat(indexer): implement true incremental indexing pipeline#8
Conversation
Add shared file indexing helpers, replace broad incremental fallback with path-targeted operations, wire watcher metadata for fs/git-head triggers, and add coverage for targeted updates, deletes, and fallback gating. Archive OpenSpec change true-incremental-indexing and sync the new incremental-indexing spec into main specs.
PR Checks Summary
|
There was a problem hiding this comment.
Pull request overview
Implements a true incremental indexing pipeline so daemon-triggered indexing updates only the changed paths (with explicit delete handling), while keeping a narrow full-index fallback for unresolved git-head transitions.
Changes:
- Refactors shared file upsert/delete + chunk/embedding/document generation into
createFileIndexHelpers, used by both full and incremental indexing. - Reworks
runIncrementalIndexto process normalized/coalesced per-path operations and to return detailed per-run counters, with git-head fallback logic. - Updates daemon watcher wiring and adds incremental indexing integration tests + OpenSpec documentation.
Reviewed changes
Copilot reviewed 7 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/incremental-indexing.test.ts | Adds integration tests for targeted updates, delete cleanup, and git-head fallback behavior. |
| src/daemon/main.ts | Passes richer FS/git trigger metadata into incremental indexing and forwards payload to hooks. |
| src/context/indexer/incremental.ts | Implements per-path incremental pipeline, coalescing, git diff resolution, and fallback result reporting. |
| src/context/indexer/full-index.ts | Refactors full indexing to use shared file indexing helpers. |
| src/context/indexer/file-index.ts | New shared helper module for file upsert/delete, chunking, embedding, and bm25 document generation. |
| openspec/specs/incremental-indexing/spec.md | Adds a spec documenting incremental indexing requirements and scenarios. |
| openspec/changes/archive/2026-03-04-true-incremental-indexing/tasks.md | Marks archived change tasks as completed. |
| openspec/changes/archive/2026-03-04-true-incremental-indexing/specs/incremental-indexing/spec.md | Archives the incremental-indexing spec content under the change record. |
| openspec/changes/archive/2026-03-04-true-incremental-indexing/proposal.md | Archives proposal describing motivation and impact. |
| openspec/changes/archive/2026-03-04-true-incremental-indexing/design.md | Archives design decisions, tradeoffs, and migration plan. |
| openspec/changes/archive/2026-03-04-true-incremental-indexing/.openspec.yaml | Adds OpenSpec metadata for the archived change. |
Comments suppressed due to low confidence (3)
src/context/indexer/file-index.ts:113
- Building
absolutePathvia string concatenation can produce mixed separators (and double slashes) depending on platform and incomingrelativePathformat. Usingpath.join(input.repoRoot, relativePath)(and/or normalizing) would be more robust across OSes.
const absolutePath = `${input.repoRoot}/${relativePath}`;
const content = await readFile(absolutePath, 'utf8').catch(() => null);
src/context/indexer/incremental.ts:176
- Incremental operations are only normalized/coalesced here; they are not filtered against the same ignore set used by
runFullIndex(e.g.!coverage/**) and the FS watcher (!node_modules/**,!dist/**,!.git/**). This can cause incremental runs to index generated/ignored paths that full indexing would skip, leading to inconsistent index contents depending on trigger. Consider applying a shared ignore/filter step before upserting/deleting.
function coalesceOperations(
repoRoot: string,
operations: IncrementalPathOperation[]
): IncrementalPathOperation[] {
const byPath = new Map<string, IncrementalPathOperation>();
for (const operation of operations) {
const normalizedPath = normalizeRepoRelativePath(repoRoot, operation.path);
if (!normalizedPath) {
continue;
}
src/context/indexer/incremental.ts:118
- The branch that treats a missing file as a delete is important behavior, but it isn't covered by the new tests (current tests cover explicit delete ops, not an upsert for a now-missing path). Adding a test where an
upsertoperation is passed for a path that has been removed on disk would lock this in.
const upserted = await fileIndexHelpers.upsertIndexedFileByPath(operation.path);
if (upserted.status === 'missing') {
const deleted = await fileIndexHelpers.deleteIndexedFileByPath(operation.path);
counters.filesDeleted += deleted.fileDeleted ? 1 : 0;
counters.chunksDeleted += deleted.chunksDeleted;
continue;
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async function upsertIndexedFileByPath(relativePath: string): Promise<UpsertFileResult> { | ||
| const absolutePath = `${input.repoRoot}/${relativePath}`; | ||
| const content = await readFile(absolutePath, 'utf8').catch(() => null); | ||
| if (!content) { |
There was a problem hiding this comment.
readFile returns an empty string for empty files, but the current if (!content) check treats that as missing and will trigger delete behavior upstream. Consider checking content === null (or catching only ENOENT) so empty files are indexed correctly.
| if (!content) { | |
| if (content === null) { |
Summary
This PR implements true incremental indexing and removes the previous broad full-reindex fallback behavior for normal filesystem changes.
What changed
upsertIndexedFileByPathdeleteIndexedFileByPathunlink) and treat missing files during upsert as deletes.add/change/unlink)true-incremental-indexingand synced capability spec to:openspec/specs/incremental-indexing/spec.mdVerification
pnpm checkspnpm testpnpm typecheckpnpm lintpnpm buildtests/incremental-indexing.test.tscovering:files/chunks/chunk_embeddings/bm25_documentsNotes
🥞 DubStack