Skip to content

fix(sync): detect HEAD-moving git operations to prevent stale index#100

Open
andreinknv wants to merge 1 commit into
colbymchenry:mainfrom
andreinknv:fix/sync-detect-head-movement
Open

fix(sync): detect HEAD-moving git operations to prevent stale index#100
andreinknv wants to merge 1 commit into
colbymchenry:mainfrom
andreinknv:fix/sync-detect-head-movement

Conversation

@andreinknv
Copy link
Copy Markdown
Contributor

Summary

Sync silently leaves the index stale after any HEAD-moving git operation (merge, pull, checkout, rebase, reset, even post-commit) because git status --porcelain only reports working-tree dirtiness vs HEAD. When the working tree is clean, sync used to short-circuit and report nothing changed, even though the DB still held pre-operation content hashes. MCP queries then returned stale call graphs and missing or deleted symbols to AI assistants with no warning.

What changed

  • getGitChangedFiles now also unions git diff --name-status <lastSyncedHead>..HEAD into the change set, catching commits that arrived without dirtying the working tree.
  • last_synced_head is recorded in the existing project_metadata table after every successful indexAll and sync. No schema change.
  • If the recorded HEAD is unreachable (force-push, gc), the function returns needsFullReindex: true and the caller drops to the filesystem-scan fallback, which is correct regardless of git history state.
  • Same wiring applied to getChangedFiles() so MCP staleness signals stay accurate.
  • 5 regression tests covering: git merge, git checkout to a branch with diverged content, committed deletion, unreachable last-synced HEAD (amended commit), and the no-op clean-tree sanity case.

Affected scenarios (before fix)

Anyone working on a team or across branches:

  • git pull from a remote — every pull leaves the index stale
  • git checkout <branch> — every branch switch
  • git rebase / git reset --hard
  • git commit followed by a post-commit-style auto-sync — also affected, since by post-commit time the working tree is clean

Single-developer single-branch workflows that only edit-then-sync without git operations were unaffected.

Test plan

  • npm test — 385/385 pass on macOS
  • New regression tests in __tests__/sync.test.ts cover the four failure modes plus a clean-tree no-op sanity check
  • Pre-existing fs.watch flake (should trigger sync after file change) is unrelated and passes when the file is run in isolation

🤖 Generated with Claude Code

Sync used to rely solely on `git status --porcelain` for change detection,
which only reports working-tree dirtiness vs HEAD. After a `git merge`
(or pull, checkout, rebase, reset, post-commit), the working tree is
clean and `git status` reports nothing, so sync silently became a no-op
while the DB still held pre-operation content hashes. MCP queries then
served stale data with no warning.

Sync now records the HEAD SHA it was last synced against (in the existing
project_metadata table) and, when current HEAD differs, unions
`git diff --name-status <last>..HEAD` into the changed-file set. If the
recorded HEAD is unreachable (force-push, gc), sync falls back to the
filesystem scan path, which is correct regardless of git history state.

The same fix is applied to getChangedFiles() so MCP staleness signals
stay accurate.

Adds 5 regression tests covering merge, branch checkout, committed
deletion, unreachable last-synced HEAD, and the no-op clean-tree case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant