Skip to content

bug-a7d17cd2: incremental reindex propagates track_id changes from bug/feature move#84

Merged
shakestzd merged 2 commits intomainfrom
bug/bug-a7d17cd2-reindex-track-id
May 1, 2026
Merged

bug-a7d17cd2: incremental reindex propagates track_id changes from bug/feature move#84
shakestzd merged 2 commits intomainfrom
bug/bug-a7d17cd2-reindex-track-id

Conversation

@shakestzd
Copy link
Copy Markdown
Owner

Motivation

Found while moving bug-92690d5b to the Yolo Mode track: after running htmlgraph bug move, the context-pack tool showed empty track membership because the SQLite index still had the old track_id. Only htmlgraph reindex --full fixed it. This blocked lineage queries for the entire incremental session window.

Root Cause

gitChangedFiles (called by runIncrementalReindex) collects changed files from two sources:

  1. git diff <lastCommit>..HEAD — committed changes since last reindex
  2. git ls-files --others — newly untracked files

Commands like bug move / feature move modify the HTML file in-place via workitem.Edit().SetTrack().Save() without committing. This means after a move, the changed .htmlgraph/*.html file is a working-tree dirty tracked file — invisible to both sources above. Incremental reindex skips it entirely, leaving track_id stale in SQLite until a --full reindex is run.

The UpsertFeature upsert SQL does include track_id = excluded.track_id in the ON CONFLICT clause, so the upsert path itself is correct — the bug is that the incremental path never reaches the upsert for dirty files.

Fix

Added two functions in cmd/htmlgraph/reindex.go:

  • appendDirtyHTMLFiles: runs git diff --name-only -- <htmlgraphDir> (unstaged) and git diff --cached --name-only -- <htmlgraphDir> (staged) to include working-tree dirty files in the incremental file list. This is the same pattern already used for committed changes, extended to the working-tree layer.
  • deduplicatePaths: prevents double-processing when the same file appears in both the commit diff and the working-tree diff (e.g., a file modified and staged in the same session).

No changes to internal/htmlparse/ or internal/db/feature_repo.go were needed — the parsing and upsert paths were already correct.

Tests

  • TestIncrementalReindex_PropagatesTrackIDAfterMove: seeds a feature with old track, writes updated HTML with new track (simulating bug move), runs reindexFromFileLists, asserts SQLite track_id matches HTML — without needing a git commit.
  • TestDeduplicatePaths: unit-tests the deduplication helper.

Acceptance

  • Incremental reindex (no --full) now picks up track_id changes from bug move / feature move without requiring a git commit
  • --full reindex unaffected (uses the separate reindexFeatureDir path)
  • go build ./... && go vet ./... && go test ./... all pass
  • htmlgraph build succeeds

Migration Note

Existing stale rows in user DBs are not auto-migrated. A one-time htmlgraph reindex --full on next upgrade is sufficient to correct any stale track_id values.

🤖 Generated with Claude Code

…g/feature move

Commands like `bug move` modify the HTML file in-place without committing to git.
The incremental reindex only checked `git diff <lastCommit>..HEAD` (committed changes)
and `git ls-files --others` (untracked new files), so dirty (modified-but-uncommitted)
files were never picked up — leaving track_id stale in SQLite until `--full` reindex.

Fix: add `appendDirtyHTMLFiles` which runs `git diff --name-only` (unstaged) and
`git diff --cached --name-only` (staged) to include working-tree dirty .htmlgraph files
in the incremental reindex file list. `deduplicatePaths` prevents double-processing
when the same file appears in both the commit diff and the working-tree diff.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 000688c9ad

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread cmd/htmlgraph/reindex.go Outdated
Comment on lines +313 to +314
{"diff", "--name-only", "--", relHg},
{"diff", "--cached", "--name-only", "--", relHg},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Filter out deleted paths from dirty-file additions

Using git diff --name-only for dirty files includes deleted .html paths as well as modified ones, but this helper appends every returned path to the added list. In incremental reindex, those deleted files are then parsed as if they exist, producing parse errors and leaving stale feature/track rows because they never enter the deleted flow; this occurs whenever a tracked .htmlgraph/*.html file is removed (staged or unstaged) before commit.

Useful? React with 👍 / 👎.

git diff --name-only includes deleted files. The previous
appendDirtyHTMLFiles routed all dirty paths to the upsert list, so a
dirty-deleted .htmlgraph/*.html would either error during parse or leave
a stale SQLite row instead of being removed.

Fix: switch to git diff --name-status and route by status code:
  A, M → added (upsert)
  D    → deleted (remove from SQLite)
  R    → old path to deleted, new path to added

Caller updated to receive both lists. Added regression test
TestIncrementalReindex_DirtyDeletion: create + index → os.Remove →
incremental reindex → assert SQLite row is gone.
@shakestzd shakestzd merged commit 73345ac into main May 1, 2026
2 checks passed
@shakestzd shakestzd deleted the bug/bug-a7d17cd2-reindex-track-id branch May 1, 2026 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant