fix(merge-batch-graphs): accept batch-existing.json + fail loud on unrecognized filenames (#292)#389
Open
tirth8205 wants to merge 1 commit into
Open
Conversation
…recognized filenames (Egonex-AI#292) The /understand skill's incremental update path (Phase 2) writes the pruned existing nodes/edges as batch-existing.json, but the merge script filtered input files with `batch-(\d+)(?:-part-(\d+))?\.json` and silently dropped the file because it has no digits. Up to ~70% of the graph could be lost (every node from unchanged files) while the script exited 0. Defence-in-depth: 1. Accept `batch-existing.json` via `batch-(\d+|existing)(?:-part-(\d+))?\.json`. It sorts first (synthetic index -1) so baseline nodes are seen before freshly-analyzed batches; combined with Step 5's "keep last" dedup, re-analyzed nodes always win over their existing-graph copies — the correct incremental-update semantics. 2. Make unrecognized filenames a hard exit instead of a stderr warning that callers routinely missed. A partial merged graph is never produced when a batch file with an unknown name is present. 3. Add tests covering batch-existing.json acceptance, sort order, and the new hard-exit contract for unknown filenames. SKILL.md already instructed writing batch-existing.json — no doc change needed beyond confirming the contract now matches the implementation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
/understandskill's incremental-update path (Phase 2 ofSKILL.md) writes the pruned existing nodes/edges asbatch-existing.json, butmerge-batch-graphs.pyfiltered input files withre.compile(r'batch-(\d+)(?:-part-(\d+))?\.json')and silently dropped the file because the name carries no digits. Up to ~70% of the graph (every node from unchanged files) could be lost while the script exited 0.This PR fixes the regex and turns silent drops into a hard failure so this kind of breakage can never reappear.
Defence-in-depth changes
batch-existing.jsonvia a new union patternbatch-(\d+|existing)(?:-part-(\d+))?\.json. The file sorts first with a synthetic index of-1, so baseline nodes are seen before any freshly-analyzed batch. Combined with Step 5's "keep last" dedup, this means a re-analyzed node always wins over its existing-graph copy — the correct incremental-update semantics.batch-existing.jsonbug for so long.tests/skill/understand/test_merge_batch_graphs.py:TestBatchExistingFilenamecovers the happy path:batch-existing.jsonalongsidebatch-0.json, alone, and the sort-order guarantee that fresh batches override the baseline copy.TestUnrecognizedBatchFilenamewas rewritten to assert the new hard-exit contract (no partial graph is written).batch-existing.json. The bug was purely in the merge script's regex, not in the doc, and there was no workaround text to remove.Closes #292
Test plan
python3 -m unittest tests.skill.understand.test_merge_batch_graphs— 71/71 pass (was 69; +3 new inTestBatchExistingFilename, -1 obsolete inTestUnrecognizedBatchFilename)tests/skill/understand/test_compute_batches.test.mjs,test_scan_project.test.mjs,test_extract_import_map.test.mjs, andunderstand-anything-plugin/src/__tests__/merge-recover-imports.test.mjsare unaffected (none of them touchbatch-existing.jsonor rely on the dropped warning behaviour).batch-existing.jsonplus every node frombatch-<N>.jsonfiles, with fresh-batch winning on collision.🤖 Generated with Claude Code