Skip to content

fix(merge): recognize batch-existing.json so incremental updates don't drop ~75% of nodes#403

Open
atlas-architect wants to merge 1 commit into
Egonex-AI:mainfrom
atlas-architect:fix/incremental-batch-existing-not-dropped
Open

fix(merge): recognize batch-existing.json so incremental updates don't drop ~75% of nodes#403
atlas-architect wants to merge 1 commit into
Egonex-AI:mainfrom
atlas-architect:fix/incremental-batch-existing-not-dropped

Conversation

@atlas-architect

Copy link
Copy Markdown
Contributor

Problem

/understand's incremental-update path is silently lossy. During an incremental update the skill writes the pruned existing-graph payload as batch-existing.json alongside the freshly-analyzed batch-<N>.json files, then runs merge-batch-graphs.py.

But the merge step buckets files by the batch-(\d+) filename match, so batch-existing.json (no digits) falls through to the "unrecognized" branch and is dropped at load — only a stderr warning, exit 0. Net effect: every carried-over node from the previous scan is lost.

In a 305-node baseline + 10 changed files, the incremental output had 92 nodes — the 213 unchanged-file nodes were gone. (The known workaround was to manually rename batch-existing.jsonbatch-0.json.)

This is the bug tracked in #292.

Fix

Recognize existing as a valid logical batch in merge-batch-graphs.py:

  • the filename match becomes batch-(\d+|existing)(?:-part-(\d+))?\.json
  • batch-existing.json is bucketed/sorted as logical index -1 so it loads before the freshly-analyzed numbered batches

So batch-existing.json is now loaded and merged like any other batch instead of being dropped. Genuinely-malformed filenames (e.g. batch-fused-8-13.json) still hit the existing "unrecognized → warn" path, unchanged.

Test

Adds TestIncrementalExistingBatch to tests/skill/understand/test_merge_batch_graphs.py — runs the script end-to-end on a temp intermediate dir containing batch-existing.json + batch-0.json and asserts the existing nodes survive the merge (and aren't reported as a dropped/unrecognized filename). Full suite green locally (71 tests).

Surfaced during fleet-side dogfooding (Atlas Intelligence — same project that filed #292 and the emoji-folder git ls-files -z fix in #231).

Fixes #292

…t drop ~75% of nodes

During an incremental `/understand` update the skill writes the pruned
existing-graph payload as `batch-existing.json` alongside the freshly
analyzed `batch-<N>.json` files. The merge step bucketed it by the
`batch-(\d+)` filename match, so `batch-existing.json` (no digits) fell
through to "unrecognized" and was silently dropped at load — only a stderr
warning, exit 0. Net effect: incremental merges lost every carried-over
node. In a 305-node baseline + 10 changed files, the incremental output
had 92 nodes; the 213 unchanged-file nodes were gone.

Fix: recognize `existing` as a valid logical batch (sorted/bucketed as
index -1 so it loads before the numbered batches). Adds a regression test
(TestIncrementalExistingBatch) that runs the script end-to-end and asserts
batch-existing.json nodes survive the merge.

Fixes Egonex-AI#292

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incremental: skill text instructs writing 'batch-existing.json' which merge regex silently drops

1 participant