Skip to content

fix: correctness bugs (Svelte off-by-one, watcher dropped changes, traversal defaults, context clamping)#38

Open
mschreib28 wants to merge 1 commit into
mainfrom
upstream/fix/audit-correctness
Open

fix: correctness bugs (Svelte off-by-one, watcher dropped changes, traversal defaults, context clamping)#38
mschreib28 wants to merge 1 commit into
mainfrom
upstream/fix/audit-correctness

Conversation

@mschreib28
Copy link
Copy Markdown
Owner

Summary\n\nFour user-visible correctness bugs caught by an independent codebase audit. Each fix has a regression test that fails before the change and passes after.\n\n### 1. Svelte symbols were reported on the wrong line — every time\n\nsrc/extraction/svelte-extractor.ts:144\n\nThe <script> block regex captures content starting with the newline that follows >. The inner extractor sees that newline as line 1 of its (1-indexed) input and the first real code on line 2. The old code did contentStartLine = scriptTagLine + openingTagLines + 1 and added that to the inner extractor's already-1-indexed line numbers — shifting every Svelte symbol's startLine / endLine off by 1. Dropping the +1 produces correct positions. Five new tests cover single-line, multi-line opening tag, template-offset, single-line no-newline, and dual module/instance script blocks.\n\n### 2. Watcher silently dropped pending changes on sync failure\n\nsrc/sync/watcher.ts:177\n\nts\nthis.hasChanges = false; // cleared BEFORE the attempt\nthis.syncing = true;\ntry {\n await this.syncFn();\n} catch (err) {\n this.onSyncError?.(error); // no retry — batch is lost\n}\n\n\nIf the sync threw (DB locked, transient FS error) the pending batch was forgotten until a new file event arrived. Now hasChanges = true is re-set in the catch path so a transient failure schedules a retry on its own. Regression test mocks a fail-then-succeed syncFn and asserts the second call happens automatically without a new file event.\n\n### 3. Graph traversal default maxDepth was Infinity\n\nsrc/graph/traversal.ts:14, src/types.ts:301\n\nlimit: 1000 capped returned nodes, but during traversal the visited set and BFS/DFS frontier can grow far beyond limit on highly connected graphs before the cap kicks in. Default is now 10. The TraversalOptions.maxDepth JSDoc documents that callers who really need exhaustive traversal can still pass Infinity explicitly. This is a public-API behavior change — existing tests pass with the new default.\n\nAlso caps findPath's BFS queue at FIND_PATH_MAX_QUEUE = 100_000 entries (each holds a cloned path array, so on dense graphs the queue could otherwise consume gigabytes before either finding a path or exhausting the search). Cap is checked at the top of the BFS loop so it's a hard ceiling, not a near-miss.\n\n### 4. findRelevantContext did not bound caller-supplied limits\n\nsrc/context/index.ts:284\n\nsearchLimit is multiplied by 5 inside findNodesByExactName and feeds several other unbounded operations; a caller passing searchLimit: 1_000_000 would pull millions of rows. Now clamped:\n- searchLimit ∈ [1, 100]\n- maxNodes ∈ [1, 1000]\n- traversalDepth ∈ [0, 10]\n\nRegression test asserts a 1e9 input is bounded.\n\n## Files changed\n\n| File | Change |\n|---|---|\n| src/extraction/svelte-extractor.ts | Drop the +1 in contentStartLine |\n| src/sync/watcher.ts | Re-set hasChanges on sync failure |\n| src/graph/traversal.ts | Default maxDepth: 10, add findPath queue cap |\n| src/types.ts | Update TraversalOptions.maxDepth doc |\n| src/context/index.ts | Clamp searchLimit/maxNodes/traversalDepth |\n| __tests__/extraction.test.ts | 5 Svelte regression tests |\n| __tests__/watcher.test.ts | 1 retry-after-failure test |\n| __tests__/context.test.ts | 1 clamp test |\n\n## Test plan\n\n- [x] npm test (serialized): 387/387 pass (was 380, added 7)\n- [x] npx tsc --noEmit clean\n- [x] Independent reviewer pass before pushing — picked up missing dual-script and single-line test cases (now added) and tightened the findPath queue cap to be a hard ceiling\n\n🤖 Generated with Claude Code\n


Copied from colbymchenry/codegraph#98

Four user-visible bugs caught by an independent audit pass:

1. Svelte symbols reported on the wrong line
   src/extraction/svelte-extractor.ts:144

   The script-block regex captures content starting with the leading newline
   that follows `>`, so the inner extractor sees that newline as line 1 of
   its 1-indexed input and the first real code on line 2. The previous
   `contentStartLine = scriptTagLine + openingTagLines + 1` was added to
   that 1-indexed line number, shifting every Svelte symbol's startLine /
   endLine off by 1. Drops the `+1`. Five regression tests added covering
   single-line, multi-line opening tag, template-offset, single-line
   no-newline, and dual module/instance script blocks.

2. Watcher silently dropped pending changes on sync failure
   src/sync/watcher.ts:177

   `hasChanges = false` ran before the sync attempt, so a thrown sync
   (DB locked, transient FS error) left the pending batch forgotten until
   a NEW file event arrived. Re-set `hasChanges = true` in the catch path
   so a transient failure schedules a retry on its own. Regression test
   added (mocks fail-then-succeed, asserts the second call happens
   without a new file event).

3. Graph traversal default maxDepth was Infinity
   src/graph/traversal.ts:14, src/types.ts:301

   `limit: 1000` capped returned nodes, but during traversal the visited
   set and BFS/DFS frontier can grow far beyond `limit` on highly
   connected graphs before the cap kicks in. Default is now 10. Callers
   who really need exhaustive traversal can still pass `maxDepth:
   Infinity` explicitly — the JSDoc documents this. This is a
   public-API behavior change; existing tests pass.

   Also caps `findPath`'s BFS queue at 100,000 entries
   (FIND_PATH_MAX_QUEUE) and returns null if exceeded — each entry holds
   a cloned path array, so on dense graphs the queue could otherwise
   consume gigabytes before either finding a path or exhausting the
   search.

4. `findRelevantContext` did not bound caller-supplied limits
   src/context/index.ts:284

   `searchLimit` is multiplied by 5 in `findNodesByExactName` and feeds
   several other unbounded operations; a caller passing
   `searchLimit: 1_000_000` would pull millions of rows. Now clamped:
   searchLimit ∈ [1, 100], maxNodes ∈ [1, 1000], traversalDepth ∈ [0, 10].
   Regression test asserts a 1e9 input is bounded.

All 387 tests pass serialized; tsc clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants