Perf WG-5: Extension JSONL ingestion — tail contract, async prescan, sync FS audit

## Summary

Workflow 5 of 5 from the 2026-05-03 performance review of `main`. Parent: #46. Siblings: #71, #72, #73 (renderer/React), #74 (relay perf).

Includes one **real correctness bug** (CR-4: data-loss already covered by an existing regression test that the majority of callers violate). The rest are blocking-IO and sync-FS items on the JSONL ingestion hot path. Together with WG-4 (#74), this addresses the architectural drift between the three near-identical watcher paths (extension `SessionWatcher`, relay watcher, Codex watcher) — only the Codex watcher currently follows the documented contracts; the others have diverged.

## Critical

### CR-4 · `readNewFileLines` partial-line tail silently dropped by 4 of 5 callers

**This is the worst class of bug for a "relay everything" system: silent data loss.**

- `extension/src/fs-utils.ts:26-51` — definition returns `{ lines, newSize, tail }`; `tail` is the bytes after the last newline that the caller is supposed to feed back as `lastTail` next call to reassemble a JSONL line split across two writer flushes
- **Lossy callers** (omit `lastTail`, ignore returned `tail`, advance `fileSize` past the partial bytes):
  - `extension/src/session-watcher.ts:450` (extension)
  - `extension/src/subagent-watcher.ts:156`
  - `extension/src/event-source.ts:57`
  - `scripts/relay.ts:326` (this caller is owned by WG-4 / #74; coordinate the merge)
- **Working reference:** `extension/src/codex-session-watcher.ts:307` (passes `session.fileTail`, restored at line 274, persisted at line 310)
- **Contract is locked down by `extension/test/fs-utils.test.ts:70-83`** ("regression guard") which explicitly demonstrates the loss
- A split line then triggers `JSON.parse` failure inside `processTranscriptLine` and is silently dropped (`log.debug('Skipping unparseable line:')`)
- **Fix:** add `fileTail: string` to `WatchedSession`, `JsonlEventSource` state, and subagent state. Plumb through every caller. Mechanical change, working reference exists, regression test exists.

### CR-7 · `prescanExistingContent` synchronously reads + parses the full JSONL on session detect

- `extension/src/transcript-parser.ts:438-521`; called from `extension/src/session-watcher.ts:388` (and from `scripts/relay.ts:288` — that caller is WG-4)
- Helper at `extension/src/fs-utils.ts:7-16` does `fs.openSync` + `Buffer.alloc(stat.size)` + `fs.readSync` + `fs.closeSync` — `Buffer.alloc` zero-fills, which the caller immediately overwrites
- Then loops `JSON.parse(line.trim())` for every line in a tight loop
- For a typical `.sim-sessions/` transcript (500 KB / ~900 lines): 20-100 ms blocking the extension host UI thread. Real Claude transcripts can be 5-50 MB.
- `scanForActiveSessions` runs every 1 s and may attach multiple sessions per tick (extension side at `session-watcher.ts:264-353`).
- **Fix:** stream via `readline.createInterface` over `fs.createReadStream` (cleanest), OR yield via `await new Promise(setImmediate)` every 100 lines (cheapest). Switch `Buffer.alloc` → `Buffer.allocUnsafe` (caller fills from disk, zero-fill is wasted work).

## Important

### IR-16 · `scanForActiveSessions` runs every 1 s with sync FS scaling per-session

- `extension/src/session-watcher.ts:264-353`
- Per scan: `readdirSync(CLAUDE_DIR)` + per-dir `isContainedProject` (which itself does `readdirSync` + `openSync` + `readSync(8KB)` + `JSON.parse` until cwd is found, until cached) + per-jsonl `statSync` + per-subagent-dir `existsSync` + `readdirSync` + `statSync` per subagent file
- `scanForActiveSessions` body owned by extension; the relay's equivalent at `scripts/relay.ts:340-389` is owned by WG-4 (no caching at all there)
- **Fix:** rely primarily on the existing `dirWatcher` callback as the trigger; downgrade the 1 s interval to a 5-10 s heartbeat fallback. Cache negative results (see IR-IO MR-10).

### IR-17 · `scanSubagentsDir` runs on every appended-line read

- `extension/src/session-watcher.ts:461`
- `readNewLines()` is the per-write hot path (called for every `fs.watch` `change` event AND every 3 s poll tick) and it unconditionally calls `scanSubagentsDir(...)` → `fs.readdirSync(session.subagentsDir)` at the end
- The work *after* the first pass is just the readdirSync (already-watched files are skipped), but that syscall happens once per parent-file-change. Stress scenario (500 rapid writes) → 500 redundant readdirSyncs
- The `session.subagentsDirWatcher` (set up inside `scanSubagentsDir`) already gets a callback when new files appear — that's the right mechanism. The unconditional call from `readNewLines` is redundant.
- **Fix:** trigger `scanSubagentsDir` only once per session-attach; rely on the dir watcher for new subagent files.

### IR-22 · `discovery.ts:39` blocks activation up to 3 s on `execSync('command -v node')`

- `extension/src/discovery.ts:34-52`
- Synchronous shell spawn the first time `getHookCommand` is called from `migrateHttpHooks` (called from `startClaudeRuntime` at `claude-runtime.ts:79`)
- Cached after first call but blocks the activation/UI thread. Worst case: 3 s typical 50-200 ms
- Comment at line 32-34 explains why it's needed (Electron `process.execPath` is not node) — that justification stands; the issue is the synchronous call
- **Fix:** `execFile`/async, or guard behind an actual hook-write path so it only runs when needed (currently runs as part of routine startup if any project-level `settings.local.json` exists).

### IR-23 · `JsonlEventSource.processExistingContent` reads full file synchronously

- `extension/src/event-source.ts:45-54`
- `start()` calls `processExistingContent()` → `fs.readFileSync(this.filePath, 'utf-8')` on the entire file, splits, emits every line as an event
- Same shape as CR-7 but in the user-triggered "Connect to Running Agent" → "Watch JSONL File" path. Multi-hundred-ms freeze on the UI thread for large logs.
- **Fix:** stream via `fs.createReadStream` + `readline`; emit events incrementally with `setImmediate` yields.

### IR-24 · `subagent-watcher.startWatchingSubagentFile` reads full subagent JSONL on attach

- `extension/src/subagent-watcher.ts:103-126`
- For each newly-discovered subagent file: `fs.readFileSync(filePath, 'utf-8')` reads the entire file, then splits and `JSON.parse`s every line just to populate `seenToolUseIds` for dedup
- Sim subagent files are small (5-7 lines) so it's fast there, but real Claude subagent transcripts can be much larger. Hits the relay event loop on every subagent dir-watch event.
- **Fix:** stream-and-yield, same as CR-7 / IR-23.

## Minor

### MR-10 · `containedProjectCache` only caches positive matches

- `extension/src/session-watcher.ts:226-229`
- Comment at line 226-228 explains positives-only caching with a justification (a fresh dir might race) that doesn't cover the common case. Negative results re-scan every 1 s.
- **Fix:** TTL the negative cache (e.g., 30 s) or invalidate it from the `CLAUDE_DIR` watcher callback.

### MR-11 · Permission timers are re-created on every readNewLines

- `extension/src/permission-detection.ts:36-83`
- Cheap individually, but the `Array.from(pendingToolCalls.values()).some(...)` (line 50) is O(pending) per call, executed twice per readNewLines.
- **Fix:** only re-arm the timer when the pending set actually changes.

## Coverage gap to add (to bench)

The `stress` sim writes 500 events at ~5 ms intervals = ~50 KB total — far below real-world transcript sizes. Add:
- A `bigfile` scenario that pre-stages a 5 MB transcript before attaching, to exercise CR-7
- A `split-write` scenario that flushes mid-line, to exercise CR-4 directly against the regression test contract

## Parallelism

Independent of WG-1 (#71), WG-2 (#72), WG-3 (#73), WG-4 (#74). Owns `extension/src/{fs-utils,transcript-parser,session-watcher,subagent-watcher,event-source,codex-session-watcher,discovery,permission-detection}.ts`. Does **not** touch `scripts/relay.ts` (WG-4) or `extension/src/webview-provider.ts` (WG-4). The CR-4 caller fix at `relay.ts:326` is owned by WG-4 — flag this in your PR if WG-4 hasn't merged yet, since it's a one-line change in a file you don't otherwise touch.

## Test plan

- [ ] CR-4: extend `extension/test/fs-utils.test.ts` with a regression test per fixed caller; run sim with deliberate mid-line flushes; confirm zero `Skipping unparseable line` debug entries
- [ ] CR-7 / IR-23 / IR-24: connect to a 10 MB JSONL log; UI thread blocking time should drop from seconds to ~16 ms frames; events should arrive incrementally
- [ ] IR-16 / IR-17: at stress sim 200 evt/s, `strace -c -p $(pgrep ...)` should show fewer `getdents`/`stat` syscalls
- [ ] IR-22: extension activation time on a fresh workspace before/after; should be ~50-200 ms faster

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf WG-5: Extension JSONL ingestion — tail contract, async prescan, sync FS audit #75

Summary

Critical

CR-4 · `readNewFileLines` partial-line tail silently dropped by 4 of 5 callers

CR-7 · `prescanExistingContent` synchronously reads + parses the full JSONL on session detect

Important

IR-16 · `scanForActiveSessions` runs every 1 s with sync FS scaling per-session

IR-17 · `scanSubagentsDir` runs on every appended-line read

IR-22 · `discovery.ts:39` blocks activation up to 3 s on `execSync('command -v node')`

IR-23 · `JsonlEventSource.processExistingContent` reads full file synchronously

IR-24 · `subagent-watcher.startWatchingSubagentFile` reads full subagent JSONL on attach

Minor

MR-10 · `containedProjectCache` only caches positive matches

MR-11 · Permission timers are re-created on every readNewLines

Coverage gap to add (to bench)

Parallelism

Test plan

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Perf WG-5: Extension JSONL ingestion — tail contract, async prescan, sync FS audit #75

Description

Summary

Critical

CR-4 · readNewFileLines partial-line tail silently dropped by 4 of 5 callers

CR-7 · prescanExistingContent synchronously reads + parses the full JSONL on session detect

Important

IR-16 · scanForActiveSessions runs every 1 s with sync FS scaling per-session

IR-17 · scanSubagentsDir runs on every appended-line read

IR-22 · discovery.ts:39 blocks activation up to 3 s on execSync('command -v node')

IR-23 · JsonlEventSource.processExistingContent reads full file synchronously

IR-24 · subagent-watcher.startWatchingSubagentFile reads full subagent JSONL on attach

Minor

MR-10 · containedProjectCache only caches positive matches

MR-11 · Permission timers are re-created on every readNewLines

Coverage gap to add (to bench)

Parallelism

Test plan

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

CR-4 · `readNewFileLines` partial-line tail silently dropped by 4 of 5 callers

CR-7 · `prescanExistingContent` synchronously reads + parses the full JSONL on session detect

IR-16 · `scanForActiveSessions` runs every 1 s with sync FS scaling per-session

IR-17 · `scanSubagentsDir` runs on every appended-line read

IR-22 · `discovery.ts:39` blocks activation up to 3 s on `execSync('command -v node')`

IR-23 · `JsonlEventSource.processExistingContent` reads full file synchronously

IR-24 · `subagent-watcher.startWatchingSubagentFile` reads full subagent JSONL on attach

MR-10 · `containedProjectCache` only caches positive matches