test(watcher): close the spec 13.1 verification gap (read-cache handoff via the watcher's own flush) by clay-good · Pull Request #103 · clay-good/OpenLore

clay-good · 2026-05-31T21:29:29Z

What this is

Follow-up verification for #102 (spec 13.1, watch-mode performance). I re-read the merged work, re-ran the full test/bench/build, and audited each of the five documented root causes against the code. The mechanism is sound and the existing tests are genuine. This PR adds two regression tests that close the one real proof gap I found. No source changes.

Verification performed (all real, all re-run on this machine)

Check	Result
`tsc --noEmit`	clean
`npm run lint`	0 errors (3 pre-existing warnings in an untouched file)
CI-mirror `vitest run src examples`	2936 passed / 2 skipped / 0 failures
`npm run bench:watch`	next-call read after save 0.02 ms (cache HIT) vs 4.0 ms cold (169×); 50-file burst → 1 flush
watcher integration suite (real chokidar)	5/5 green

Root-cause → fix → test, audited:

Full llm-context.json rewrite per save → coalesced to one write per batch (Step 1). ✔ covered (G2 burst→1-flush test).
Forced cold re-parse on next tool call (mtime cache-bust) → primeContextCache handoff (Step 2). ✔ covered in isolation — the gap this PR closes (see below).
Full vector read + createTable(overwrite) → row-level updateFiles (delete(\filePath` IN …) + add`) (Step 3). ✔ covered, incl. the backtick-quoting trap.
stderr line per change → one summary line per batch (Step 6). ✔ covered (G5).
No cross-file coalescing → single pending Set + debounce + max-batch ceiling (Step 1). ✔ covered (G2/G3).

The gap I found

The merged PR proved root-cause #2 in two disconnected halves: a unit test proves primeContextCache→hit when called directly, and the integration tests prove freshness lands on disk through a real save. But nothing proved the two connect — that the watcher's own flush path (enqueue → flush → handleBatch → persistContext → primeContextCache) actually makes the next tool-call read a cache HIT instead of the cold ~2 MB re-parse that was the single biggest per-save cost. Freshness alone can't prove it (a cold re-parse also returns fresh data); the discriminator is whether a parse happened.

What this PR adds

mcp-watcher-incremental.test.ts (runs in CI): drives a flush via enqueue, spies on primeContextCache to capture the exact object the flush hands to the read cache, then asserts the next readCachedContext returns that same object reference — reference identity ⇒ no disk re-parse. Also asserts the flush primes exactly once.
mcp-watcher.integration.test.ts (real chokidar; npm run test:integration): the same proof through an actual file save + real FSWatcher, for end-to-end fidelity to the field scenario.

Finding for the owner (not changed here)

*.integration.test.ts is excluded from the default vitest config, so CI (npm run test:run) runs zero real-chokidar watcher tests — the integration suite only runs via npm run test:integration. Wiring the hermetic watcher integration file into CI would be worthwhile, but the full integration suite pulls in network/embedding-dependent files, so I left that as an owner decision rather than bundling it. The CI-covered incremental test added here gives regression protection in the meantime.

Honest limits (unchanged from #102)

This does not reproduce the original multi-second field symptom — that needs a real MCP client under load / a denser corpus than this box has. Confidence still rests on the root-cause analysis + the now-complete mechanism tests + the microbenchmark. This PR strengthens the second leg; it doesn't claim the field magnitude.

🤖 Generated with Claude Code

…cache handoff through the watcher's own flush path Follow-up verification for PR #102 (spec 13.1). The merged PR proved primeContextCache→hit in ISOLATION (the watcher calls a setter; the next read returns it), and proved freshness lands on disk through a real chokidar save — but nothing proved the two halves connect: that the watcher's OWN flush path (enqueue → flush → handleBatch → persistContext → primeContextCache) makes the next tool-call read a cache HIT rather than the cold ~2 MB re-parse that was root-cause #2 of the field regression. These two regression tests close that gap. - mcp-watcher-incremental.test.ts (runs in CI): drive a flush via enqueue, spy on primeContextCache to capture the exact object the flush hands to the read cache, then assert the next readCachedContext returns that SAME object (reference identity ⇒ no disk re-parse). Asserts the flush primes exactly once. - mcp-watcher.integration.test.ts (real chokidar, local/`test:integration`): the same proof through an actual file save and a real FSWatcher, for end-to-end fidelity to the field scenario. No source changes — the implementation was already correct; this only adds the missing proof. Verification: tsc --noEmit clean; lint 0 errors; CI-mirror (vitest run src examples) 2936 passed / 2 skipped / 0 failures; watcher integration suite 5/5 green. Note for the owner (not changed here): *.integration.test.ts is excluded from the default vitest config, so CI (`npm run test:run`) runs zero real-chokidar watcher tests — the integration suite only runs via `npm run test:integration`. Wiring the hermetic watcher integration file into CI would be worthwhile, but the full integration suite includes network/embedding-dependent files, so that's left as an owner decision rather than bundled here. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Bump version for the v2.0.6 release (spec 13.1 — O(change) watch-mode freshness). The release workflow publishes package.json's version on the v* tag, so this must match the tag. **Why this commit exists:** the first v2.0.6 publish (run 71639223540) failed with `npm error You cannot publish over the previously published versions: 2.0.5` — the v2.0.6 tag was pushed at a commit where package.json still said 2.0.5 (the bump was never made), so `npm publish` tried to republish 2.0.5. Two changes: - Bump package.json + package-lock.json 2.0.5 → 2.0.6 (the missing bump). - Add a "Verify tag matches package.json version" guard to the release workflow's validate job. A mismatched tag now fails fast and loud in `validate` with an actionable message, instead of sailing through and dying late in `npm publish` with the cryptic over-publish error. After this merges, the v2.0.6 tag must be re-created at the merge commit (it currently points at the pre-bump clay-good#103 merge) before re-running publish. Verification: tsc --noEmit clean; lint 0 errors; build clean; CI-mirror (vitest run src examples) 2936 passed / 2 skipped / 0 failures; guard logic sanity-checked (v2.0.6→pass, v2.0.5→fail). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

clay-good merged commit fdc4cd8 into main May 31, 2026
4 checks passed

clay-good mentioned this pull request Jun 1, 2026

chore(release): v2.0.6 + guard release against tag/version mismatch #106

Merged

clay-good deleted the test/spec-13.1-watch-handoff-regression branch June 2, 2026 21:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(watcher): close the spec 13.1 verification gap (read-cache handoff via the watcher's own flush)#103

test(watcher): close the spec 13.1 verification gap (read-cache handoff via the watcher's own flush)#103
clay-good merged 1 commit into
mainfrom
test/spec-13.1-watch-handoff-regression

clay-good commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

clay-good commented May 31, 2026

What this is

Verification performed (all real, all re-run on this machine)

The gap I found

What this PR adds

Finding for the owner (not changed here)

Honest limits (unchanged from #102)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant