test(watcher): close the spec 13.1 verification gap (read-cache handoff via the watcher's own flush)#103
Merged
Conversation
…cache handoff through the watcher's own flush path Follow-up verification for PR #102 (spec 13.1). The merged PR proved primeContextCache→hit in ISOLATION (the watcher calls a setter; the next read returns it), and proved freshness lands on disk through a real chokidar save — but nothing proved the two halves connect: that the watcher's OWN flush path (enqueue → flush → handleBatch → persistContext → primeContextCache) makes the next tool-call read a cache HIT rather than the cold ~2 MB re-parse that was root-cause #2 of the field regression. These two regression tests close that gap. - mcp-watcher-incremental.test.ts (runs in CI): drive a flush via enqueue, spy on primeContextCache to capture the exact object the flush hands to the read cache, then assert the next readCachedContext returns that SAME object (reference identity ⇒ no disk re-parse). Asserts the flush primes exactly once. - mcp-watcher.integration.test.ts (real chokidar, local/`test:integration`): the same proof through an actual file save and a real FSWatcher, for end-to-end fidelity to the field scenario. No source changes — the implementation was already correct; this only adds the missing proof. Verification: tsc --noEmit clean; lint 0 errors; CI-mirror (vitest run src examples) 2936 passed / 2 skipped / 0 failures; watcher integration suite 5/5 green. Note for the owner (not changed here): *.integration.test.ts is excluded from the default vitest config, so CI (`npm run test:run`) runs zero real-chokidar watcher tests — the integration suite only runs via `npm run test:integration`. Wiring the hermetic watcher integration file into CI would be worthwhile, but the full integration suite includes network/embedding-dependent files, so that's left as an owner decision rather than bundled here. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
laurentftech
pushed a commit
to laurentftech/OpenLore
that referenced
this pull request
Jun 3, 2026
Bump version for the v2.0.6 release (spec 13.1 — O(change) watch-mode freshness). The release workflow publishes package.json's version on the v* tag, so this must match the tag. **Why this commit exists:** the first v2.0.6 publish (run 71639223540) failed with `npm error You cannot publish over the previously published versions: 2.0.5` — the v2.0.6 tag was pushed at a commit where package.json still said 2.0.5 (the bump was never made), so `npm publish` tried to republish 2.0.5. Two changes: - Bump package.json + package-lock.json 2.0.5 → 2.0.6 (the missing bump). - Add a "Verify tag matches package.json version" guard to the release workflow's validate job. A mismatched tag now fails fast and loud in `validate` with an actionable message, instead of sailing through and dying late in `npm publish` with the cryptic over-publish error. After this merges, the v2.0.6 tag must be re-created at the merge commit (it currently points at the pre-bump clay-good#103 merge) before re-running publish. Verification: tsc --noEmit clean; lint 0 errors; build clean; CI-mirror (vitest run src examples) 2936 passed / 2 skipped / 0 failures; guard logic sanity-checked (v2.0.6→pass, v2.0.5→fail). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this is
Follow-up verification for #102 (spec 13.1, watch-mode performance). I re-read the merged work, re-ran the full test/bench/build, and audited each of the five documented root causes against the code. The mechanism is sound and the existing tests are genuine. This PR adds two regression tests that close the one real proof gap I found. No source changes.
Verification performed (all real, all re-run on this machine)
tsc --noEmitnpm run lintvitest run src examplesnpm run bench:watchRoot-cause → fix → test, audited:
llm-context.jsonrewrite per save → coalesced to one write per batch (Step 1). ✔ covered (G2 burst→1-flush test).primeContextCachehandoff (Step 2). ✔ covered in isolation — the gap this PR closes (see below).createTable(overwrite)→ row-levelupdateFiles(delete(\filePath` IN …) + add`) (Step 3). ✔ covered, incl. the backtick-quoting trap.The gap I found
The merged PR proved root-cause #2 in two disconnected halves: a unit test proves
primeContextCache→hit when called directly, and the integration tests prove freshness lands on disk through a real save. But nothing proved the two connect — that the watcher's own flush path (enqueue → flush → handleBatch → persistContext → primeContextCache) actually makes the next tool-call read a cache HIT instead of the cold ~2 MB re-parse that was the single biggest per-save cost. Freshness alone can't prove it (a cold re-parse also returns fresh data); the discriminator is whether a parse happened.What this PR adds
mcp-watcher-incremental.test.ts(runs in CI): drives a flush viaenqueue, spies onprimeContextCacheto capture the exact object the flush hands to the read cache, then asserts the nextreadCachedContextreturns that same object reference — reference identity ⇒ no disk re-parse. Also asserts the flush primes exactly once.mcp-watcher.integration.test.ts(real chokidar;npm run test:integration): the same proof through an actual file save + realFSWatcher, for end-to-end fidelity to the field scenario.Finding for the owner (not changed here)
*.integration.test.tsis excluded from the default vitest config, so CI (npm run test:run) runs zero real-chokidar watcher tests — the integration suite only runs vianpm run test:integration. Wiring the hermetic watcher integration file into CI would be worthwhile, but the full integration suite pulls in network/embedding-dependent files, so I left that as an owner decision rather than bundling it. The CI-covered incremental test added here gives regression protection in the meantime.Honest limits (unchanged from #102)
This does not reproduce the original multi-second field symptom — that needs a real MCP client under load / a denser corpus than this box has. Confidence still rests on the root-cause analysis + the now-complete mechanism tests + the microbenchmark. This PR strengthens the second leg; it doesn't claim the field magnitude.
🤖 Generated with Claude Code