Summary
oddkit_audit returns non-deterministic findings on consecutive calls against the same canon state when its resolver cache is partially warm. Default-scope (writings/) audits can return 3, 4, 5+ findings call-by-call, including false positives for URIs that resolve FOUND via standalone oddkit_resolve in the same session. This is the failure mode the v2.2 spec amendment explicitly named as the trigger to reconsider the deferred index_state/PARTIAL_INDEX mechanism.
This blocks PR-3.2 of the link-rot elimination campaign (the canon-quality workflow's hard-block flip), because hard-block on a non-deterministic checker would fail random PRs.
Reproduce
Tested 2026-04-27 ~17:50 UTC against https://oddkit.klappy.dev/mcp (oddkit_version returned 0.26.0). Two consecutive oddkit_audit calls with no input args (default scope ["writings/"]), 5 seconds apart:
=== Call #1 ===
status: FINDINGS
total_findings: 3
files_scanned: 39
findings:
writings/choosing-faith-not-fear.md:203 -> klappy://writings/four-questions-that-change-everything
writings/the-broken-wall-and-the-buried-talent.md:332 -> klappy://draft-zeros/appendix-a-the-biblical-roots
writings/the-voice-came-first.md:244 -> klappy://writings/four-questions-that-change-everything
=== Call #2 (5 seconds later, same state) ===
status: FINDINGS
total_findings: 5
files_scanned: 36
findings:
writings/agentic-software-development.md:242 -> klappy://writings/nothing-new-even-ai
writings/choosing-faith-not-fear.md:203 -> klappy://writings/four-questions-that-change-everything
writings/getting-started-with-odd-and-oddkit.md:204 -> klappy://docs/examples/project-instructions-template
writings/the-broken-wall-and-the-buried-talent.md:332 -> klappy://draft-zeros/appendix-a-the-biblical-roots
writings/the-voice-came-first.md:244 -> klappy://writings/four-questions-that-change-everything
The 2 extra findings in call #2 are URIs that resolve FOUND via standalone oddkit_resolve in the same session:
klappy://docs/examples/project-instructions-template → oddkit_resolve returns {"status": "FOUND", "resolved": {"path": "docs/examples/project-instructions-template.md", ...}}
klappy://writings/nothing-new-even-ai → file exists in writings/
When the same audit is scoped to the single file containing one of these false-positive URIs:
=== Call: oddkit_audit, scope = ["writings/getting-started-with-odd-and-oddkit.md"] ===
status: OK
total_findings: 0
files_scanned: 1
findings: []
Same audit code, same target URI, scope-dependent verdict. The discriminator is files_scanned — when files_scanned < 39 (incomplete warm), audit false-flags un-warmed URIs as NOT_FOUND.
Diagnosis
The audit's pre-warm-then-resolve loop has a partial-cache bug. URIs whose target docs haven't been warmed yet at the moment of the resolve call get classified as dead-reference, even though they resolve normally on a fresh standalone call (which warms the resolver index incidentally).
This is exactly the failure mode the v2.2 spec amendment named:
If a real consumer demonstrates the need to distinguish "URI is dead" from "URI couldn't be checked because the index wasn't warm yet," the deferred mechanism graduates from this section back into the Output schema.
We have that consumer now. It's the canon-quality.yml workflow merged in klappy/klappy.dev#149.
Impact
- Soft-block (current): workflow comments are noisy — sometimes report findings that are not really findings. PRs do not fail. Tolerable but degrades signal quality for the observation cycle.
- Hard-block (PR-3.2): would fail random PRs depending on which URIs happened to be cold at audit time. Deployment blocker for PR-3.2.
Proposed fix paths
Three options, ordered by KISS / Vodka adherence:
(i) Eager warm-all-targets before any resolve
Audit synchronously force-warms every klappy:// target it finds in the scoped files before issuing any resolve call. Slower first call (likely +seconds), deterministic output, no envelope schema change.
Pros: smallest spec/contract surface change. Preserves spec v2.2 envelope as-is. No new statuses, no new fields. Easy to validate determinism (call twice, expect identical output).
Cons: latency cost on first call (cache hits make subsequent calls fast). If writings/ grows to thousands of files, this might become uncomfortable.
(ii) Un-defer index_state and PARTIAL_INDEX per spec v2.0/v2.1
Worker emits index_state: {warm_count, warming_count} and uses PARTIAL_INDEX status when warm is incomplete. Consumers (workflow) treat PARTIAL_INDEX as non-blocking with retry-on-next-push.
Pros: matches the original spec design. Honest about cache state.
Cons: requires worker change + spec amendment v2.2 → v2.3 (un-defer) + canon-quality.yml change to handle PARTIAL_INDEX status (which would re-introduce the dormant code path tracked at klappy/klappy.dev#153). RV-gate dispatch needed.
(iii) Treat partial-cache misses as warning-severity findings rather than error
Audit still emits the false-positives but downgraded to warning. Hard-block gate fails only on error-severity. Comment shows both with distinct icons.
Pros: simpler than (ii). Surfaces the issue in CI without blocking.
Cons: still surfaces noise to authors. Requires the audit to know a finding came from a partial-cache miss, which means it needs index_state internally anyway — so this is (ii) minus the schema change.
Recommendation
(i) eager warm-all is the right starting point. Smallest contract change, fully solves the determinism problem, and the latency cost is bounded by the size of writings/. If the latency becomes a problem at scale, (ii) becomes the upgrade path.
Blocks
- klappy/klappy.dev PR-3.2 (hard-block flip) — cannot ship until audit is deterministic.
Discovered by
Operator + Claude during post-merge verification of v0.26.0 (#146) and observation of canon-quality workflow first-run output (klappy/klappy.dev#149).
See also
Summary
oddkit_auditreturns non-deterministic findings on consecutive calls against the same canon state when its resolver cache is partially warm. Default-scope (writings/) audits can return 3, 4, 5+ findings call-by-call, including false positives for URIs that resolveFOUNDvia standaloneoddkit_resolvein the same session. This is the failure mode the v2.2 spec amendment explicitly named as the trigger to reconsider the deferredindex_state/PARTIAL_INDEXmechanism.This blocks PR-3.2 of the link-rot elimination campaign (the canon-quality workflow's hard-block flip), because hard-block on a non-deterministic checker would fail random PRs.
Reproduce
Tested 2026-04-27 ~17:50 UTC against
https://oddkit.klappy.dev/mcp(oddkit_versionreturned0.26.0). Two consecutiveoddkit_auditcalls with no input args (default scope["writings/"]), 5 seconds apart:The 2 extra findings in call #2 are URIs that resolve
FOUNDvia standaloneoddkit_resolvein the same session:klappy://docs/examples/project-instructions-template→oddkit_resolvereturns{"status": "FOUND", "resolved": {"path": "docs/examples/project-instructions-template.md", ...}}klappy://writings/nothing-new-even-ai→ file exists inwritings/When the same audit is scoped to the single file containing one of these false-positive URIs:
Same audit code, same target URI, scope-dependent verdict. The discriminator is
files_scanned— whenfiles_scanned < 39(incomplete warm), audit false-flags un-warmed URIs asNOT_FOUND.Diagnosis
The audit's pre-warm-then-resolve loop has a partial-cache bug. URIs whose target docs haven't been warmed yet at the moment of the resolve call get classified as
dead-reference, even though they resolve normally on a fresh standalone call (which warms the resolver index incidentally).This is exactly the failure mode the v2.2 spec amendment named:
We have that consumer now. It's the
canon-quality.ymlworkflow merged in klappy/klappy.dev#149.Impact
Proposed fix paths
Three options, ordered by KISS / Vodka adherence:
(i) Eager warm-all-targets before any resolve
Audit synchronously force-warms every
klappy://target it finds in the scoped files before issuing any resolve call. Slower first call (likely +seconds), deterministic output, no envelope schema change.Pros: smallest spec/contract surface change. Preserves spec v2.2 envelope as-is. No new statuses, no new fields. Easy to validate determinism (call twice, expect identical output).
Cons: latency cost on first call (cache hits make subsequent calls fast). If
writings/grows to thousands of files, this might become uncomfortable.(ii) Un-defer
index_stateandPARTIAL_INDEXper spec v2.0/v2.1Worker emits
index_state: {warm_count, warming_count}and usesPARTIAL_INDEXstatus when warm is incomplete. Consumers (workflow) treatPARTIAL_INDEXas non-blocking with retry-on-next-push.Pros: matches the original spec design. Honest about cache state.
Cons: requires worker change + spec amendment v2.2 → v2.3 (un-defer) + canon-quality.yml change to handle
PARTIAL_INDEXstatus (which would re-introduce the dormant code path tracked at klappy/klappy.dev#153). RV-gate dispatch needed.(iii) Treat partial-cache misses as
warning-severity findings rather thanerrorAudit still emits the false-positives but downgraded to
warning. Hard-block gate fails only onerror-severity. Comment shows both with distinct icons.Pros: simpler than (ii). Surfaces the issue in CI without blocking.
Cons: still surfaces noise to authors. Requires the audit to know a finding came from a partial-cache miss, which means it needs index_state internally anyway — so this is (ii) minus the schema change.
Recommendation
(i) eager warm-all is the right starting point. Smallest contract change, fully solves the determinism problem, and the latency cost is bounded by the size of
writings/. If the latency becomes a problem at scale, (ii) becomes the upgrade path.Blocks
Discovered by
Operator + Claude during post-merge verification of v0.26.0 (#146) and observation of canon-quality workflow first-run output (klappy/klappy.dev#149).
See also
klappy://docs/oddkit/specs/oddkit-audit