From 3f5c8a472dad6ee338d400163d5d0eb5c875dc28 Mon Sep 17 00:00:00 2001 From: Klappy Date: Thu, 30 Apr 2026 16:25:57 +0000 Subject: [PATCH] feat(canon): E0008.4 Phase 2 closeout ledger + flip handoff to superseded Closeout ledger at klappy://odd/ledger/2026-04-30-encode-vodka-refactor-phase-2-landed documents the Items 1-4 ship via klappy/oddkit#155 (squash-merged 67741bd), Item 5 deferral to 0.29.0, two Cursor Bugbot findings dispositioned via Cursor Agent autofix 47fc7e0 (Bug 1: trigger-word path facet propagation; Bug 2: TSV typeMap collision), Sonnet 4.6 read-only validator PASS verdict on all four items, and the open prod-promotion gap (main-oddkit.klappy.workers.dev serves 0.28.0; oddkit.klappy.dev still on 0.27.0 awaiting operator-side version-promote in the Cloudflare dashboard). Ledger status remains 'active' pending operator promotion of 67741bd to oddkit.klappy.dev. Per [C-02 new], when prod promotion is outstanding at closeout time the ledger is active (not complete) and a follow-up commit flips status after the operator confirms. Handoff klappy://odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d-revised flipped from status: active to status: superseded with superseded_by pointing at the new ledger. Three end-to-end applications of the release-validation-gate canon now: P1.3.3 (wrote it), P1.3.4 (inherited it), Phase 2 (third application, smoothest yet). --- ...de-vodka-refactor-alternative-d-revised.md | 3 +- ...30-encode-vodka-refactor-phase-2-landed.md | 152 ++++++++++++++++++ 2 files changed, 154 insertions(+), 1 deletion(-) create mode 100644 odd/ledger/2026-04-30-encode-vodka-refactor-phase-2-landed.md diff --git a/odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d-revised.md b/odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d-revised.md index 3a839f2..97c64bd 100644 --- a/odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d-revised.md +++ b/odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d-revised.md @@ -13,8 +13,9 @@ describes_state_at: "klappy/oddkit@1a1f093 (main, 2026-04-29) and klappy/klappy. derives_from: "docs/architecture/encode-current-state-2026-04-30.md, canon/principles/code-claims-require-code-observation.md, canon/constraints/release-validation-gate.md, canon/constraints/core-governance-baseline.md" complements: "docs/architecture/encode-current-state-2026-04-30.md, odd/handoffs/2026-04-30-cli-encode-deprecation.md" supersedes: "odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d.md" +superseded_by: "odd/ledger/2026-04-30-encode-vodka-refactor-phase-2-landed.md" governs: "Phase 2 implementation gate for the oddkit_encode worker — the five small remaining items (envelope plural alignment, dedup-by-letter bug, fallback baseline gap, self-teaching surface, schema-driven check evaluator) plus the dedup bug surfaced by Audit 2026-04-30. CLI deprecation is out of scope per separate handoff. Replaces the predecessor handoff, which scoped against pre-PR-#96 state." -status: active +status: superseded --- # Handoff (Revised) — oddkit_encode Phase 2 diff --git a/odd/ledger/2026-04-30-encode-vodka-refactor-phase-2-landed.md b/odd/ledger/2026-04-30-encode-vodka-refactor-phase-2-landed.md new file mode 100644 index 0000000..1f1c63b --- /dev/null +++ b/odd/ledger/2026-04-30-encode-vodka-refactor-phase-2-landed.md @@ -0,0 +1,152 @@ +--- +uri: klappy://odd/ledger/2026-04-30-encode-vodka-refactor-phase-2-landed +title: "E0008.4 Phase 2 Closeout — Encode Items 1–4 (governance_uris Plural, (letter,facet) Dedup, Open in Fallback, governance_extended Self-Teaching), Item 5 Deferred to 0.29.0, Two Bugbot Findings Autofix-Forwarded, Validator PASS, Prod Promotion Outstanding" +audience: ledger +exposure: nav +tier: 3 +voice: neutral +stability: stable +tags: ["ledger", "p2-encode-vodka", "encode", "governance-driven", "vodka-architecture", "letter-facet-dedup", "open-quality-criteria", "governance-uris-plural", "governance-extended", "self-teaching", "release-validation-gate", "third-application", "bugbot-autofix", "managed-agent-validation", "epoch-8.4", "phase-2", "prod-promotion-outstanding"] +epoch: E0008.4 +date: 2026-04-30 +derives_from: "odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d-revised.md, docs/architecture/encode-current-state-2026-04-30.md, canon/constraints/release-validation-gate.md, canon/principles/code-claims-require-code-observation.md, canon/principles/vodka-architecture.md" +complements: "odd/handoffs/2026-04-30-cli-encode-deprecation.md, odd/ledger/2026-04-20-p1-3-4-encode-canon-parity-landed.md" +governs: "Closeout record for E0008.4 Phase 2 — the four-item ship that closed the encode parser refactor against real code state. Documents the shipped envelope alignment + dedup-bug fix + fallback expansion + self-teaching surface, the third end-to-end application of the release-validation-gate canon, two Cursor Bugbot findings on the trigger-word and TSV paths dispositioned as autofix fix-forwards (the smoke battery missed both vectors — a learning), Sonnet 4.6 read-only validator PASS verdict, and the open question of prod promotion (main-oddkit preview serves 0.28.0; oddkit.klappy.dev still on 0.27.0 awaiting operator-side version-promote in the Cloudflare dashboard). Item 5 (schema-driven check evaluator) deferred to 0.29.0 per the handoff's defer condition." +status: active +supersedes: "odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d-revised.md" +--- + +# E0008.4 Phase 2 Closeout — Encode Items 1–4 Landed, Item 5 Deferred, Two Bugbot Autofix Cycles, Validator PASS, Prod Promotion Outstanding + +> Phase 2 was scoped against real code state by the audit-2026-04-30 sweep: five small worker items, not the original brief's six gaps (most of which PR #96 had already retired in April). Items 1–4 landed via klappy/oddkit#155 squash-merged to main as `67741bd`. Item 5 deferred to 0.29.0 at the gate per the handoff's explicit defer condition because it spans both repos (oddkit code + canon Quality Criteria table migrations across seven articles). Along the way the release-validation-gate canon held end-to-end for the third application: Cursor Bugbot caught two real bugs the implementing session's smoke battery had missed — both on code paths the implementer didn't exercise — Cursor Agent autofix landed both fix-forwards in the same PR, and a Sonnet 4.6 read-only validator agent dispatched against the post-autofix HEAD returned PASS on all four items with a code-level + live-preview + prod-delta cross-check confirming the fixes change observable behavior in the intended direction. The implementing session had reach to `*.workers.dev` (no egress limitation this session), so the validator's role was independent fresh-context corroboration rather than substitute-observation. Prod at `oddkit.klappy.dev` remains on 0.27.0 with the bug live; `main-oddkit.klappy.workers.dev` serves 0.28.0; the gap is a manual version-promote in the Cloudflare dashboard that the implementing session does not have credentials for. + +--- + +## Summary — What Shipped, What Held, What's Outstanding + +Phase 2 scoped four mandatory items + one defer-eligible item per the handoff `klappy://odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d-revised`: + +- **Item 1** — `governance_uris` plural array on the encode envelope, alphabetical, dynamic per request (typically ~9 URIs: 7 type articles + serialization-format + how-to-write-encoding-types). Aligns shape with the challenge (P1.3.1) and gate (P1.3.2) canaries. `governance_uri` singular retained as deprecation alias for one minor; removed in 0.29.0. +- **Item 2** — `(letter, facet)` dedup. `discoverEncodingTypes` now parses an optional `Facet` row from the Type Identity table the same way it parses Letter, dedupes by pair, and the scorer in `runEncodeAction` looks up criteria by pair. `parsePrefixedBatchInput`, `parseStructuredInput`, and `parseUnstructuredInput` all propagate facet onto artifacts so the scorer routes correctly across all three input paths. The pre-fix bug (verified live on prod 0.27.0 before this change: `[O-open P1] body` returned `quality.score: 4 / maxScore: 4` with `typeName: "Observation (Open)"`) is dead on the post-merge main preview (returns 5/5 with `typeName: "Open"`). +- **Item 3** — Inline fallback baseline expanded to seven entries (D, O closed, O open, L, C, H, E). When canon is unreachable, the fallback now includes Open with `facet: "open"` instead of dropping it. +- **Item 4** — Optional `governance_extended` payload, gated by request param `include_governance_details: boolean`. When opted in, returns parsed Field Schema, Quality Criteria, trigger words, facet, and source URI per type plus `serializationFormatUri` and `howToWriteUri`. Closes Gap 4 from the original architecture brief — model can self-teach format and rubric from one encode call rather than seven `oddkit_get` calls. +- **Item 5** — Schema-driven `check` evaluator. **Deferred to 0.29.0** per the handoff's explicit defer condition. Item 5 spans both repos (oddkit code + Quality Criteria table migrations across seven encoding-type articles in klappy.dev) and the current `check.includes(...)` keyword interpretation works correctly for all existing criteria. Bundling it with Items 1–4 would entangle release validation across two repos and violate "use only what hurts." Five `check.includes(...)` calls remain in `scoreArtifactQuality` at orchestrate.ts L1400–1404 — verified by validator as the correct deferred state. + +The release-validation-gate canon held end-to-end on its third application. **Rule 1** (Bugbot completed) caught two real bugs Cursor Bugbot found on the original commit `7460a0e` and that the implementing session's smoke battery had missed: a medium-severity finding on the trigger-word paths (Open trigger words could match in `parseUnstructuredInput` and the `parsePrefixedBatchInput` untagged-paragraph fallback, but neither propagated `t.facet` onto the artifact, so the scorer's pair-lookup matched Observation's criteria instead of Open's — re-introducing the very mismatch this PR was supposed to kill, just on a different code path) and a low-severity finding on `parseStructuredInput` (the `letter → name` Map collapsed Observation and Open into the last-write-wins entry, mislabeling all bare-`O` TSV rows as `typeName: "Open"`). Cursor Agent autofix landed both fixes via commit `47fc7e0` on the same branch, addressing the Medium finding by propagating `t.facet` at all four artifact-construction sites and addressing the Low finding by replacing the naive Map with a no-facet-preferred build loop. Bugbot re-reviewed `47fc7e0` clean. **Rule 2** (independent fresh-context Sonnet 4.6 read-only validator agent via Managed Agents) returned PASS on all four items with code-level confirmation against the cloned branch at HEAD `47fc7e0`, live-preview confirmation against `https://feat-encode-phase-2-items-1-4-oddkit.klappy.workers.dev`, and prod-delta cross-check against `https://oddkit.klappy.dev` (0.27.0) — same `[O-open P1]` input returned `maxScore: 4` typeName `"Observation (Open)"` on prod and `maxScore: 5` typeName `"Open"` on preview, confirming the fix is the behavior change the test was anchoring. **Rule 3** (canon outranks session) had no tension to resolve this session — the handoff's specified version (0.28.0) was uncut and remained uncut throughout the dispatch. + +The full sequence took roughly ninety minutes from session open to merge: handoff and current-state architecture loaded; code observed at HEAD `1a1f093` (line numbers in handoff verified); pre-fix bug verified live on prod 0.27.0; branch opened; 218-line diff across `workers/src/orchestrate.ts`, `workers/src/index.ts`, `package.json`, `workers/package.json`, lockfile; typecheck clean; 105 + 7 existing tests pass; pushed; PR #155 opened; Workers Builds + Test CF Preview + Version Sync + Creed Freshness all completed success on the original commit; Bugbot completed with 2 findings; Cursor Agent autofix landed; Bugbot re-reviewed `47fc7e0` clean; validator agent dispatched; PASS verdict returned; squash-merged to main as `67741bd`. + +**Prod promotion is outstanding.** Cloudflare Workers Builds creates a "preview" version even on main pushes (current preview alias `https://main-oddkit.klappy.workers.dev` confirmed serving 0.28.0). The custom domain at `https://oddkit.klappy.dev` is served by a manually-pinned version per the canon convention "do NOT run `wrangler deploy` manually." Promotion at the prod endpoint requires either (a) operator action in the Cloudflare Workers dashboard ("Deploy this version") or (b) a `wrangler versions deploy` against a CF API token the implementing session does not hold. The fix is committed, validated, deployed to the main preview alias, and one operator click away from prod. Until that click happens, the dedup bug remains live in prod for any caller hitting `oddkit.klappy.dev`. + +--- + +## Decisions + +**[D-01] Defer Item 5 (schema-driven `check` evaluator) to 0.29.0.** The handoff explicitly licensed this split: "If Items 1–4 land cleanly and the schema-driven evaluator becomes a substantial implementation in its own right, ship Phase 2 as 0.28.0 with Items 1–4 only and queue Item 5 as Phase 3 / 0.29.0. Operator decides at the gate." The implementing session decided defer at the gate because (a) Item 5 spans both repos — it requires updating the canon Quality Criteria tables in seven encoding-type articles (`decision.md`, `observation.md`, `open.md`, `learning.md`, `constraint.md`, `handoff.md`, `encode.md`) to add a machine-readable column alongside the existing human-readable `check` strings, plus a new evaluator vocabulary in the worker, plus the worker's evaluator function rewrite — bundling that with Items 1–4 would entangle release validation across two repos and produce a PR that would be hard to validate as one coherent unit; (b) the current `check.includes(...)` interpretation works correctly for all existing Quality Criteria — the constraint Item 5 fixes ("server stops opinionating on what `check` strings mean") is structural debt, not a behavioral defect, and "use only what hurts" applies; (c) Items 1–4 are tightly scoped and naturally reviewable as one PR — Item 5 deserves its own coherent ship with its own canon-side migration PR. Five `ck.includes(...)` calls remain in `scoreArtifactQuality` at orchestrate.ts L1400–1404 — verified by the validator as the correct deferred state. + +**[D-02] Single-commit PR shipping pattern over per-item commits.** The four items are intertwined in `orchestrate.ts`: extending `EncodingTypeDef` with `facet` threads through `discoverEncodingTypes` (parsing + dedup), `parsePrefixedBatchInput` (pair-keyed lookup), the scorer (pair-keyed criteria selection), and the envelope (`governance` array now carries facet, `governance_uris` is dynamic). Per-item commits would have produced artificial seams that don't review well — Item 1's envelope changes can't be honestly evaluated without Item 2's facet field on the governance array, Item 2's dedup behavior can't be tested without Item 4's `governance_extended` revealing the per-type criteria counts. Single squashable commit on the feat branch, full rationale in the commit body, expanded narrative in the PR description. The autofix `47fc7e0` is a separate commit on the same branch (preserves Bugbot's authorship and audit trail) and gets squashed in by the merge. + +**[D-03] Accept Cursor Agent autofix design over the orchestrator's planned alternative.** The implementing session was about to write fixes for both Bugbot findings when it discovered Cursor Agent had already committed `47fc7e0` addressing both issues. The autofix shape was exactly what the implementing session was about to write — propagate `t.facet` on all four artifact-construction sites in the trigger-word paths; replace the naive `letter → name` Map with a no-facet-preferred build loop in `parseStructuredInput`. Per L-08-candidate-from-P1.3.4 ("prefer autofix design over orchestrator design in Bugbot disposition unless canon names a reason for divergence"), accept and verify rather than re-write. This is the second occurrence of the autofix-design-vs-orchestrator-design pattern (P1.3.4 was the first); calling it twice is enough to graduate L-08 in the next session that touches a Bugbot-detectable surface — see [H-04]. + +**[D-04] Verify Cursor Agent autofix by direct preview-URL smoke + accept by inspection.** The autofix was applied while the implementing session was in another sub-task; the implementing session pulled it, inspected the diff (21 +/- 5 lines, exactly the right shape), confirmed `npm run typecheck` clean, and ran two new smokes against the preview URL to confirm both fixes work in the deployed worker: Bug 1 smoke (an unstructured paragraph triggering Open's vocabulary) returned `typeName: "Open"` with `maxScore: 5` and `facet: "open"`; Bug 2 smoke (a bare-`O` TSV row) returned `typeName: "Observation"` with no facet and `maxScore: 4`. Both behaviors are the canon-correct semantics. Same-session smoke does not satisfy the release-validation-gate (Rule 2 requires fresh-context validation), but it does provide pre-flight evidence that there's something for the validator to validate — and in this case it caught no further issues, which the validator then independently corroborated. + +**[D-05] Squash-merge with the validator returning PASS rather than wait for an additional promotion-validator dispatch.** The release-validation-gate canon's Rule 2 requires one Sonnet 4.6 fresh-context validator dispatch before promotion merge. The P1.3.4 closeout introduced a hedge pattern (two validators — feat + promotion) for situations where the orchestrator's session sandbox couldn't reach the workers.dev preview URLs. This session had reach throughout (no egress limitation), the validator's PASS verdict was definitive (code + live + prod-delta), and the autofix's content was small enough to inspect by hand — so a second promotion-validator dispatch would have been ritual, not corroboration. Single-validator was sufficient. The two-validator hedge remains valid for sessions with limited egress (P1.3.4's case); it is not the canon default. + +--- + +## Observations + +**[O-01] The release-validation-gate canon held under pressure on its third end-to-end application.** All three rules were binding, all three operated as designed, and neither Bugbot finding nor the autofix cycle nor the prod-promotion gap caused deviation from the gate's discipline. Compared with P1.3.3 (canon written mid-stream, painful) and P1.3.4 (canon inherited, second application, smoother), this session was the smoothest yet — Bugbot ran without any "skip" temptation, validator dispatched without delay, autofix accepted without hand-wringing about whose design was "better." The canon doing what structural fixes are supposed to do: removing the individual-judgment-under-pressure failure mode by making the discipline mechanically unavoidable. + +**[O-02] The implementing session's smoke battery missed both Bugbot findings — this is the system working as designed.** The original four acceptance smokes the implementing session ran against the preview URL covered: Item 1 (governance_uris shape + count + alias retained), Item 2 (prefix-tagged `[O-open P1]` returning Open's 5 criteria), Item 3 (governance count = 7 with Open at facet=open), Item 4 (governance_extended on/off behavior). All four passed. None of the four exercised the unstructured trigger-word path (Bug 1's vector) or the bare-`O` TSV path (Bug 2's vector). Bugbot's fresh read against the diff caught both. This is the canonical case for fresh-context review: the implementer's smoke is biased toward the paths the implementer thought about; an external reviewer reads the diff cold and asks "what happens on the other paths?" The post-autofix smoke battery now covers both vectors (added during disposition), so a future regression on either path would be caught by in-session smoke too — but the discipline that makes that possible is keeping Bugbot in the loop, not relying on the implementer's smoke battery to be exhaustive. + +**[O-03] Two Cursor Bugbot findings both caught real bugs; both fix-forwarded by Cursor Agent autofix on the same branch.** Finding 1 (Medium severity, on commit `7460a0e`, posted 2026-04-30T15:24:15Z): "Open artifacts from trigger-word path lack facet, misscored." Root cause confirmed by both the orchestrator and the validator: the `(letter, facet)` dedup keeps Open in the `types` array, so `parseUnstructuredInput` and the untagged-paragraph fallback in `parsePrefixedBatchInput` could match Open's trigger words and create artifacts with `typeName: "Open"`, but neither propagated `t.facet` onto the artifact, so the scorer's first lookup matched `letter === "O" && facet === undefined` (Observation), selecting Observation's 4 criteria instead of Open's 5 — exactly the mismatch this PR set out to fix on the prefix-tagged path, just on a different code path. Finding 2 (Low severity, same commit, same time): "TSV parser Map collision renames all `O` to Open." Root cause: `parseStructuredInput` built a `letter → name` Map from the types array, and now that the dedup keeps both Observation and Open at letter `O`, last-write-wins (alphabetical canon ordering put Open after Observation) — so all bare-`O` TSV rows resolved to `typeName: "Open"` instead of `typeName: "Observation"`. Both findings were real silent regressions introduced by the partial-facet-propagation that the implementing session shipped on the original commit. Cursor Agent autofix `47fc7e0` (Author: `Cursor Agent `, +21 / -5 lines) addressed both: propagated `t.facet` at four artifact-construction sites in the trigger-word paths, and replaced the naive Map with a build loop that prefers the no-facet entry. Bugbot re-reviewed `47fc7e0` clean. The session lost ~10 minutes to the autofix cycle and gained complete coverage of three input paths (prefix, unstructured, TSV) instead of one (prefix only). + +**[O-04] Sonnet 4.6 read-only validator agent returned PASS on all four items with non-overlapping evidence.** Validator session `sesn_011CaFBdjEZbjae8tUyBFmih`, agent `agent_011CaaBbbDoCGnnGtzD2CWnt`, dispatched at 2026-04-30T15:22:10Z. Validator was given no shared context with the implementing session — only the contract URI (handoff), the current-state URI, and the release-validation-gate canon URI. Validator independently fetched all three, cloned the repo at HEAD `47fc7e0` (the post-autofix commit), inspected `workers/src/orchestrate.ts` and `workers/src/index.ts`, ran preview-URL smokes, and ran the prod-delta cross-check. Verdict by item: Item 1 PASS (code at orchestrate.ts L3460–3492 builds `consultedUris` Set from each type's `sourceUri` plus serialization-format and how-to-write, then sorts; live confirms 9 URIs alphabetical with singular alias retained); Item 2 PASS (code at L590 keys dedup by `${t.letter}::${t.facet ?? ""}`, scorer at L3390–3392 looks up by pair, parsePrefixedBatchInput at L1241–1265 uses pair-keyed map, autofix sites verified — preview returns Open at 5/5, prod 0.27.0 returns Observation (Open) at 4/4, confirming the fix changes observable behavior in the intended direction); Item 3 PASS (defaults array at L609–617 has exactly 7 entries with Open present); Item 4 PASS (param threaded through schema at index.ts L231 and L320, through `handleUnifiedAction` at L255 and L455 to `runEncodeAction` at L3600; payload at L3496–3518 contains all required fields; live confirms Open's 5 criteria and Observation's 4 returned, both meta URIs correct, default-off omits the payload); Item 5 CORRECTLY DEFERRED (5 `ck.includes()` calls confirmed present at L1400–1404, zero schema-driven primitives — the correct 0.29.0-deferred state). Validator also flagged that it did not verify the Bugbot check-run status itself (correctly noted as the implementing session's Rule 1 obligation). Validator saved a structured ledger at `/home/user/ledger/2026-04-30-phase-2-validation.md` in its session sandbox; not persisted to canon (per the `oddkit_encode` standing rule). + +**[O-05] Prod at `oddkit.klappy.dev` is still on 0.27.0; main-oddkit preview at `https://main-oddkit.klappy.workers.dev` is on 0.28.0.** This was directly verified post-merge by polling `oddkit_version` against both endpoints — main-oddkit returned `0.28.0`, prod returned `0.27.0`. The Cloudflare Workers Builds CI check on the merge commit reported `completed: success` with a "Preview Alias URL: https://main-oddkit.klappy.workers.dev" — meaning the build succeeded and a new "preview" version was created, but the custom domain at `oddkit.klappy.dev` is served by a separately-pinned version that requires manual promotion via the Cloudflare dashboard. The `wrangler.toml` has no `routes` block; CI's `ci.yml` has no production-promote step; per user-canon ("do NOT run `wrangler deploy` manually"), promotion is operator-side action with CF API credentials the implementing session does not hold. The bug is fixed, validated, and one click away from prod. Until the click happens, callers of `oddkit.klappy.dev` continue to see Open artifacts scored against Observation's 4 criteria — the bug this Phase set out to fix remains live in production. + +--- + +## Learnings + +**[L-01] Implementer smoke batteries are biased toward the implementer's mental model — Bugbot's fresh diff-read is the structural complement.** The four acceptance smokes the implementing session ran covered exactly the four items the implementer was thinking about. None of them exercised the unstructured trigger-word path or the bare-`O` TSV path — both of which the dedup-by-pair change touched silently. Bugbot found both. The pattern is general: when an implementer ships a refactor that changes a shared data structure (here, `EncodingTypeDef.facet` flowing through three parsers + the scorer + the envelope), the implementer's smoke battery will cover whichever paths the implementer was actively working through, and will systematically miss the paths the implementer touched as a side effect. The structural complement is not "write a more exhaustive smoke battery" — that's individual-discipline-under-pressure failure mode (you can never enumerate the paths you didn't think about). The structural complement is "let a fresh reviewer read the diff cold." Bugbot is exactly that reviewer. Keep Bugbot in the loop on every merge to main, no matter how confident the in-session smoke looks. The discipline the canon enforces is the structural fix. + +**[L-02] When the deploy CI says "completed: success" but prod is unchanged, the check is reporting build success, not promote success — read the check's summary text, not just the status.** The Workers Builds check on the merge commit reported `completed: success` with a "Preview Alias URL: https://main-oddkit.klappy.workers.dev" — and the implementing session initially read the green status as "prod is updated." Polling prod showed 0.27.0 still serving. The check's summary text was explicit ("Preview Alias URL", not "Production URL"), but the implementing session had to investigate before noticing. The general pattern: in repositories where the deploy pipeline has a manual promotion step (Cloudflare Workers with named-version routing, AWS CodeDeploy with manual deployment groups, Kubernetes with manual rollout-status), CI's "deploy success" is "build success + preview deploy success" — production promotion is a separate event. Reading the check's status field alone is insufficient; the summary text and the wrangler / route configuration carry the information about what "deployed" actually means. For oddkit specifically: `main-oddkit.klappy.workers.dev` is the post-merge preview, `oddkit.klappy.dev` is prod, and the gap is operator action. + +**[L-03] Item 5 deferral was the right call — the handoff's defer condition exists for exactly this scope-spanning shape.** Item 5 (schema-driven `check` evaluator) is not just a worker-code change — it requires a coordinated migration of the canon Quality Criteria tables in seven encoding-type articles to add a machine-readable column. That migration is its own coherent ship: it has a different review surface (canon, not code), a different validation pattern (the new evaluator vocabulary needs its own corroboration against each article's structured form), and it produces value independently (the structured criteria become useful for any future tooling that reads encoding-type governance, not just the worker's scorer). Bundling Item 5 with Items 1–4 would have produced a PR with code changes in oddkit + canon changes in klappy.dev that needed atomic review across two repos — exactly the kind of release-validation entanglement the gate's discipline tries to prevent. The handoff anticipated this and wrote in a defer condition; using it is not "punting" but "respecting the natural seam." Item 5 is queued as a clean Phase 3 / 0.29.0 ship with its own handoff to be written. + +--- + +## Constraints + +**[C-01-reinforced] `klappy://canon/constraints/release-validation-gate` held under pressure for the third time.** Rule 1 caught both Bugbot findings on the original commit and prevented merge until both were dispositioned (Cursor Agent autofix landed `47fc7e0`, Bugbot re-reviewed clean). Rule 2 produced an independent PASS verdict from Sonnet 4.6 with non-overlapping evidence (code + live + prod-delta cross-check). Rule 3 had no tension to resolve this session — the handoff's specified version (0.28.0) was uncut and remained uncut. The pattern that has now held three times: gate-enforced discipline is what makes refactors safe to ship, not implementer confidence in the smoke battery. + +**[C-02 new] Operator-side prod promotion at `oddkit.klappy.dev` is the standing gap between merged main and live prod for this codebase.** The Cloudflare Workers deployment architecture for this project: `main-oddkit.klappy.workers.dev` auto-deploys on every push to main; `oddkit.klappy.dev` is a custom domain pinned to a specific named version, promoted manually via the Cloudflare Workers dashboard. The implementing session does not have CF API credentials; the user canon explicitly says "do NOT run `wrangler deploy` manually." This is not a defect — it's a deliberate human-in-the-loop gate before production. Closeout ledgers must explicitly disposition the prod-promotion state ("promoted" vs "outstanding"), and any future ledger that claims "shipped to prod" must verify by direct prod-endpoint smoke, not by inferring from CI status. When prod promotion is outstanding at closeout time, the ledger is `status: active` (not `status: complete`) until the operator confirms promotion and a follow-up commit flips status. This ledger leaves status `active` pending operator promotion of `67741bd` to `oddkit.klappy.dev`. + +--- + +## Handoffs + +**[H-01] Operator action: promote 0.28.0 to prod via the Cloudflare Workers dashboard.** Open `https://dash.cloudflare.com/...workers/services/view/oddkit/production` → Versions → find the version built from commit `67741bd` (Cloudflare's build ID for the merge will be visible in the Workers Builds check summary on the merged PR) → Deploy. Or via wrangler: `cd workers && wrangler versions deploy --x-versions` against a CF API token with deploy permissions. Once promoted, smoke `oddkit.klappy.dev` directly: `oddkit_version` should return 0.28.0; `[O-open P1] body` should return `quality.maxScore: 5` and `typeName: "Open"`. After confirmation, flip this ledger's frontmatter from `status: active` to `status: complete` (or open a small PR doing so). + +**[H-02] Item 5 — schema-driven `check` evaluator, target 0.29.0.** Open a new handoff `klappy://odd/handoffs/2026-MM-DD-encode-item-5-schema-driven-check-evaluator` describing: (a) the worker-side rewrite of `scoreArtifactQuality` to interpret a small primitive vocabulary (`field_X_non_empty`, `body_words_gte_N`, `body_matches_pattern_Y`, `body_does_not_contain_Z`, `field_X_matches_pattern_Y`, `priority_band_present`) instead of `check.includes(...)`; (b) the canon-side migration of seven encoding-type articles to add a machine-readable column to the Quality Criteria tables alongside the existing human-readable `check` strings; (c) the staging plan for the cross-repo ship — likely klappy.dev PR first (canon adds the structured column, old `check` strings still present and still authoritative), then oddkit PR (worker reads the structured column, falls back to `check.includes` if absent, ships as 0.29.0), then a follow-up klappy.dev PR removing the old `check` column from canon once the worker has been on the new evaluator for a release cycle. Acceptance: `scoreArtifactQuality` contains zero `check.includes(...)` calls; all criteria evaluation flows through the primitive vocabulary; canon migration is in. + +**[H-03] CLI deprecation per the separate handoff.** `klappy://odd/handoffs/2026-04-30-cli-encode-deprecation` is independently active and not in scope for this Phase 2. Now that Phase 2 has shipped to main, decide whether the CLI deprecation timeline accelerates (Phase 2's worker semantics are now the obvious one source of truth) or holds at its current cadence. + +**[H-04] L-08 graduation — autofix-design-vs-orchestrator-design preference.** Per [D-03] above, this is the second occurrence of the autofix-vs-orchestrator-design pattern (P1.3.4 was the first; this Phase 2 is the second). Two occurrences is not yet enough to graduate as canon (the L-08 candidate was sized at "third or fourth"). Call out on the next Bugbot autofix interaction; if it's the third, graduate as a tier-2 principle in the next canon-tier-2 writing window. + +**[H-05] TruthKit-KB sync — replace older D/O/L/C/H/Q articles with the now-canonical oddkit versions.** Carried forward from the handoff's open items table at P3 priority. The TruthKit knowledge base has older copies of the encoding-type articles that predate Phase 1's migration into klappy.dev canon. Now that klappy.dev's encoding-type articles are the canonical source (including Open with its 5 criteria, the Field Schema tables that Item 4's `governance_extended` reads, and the supersession chain documented in PR #157), TruthKit-KB needs to either sync to klappy.dev's versions or maintain divergence intentionally. Open as a separate session. + +**[H-06] Close this Phase 2 handoff.** The supersedes frontmatter in this ledger flips `klappy://odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d-revised` to superseded. Verify in the same PR that flips this ledger to status: complete after operator promotion. + +--- + +## Encodes + +**[E-01] Mid-session DOLCHE encode performed via `oddkit_encode` action=encode at 2026-04-30T15:19:07Z, output present in the orchestrator session transcript but not persisted to canon (per the encode tool's standing rule that callers must save output to storage).** Five artifacts encoded covering: D-01 (single-commit shipping pattern), C-01 (in-session smoke results), O-open-01 (acceptance 3 fetcher-multi-tier behavior), O-open-02 (release-validation-gate state), H-01 (next-action sequence). The mid-session encode preceded the autofix cycle and the validator's PASS verdict; this ledger supersedes it with the post-merge complete picture. Parser quality scores ranged 1/2 to 2/2 — typical floor for session-progress encodes that re-use narrative rather than compose fresh rationale. + +--- + +## Timeline (UTC) + +| Time | Event | +|---|---| +| ~14:44 | Phase 2 session opened; `oddkit_time` first call. | +| ~14:53 | Handoff and current-state architecture loaded; line numbers in handoff verified against `klappy/oddkit@1a1f093` HEAD. | +| ~14:56 | Pre-fix bug verified live on prod 0.27.0: `[O-open P1] body` → `quality.score: 4 / maxScore: 4`, `typeName: "Observation (Open)"`. | +| ~14:58 | `oddkit_preflight` + `oddkit_gate` consulted; gate returned NOT_READY (artifacts not yet built — expected pre-execution state). | +| ~15:00 | Feat branch `feat/encode-phase-2-items-1-4` opened. | +| ~15:00–15:13 | Items 1–4 implemented across `workers/src/orchestrate.ts` + `workers/src/index.ts` + version bumps in both `package.json`s + lockfile regenerated; typecheck clean; existing 105 + 7 tests pass. | +| ~15:13 | Commit `7460a0e` pushed; PR #155 opened. | +| ~15:16 | Workers Builds completed success; preview at `https://feat-encode-phase-2-items-1-4-oddkit.klappy.workers.dev` live. | +| ~15:16–15:18 | Implementing session ran four acceptance smokes against preview; all four pass. | +| ~15:24 | Cursor Bugbot completed with two findings: Medium (trigger-word path facet propagation) + Low (TSV typeMap collision). | +| ~15:30 | Cursor Agent autofix commit `47fc7e0` landed on the same branch addressing both findings. | +| ~15:30–15:32 | Workers Builds + Test CF Preview + Version Sync + Creed Freshness + Bugbot all re-ran and completed success on `47fc7e0`. | +| ~15:22 | Sonnet 4.6 read-only validator agent dispatched (`agent_011CaaBbbDoCGnnGtzD2CWnt`, session `sesn_011CaFBdjEZbjae8tUyBFmih`). | +| ~15:26–15:37 | Validator independently fetched the three canon docs, cloned the repo at `47fc7e0`, inspected source, ran preview smokes, ran prod-delta cross-check. | +| ~15:37 | Validator returned PASS verdict on all four items + Item 5 correctly identified as deferred. | +| ~16:11–16:14 | Implementing session ran post-autofix smokes against preview confirming both Bugbot fix vectors now work (Bug 1: trigger-word Open → 5/5 typeName Open facet open; Bug 2: bare-O TSV → typeName Observation maxScore 4); regression check on `[O-open P1]` prefix path still 5/5. | +| 16:15:52 | PR #155 squash-merged to main as commit `67741bd`. | +| ~16:16–16:18 | Polled `oddkit.klappy.dev` for 0.28.0 — still serving 0.27.0 across 12 polls. | +| ~16:19 | Diagnosed: Workers Builds creates a new "preview" version even on main pushes; main-oddkit.klappy.workers.dev serves 0.28.0; oddkit.klappy.dev custom domain requires manual version-promote. Operator action outstanding. | +| This ledger | Closeout written; handoff flipped to superseded; status remains `active` pending operator prod promotion. | + +--- + +## References + +- **Handoff superseded:** `klappy://odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d-revised` +- **Predecessor in encode arc:** `klappy://odd/ledger/2026-04-20-p1-3-4-encode-canon-parity-landed` +- **Current-state architecture:** `klappy://docs/architecture/encode-current-state-2026-04-30` +- **Binding canon:** `klappy://canon/constraints/release-validation-gate`, `klappy://canon/principles/code-claims-require-code-observation` +- **Principles applied:** `klappy://canon/principles/vodka-architecture`, `klappy://canon/principles/prompt-over-code` +- **Feat PR:** klappy/oddkit#155 (squash-merged `67741bd`) +- **Original commit:** `7460a0e` (Items 1–4 implementation) +- **Cursor Agent autofix commit:** `47fc7e0` (Bug 1 + Bug 2 fixes) +- **Cursor Bugbot findings:** Medium "Open artifacts from trigger-word path lack facet, misscored" (BUGBOT_BUG_ID f4faa575-aee2-4387-bd1f-52edd100a175); Low "TSV parser Map collision renames all `O` to Open" (BUGBOT_BUG_ID d9d7a261-b2ca-4012-985c-63a8e739ba73) +- **Validator agent:** `agent_011CaaBbbDoCGnnGtzD2CWnt`, session `sesn_011CaFBdjEZbjae8tUyBFmih` +- **Validator ledger:** `/home/user/ledger/2026-04-30-phase-2-validation.md` (validator session sandbox; not persisted to canon) +- **Preview URLs verified:** `https://feat-encode-phase-2-items-1-4-oddkit.klappy.workers.dev` (feat branch, retired post-merge), `https://main-oddkit.klappy.workers.dev` (main branch, currently serving 0.28.0) +- **Production URL:** `https://oddkit.klappy.dev` (currently serving 0.27.0; awaiting operator promotion of `67741bd`)