diff --git a/docs/architecture/encode-architecture-problem-and-gaps.md b/docs/architecture/encode-architecture-problem-and-gaps.md new file mode 100644 index 00000000..22d24a61 --- /dev/null +++ b/docs/architecture/encode-architecture-problem-and-gaps.md @@ -0,0 +1,250 @@ +--- +uri: klappy://docs/architecture/encode-architecture-problem-and-gaps +title: "Encode Architecture: Problem, Gaps, and Alternatives Analysis" +audience: docs +exposure: nav +tier: 2 +voice: neutral +stability: semi_stable +tags: ["odd", "oddkit", "encode", "architecture", "vodka-architecture", "prompt-over-code", "alternative-d", "dolche", "dolcheo", "tsv", "governance", "design-brief", "epoch-8.4"] +epoch: E0008.4 +date: 2026-04-30 +derives_from: "canon/definitions/dolcheo-vocabulary.md, canon/principles/prompt-over-code.md, odd/encoding-types/how-to-write-encoding-types.md, odd/encoding-types/serialization-format.md" +complements: "odd/encoding-types/decision.md, odd/encoding-types/observation.md, odd/encoding-types/learning.md, odd/encoding-types/constraint.md, odd/encoding-types/handoff.md, odd/encoding-types/open.md, odd/encoding-types/encode.md" +governs: "Implementation brief for the oddkit_encode vodka refactor (Phase 2 of E0008.4). Names the problem, evaluates alternatives, recommends Alternative D — Governance-defined field schemas with format-agnostic serialization." +provenance: "Originated in klappy/truthkit-kb at docs/architecture/encode-architecture-problem-and-gaps.md (commit prior to 2026-04-16). Migrated verbatim to klappy.dev as the implementation brief for the oddkit encode refactor, with frontmatter added for canon discoverability. The TruthKit-KB origin is preserved in repo history; this doc is now the oddkit-canonical version." +status: active +--- + +# Encode Architecture: Problem, Gaps, and Alternatives Analysis + +> The encode tool's parser and the governance it claims to serve are completely disconnected. The parser recognizes four English keywords via hardcoded regex. The governance defines six extensible dimensions via canon articles. The model does all the categorization work — and the parser throws it away. The fix must make governance the source of truth for encoding behavior: the server searches canon at encode time, dynamically builds extraction patterns from what it finds, teaches the calling model how to structure input, and supports ad-hoc types that any knowledge base can define without server changes. + +--- + +## The Problem in One Sentence + +The encode tool's type detection is hardcoded in TypeScript while the governance it should implement lives in markdown — a direct violation of prompt over code. + +--- + +## Current State — What the Code Actually Does + +`detectEncodeType()` in `orchestrate.ts` (lines 256–266): + +```typescript +function detectEncodeType(input: string): string { + if (/\b(decided|decision|chose|choosing|selected|committed to|going with)\b/i.test(input)) + return "decision"; + if (/\b(learned|insight|realized|discovered|found that|turns out)\b/i.test(input)) + return "insight"; + if (/\b(boundary|limit|constraint|rule|prohibition|must not|never)\b/i.test(input)) + return "boundary"; + if (/\b(override|exception|despite|even though|notwithstanding)\b/i.test(input)) + return "override"; + return "decision"; +} +``` + +The entire encode handler then produces **one artifact** with: one type, one title, one quality score, one rationale extraction, one set of constraints. Everything the model sends — regardless of how many DOLCHE dimensions it contains — collapses into a single blob typed as "decision," "insight," "boundary," or "override." + +--- + +## Gap 1 — Parser Vocabulary ≠ Governance Vocabulary + +The parser knows four types: `decision`, `insight`, `boundary`, `override`. + +The DOLCHE vocabulary doc (`docs/oddkit/proactive/dolche-vocabulary.md`) defines six dimensions: **D**ecisions, **O**bservations, **L**earnings, **C**onstraints, **H**andoffs, **E**ncodes. The vocabulary doc explicitly states: "The type field is a string, not an enum. Any knowledge base can extend DOLCHE with custom types by adding a governance document." + +These two systems share zero coordination: + +| Governance says | Parser recognizes | Alignment | +|---|---|---| +| Decision (D) | "decision" | Partial — different trigger words | +| Observation (O) | — | **Missing entirely** | +| Learning (L) | "insight" | Renamed — "insight" ≠ "learning" | +| Constraint (C) | "boundary" | Renamed — "boundary" ≠ "constraint" | +| Handoff (H) | — | **Missing entirely** | +| Encode (E) | — | Meta-level, not a content type | +| Custom (any) | — | **No extension mechanism** | +| — | "override" | **Not in DOLCHE at all** | + +The parser invented "override" (not in DOLCHE) and renamed two types ("insight" for Learning, "boundary" for Constraint). Two DOLCHE types (Observation, Handoff) have no parser representation at all. The governance vocabulary and the server vocabulary diverged silently — and no mechanism detects or prevents the drift. + +## Gap 2 — Single Blob Output for Multi-Dimensional Input + +When a model calls encode with a full DOLCHE session capture — say, 5 decisions, 3 observations, 2 learnings, 4 constraints, 3 handoffs — the handler produces **one artifact**. One title (extracted from the first sentence). One type (the first regex match). One quality score (computed against the entire input string). + +The DOLCHE from the last session (attached to this conversation) contained 20+ categorized items across 6 types. The encode parser would have collapsed all of it into a single "decision" artifact scored against the entire blob. The model did all the extraction work. The server discarded it. + +This isn't just wasteful — it's architecturally backwards. The model has the intelligence to categorize. The server has the structure to store per-type artifacts. But the interface between them is a single unstructured string, parsed by regex that doesn't match the vocabulary, producing one output where many are needed. + +## Gap 3 — No Discovery Mechanism for Governance + +The encode handler never searches canon. It never fetches the DOLCHE vocabulary doc. It never looks for custom type definitions. The governance articles exist and are thoroughly written — but the server code that should implement them doesn't know they exist and has no mechanism to find them. + +This is the prompt-over-code violation in its purest form. The canon says the vocabulary is extensible via governance documents. The server says the vocabulary is four hardcoded regex patterns. When the governance changes — as it already did when OLDC+H became DOLCHE — the server doesn't know and can't adapt. + +## Gap 4 — The Model Can't Learn From the Governance + +The encode tool description currently says: "Standard artifact types: Observations (O), Learnings (L), Decisions (D), Constraints (C), Handoffs (H) — OLDC+H." + +This is a static string. It was written by a human, committed to the server codebase, and will drift from the governance vocabulary the moment the governance changes. It already has — it says OLDC+H while the governance says DOLCHE. It doesn't mention custom types. It doesn't teach the model how to structure input for optimal extraction. It doesn't know about extensions a specific knowledge base might define. + +The model reads this description once and improvises from there. The result: every encode call is a negotiation between what the model guesses the tool wants and what the regex actually matches. The governance has the answers — type definitions, trigger phrases, structural guidance — but the model never sees them. + +## Gap 5 — Quality Scoring Is Monolithic + +The quality scorer checks for: word count ≥ 10, rationale present, constraints present, alternatives mentioned, reversibility noted. This produces one score (0–5) for the entire input. + +But a DOLCHE capture has per-type quality criteria. A Decision without rationale is weak. An Observation without evidence is speculation. A Handoff without next-actions is incomplete. A Constraint without enforcement is a suggestion. These are different quality dimensions that require different scoring criteria — criteria that the governance docs already define but the scorer doesn't read. + +## Gap 6 — Ad-Hoc Types Are Governance Fiction + +The DOLCHE vocabulary doc says: "A pastoral knowledge base might add 'P' for Prayer Requests." This is currently a governance aspiration with zero server support. There is no mechanism for: + +1. The server to discover that a KB defines custom types +2. The parser to recognize input that matches custom type patterns +3. The model to learn about custom types from the KB's governance +4. Quality scoring to apply custom criteria to custom types + +The extensibility promise is real in the governance layer and fictional in the server layer. + +--- + +## Alternatives Analysis + +### Alternative A: Expand the Hardcoded Regex + +Add regex patterns for Observation, Learning, Constraint, Handoff, and Encode to `detectEncodeType()`. Map them to DOLCHE letters. + +**Pros:** Minimal code change. Ships today. + +**Cons:** Still violates prompt over code — every vocabulary change requires a server deployment. Still produces single-blob output. Still can't handle custom types. Still doesn't teach the model. The regex grows but the architecture doesn't improve. This is the fix that works for DOLCHE and breaks for every extension. + +**Verdict:** Solves today's naming mismatch. Creates tomorrow's maintenance burden. Scales linearly with vocabulary size. + +### Alternative B: Let the Model Do All the Work via Tool Description + +Make the tool description extremely detailed — include all type definitions, trigger words, structural guidance, quality criteria. Tell the model to produce structured JSON with per-type artifacts. The server becomes a passthrough that validates and returns whatever the model sends. + +**Pros:** Zero server changes. Pure prompt-over-code. The model is the intelligence layer. + +**Cons:** Tool descriptions have token limits. Stuffing the entire DOLCHE vocabulary + quality criteria + custom type definitions into a tool description is fragile and bloats every MCP handshake. The description becomes stale the moment governance changes. Different MCP platforms truncate descriptions at different lengths. The model still has to guess what custom types exist. + +**Verdict:** Right direction, wrong layer. The intelligence should be in the model, but the governance should be surfaced dynamically — not embedded statically. + +### Alternative C: Encode Calls Search Internally + +At encode time, the server searches its own canon for governance documents tagged with encoding vocabulary. It finds the DOLCHE vocabulary doc, any custom type definitions, and any quality criteria docs. It includes these in the encode response, teaching the calling model how to structure a re-invocation or how to structure future encode calls. + +**Pros:** Dynamic — governance changes propagate automatically. Extensible — custom types surface from KB governance, not server code. The server stays thin (search is already a capability). Mirrors how `telemetry_policy` works — governance fetched from canon at runtime. + +**Cons:** Adds latency to encode (one search round-trip). First encode call in a session won't benefit from the teaching unless the model re-invokes. Doesn't solve the single-blob problem on its own — the response format still needs to support multiple artifacts. + +**Verdict:** Strong for discovery and teaching. Insufficient alone for extraction. + +### Alternative D: Governance-Defined Field Schemas with Format-Agnostic Serialization (Proposed) + +The server searches canon at encode time for governance documents that define encoding types. Each type's governance doc defines field semantics: type letter, type name, field schema, quality criteria, trigger words (for fallback). A separate serialization format governance doc defines how fields are serialized (default: TSV). The model outputs structured rows — one per artifact, typed by the first field. The server parses mechanically using the serialization format. For unstructured input, the server falls back to paragraph splitting and dynamic regex classification. The encode response surfaces the governance, teaching the model for subsequent calls. Type definitions and serialization format evolve independently. + +**Architecture:** + +1. **At encode time:** Server searches canon for docs tagged as encoding-type governance, plus serialization format governance. +2. **From each type doc:** Extract type letter, field schema, quality criteria, trigger words. +3. **Parse structured input:** Using the serialization format, split into rows and fields. First field is the type letter. Remaining fields defined per-type by governance. +4. **Fallback:** If input isn't structured, split by paragraph, classify against governance-derived regex from trigger words. +5. **Score per-type:** Apply governance-defined quality criteria to the fields of each typed row. +6. **Return per-type artifacts:** Multiple artifacts in markdown stream, each with its own type, title, quality score, and gaps. +7. **Teach the model:** Response includes governance definitions — type letters, field schemas, quality criteria, serialization format, custom types. Model learns. + +**Pros:** +- **Pure prompt over code:** Adding a type means writing a governance doc, not changing server code. Changing the format means updating one format doc, not every type doc. +- **Self-teaching:** The model learns the vocabulary and format from governance docs surfaced in the response. +- **Extensible:** Custom types work identically to default types — governance doc in, parsing out. +- **Per-type quality:** Each type has its own field schema and quality criteria from its own governance doc. +- **Mechanical parsing:** Serialization parsing is string splitting. No regex on the primary path. Near-zero compute. +- **Minimizes model capacity requirements:** The model is already restructuring content for encode. Adding field structure is trivial for any LLM. +- **Server stays thin:** Search + format parsing is generic infrastructure. The server doesn't know what DOLCHE means — it knows how to read governance docs and parse typed rows. +- **Graceful degradation:** Unstructured input falls back to paragraph + regex classification. The response teaches the format. The model converges. +- **Independent axes:** Type semantics and serialization format are governed by separate docs. Either changes without affecting the other. + +**Cons:** +- Governance docs need a structured format for field schemas and quality criteria (adds a writing convention). +- First encode in a session may be unstructured (model hasn't learned yet). Fallback handles this. + +**Verdict:** This is the architecture that makes governance the single source of truth for encoding behavior. It scales to any vocabulary size, supports ad-hoc types without server changes, teaches the model from the canon, and keeps the server thinner than the current hardcoded approach. + +### Alternative E: LLM-in-the-Loop Encoding + +Add LLM inference to the encode handler. The server sends the input + governance docs to a model, which produces structured per-type artifacts. The server validates and returns them. + +**Pros:** Highest extraction quality. The model understands nuance, context, and ambiguity. + +**Cons:** Adds inference latency (seconds, not milliseconds). Adds cost per encode call. Breaks the 0ms encode characteristic. Creates a dependency on model availability. Violates the Vodka Architecture principle of thin, stateless servers. The server becomes an inference orchestrator. + +**Verdict:** Right for TruthKit (where the harness governs LLM invocation). Wrong for oddkit (where the server must stay thin and stateless). This is the graduation path, not the current path. + +--- + +## Recommendation + +**Alternative D — Governance-defined field schemas with format-agnostic serialization** — is the proposal that beats all others. The model is already restructuring content when it calls encode. Encoding-type governance docs define field semantics per type. A separate serialization format doc defines how fields are serialized (default: TSV). The server parses mechanically. No regex on the primary path. Dynamic regex from governance-defined trigger words handles unstructured fallback. The response teaches the model both the vocabulary and the format. The model converges. + +The key insight: the server doesn't need to understand DOLCHE. It needs to understand *how to read governance docs that define encoding types and parse typed rows against their field schemas*. Two independent governance layers — type semantics and serialization format — give maximum flexibility with minimum coupling. The server is the enforcer. The canon is the law. + +--- + +## Resolved Design Questions + +1. **Governance doc format:** Separate docs per type — antifragile and easier to find via BM25. The DOLCHE vocabulary doc remains as narrative reference linking to them. Each type governance doc defines: type letter, type name, field schema, quality criteria. Serialization format (TSV default) is governed by a separate doc so types and format evolve independently. + +2. **Input format:** Governed by a separate serialization format doc (default: TSV). The model is already restructuring raw conversation into encode input — it's already doing the categorization work. The serialization format is independent of the type definitions so either can change without affecting the other. + +3. **Collision handling:** A single encode call can contain multiple rows of different types. Each row is independently typed. Multi-typing happens at the row level — the model decides a concept warrants both a D and a C row by emitting two rows. The model uses judgment; the server parses mechanically. + +4. **Response shape:** Markdown stream with per-type sections, each with quality score, gaps, and suggestions. Governance definitions included in the response to teach the model. + +5. **Caching:** Module memory cache is already 0ms for cached articles. Governance docs for encoding types are cached identically to all other canon files. No special caching strategy needed — solved infrastructure. + +6. **Backward compatibility:** Non-issue. Consumers are LLMs. Dynamic MCP tool usage is expected to change between sessions. Models re-read tool descriptions and response shapes every invocation. + +--- + +## Refined Architecture — Governance-Defined Field Schemas with Format-Agnostic Serialization + +The model is already restructuring content when it calls encode. Two independent governance layers define the behavior: encoding-type docs define field schemas (what fields exist per type, quality criteria), and a serialization format doc defines how those fields are serialized (default: TSV). Either layer can change independently. + +**Flow:** + +1. **At encode time:** Server searches canon for docs tagged as encoding-type governance (or retrieves from 0ms module cache after first fetch). Also reads serialization format governance. +2. **From each encoding-type doc:** Extract type letter, field schema, quality criteria. +3. **Parse structured input:** Using the serialization format, split input into rows and fields. First field is the type letter. Remaining fields are defined per-type by governance. +4. **Per-type quality scoring:** Each type's governance doc defines its own quality criteria applied to its own fields. A Decision without rationale is weak. An Observation without evidence is speculation. A Handoff without next-actions is incomplete. Different types, different fields, different standards. +5. **Teach the model:** Encode response includes the governance definitions — type letters, field schemas, quality criteria, serialization format, and any custom types. The model learns from the canon, not from the tool description. +6. **Per-type artifacts in markdown stream:** Response contains per-type sections, each with its own title, quality score, gaps, and suggestions. +7. **Custom types work identically:** A KB adds a governance doc for "P — Prayer Requests" with its own field schema and quality criteria. Next encode call discovers it, parses P-typed rows against that schema, scores accordingly. No server change. + +**Fallback for unstructured input:** If the input isn't in the expected serialization format (models that haven't learned it yet), fall back to paragraph splitting + dynamic regex classification from governance-defined trigger words. The response teaches the format. The model converges on subsequent calls. + +**The self-teaching loop:** + +First encode call → may be unstructured → server classifies via fallback, surfaces governance with field schemas and serialization format → model learns → subsequent calls are well-structured → server parses mechanically → per-type quality feedback → extraction quality improves within the session. + +**What the server does NOT do:** + +- LLM inference +- Domain-specific logic +- Type definition (that's encoding-type governance) +- Field schema definition (that's encoding-type governance) +- Quality criteria definition (that's encoding-type governance) +- Serialization format definition (that's format governance) + +**What the server DOES do:** + +- Search canon for encoding-type governance docs and serialization format governance +- Parse input against governance-defined field schemas using governance-defined format +- Fall back to paragraph split + dynamic regex for unstructured input +- Score per-type quality against governance-defined criteria +- Surface governance in the response to teach the model +- Return per-type artifacts in markdown stream diff --git a/odd/encoding-types/constraint.md b/odd/encoding-types/constraint.md index ed525e33..2fec1535 100644 --- a/odd/encoding-types/constraint.md +++ b/odd/encoding-types/constraint.md @@ -6,23 +6,27 @@ exposure: nav tier: 2 voice: neutral stability: semi_stable -tags: ["odd", "oddkit", "encode", "dolche", "constraint", "encoding-type"] -epoch: E0008 -date: 2026-04-15 -derives_from: "docs/oddkit/proactive/dolche-vocabulary.md" +tags: ["odd", "oddkit", "encode", "dolche", "dolcheo", "constraint", "encoding-type", "tsv", "governance", "epoch-8.4"] +epoch: E0008.4 +date: 2026-04-30 +derives_from: "canon/definitions/dolcheo-vocabulary.md, docs/architecture/encode-architecture-problem-and-gaps.md, canon/principles/prompt-over-code.md" +complements: "odd/encoding-types/decision.md, odd/encoding-types/observation.md, odd/encoding-types/learning.md, odd/encoding-types/handoff.md, odd/encoding-types/open.md, odd/encoding-types/encode.md, odd/encoding-types/how-to-write-encoding-types.md, odd/encoding-types/serialization-format.md" governs: "oddkit_encode parsing and quality scoring for type C" status: active --- + # Encoding Type: Constraint (C) -> What now governs future work. Rules, boundaries, and non-negotiables. +> What now governs future work. Rules, boundaries, and non-negotiables that emerged from the session. Constraints bind future behavior — they are the artifacts most likely to prevent future mistakes. A constraint without enforcement is a suggestion. --- ## Summary — The Binding Layer -Constraints bind future behavior and prevent repeated mistakes. A constraint without enforcement is a suggestion. +Constraints are the artifacts that prevent future mistakes by binding future behavior. They emerge from decisions, observations, and learnings — but once established, they outlive the context that created them. A constraint from six months ago still governs today's work even if nobody remembers why. + +The key discipline: constraints define what cannot be done, not just what should be done. "Use TSV for encode input" is a decision. "The tool description must never hardcode specific keywords" is a constraint — it binds all future work regardless of context. --- @@ -32,21 +36,76 @@ Constraints bind future behavior and prevent repeated mistakes. A constraint wit |---|---| | Letter | C | | Name | Constraint | +| Priority | High — constraints bind future behavior and prevent repeated mistakes | --- ## Field Schema +When encoding a Constraint, the model outputs a row with the following fields (serialization format governed by `odd/encoding-types/serialization-format.md`): + +``` +C {title} {body} {origin} {scope} +``` + | Field | Recommended | Description | |---|---|---| | type | yes | Always `C` | -| title | yes | Short summary of the constraint | -| body | yes | What is bound, limited, or prohibited | +| title | yes | Short summary of the constraint (≤12 words) | +| body | yes | The constraint statement — what is bound, limited, or prohibited | +| origin | no | What decision, observation, or learning produced this constraint | +| scope | no | "permanent", "until {condition}", "this project", "this epoch", or empty | + +Example: + +``` +C Server must never hardcode encoding type keywords The encode tool description and server code must never hardcode specific type keywords because the DOLCHE vocabulary is extensible via governance and hardcoding creates drift between code and canon. D: Governance-defined TSV contract for encode input permanent +``` --- ## Trigger Words (Fallback Classification) +When encode input is unstructured (not TSV), these trigger words classify a paragraph as Constraint: + ``` -must, must not, shall, never, always, required, prohibited, constraint, cannot +must, must not, shall, shall not, never, always, required, prohibited, constraint, cannot, non-negotiable, boundary, rule, forbidden, mandatory ``` + +--- + +## Quality Criteria + +Each criterion adds 1 to the quality score (max 4): + +| Criterion | Check | Gap message if missing | +|---|---|---| +| Substance | Body is ≥10 words | "Constraint is too brief — expand what is bound" | +| Clarity | Body contains a clear prohibition or requirement (must/must not/never/always) | "Make the constraint explicit — what must or must not happen?" | +| Origin | Origin column is non-empty | "What produced this constraint? Link to the decision or observation" | +| Scope | Scope column is non-empty | "Is this permanent, temporary, or scoped to a specific context?" | + +Quality levels: + +| Score | Level | Status | +|---|---|---| +| 4 | strong | recorded | +| 3 | adequate | recorded | +| 2 | weak | draft | +| 0–1 | insufficient | draft | + +--- + +## What Makes a Good Constraint Encoding + +A strong Constraint answers: what is bound, why it's bound (origin), and how long it's bound (scope). The most common gap is missing scope — constraints without expiration or context become invisible governance debt. "Never do X" is stronger than "don't do X" but "never do X because Y, permanent" is strongest. + +The second most common gap is constraints that read like preferences. "We should use TypeScript" is a preference. "All server code must be TypeScript because the CI pipeline only compiles TS" is a constraint — it names the enforcement mechanism. + +--- + +## See Also + +- [DOLCHE Vocabulary](klappy://docs/oddkit/proactive/dolche-vocabulary) — the six-dimension framework this type belongs to +- [Encoding Type: Decision](klappy://odd/encoding-types/decision) — decisions often produce constraints +- [Prompt Over Code](klappy://canon/principles/prompt-over-code) — why this governance doc exists instead of server code diff --git a/odd/encoding-types/decision.md b/odd/encoding-types/decision.md index a060c951..bc02cf90 100644 --- a/odd/encoding-types/decision.md +++ b/odd/encoding-types/decision.md @@ -6,23 +6,25 @@ exposure: nav tier: 2 voice: neutral stability: semi_stable -tags: ["odd", "oddkit", "encode", "dolche", "decision", "encoding-type"] -epoch: E0008 -date: 2026-04-15 -derives_from: "docs/oddkit/proactive/dolche-vocabulary.md" +tags: ["odd", "oddkit", "encode", "dolche", "dolcheo", "decision", "encoding-type", "tsv", "governance", "epoch-8.4"] +epoch: E0008.4 +date: 2026-04-30 +derives_from: "canon/definitions/dolcheo-vocabulary.md, docs/architecture/encode-architecture-problem-and-gaps.md, canon/principles/prompt-over-code.md" +complements: "odd/encoding-types/observation.md, odd/encoding-types/learning.md, odd/encoding-types/constraint.md, odd/encoding-types/handoff.md, odd/encoding-types/open.md, odd/encoding-types/encode.md, odd/encoding-types/how-to-write-encoding-types.md, odd/encoding-types/serialization-format.md" governs: "oddkit_encode parsing and quality scoring for type D" status: active --- + # Encoding Type: Decision (D) -> What was chosen. Explicit commitments with rationale that close options and create direction. +> What was chosen. Explicit commitments with rationale. Decisions close options and create direction. They are the highest-stakes artifacts because they constrain all subsequent work. A decision without rationale is a debt. A decision without a constraint test is untested. --- ## Summary — The Highest-Stakes Artifact Type -Decisions close options and create direction. They constrain all subsequent work, making them the most important artifacts to capture. +Decisions are the encoding type most likely to affect future work. A missed observation can be recovered from the transcript. A missed decision may not surface again. Decisions create direction, close options, and bind future behavior. They deserve the most rigorous quality criteria of any encoding type. --- @@ -32,21 +34,82 @@ Decisions close options and create direction. They constrain all subsequent work |---|---| | Letter | D | | Name | Decision | +| Priority | Highest — decisions constrain all subsequent work | --- ## Field Schema +When encoding a Decision, the model outputs a row with the following fields (serialization format governed by `odd/encoding-types/serialization-format.md`): + +``` +D {title} {body} {rationale} {alternatives} {reversibility} +``` + | Field | Recommended | Description | |---|---|---| | type | yes | Always `D` | -| title | yes | Short summary of the decision | -| body | yes | What was chosen and why | +| title | yes | Short summary of the decision (≤12 words) | +| body | yes | The decision statement — what was chosen and why it matters | +| rationale | yes | Why this was chosen. Starts with the reasoning, not "because" | +| alternatives | no | What else was considered. Empty string if none discussed | +| reversibility | no | "reversible", "permanent", "reversible until {condition}", or empty | + +Example: + +``` +D TSV as encode input format Governance defines strict TSV output contract for encode input. Model outputs typed rows, server parses mechanically. TSV is pure string splitting — no regex, no NLP. Model is already restructuring content. Adding column structure is trivial. Free-form prose with regex classification; JSON structured output; model-labeled paragraphs reversible +``` --- ## Trigger Words (Fallback Classification) +When encode input is unstructured (not TSV), these trigger words classify a paragraph as Decision: + ``` -decided, decision, chose, choosing, selected, committed to, going with +decided, decision, chose, choosing, selected, committed to, going with, will use, adopted, settled on, picked, determined, resolved to ``` + +These words are used by the server to build dynamic regex for fallback classification only. On the primary TSV path, the type letter `D` is the classifier. + +--- + +## Quality Criteria + +Each criterion adds 1 to the quality score (max 5): + +| Criterion | Check | Gap message if missing | +|---|---|---| +| Substance | Body is ≥10 words | "Decision body is too brief — expand what was chosen" | +| Rationale | Rationale column is non-empty and ≥3 words | "No rationale — add why this was chosen" | +| Alternatives | Alternatives column is non-empty | "No alternatives considered — what else was an option?" | +| Reversibility | Reversibility column is non-empty | "Note whether this is reversible or permanent" | +| Constraints | Body or rationale mentions what this decision constrains or binds | "What constraints does this decision create?" | + +Quality levels: + +| Score | Level | Status | +|---|---|---| +| 5 | strong | recorded | +| 3–4 | adequate | recorded | +| 2 | weak | draft | +| 0–1 | insufficient | draft | + +--- + +## What Makes a Good Decision Encoding + +A strong Decision encoding answers five questions: What was chosen? Why? What else was considered? Is it reversible? What does it constrain? The model doesn't need to answer all five — but quality scoring rewards completeness. + +The most common gap is missing rationale. Models often encode the what without the why. The quality feedback loop teaches the model to include rationale in future calls. + +The second most common gap is missing alternatives. A decision without alternatives is an assertion, not a choice. Even "no alternatives were discussed" is better than silence — it's honest about the decision's context. + +--- + +## See Also + +- [DOLCHE Vocabulary](klappy://docs/oddkit/proactive/dolche-vocabulary) — the six-dimension framework this type belongs to +- [Prompt Over Code](klappy://canon/principles/prompt-over-code) — why this governance doc exists instead of server code +- [Encode Does Not Persist](klappy://docs/oddkit/proactive/encode-does-not-persist) — the caller must save encoded artifacts diff --git a/odd/encoding-types/handoff.md b/odd/encoding-types/handoff.md index 1ba9fa67..aa8f883d 100644 --- a/odd/encoding-types/handoff.md +++ b/odd/encoding-types/handoff.md @@ -6,23 +6,27 @@ exposure: nav tier: 2 voice: neutral stability: semi_stable -tags: ["odd", "oddkit", "encode", "dolche", "handoff", "encoding-type"] -epoch: E0008 -date: 2026-04-15 -derives_from: "docs/oddkit/proactive/dolche-vocabulary.md" +tags: ["odd", "oddkit", "encode", "dolche", "dolcheo", "handoff", "encoding-type", "tsv", "governance", "epoch-8.4"] +epoch: E0008.4 +date: 2026-04-30 +derives_from: "canon/definitions/dolcheo-vocabulary.md, docs/architecture/encode-architecture-problem-and-gaps.md, canon/principles/prompt-over-code.md" +complements: "odd/encoding-types/decision.md, odd/encoding-types/observation.md, odd/encoding-types/learning.md, odd/encoding-types/constraint.md, odd/encoding-types/open.md, odd/encoding-types/encode.md, odd/encoding-types/how-to-write-encoding-types.md, odd/encoding-types/serialization-format.md" governs: "oddkit_encode parsing and quality scoring for type H" status: active --- + # Encoding Type: Handoff (H) -> What comes next and what context the next session needs. +> What comes next and what context the next session needs. Explicit transfer of state across conversation boundaries. Handoffs are the artifacts most likely to be lost because they describe what hasn't happened yet. A session without handoffs forces the next session to reconstruct context from scratch. --- ## Summary — The Continuity Layer -Handoffs transfer state across session boundaries. A session without handoffs forces the next session to reconstruct context from scratch. +Handoffs are how work survives across session boundaries. Every conversation ends. The question is whether the next one starts from zero or starts from where this one left off. Handoffs answer that question by explicitly naming: what's next, what's blocked, what context is needed, and who owns it. + +The key discipline: handoffs describe the future, not the past. "We decided to use TSV" is a decision. "Next session: write the remaining four encoding-type governance docs" is a handoff — it transfers intent across a boundary. --- @@ -32,21 +36,76 @@ Handoffs transfer state across session boundaries. A session without handoffs fo |---|---| | Letter | H | | Name | Handoff | +| Priority | High — handoffs are the most frequently lost artifact type | --- ## Field Schema +When encoding a Handoff, the model outputs a row with the following fields (serialization format governed by `odd/encoding-types/serialization-format.md`): + +``` +H {title} {body} {blocked_by} {owner} +``` + | Field | Recommended | Description | |---|---|---| | type | yes | Always `H` | -| title | yes | Short summary of what comes next | -| body | yes | What needs to happen and what context is needed | +| title | yes | Short summary of what comes next (≤12 words) | +| body | yes | What needs to happen and what context is needed to do it | +| blocked_by | no | What must happen before this can proceed. Empty if unblocked | +| owner | no | Who owns this next action — a person, role, or "next session" | + +Example: + +``` +H Write O, L, C, H encoding-type governance docs Create the remaining four default encoding-type governance docs following the Decision template. Each needs type identity, TSV schema, trigger words, and quality criteria. next session +``` --- ## Trigger Words (Fallback Classification) +When encode input is unstructured (not TSV), these trigger words classify a paragraph as Handoff: + ``` -next session, next step, todo, follow up, blocked by, waiting on, continue, remaining, handoff +next session, next step, todo, to do, follow up, blocked by, waiting on, needs to happen, pick up, continue, remaining, outstanding, handoff, hand off, carry forward, defer ``` + +--- + +## Quality Criteria + +Each criterion adds 1 to the quality score (max 4): + +| Criterion | Check | Gap message if missing | +|---|---|---| +| Substance | Body is ≥10 words | "Handoff is too brief — what context does the next session need?" | +| Actionability | Body describes a concrete next action, not just a topic | "Make it actionable — what specifically needs to happen next?" | +| Blocker clarity | Blocked_by column is non-empty OR body explicitly states unblocked | "Is this blocked by anything? State blockers or confirm unblocked" | +| Ownership | Owner column is non-empty | "Who owns this? A person, role, or 'next session'" | + +Quality levels: + +| Score | Level | Status | +|---|---|---| +| 4 | strong | recorded | +| 3 | adequate | recorded | +| 2 | weak | draft | +| 0–1 | insufficient | draft | + +--- + +## What Makes a Good Handoff Encoding + +A strong Handoff answers: what needs to happen, what context is needed to do it, what's blocking it, and who owns it. The most common gap is handoffs that name a topic without naming an action — "encoding architecture" is a topic, "implement TSV parsing in the encode handler" is an action. + +The second most common gap is missing blockers. A handoff without blocker information forces the next session to rediscover dependencies. Even "unblocked" is valuable — it confirms the work can start immediately. + +--- + +## See Also + +- [DOLCHE Vocabulary](klappy://docs/oddkit/proactive/dolche-vocabulary) — the six-dimension framework this type belongs to +- [Proactive Session Close](klappy://docs/oddkit/proactive/proactive-session-close) — handoffs are central to session closing +- [Prompt Over Code](klappy://canon/principles/prompt-over-code) — why this governance doc exists instead of server code diff --git a/odd/encoding-types/learning.md b/odd/encoding-types/learning.md index 0dad9615..4247872e 100644 --- a/odd/encoding-types/learning.md +++ b/odd/encoding-types/learning.md @@ -6,23 +6,27 @@ exposure: nav tier: 2 voice: neutral stability: semi_stable -tags: ["odd", "oddkit", "encode", "dolche", "learning", "encoding-type"] -epoch: E0008 -date: 2026-04-15 -derives_from: "docs/oddkit/proactive/dolche-vocabulary.md" +tags: ["odd", "oddkit", "encode", "dolche", "dolcheo", "learning", "encoding-type", "tsv", "governance", "epoch-8.4"] +epoch: E0008.4 +date: 2026-04-30 +derives_from: "canon/definitions/dolcheo-vocabulary.md, docs/architecture/encode-architecture-problem-and-gaps.md, canon/principles/prompt-over-code.md" +complements: "odd/encoding-types/decision.md, odd/encoding-types/observation.md, odd/encoding-types/constraint.md, odd/encoding-types/handoff.md, odd/encoding-types/open.md, odd/encoding-types/encode.md, odd/encoding-types/how-to-write-encoding-types.md, odd/encoding-types/serialization-format.md" governs: "oddkit_encode parsing and quality scoring for type L" status: active --- + # Encoding Type: Learning (L) -> What was understood from the observations. Interpretation with evidence. +> What was understood from the observations. Interpretation with evidence. Learnings connect observations to meaning — the bridge between "what did we see?" and "what does it mean?" A learning without an observation is speculation. A learning with an observation is knowledge. --- ## Summary — The Interpretation Layer -Learnings connect observations to meaning. A learning without an observation is speculation. A learning with an observation is knowledge. +Learnings are where observations become understanding. They require evidence — the observation that prompted the interpretation. Without that link, a learning is just an assertion dressed up as insight. + +The key discipline: learnings explain why, not just what. "The deploy took 47 seconds" is an observation. "Cold starts dominate deploy time because the first fetch always misses cache" is a learning — it names the mechanism and connects to evidence. --- @@ -32,21 +36,74 @@ Learnings connect observations to meaning. A learning without an observation is |---|---| | Letter | L | | Name | Learning | +| Priority | Medium — learnings inform future decisions but don't constrain them directly | --- ## Field Schema +When encoding a Learning, the model outputs a row with the following fields (serialization format governed by `odd/encoding-types/serialization-format.md`): + +``` +L {title} {body} {evidence} {applicability} +``` + | Field | Recommended | Description | |---|---|---| | type | yes | Always `L` | -| title | yes | Short summary of what was learned | -| body | yes | What was understood and why it matters | +| title | yes | Short summary of what was learned (≤12 words) | +| body | yes | The learning — what was understood and why it matters | +| evidence | no | The observation(s) that support this learning. References to O-typed entries or direct evidence | +| applicability | no | "local" (this project only), "general" (applies broadly), or "provisional" (needs more evidence) | + +Example: + +``` +L Platform-agnostic identification requires URL-level mechanisms Headers are implementation details that vary by platform. URLs are universal. The consumer query param works everywhere because every MCP platform lets users edit URLs. O: Labeled consumers only got credit for MCP initialize events — headers weren't forwarded on tool calls general +``` --- ## Trigger Words (Fallback Classification) +When encode input is unstructured (not TSV), these trigger words classify a paragraph as Learning: + ``` -learned, realized, discovered, understood, found that, turns out, insight +learned, realized, discovered, understood, found that, turns out, the reason is, this means, implies, suggests, the pattern is, insight, takeaway, the mechanism is, because ``` + +--- + +## Quality Criteria + +Each criterion adds 1 to the quality score (max 4): + +| Criterion | Check | Gap message if missing | +|---|---|---| +| Substance | Body is ≥10 words | "Learning is too brief — expand what was understood" | +| Evidence | Evidence column is non-empty | "What observation supports this? A learning without evidence is speculation" | +| Mechanism | Body explains why or how, not just what | "Add the mechanism — why does this happen, not just that it does" | +| Applicability | Applicability column is non-empty | "Is this local to this project, generally applicable, or provisional?" | + +Quality levels: + +| Score | Level | Status | +|---|---|---| +| 4 | strong | recorded | +| 3 | adequate | recorded | +| 2 | weak | draft | +| 0–1 | insufficient | draft | + +--- + +## What Makes a Good Learning Encoding + +A strong Learning answers: what was understood, what evidence supports it, why it happens (the mechanism), and how broadly it applies. The most common gap is missing evidence — models often encode interpretations without citing the observations that prompted them. The second most common gap is asserting a pattern without explaining the mechanism. + +--- + +## See Also + +- [DOLCHE Vocabulary](klappy://docs/oddkit/proactive/dolche-vocabulary) — the six-dimension framework this type belongs to +- [Encoding Type: Observation](klappy://odd/encoding-types/observation) — the companion type that provides evidence for learnings +- [Prompt Over Code](klappy://canon/principles/prompt-over-code) — why this governance doc exists instead of server code diff --git a/odd/encoding-types/observation.md b/odd/encoding-types/observation.md index a8dadf11..11444d31 100644 --- a/odd/encoding-types/observation.md +++ b/odd/encoding-types/observation.md @@ -6,24 +6,28 @@ exposure: nav tier: 2 voice: neutral stability: semi_stable -tags: ["odd", "oddkit", "encode", "dolche", "observation", "encoding-type"] -epoch: E0008 -date: 2026-04-15 -derives_from: "docs/oddkit/proactive/dolche-vocabulary.md" +tags: ["odd", "oddkit", "encode", "dolche", "dolcheo", "observation", "encoding-type", "tsv", "governance", "epoch-8.4"] +epoch: E0008.4 +date: 2026-04-30 +derives_from: "canon/definitions/dolcheo-vocabulary.md, docs/architecture/encode-architecture-problem-and-gaps.md, canon/principles/prompt-over-code.md" +complements: "odd/encoding-types/decision.md, odd/encoding-types/learning.md, odd/encoding-types/constraint.md, odd/encoding-types/handoff.md, odd/encoding-types/open.md, odd/encoding-types/encode.md, odd/encoding-types/how-to-write-encoding-types.md, odd/encoding-types/serialization-format.md" governs: "oddkit_encode parsing and quality scoring for type O" -status: active fallback: true +status: active --- + # Encoding Type: Observation (O) -> What was seen or noticed. Raw facts without interpretation. +> What was seen or noticed. Raw facts without interpretation. Observations are the evidence layer — they describe reality as encountered, not reality as theorized. An observation that nobody recorded is an observation that never happened for the system's purposes. --- ## Summary — The Evidence Layer -Observations describe reality as encountered, not reality as theorized. They are the foundation that Learnings, Decisions, and Constraints build on. +Observations are the foundation that all other DOLCHE types build on. Learnings interpret observations. Decisions respond to them. Constraints emerge from them. Without recorded observations, the rest is speculation. + +The key discipline: observations describe what happened, not what it means. "The deploy took 47 seconds" is an observation. "The deploy is too slow" is a learning. Keeping these separate matters because the same observation can support different interpretations. Recording the raw fact preserves optionality. --- @@ -33,28 +37,74 @@ Observations describe reality as encountered, not reality as theorized. They are |---|---| | Letter | O | | Name | Observation | +| Priority | High — observations are the evidence that grounds all other types | --- ## Field Schema +When encoding an Observation, the model outputs a row with the following fields (serialization format governed by `odd/encoding-types/serialization-format.md`): + +``` +O {title} {body} {source} {verifiability} +``` + | Field | Recommended | Description | |---|---|---| | type | yes | Always `O` | -| title | yes | Short summary of what was observed | -| body | yes | What was seen, measured, or encountered | +| title | yes | Short summary of what was observed (≤12 words) | +| body | yes | The observation — what was seen, measured, or encountered | +| source | no | Where or how this was observed: direct measurement, log output, user report, conversation, etc. | +| verifiability | no | "verified", "reported", "inferred", or empty. Distinguishes firsthand evidence from secondhand | + +Example: + +``` +O Deno user-agent was 39% of oddkit traffic Deno/2.1.4 accounted for 2,631 of 6,753 total tool calls in the telemetry data. Access pattern was entirely Apocrypha-focused. telemetry_public SQL query verified +``` --- ## Trigger Words (Fallback Classification) +When encode input is unstructured (not TSV), these trigger words classify a paragraph as Observation: + ``` -observed, noticed, saw, measured, detected, appeared, showed +observed, noticed, saw, measured, detected, appeared, showed, the data shows, the log shows, the output was, encountered, surfaced, evidence, metric ``` --- -## See Also — Open Is the Forward-Pointing Sibling +## Quality Criteria + +Each criterion adds 1 to the quality score (max 4): + +| Criterion | Check | Gap message if missing | +|---|---|---| +| Substance | Body is ≥10 words | "Observation is too brief — expand what was seen" | +| Specificity | Body contains a number, name, or concrete detail | "Add specifics — numbers, names, or concrete details strengthen observations" | +| Source | Source column is non-empty | "Where was this observed? Add the source" | +| Separation | Body does not contain interpretation words (should, better, worse, means, implies) | "This reads like interpretation, not observation — separate what was seen from what it means" | + +Quality levels: + +| Score | Level | Status | +|---|---|---| +| 4 | strong | recorded | +| 3 | adequate | recorded | +| 2 | weak | draft | +| 0–1 | insufficient | draft | + +--- + +## What Makes a Good Observation Encoding + +A strong Observation answers: what was seen, where it was seen, and whether it's firsthand. It resists the urge to interpret — that's what Learnings are for. The most common gap is mixing observation with interpretation in the same entry. "The API returned a 403" is an observation. "The API is broken" is a learning. Recording both is fine — as separate typed entries. + +--- -Observation (closed) and Open (forward-pointing) share letter `O` and are distinguished by section placement and an optional `facet` field. A paragraph inside a section headed `## Open items` is an Open; elsewhere, `[O]` is a closed Observation. For the forward-pointing variant, see `odd/encoding-types/open.md`. For the umbrella vocabulary that defines both, see `canon/definitions/dolcheo-vocabulary.md`. +## See Also +- [DOLCHE Vocabulary](klappy://docs/oddkit/proactive/dolche-vocabulary) — the six-dimension framework this type belongs to +- [Encoding Type: Learning](klappy://odd/encoding-types/learning) — the companion type that interprets observations +- [Prompt Over Code](klappy://canon/principles/prompt-over-code) — why this governance doc exists instead of server code diff --git a/odd/encoding-types/open.md b/odd/encoding-types/open.md index 94a627be..83605812 100644 --- a/odd/encoding-types/open.md +++ b/odd/encoding-types/open.md @@ -6,18 +6,18 @@ exposure: nav tier: 2 voice: neutral stability: semi_stable -tags: ["odd", "oddkit", "encode", "dolche", "dolcheo", "open", "open-items", "priority-bands", "forward-pointing", "encoding-type", "epoch-8.3"] -epoch: E0008.3 -date: 2026-04-19 -derives_from: "canon/definitions/dolcheo-vocabulary.md, odd/ledger/2026-04-19-agent-team-pilot.md" -complements: "odd/encoding-types/observation.md, odd/encoding-types/handoff.md, odd/encoding-types/decision.md, odd/encoding-types/learning.md, odd/encoding-types/constraint.md, odd/encoding-types/encode.md, odd/encoding-types/how-to-write-encoding-types.md" +tags: ["odd", "oddkit", "encode", "dolche", "dolcheo", "open", "open-items", "priority-bands", "forward-pointing", "encoding-type", "tsv", "governance", "epoch-8.4"] +epoch: E0008.4 +date: 2026-04-30 +derives_from: "canon/definitions/dolcheo-vocabulary.md, docs/architecture/encode-architecture-problem-and-gaps.md, odd/ledger/2026-04-19-agent-team-pilot.md, canon/principles/prompt-over-code.md" +complements: "odd/encoding-types/observation.md, odd/encoding-types/handoff.md, odd/encoding-types/decision.md, odd/encoding-types/learning.md, odd/encoding-types/constraint.md, odd/encoding-types/encode.md, odd/encoding-types/how-to-write-encoding-types.md, odd/encoding-types/serialization-format.md" governs: "oddkit_encode parsing and quality scoring for type O, facet=open" status: active --- # Encoding Type: Open (O, forward-pointing) -> An unresolved thread the current owner is still holding. Shares letter `O` with Observation; section placement and the `-open` subtype tag disambiguate. Open items carry priority bands (P1, P2, P3…) so the list is scannable. +> An unresolved thread the current owner is still holding. Shares letter `O` with Observation; section placement and the `-open` subtype tag disambiguate. Open items carry priority bands (P1, P2, P3…) so the list is scannable. An Open closes by transitioning to a Decision when resolved, a Handoff when transferred, or a closed Observation when it completes as historical fact. --- @@ -36,25 +36,82 @@ An Open becomes closed by transitioning to another artifact: to a Decision when | Letter | O | | Facet | open | | Name | Open | +| Priority | High — Opens that aren't tracked become silent dropped work | --- ## Field Schema +When encoding an Open, the model outputs a row with the following fields (serialization format governed by `odd/encoding-types/serialization-format.md`): + +``` +O {facet} {priority} {title} {body} +``` + | Field | Recommended | Description | |---|---|---| | type | yes | Always `O` | | facet | yes | Always `open` (distinguishes from closed Observation, which has no facet or `facet: closed`) | | priority | yes | Band identifier — `P1`, `P2`, `P3`, …, with optional sub-band like `P1.1` | -| title | yes | Short summary of the unresolved thread | +| title | yes | Short summary of the unresolved thread (≤12 words) | | body | yes | What remains to be done, decided, or answered, and what would close it | +Example: + +``` +O open P1 Encode parser still hardcodes 4 types Parser detectEncodeType() recognizes only decision/insight/boundary/override via regex; DOLCHE governance defines six dimensions plus extension. Closes when parser reads vocabulary from canon at runtime per Alternative D. +``` + --- ## Trigger Words (Fallback Classification) +When encode input is unstructured (not TSV), these trigger words classify a paragraph as Open: + ``` -open item, still need to, haven't decided, unresolved, pending, awaiting, todo, followup, next up, P1, P2, P3, O-open +open item, still need to, haven't decided, unresolved, pending, awaiting, todo, followup, next up, P1, P2, P3, O-open, parked, holding, in flight ``` Classification preference: a paragraph inside a section whose header matches `/open items?/i` or `/forward[- ]pointing/i` is classified as Open regardless of trigger words. A paragraph prefixed `[O-open]` is classified as Open. A paragraph prefixed `[O]` outside an Open items section is classified as closed Observation. + +These words are used by the server to build dynamic regex for fallback classification only. On the primary TSV path, the type letter `O` plus `facet=open` is the classifier. + +--- + +## Quality Criteria + +Each criterion adds 1 to the quality score (max 5): + +| Criterion | Check | Gap message if missing | +|---|---|---| +| Substance | Body is ≥10 words | "Open is too brief — describe what remains and what would close it" | +| Priority assigned | Priority column is non-empty and matches `P\d+(\.\d+)*` | "Assign a priority band — P1 (must close), P2 (should close), P3 (parked)" | +| Closure path | Body names what would close this — a Decision, a Handoff, an Observation, or a specific event | "How does this close? Decision, handoff, or completion event?" | +| Specificity | Title is concrete, not a topic label | "Sharpen the title — what specifically remains unresolved?" | +| Owner clarity | Body or context makes clear who is currently holding the thread (or that ownership itself is unresolved) | "Who is holding this? If nobody, it should be a Handoff" | + +Quality levels: + +| Score | Level | Status | +|---|---|---| +| 5 | strong | recorded | +| 3–4 | adequate | recorded | +| 2 | weak | draft | +| 0–1 | insufficient | draft | + +--- + +## What Makes a Good Open Encoding + +A strong Open encoding answers four questions: What is unresolved? What would close it? Who is holding it? How urgent is it? The most common gap is missing closure path — Opens that don't say what would close them tend to drift indefinitely. The second most common gap is vague titles that read like topics rather than threads ("caching" instead of "decide whether governance regex is module-cached or per-request"). + +Opens are the artifact type most likely to be silently dropped. Recording them aggressively is the discipline. Closing them honestly — via transition to D, H, or closed-O — is the follow-through. + +--- + +## See Also + +- [DOLCHEO Vocabulary](klappy://canon/definitions/dolcheo-vocabulary) — the seven-dimension framework this type belongs to +- [How to Write an Encoding Type](klappy://odd/encoding-types/how-to-write-encoding-types) — the meta-governance this doc follows +- [Encoding Type: Handoff](klappy://odd/encoding-types/handoff) — the companion forward-looking type (transfers ownership) +- [Prompt Over Code](klappy://canon/principles/prompt-over-code) — why this governance doc exists instead of server code diff --git a/odd/encoding-types/serialization-format.md b/odd/encoding-types/serialization-format.md index f2a9ec83..7c8118ee 100644 --- a/odd/encoding-types/serialization-format.md +++ b/odd/encoding-types/serialization-format.md @@ -6,7 +6,7 @@ exposure: nav tier: 2 voice: neutral stability: semi_stable -tags: ["odd", "oddkit", "encode", "dolche", "tsv", "format", "serialization", "governance", "storage"] +tags: ["odd", "oddkit", "encode", "dolche", "tsv", "format", "serialization", "encoding-type", "governance", "storage"] epoch: E0008 date: 2026-04-15 derives_from: "docs/oddkit/proactive/dolche-vocabulary.md, odd/encoding-types/how-to-write-encoding-types.md, canon/principles/prompt-over-code.md" diff --git a/odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d.md b/odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d.md new file mode 100644 index 00000000..d0749c6e --- /dev/null +++ b/odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d.md @@ -0,0 +1,163 @@ +--- +uri: klappy://odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d +title: "Handoff — oddkit_encode Vodka Refactor (Alternative D, Governance-Driven Parser, E0008.4 Phase 2)" +audience: docs +exposure: nav +tier: 2 +voice: neutral +stability: semi_stable +tags: ["odd", "handoff", "session", "epoch-8.4", "p2-encode-vodka", "oddkit-encode", "alternative-d", "governance-driven", "tsv", "vodka-architecture", "prompt-over-code", "release-validation-gate", "managed-agent-validation"] +epoch: E0008.4 +date: 2026-04-30 +derives_from: "docs/architecture/encode-architecture-problem-and-gaps.md, odd/encoding-types/decision.md, odd/encoding-types/observation.md, odd/encoding-types/learning.md, odd/encoding-types/constraint.md, odd/encoding-types/handoff.md, odd/encoding-types/open.md, odd/encoding-types/serialization-format.md, odd/encoding-types/how-to-write-encoding-types.md, canon/definitions/dolcheo-vocabulary.md, canon/constraints/release-validation-gate.md, canon/constraints/core-governance-baseline.md, odd/ledger/2026-04-20-p1-3-4-encode-canon-parity-landed.md" +complements: "odd/handoffs/2026-04-21-p1-3-2-gate-canary.md, odd/handoffs/2026-04-20-p1-3-4-encode-canon-parity.md" +governs: "Phase 2 implementation gate for the oddkit_encode vodka refactor: the parser must read encoding-type and serialization-format governance from canon at runtime, parse TSV input against governance-defined field schemas, score per-type quality from governance criteria, return per-type artifacts in a markdown stream, and declare governance_source + governance_uris in the response envelope." +status: active +--- + +# Handoff — oddkit_encode Vodka Refactor (Alternative D, Governance-Driven Parser, E0008.4 Phase 2) + +> Phase 1 landed governance into canon. The seven encoding-type articles (D, O, L, C, H, E, Open) and the serialization-format article now carry field schemas, quality criteria, trigger words, and serialization rules — everything the parser needs to be governance-driven. The architecture brief at `klappy://docs/architecture/encode-architecture-problem-and-gaps` recommends Alternative D and resolves the design questions. Phase 2 is the parser refactor that makes the canon the source of truth. + +--- + +## Summary — Why Phase 2 Exists Even Though P1.3.4 "Closed the Sweep" + +The P1.3.4 closeout (`klappy://odd/ledger/2026-04-20-p1-3-4-encode-canon-parity-landed`) brought encode's trigger-word matcher to canon-parity with challenge and gate. That sweep was real progress, but it left the deeper architectural gap that H-01 of that ledger explicitly named: the encode parser still hardcodes recognition of four English keywords (`decision`, `insight`, `boundary`, `override`) via TypeScript regex, while the DOLCHEO governance defines seven dimensions (D, O, L, C, H, E, Open) plus an extension mechanism. The matcher was synchronized; the parser remained a regex against a private four-type vocabulary that does not match the canon at all. Two of the seven types — Observation and Handoff — have no parser representation. One name in the parser (`override`) does not exist in the canon at all. The model does the categorization work; the server discards it; the artifact collapses to a single blob. + +Phase 1 of E0008.4 (this PR) populated canon with the per-type field schemas and quality criteria the architecture needs. Phase 2 of E0008.4 (this handoff) refactors the parser to read that canon at encode time. The sweep didn't fail — it cleaned the matcher. The parser is the next layer down. + +--- + +## Scope — What Phase 2 Ships + +**Implementation contract:** Alternative D from the architecture brief. The server reads encoding-type governance and serialization-format governance from canon at encode time. It parses structured TSV input mechanically against governance-defined field schemas. It falls back to paragraph-split + dynamic regex (built from governance-defined trigger words) for unstructured input. It scores per-type quality against governance-defined criteria. It returns per-type artifacts in a markdown stream. It surfaces the governance in the response so the model learns the format for subsequent calls. + +**Specifically in scope:** + +1. Replace `detectEncodeType()` in `orchestrate.ts` with a governance-driven classifier that reads the seven encoding-type docs and the serialization-format doc at runtime (cached identically to all other canon reads). +2. Parse TSV input mechanically: first field is the type letter, remaining fields are defined per-type by the article's Field Schema table. No regex on the primary path. +3. Implement the unstructured-input fallback: paragraph-split, dynamic regex from governance-defined trigger words, classify per paragraph. +4. Implement per-type quality scoring from each article's Quality Criteria table (criterion checks + gap messages, score levels). +5. Return per-type artifacts in markdown stream — one section per row, with title, type, quality score, gaps, and suggestions. +6. Declare `governance_source` (`"knowledge_base"` | `"minimal"`) and `governance_uris` (alphabetical-by-path array of the canon docs consulted) in the response envelope, matching the pattern challenge (P1.3.1) and gate (P1.3.2) established. +7. Update the encode tool description to teach the model the TSV format, name the seven types, and reference the governance for extension. +8. Bundle the seven encoding-type articles + serialization-format + how-to-write-encoding-types into the worker's minimal governance baseline per `canon/constraints/core-governance-baseline`, so the tool works when canon is unreachable but degrades to `governance_source: "minimal"` honestly. + +**Specifically out of scope (do not let scope drift into these):** + +- LLM-in-the-loop encoding (Alternative E in the brief — that's TruthKit territory, not oddkit). +- Schema enforcement that rejects unknown fields. The server should be permissive: extra fields are surfaced as warnings, not errors. +- Auto-supersession of the older shorter encoding-type articles. Phase 1 of this PR replaced them in place; no `superseded_by` chain is needed because the URIs are stable. +- P11 (gate mechanical enforcement of release-validation-gate) — that's a separate next-epoch capability per the P1.3.4 closeout. + +**Target version:** oddkit 0.23.0 (canon-driven minor bump per the canary precedent). Coordinate version with the parallel-release lesson from P1.3.4 — confirm 0.23.0 isn't taken before opening the PR. + +--- + +## Acceptance Criteria + +A clean Phase 2 ship demonstrates all of the following: + +1. `oddkit_encode` called with structured TSV input parses mechanically — no regex on the primary path. +2. `oddkit_encode` called with unstructured prose falls back to paragraph-split + dynamic regex from governance trigger words; classifications per paragraph reflect the governance vocabulary (seven types), not a hardcoded four. +3. Multi-row inputs return multi-artifact markdown streams. The DOLCHE-style "5 decisions, 3 observations, 2 learnings, 4 constraints, 3 handoffs" test from the architecture brief returns 17 typed artifacts, not one blob. +4. Each artifact carries a per-type quality score derived from its governance article's criteria, with explicit gap messages for missing fields. +5. Response envelope includes `governance_source` and `governance_uris` (alphabetical by path). +6. With canon reachable: `governance_source: "knowledge_base"`. With canon unreachable (simulated 403): `governance_source: "minimal"`, baseline behavior preserved, no silent substitution. +7. Tool description teaches the seven types and the TSV format. Description does not enumerate trigger words — those live in canon. +8. Smoke 126/126 × 5 consecutive runs at the preview URL before promotion (matching the gate canary precedent). +9. Bundled minimal baseline (the eight files: seven encoding-types + serialization-format + how-to-write-encoding-types, total nine files) shipped with the worker. + +--- + +## Release Validation Gate — Mandatory, Not Optional + +This refactor touches `orchestrate.ts`, the matcher implementation, governance reads, the response envelope, and action behavior. Per `klappy://canon/constraints/release-validation-gate` (tier 1), all three rules apply: + +1. **Cursor Bugbot must reach `completed` before promotion merge.** `in_progress` is not non-blocking. If Bugbot is still running when the feature PR is otherwise green, wait. +2. **Independent Sonnet 4.6 read-only validator agent must be dispatched via Managed Agents before promotion merge.** This is not optional for a PR that touches all five trigger surfaces. Same-session smoke and self-calls do not satisfy this. Fresh-context validation per `klappy://canon/principles/verification-requires-fresh-context`. +3. **Canon outranks session-scoped recommendations.** If the session ledger ends up suggesting a shortcut on either of the above, surface the conflict and follow canon. Propose amendment to canon if session judgment was actually right. + +The P1.3.3 process failure (skipped Bugbot, skipped validator, broke prod twice) is the cautionary tale. The P1.3.4 closeout was the second clean application of the gate. This will be the third — make it mechanical. + +--- + +## Pitfalls Observed in the Sweep + +These are not theoretical — they bit during P1.3.1 through P1.3.4. Carry them forward. + +- **Stop-word filters silently destroy short trigger words.** Tokenize with stop-words disabled on both sides of the matcher when the vocabulary contains words like "to," "in," "of." See P1.3.3 prod-break post-mortem. +- **Module-memory caches that derive structure on first request lie about cache hits.** Canon reads cache; derived parses do not. Don't introduce new derivation caches; rebuild per request from cached canon. The `D9 cachedEncodingTypes` removal in P1.3.4 was for this exact reason. +- **Set-intersection is the canon-parity matcher, not regex.** Stem both sides, take the intersection of token sets, threshold. Don't reach for regex on the primary path. Regex is the unstructured fallback only. +- **Bugbot autofix design over orchestrator design.** When Bugbot proposes an autofix, prefer it unless canon names a reason for divergence. L-08 from P1.3.4 is the candidate principle graduation here. +- **Parallel-release version collisions.** Confirm the next minor isn't already cut on prod. P1.3.4 hit this; the version bump was a coordination artifact, not a content concern. + +--- + +## Phase 1 Inventory — What Already Landed in This PR + +For the validator agent and any future archaeologist: + +- `odd/encoding-types/decision.md` — replaced (1.1KB → 5.1KB) with field schema (6 fields), TSV example, expanded trigger words (13), per-type quality criteria (5 checks). +- `odd/encoding-types/observation.md` — replaced with same upgrade pattern. +- `odd/encoding-types/learning.md` — replaced with same upgrade pattern. +- `odd/encoding-types/constraint.md` — replaced with same upgrade pattern. +- `odd/encoding-types/handoff.md` — replaced with same upgrade pattern. +- `odd/encoding-types/open.md` — enriched in place: kept oddkit's Open framing (DOLCHEO seventh letter, post-2026-04-19), added field schema with `facet`/`priority` columns, added quality criteria (5 checks), added narrative "What Makes a Good Open Encoding." +- `odd/encoding-types/serialization-format.md` — synced one missing tag (`encoding-type`); body unchanged. +- `odd/encoding-types/encode.md` — unchanged; encode is the meta-action, not a row class. +- `odd/encoding-types/how-to-write-encoding-types.md` — unchanged; oddkit's version already carries the Open/Observation `O`-letter collision note. +- `docs/architecture/encode-architecture-problem-and-gaps.md` — new in oddkit canon, body verbatim from `klappy/truthkit-kb`, klappy.dev frontmatter added with `provenance` field documenting the migration. + +The TruthKit-KB articles dropped (not migrated): `question.md` — superseded by oddkit's `open.md` framing on 2026-04-19 when DOLCHEO promoted Open to the seventh letter. The TruthKit-KB origin remains in that repo's history; the field-schema and quality-criteria pattern from `question.md` was ported into `open.md`. + +--- + +## What the Validator Should Confirm + +When the Sonnet 4.6 read-only validator runs against the Phase 2 PR, it should answer these specifically: + +1. Does `detectEncodeType()` (or its successor) read encoding-type governance at runtime, with no regex on the primary TSV path? +2. Are the trigger words in the fallback path sourced from each encoding-type article's `## Trigger Words` section, not from a TypeScript constant? +3. Are quality criteria sourced from each encoding-type article's `## Quality Criteria` table, not from a TypeScript constant? +4. Does the response envelope declare `governance_source` and `governance_uris`? +5. Does the bundled minimal baseline include the nine files named above? +6. With canon unreachable (test it), does `governance_source` flip to `"minimal"` and does behavior degrade gracefully? +7. Does the tool description teach the model the seven types and the TSV format without enumerating trigger words inline? +8. Is multi-row TSV input returned as multi-artifact markdown stream, one section per row, each with type-specific quality feedback? + +A Phase 2 PR that fails any of the above is not done. + +--- + +## Successor / Closeout Pattern + +When Phase 2 ships clean to prod: + +1. Open a closeout ledger at `odd/ledger/2026-MM-DD-encode-vodka-refactor-landed.md` mirroring the structure of `odd/ledger/2026-04-20-p1-3-4-encode-canon-parity-landed.md`. +2. Flip this handoff's frontmatter to `status: superseded`, `superseded_by: `. +3. Update the encode tool description in the README/docs site to reflect the new behavior. +4. Encode the L-08 graduation candidate (Bugbot autofix preference) as a tier-2 principle if this is the fourth occurrence of the autofix-vs-orchestrator-design tension. +5. Close E0008.4 as the encode-architecture-finally-right epoch. Confirm E0008.5 (or the next epoch) carries P11 (gate mechanical enforcement). + +--- + +## Open Items (carried forward to next session) + +| Tag | Description | Priority | +|---|---|---| +| O-open | Confirm 0.23.0 is the right minor version before opening the Phase 2 PR (parallel-release check). | P1 | +| O-open | Decide whether the bundled minimal baseline includes the architecture brief or only the eight per-type + format docs. The architecture brief is human-facing; the minimal baseline should probably stay machine-facing. | P2 | +| O-open | Decide whether `oddkit_encode` should accept a `format` hint param (e.g., `format: "tsv"` vs `format: "prose"`) or always auto-detect. Auto-detect is simpler; explicit hint is more honest. | P2 | +| O-open | TruthKit-KB next: replace its older D/O/L/C/H/Q articles with the now-canonical oddkit versions, or maintain divergence intentionally. Likely a one-direction sync — TruthKit reads from oddkit baseline + adds harness-specific overlay. | P3 | + +--- + +## See Also + +- [Encode Architecture: Problem, Gaps, and Alternatives Analysis](klappy://docs/architecture/encode-architecture-problem-and-gaps) — the implementation brief +- [P1.3.4 Closeout — Encode Canon-Parity](klappy://odd/ledger/2026-04-20-p1-3-4-encode-canon-parity-landed) — the prior ship and the H-01 that motivated this work +- [Core Governance Baseline](klappy://canon/constraints/core-governance-baseline) — the minimal-baseline contract +- [Release Validation Gate](klappy://canon/constraints/release-validation-gate) — the three rules for shipping +- [DOLCHEO Vocabulary](klappy://canon/definitions/dolcheo-vocabulary) — the seven-dimension umbrella diff --git a/odd/ledger/2026-04-30-e0008-4-phase-1-encode-governance-migration-landed.md b/odd/ledger/2026-04-30-e0008-4-phase-1-encode-governance-migration-landed.md new file mode 100644 index 00000000..71580feb --- /dev/null +++ b/odd/ledger/2026-04-30-e0008-4-phase-1-encode-governance-migration-landed.md @@ -0,0 +1,144 @@ +--- +uri: klappy://odd/ledger/2026-04-30-e0008-4-phase-1-encode-governance-migration-landed +title: "E0008.4 Phase 1 Closeout — Encode Governance Migration from TruthKit-KB (Canon-Only Ship, Phase 2 Handoff Laid)" +audience: docs +exposure: nav +tier: 2 +voice: neutral +stability: semi_stable +tags: ["odd", "ledger", "session", "epoch-8.4", "phase-1", "encode", "vodka-architecture", "alternative-d", "truthkit-kb", "governance-migration", "canon-only", "release-validation-gate", "dolcheo"] +epoch: E0008.4 +date: 2026-04-30 +derives_from: "docs/architecture/encode-architecture-problem-and-gaps.md, odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d.md, odd/ledger/2026-04-20-p1-3-4-encode-canon-parity-landed.md, canon/definitions/dolcheo-vocabulary.md, canon/constraints/release-validation-gate.md" +complements: "odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d.md, odd/encoding-types/decision.md, odd/encoding-types/observation.md, odd/encoding-types/learning.md, odd/encoding-types/constraint.md, odd/encoding-types/handoff.md, odd/encoding-types/open.md" +governs: "Closeout for Phase 1 of E0008.4 (encode governance migration from truthkit-kb into oddkit canon). Records the canon-only ship that prepares Phase 2 (parser refactor in klappy/oddkit). PR klappy/klappy.dev#157." +status: active +--- + +# E0008.4 Phase 1 Closeout — Encode Governance Migration from TruthKit-KB + +> Phase 1 of E0008.4 shipped as a canon-only PR (klappy/klappy.dev#157, 9 files, +814/-57). The seven encoding-type articles (D, O, L, C, H, E, Open) and the serialization-format article now carry field schemas, TSV serialization examples, expanded trigger words, per-type quality criteria, and narrative sections — exactly what Alternative D needs to be governance-driven. The architecture brief originally written in `klappy/truthkit-kb` is now oddkit canon at `klappy://docs/architecture/encode-architecture-problem-and-gaps`, copied verbatim with klappy.dev frontmatter and a `provenance` field documenting the migration. The Phase 2 implementation handoff is at `klappy://odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d`. No code changes; the parser refactor is Phase 2 in `klappy/oddkit`. + +--- + +## Summary — What Shipped, What Held, What Held the Door for Phase 2 + +This was a coordination ship, not an authoring ship. The encode-architecture problem analysis and the per-type field-schema/quality-criteria patterns had been authored in `klappy/truthkit-kb` (private) but were unused by the consumer that needed them most: oddkit's `oddkit_encode` parser. The P1.3.4 closeout (`klappy://odd/ledger/2026-04-20-p1-3-4-encode-canon-parity-landed`) explicitly named this gap as H-01 — the matcher was canon-parity, but the parser still hardcoded recognition of four English keywords against a vocabulary of seven dimensions. Phase 1 puts the governance where the parser will read it. Phase 2 makes the parser read it. + +The migration was mechanically simple — copy bodies verbatim, rewrite frontmatter to klappy.dev conventions (epoch tag, dates, derives_from cross-links, provenance field on the architecture doc). The intellectual work was already done; this PR is the wiring. + +The decision worth naming: split the work across two repos and two PRs. Phase 1 is canon-only and triggers no release-validation-gate surfaces — no `orchestrate.ts`, no matcher, no governance reads at runtime, no envelope changes, no action behavior changes. Phase 2 will trigger all five surfaces and demands the full gate (Bugbot to completed, Sonnet 4.6 read-only validator via Managed Agents). Splitting prevents Phase 1 from being held hostage to Phase 2's validator dispatch. + +A second observation worth recording: the session also surfaced a posture lapse. The first attempt to declare Phase 1 "done" did not include a session ledger in the same PR. The session journal had been generated and persisted to `/mnt/user-data/outputs/`, but that is download storage, not project storage. The operator caught it directly: "Why wasn't the journal included?" The fix is in the PR you are now reading. Lesson L-03 below. + +--- + +## Decisions + +**[D-01] Migrate TruthKit-KB encoding-type field schemas, quality criteria, and architecture brief into oddkit canon as the basis for the encode parser vodka refactor.** Per-type articles for D, O, L, C, H replaced verbatim from TruthKit-KB; `open.md` enriched in place to keep oddkit's DOLCHEO seventh-letter framing while gaining TruthKit's quality-criteria pattern; `question.md` NOT migrated (superseded by Open on 2026-04-19). Architecture brief copied verbatim with klappy.dev frontmatter and `provenance` field; epoch tagged E0008.4. Single PR (klappy.dev #157) for Phase 1 — canon-only, no code changes. Rationale: re-authoring would have wasted the work already done in TruthKit-KB; verbatim migration with framing reconciliation is the cheaper and more honest correction. + +**[D-02] Split the encode vodka refactor into two PRs across two repos: Phase 1 (canon migration, klappy.dev #157, this session) and Phase 2 (parser refactor, klappy/oddkit, separate session).** Rationale: the canon migration triggers no release-validation-gate surfaces but the parser refactor triggers all of them. Splitting prevents Phase 1 review from being held hostage to Phase 2 validator dispatch and lets each PR carry the gate severity it actually requires. + +**[D-03] Keep oddkit's `open.md` framing (DOLCHEO seventh letter, post-2026-04-19) rather than migrate TruthKit's `question.md`.** Port the field-schema and quality-criteria pattern from `question.md` into `open.md` instead. Rationale: TruthKit-KB was last updated 2026-04-16 — three days before DOLCHEO supersession introduced "Open" as the seventh letter. The architectural pattern was portable across the framing change; the framing itself was not. Maintaining oddkit's newer framing keeps the umbrella vocabulary doc (`canon/definitions/dolcheo-vocabulary`) coherent. + +--- + +## Observations + +**[O-01]** TruthKit-KB's per-type articles (D, O, L, C, H) are roughly 4× larger than oddkit canon's were because they carry the field schemas, TSV serialization examples, expanded trigger-word lists (10–13 each vs. 7), per-type quality criteria with check rules and gap messages, and "What Makes a Good X Encoding" narrative sections that Alternative D requires. The work was already done; the consumer that needed it didn't have it. + +**[O-02]** TruthKit-KB was last updated 2026-04-16, three days before the DOLCHEO supersession (2026-04-19) introduced "Open" as the seventh letter. TruthKit's `question.md` is older framing; oddkit's `open.md` is newer framing. The architectural pattern (field schema + quality criteria) was portable across the framing change; the framing itself was not. This kind of asymmetric divergence between sibling repos is a recurring shape — the lesson is in L-01. + +**[O-03]** The architecture brief recommends Alternative D — Governance-defined field schemas with format-agnostic serialization. Server reads encoding-type docs at runtime, parses TSV mechanically, falls back to dynamic regex from governance trigger words for unstructured input, scores per-type quality, returns per-type artifacts in markdown stream, declares `governance_source` + `governance_uris` envelope. Identical pattern to challenge (P1.3.1) and gate (P1.3.2) canaries — encode is the third application of the same canary architecture, not a novel one. + +**[O-04]** The session's `oddkit_encode` call at the end (the one that produced this ledger's content) returned `governance_source: "knowledge_base"` from the action envelope. Encode was already reading the encoding-types directory from canon to drive its own classification — but its parser still hardcodes the four English keywords. The runtime read and the parser logic are decoupled in the current implementation; Phase 2 will close that decoupling. + +--- + +## Learnings + +**[L-01] When two related repos diverge in governance authoring, the cheapest correction is verbatim migration with framing reconciliation, not re-authoring.** TruthKit-KB was ahead of oddkit canon on encode design; instead of re-authoring in oddkit, the body was migrated wholesale and the `provenance` frontmatter field was used to preserve authorship and origin. Review burden becomes the frontmatter (correctness, conventions, cross-links), not the body (which has already been written, reviewed, and proven coherent in its origin repo). This pattern likely generalizes to other oddkit ↔ TruthKit-KB sync situations and is a candidate for graduation as a tier-2 principle if it surfaces three more times. + +**[L-02] The release-validation-gate's three rules don't apply uniformly to all PRs.** Canon-only PRs (no `orchestrate.ts` / matcher / governance reads / envelope / action behavior changes) don't require Bugbot completion or Sonnet validator dispatch. The gate's trigger surface list is the operative classifier — fetch the canon and check before deciding gate severity. This Phase 1 PR is the worked example of correct gate-skip: canon-only, standard gauntlet only. Phase 2 will be the worked example of correct full-gate application. + +**[L-03] Persist-to-project-storage means canon repo, not download storage.** The first attempt to declare Phase 1 "done" generated a session journal and saved it to `/mnt/user-data/outputs/` — which is download storage for the operator, not project storage. The operator caught this directly. The bootstrap doc says "persist to project storage at natural breakpoints"; "project storage" for an oddkit-pattern project is the canon repo, written via the same PR that contains the work the journal describes. A milestone ship without a same-PR ledger is incomplete by the same standard P1.3.x has been measured against. This ledger you are reading is the fix; the lesson is that the model must include the ledger in the same PR as the work, not as a separate post-hoc artifact, and not as a downloadable file. + +**[L-04] TruthKit-KB → oddkit migrations preserve authorship via the `provenance` frontmatter field.** Verbatim body copy with klappy.dev frontmatter added and a `provenance:` line documenting the source repo and migration date is the lightweight authorship-preservation pattern. It avoids the overhead of formal supersession chains for content that was authored elsewhere and migrated cleanly. + +--- + +## Constraints + +**[C-01] Phase 2 (parser refactor in klappy/oddkit) MUST run the full release-validation-gate.** Bugbot to completed (not `in_progress`), Sonnet 4.6 read-only validator dispatched via Managed Agents before promotion merge. Same-session smoke does not satisfy. The handoff document at `klappy://odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d` names this explicitly. Authority: `klappy://canon/constraints/release-validation-gate` (tier 1). + +**[C-02] Bundled minimal baseline for the worker MUST include the nine governance files** (seven encoding-types: decision, observation, learning, constraint, handoff, encode, open; plus serialization-format and how-to-write-encoding-types). Without canon reachable, the tool degrades to `governance_source: "minimal"` honestly — does not silently substitute. Authority: `klappy://canon/constraints/core-governance-baseline`. + +**[C-03] Milestone ledgers ship in the same PR as the milestone work.** Generate-and-download is not persistence to project storage. Authority: bootstrap operating contract at `klappy://canon/bootstrap/model-operating-contract` § "For Durable Records." This constraint is graduated from L-03. + +--- + +## Handoffs + +**[H-01] Next session: Phase 2 implementation in klappy/oddkit.** Read `klappy://odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d` as the implementation contract. Confirm 0.23.0 is the right minor (parallel-release check per the P1.3.4 lesson). Implement Alternative D end-to-end. Run full release-validation-gate. Open closeout ledger when shipped. Successor pattern mirrors `klappy://odd/ledger/2026-04-20-p1-3-4-encode-canon-parity-landed`. + +**[H-02] After Phase 2 lands: TruthKit-KB sync.** Replace its older D/O/L/C/H/Q articles with the now-canonical oddkit versions, or maintain divergence intentionally. Likely one-direction sync (TruthKit reads oddkit baseline + adds harness-specific overlay). Decision deferred to post-Phase-2. + +**[H-03] L-04 graduation candidate.** "Verbatim migration with `provenance` frontmatter is the cross-repo authorship-preservation pattern" is a candidate for graduation as a tier-2 principle. Watch for the next two cross-repo migrations (any direction); if either reuses this pattern unprompted, graduate. + +**[H-04] L-03 graduation candidate (urgent).** "Milestone ledgers ship in the same PR as the milestone work" was just promoted to a constraint (C-03) because it was load-bearing in this session. Watch for any future session-end where the ledger is omitted — that's the canary that the constraint isn't being internalized. + +--- + +## Encodes + +**[E-01]** Phase 1 of E0008.4 complete: PR klappy/klappy.dev#157 opened, 9 files changed (now 10 with this ledger), +814/-57 (now ~+1100 with this ledger), `mergeable=true`. Canon migration from TruthKit-KB to oddkit verbatim with frontmatter rewrite plus session ledger now included in same PR. + +--- + +## Files Changed (PR #157, post-ledger-add) + +| Status | File | Origin | +|---|---|---| +| modified | `odd/encoding-types/constraint.md` | truthkit-kb (verbatim body) | +| modified | `odd/encoding-types/decision.md` | truthkit-kb (verbatim body) | +| modified | `odd/encoding-types/handoff.md` | truthkit-kb (verbatim body) | +| modified | `odd/encoding-types/learning.md` | truthkit-kb (verbatim body) | +| modified | `odd/encoding-types/observation.md` | truthkit-kb (verbatim body) | +| modified | `odd/encoding-types/open.md` | enriched in place (truthkit pattern, oddkit framing) | +| modified | `odd/encoding-types/serialization-format.md` | one tag sync | +| added | `docs/architecture/encode-architecture-problem-and-gaps.md` | truthkit-kb (verbatim body, klappy.dev frontmatter) | +| added | `odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d.md` | new (Phase 2 contract) | +| added | `odd/ledger/2026-04-30-e0008-4-phase-1-encode-governance-migration-landed.md` | new (this ledger — added after operator caught its omission) | + +--- + +## Timeline (UTC) + +| Time | Event | +|---|---| +| 04:14:50 | Session start; bootstrap fetched | +| 04:15:13 | Orient on E0008 observability refactor inventory | +| 04:34:22 | Operator gate: scope confirmed (encode H-01) and method (use truthkit-kb articles) | +| 04:43:50 | Operator gate: "both canon and handoff brief, dual purpose" → execution | +| 04:44:00 | Preflight against the canon migration scope | +| 04:46–04:50 | Build 9 files locally (5 replacements, 1 enrichment, 1 sync, 2 new) | +| 04:51:06 | Push branch + open PR #157 + encode session checkpoint to memory | +| 04:51:30 | Session journal saved to `/mnt/user-data/outputs/` (incorrect storage location) | +| 04:52:03 | Validate returned NEEDS_ARTIFACTS; classified as expected for text commits | +| 05:00:00 | Operator: "Why wasn't the journal included?" — caught the storage-location miss | +| 05:00:47 | Time call; fix executed | +| 05:0X:XX | Ledger added to PR #157 as `odd/ledger/2026-04-30-e0008-4-phase-1-encode-governance-migration-landed.md` | + +--- + +## References + +- PR: https://github.com/klappy/klappy.dev/pull/157 +- Architecture brief: `klappy://docs/architecture/encode-architecture-problem-and-gaps` +- Phase 2 handoff: `klappy://odd/handoffs/2026-04-30-encode-vodka-refactor-alternative-d` +- Prior ledger (motivating H-01): `klappy://odd/ledger/2026-04-20-p1-3-4-encode-canon-parity-landed` +- Source repo (private): `klappy/truthkit-kb` +- Release validation gate: `klappy://canon/constraints/release-validation-gate` +- Core governance baseline: `klappy://canon/constraints/core-governance-baseline` +- DOLCHEO vocabulary: `klappy://canon/definitions/dolcheo-vocabulary` +- Bootstrap operating contract: `klappy://canon/bootstrap/model-operating-contract`