From a0dba0984242db4381dcdbc347b31552fb47b937 Mon Sep 17 00:00:00 2001 From: "Claude (via klappy session)" Date: Sun, 3 May 2026 19:21:46 +0000 Subject: [PATCH] =?UTF-8?q?docs(promotions):=20P0003-P0009=20=E2=80=94=20s?= =?UTF-8?q?even=20promotions=20for=20MCP=20server=20patterns?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Slate of promotions distilled from patterns observed across klappy/PTXprint-MCP (v1.0 -> v1.2 arc, PR #30 fresh-validator review) and klappy/agent-messaging-service (hosted /mcp planning, 2026-05-03). Each follows docs/promotions/TEMPLATE.md. Each cites adjacent existing canon via derives_from chains. Risk levels: 5 Low, 2 Medium. P0003 (Method, Medium) Reframe Before Trimming P0004 (Pattern, Low) Docs Proxy - Canon-as-Tool P0005 (Principle, Low) Async by Default for Long-Running Tools P0006 (Amend Vodka, Med) Boundary Enumeration as Spec Convention P0007 (Amend DoD, Low) DoD as Agent-Observable Behaviors P0008 (Amend Gate, Med) Validator DOLCHEO Ledger as Deliverable P0009 (Amend Vocab, Low) DOLCHEO+H Anti-Pattern Callout Pre-push gauntlet: - Audit 1 (link-rot): 8/8 existing canon URIs resolve, 3 new URIs correctly unresolved (will resolve on acceptance) - Audit 2 (adjacent-canon): derives_from chains cite all matching adjacent canon (doing-less-enables-more, partial-data, vodka-arch, dolcheo-vocabulary, release-validation-gate, specs-lock, etc.) - Audit 3 (frontmatter): 9/9 files type-discipline-compliant - oddkit_challenge against P0003 (highest stakes): canon constraints surfaced; doing-less-enables-more cited as the structural empirical claim; P0003 positioned as the operational complement (method, not principle) Per docs/promotions/README.md: each promotion is a proposal, not canon. Reviewer triage expected. --- .../P0003-reframe-before-trimming.md | 154 ++++++++++++++++++ .../P0004-docs-proxy-canon-as-tool.md | 151 +++++++++++++++++ ...async-by-default-for-long-running-tools.md | 152 +++++++++++++++++ ...boundary-enumeration-as-spec-convention.md | 123 ++++++++++++++ ...P0007-dod-as-agent-observable-behaviors.md | 129 +++++++++++++++ ...validator-dolcheo-ledger-as-deliverable.md | 137 ++++++++++++++++ ...dolcheo-not-dolcheo-plus-h-anti-pattern.md | 125 ++++++++++++++ 7 files changed, 971 insertions(+) create mode 100644 docs/promotions/P0003-reframe-before-trimming.md create mode 100644 docs/promotions/P0004-docs-proxy-canon-as-tool.md create mode 100644 docs/promotions/P0005-async-by-default-for-long-running-tools.md create mode 100644 docs/promotions/P0006-vodka-boundary-enumeration-as-spec-convention.md create mode 100644 docs/promotions/P0007-dod-as-agent-observable-behaviors.md create mode 100644 docs/promotions/P0008-pr-validator-dolcheo-ledger-as-deliverable.md create mode 100644 docs/promotions/P0009-dolcheo-not-dolcheo-plus-h-anti-pattern.md diff --git a/docs/promotions/P0003-reframe-before-trimming.md b/docs/promotions/P0003-reframe-before-trimming.md new file mode 100644 index 0000000..630a95e --- /dev/null +++ b/docs/promotions/P0003-reframe-before-trimming.md @@ -0,0 +1,154 @@ +--- +uri: klappy://docs/promotions/P0003-reframe-before-trimming +title: "P0003: Reframe Before Trimming — When a Tool Surface Feels Bloated, Question the Frame" +audience: docs +exposure: nav +tier: 3 +voice: neutral +stability: evolving +tags: ["promotions", "proposed", "mcp-server", "tool-surface", "refactoring", "vodka-architecture", "doing-less"] +promotion_status: proposed +--- + +# P0003: Reframe Before Trimming — When a Tool Surface Feels Bloated, Question the Frame + +> When an MCP server's tool surface feels bloated, the bloat is usually in the frame the surface implies, not in the tool count. Reframe first; the trim follows mechanically. + +## Observed Pattern + +An MCP server's tool count drifts upward over time as features accrete. At some point the surface "feels wrong" — too many tools, too much overlap, too many edges to defend in PR review. The instinct is to ask "which tools can we cut?" and pick the weakest-justified ones. That move trims symptoms but preserves the conceptual model that produced the bloat. + +The pattern observed across server projects is that the right move at this moment is structurally different: **suspect the frame the surface implies, not the count.** When the frame is wrong, the tool count collapses mechanically once the frame is corrected. When the frame is right and the count still feels high, the discomfort is usually about something else — vodka-boundary enforcement, async shape, or consumer-side wiring friction — and trimming will not address it. + +- Affects: any MCP server going through scope review or v-bump refactoring +- Outcome without the move: smaller surface that still embodies the wrong model; conceptual debt persists; bloat returns at the next feature wave +- Outcome with the move: surface that fits the actual model; the trim is a side effect, not the goal + +## Evidence + +| Validation Session | Date | Outcome | Notes | +| --- | --- | --- | --- | +| `klappy/PTXprint-MCP` v1.0 → v1.1 | 2026-Q1 | Trimmed without reframing | 17 → 7 tools by cutting features; surface still modeled the server as a project filesystem | +| `klappy/PTXprint-MCP` v1.1 → v1.2 | 2026-Q2 | Reframed, then trimmed | Reframed PTXprint as a pure function `(config, sources, fonts) → PDF` with content-addressed cache; tool count collapsed to 3 mechanically | +| `klappy/agent-messaging-service` hosted /mcp planning | 2026-05-03 | Pre-emptive reframe applied | Same diagnostic move applied during planning; spec landed on 6 tools per `mcp-wrapper-conformance-for-conversational-ai` rather than drifting to 8+ via "while we're here" additions | + +**Total observations**: 3 across 2 independent server projects +**Independent occurrences**: 2 distinct repositories +**Affected workflows**: MCP server v-bump refactoring, hosted protocol-endpoint planning + +## Current Handling + +- **Detection today**: each server project re-derives the discipline through some combination of `vodka-architecture` review, `kiss-simplicity-is-the-ceiling` pressure, and operator intuition during PR review +- **Guidance**: there is no named diagnostic move that surfaces "the count is a symptom; the frame is the cause" before the trim instinct fires +- **Closest existing canon**: `canon/principles/doing-less-enables-more.md` is the structural empirical claim about why thin substrates win, and its smell test catches additions ("while we're here, the substrate could just…"). It does not address the post-hoc case where the surface has already drifted and the bloat is observable + +This promotion fills the operational gap: when an existing surface has already drifted, what is the diagnostic move? + +## Proposed Promotion + +### Target Document + +`canon/methods/reframe-before-trimming.md` (new) + +Method, not principle. The principle space is occupied by `doing-less-enables-more` (the structural claim) and `vodka-architecture` (the discipline). This document operationalizes a specific diagnostic move during refactoring. + +### Section + +Whole document; new file. + +### Proposed Language + +```markdown +--- +uri: klappy://canon/methods/reframe-before-trimming +title: "Reframe Before Trimming — When a Tool Surface Feels Bloated, Question the Frame" +audience: canon +exposure: nav +tier: 2 +voice: neutral +stability: evolving +tags: ["canon", "method", "refactoring", "tool-surface", "mcp-server", "vodka-architecture", "doing-less", "diagnosis"] +derives_from: + - klappy://canon/principles/doing-less-enables-more + - klappy://canon/principles/vodka-architecture + - klappy://canon/principles/kiss-simplicity-is-the-ceiling +complements: + - klappy://canon/methods/pivot-on-inversion +status: active +--- + +# Reframe Before Trimming + +> When an MCP server's tool surface feels bloated, the bloat is usually in the frame the surface implies, not in the tool count. Reframe first; the trim follows mechanically. + +## When This Method Applies + +A surface feels bloated. The instinct is to cut tools. This method says: pause that instinct. + +Specifically, this method applies when: + +- An existing MCP server has accrued tools across multiple versions +- Reviewers describe the surface as "too many tools" or "feels overlapping" +- A refactor is being scoped that will trim the count + +## The Method + +1. **Do not ask "which tools can we cut?"** +2. **Ask "is the frame this surface implies actually correct?"** Write down what mental model a fresh consumer would build by reading the tool list cold. Compare it against what the server actually does in the world. +3. **If the frame is wrong, fix the frame.** Restate the server's job in one sentence that matches reality. The tool count usually collapses mechanically because tools justified only by the wrong frame stop being justified. +4. **If the frame is right and the tool count still feels high**, the discomfort is probably elsewhere — vodka-boundary leakage (`canon/principles/vodka-architecture.md`), async shape mismatch (`canon/principles/async-by-default-for-long-running-tools.md` if proposed), or consumer-side wiring friction. Diagnose that, not the count. + +## Failure Mode — Trimming Without Reframing + +Cutting tools without reframing produces a smaller surface that still embodies the wrong model. The tool count drops; the conceptual debt stays. Bloat returns at the next feature wave because the model's gravity pulls toward the same shape. + +## Receipts + +- **PTXprint-MCP v1.0 → v1.2.** v1.0 had 17 tools modeling the server as a project filesystem. v1.1 trimmed to 7 by cutting features but kept the filesystem frame. v1.2 reframed the server as a pure function `(config, sources, fonts) → PDF` with content-addressed cache. Tool count collapsed to 3 — `submit_typeset`, `get_job_status`, `cancel_job` — without functionality loss. The trim was a side effect of the reframe, not its goal. +- *(Receipt pattern: each future application adds one row — server, before-count, after-count, the reframe in one sentence. Dense, not narrative.)* + +## Relationship to Adjacent Canon + +This method is the operational complement to `canon/principles/doing-less-enables-more`. That principle is the structural empirical claim about why thin substrates win and catches NOT-ADOPTING-new-opinions through its smell test. This method addresses the post-hoc case: the substrate already drifted; the bloat is observable; what now? + +`canon/methods/pivot-on-inversion` is adjacent but different — that method is about recovery when an iteration's gradient turns negative. This method is about diagnosis when a surface's tool count feels wrong. Pivot-on-inversion answers "should we keep going?"; reframe-before-trimming answers "what should we change first?" +``` + +### Rationale + +The principle layer is occupied. `doing-less-enables-more` (2026-05-02) makes the structural claim; `vodka-architecture` defines the discipline; `kiss-simplicity-is-the-ceiling` constrains surface area at construction. None of them address the post-drift diagnostic move. This is a method-shaped gap, not a principle-shaped one — operationalizing a specific decision sequence during refactoring. + +Placing this in `canon/methods/` rather than `canon/principles/` is deliberate. It sits next to `pivot-on-inversion` (also a method-shaped recovery procedure) and clearly relates to the principles via derives_from rather than competing with them. + +## Risk Assessment + +| Risk Level | Description | +| --- | --- | +| Low | Clarifies existing rule, no scope change | +| **Medium** | **Adds new method, may affect refactoring workflows** | +| High | Changes existing behavior, requires migration | + +**Risk level**: Medium + +**Mitigation**: The method is opt-in diagnostic guidance for refactoring, not a hard requirement. It does not block any workflow; it offers a sequencing rule for surface-level refactors. Adoption can be gradual — cited in PR descriptions when relevant, not enforced by gate. + +## Status + +`proposed` + +## Review Notes + +(To be filled during review) + +- **Reviewer**: +- **Decision**: +- **Date**: +- **Notes**: + +## Execution Record + +(To be filled after acceptance) + +- **Commit**: +- **Canon doc updated**: `canon/methods/reframe-before-trimming.md` +- **Backlink added**: Yes / No diff --git a/docs/promotions/P0004-docs-proxy-canon-as-tool.md b/docs/promotions/P0004-docs-proxy-canon-as-tool.md new file mode 100644 index 0000000..fcf3b09 --- /dev/null +++ b/docs/promotions/P0004-docs-proxy-canon-as-tool.md @@ -0,0 +1,151 @@ +--- +uri: klappy://docs/promotions/P0004-docs-proxy-canon-as-tool +title: "P0004: Docs Proxy — Canon-as-Tool So Consumers Wire One MCP, Not Two" +audience: docs +exposure: nav +tier: 3 +voice: neutral +stability: evolving +tags: ["promotions", "proposed", "mcp-server", "docs-proxy", "canon", "vodka-architecture", "consumer-experience"] +promotion_status: proposed +--- + +# P0004: Docs Proxy — Canon-as-Tool So Consumers Wire One MCP, Not Two + +> An MCP server that depends on a sibling canon repo for its domain knowledge SHOULD expose a `docs(query, ...)` tool that proxies the canon-server (oddkit), parameterized to its own repo. Consumers get both action tools and canon retrieval through one MCP wiring. + +## Observed Pattern + +Multiple MCP servers in this program depend on a sibling canon repo for governance and domain knowledge. Without intervention, every consumer of those servers must wire two MCPs to get the full surface: the action server, and the canon server (oddkit) configured with the action server's repo as `knowledge_base_url`. + +The wire-two-MCPs tax is paid by every consumer, every time. Some consumers pay it; many do not. Those who do not pay it lose access to live canon at exactly the moment they need it most — mid-task, in-flow — and either abandon the canon-as-living-context value proposition or fall back to opening the canon repo's GitHub web UI. + +The pattern observed is that one server-side decision — exposing a thin `docs` proxy tool that POSTs to the canon-server with the action server's repo URL — eliminates the tax for every current and future consumer. + +- Affects: every consumer of every MCP server with a sibling canon repo +- Outcome without the tool: per-consumer onboarding friction; reduced canon-in-the-loop usage; canon drifts from living context to web-search-of-last-resort +- Outcome with the tool: one MCP wired, both surfaces accessible, canon stays in-flow + +## Evidence + +| Validation Session | Date | Outcome | Notes | +| --- | --- | --- | --- | +| `klappy/PTXprint-MCP` v1.0 D-004 | 2026-Q1 | "No retrieval in MCP server" — original decision | Server boundary kept thin; consumers expected to wire oddkit separately | +| `klappy/PTXprint-MCP` v1.2 session 13 | 2026-04-29 | Decision reversed; `docs(query, audience?, depth?)` added as 4th tool | Reversal rationale recorded in v1.2 spec: downstream agents (e.g., BT Servant) want one MCP, not two; vodka boundary preserved (the proxy holds zero domain semantics) | +| `klappy/agent-messaging-service` hosted /mcp planning | 2026-05-03 | Same gap re-identified | A Claude Desktop user wiring AMS-MCP today must also wire oddkit-MCP with `knowledge_base_url=agent-messaging-service` to get `ams://canon/...` retrieval. An `ams_docs` tool would absorb the tax once, server-side | + +**Total observations**: 3 across 2 independent server projects +**Independent occurrences**: 2 distinct repositories, with the second project re-encountering the question without prior knowledge of the first project's resolution +**Affected workflows**: every consumer onboarding path for every oddkit-pattern MCP server in this program + +## Current Handling + +- **Detection today**: per-consumer friction is silent — operators notice it when they themselves try to onboard a new agent and discover they need to wire two MCPs +- **Workaround today**: consumers either accept the two-MCP tax or skip canon-in-flow entirely +- **Closest existing canon**: `canon/principles/consistency-same-pattern-every-time.md` says "the server behaves identically regardless of what knowledge base it serves" — that is the *same-server-many-knowledge-bases* axis. This promotion addresses the orthogonal axis: *many-servers-one-knowledge-base, single consumer wiring*. Both axes deserve coverage + +## Proposed Promotion + +### Target Document + +`canon/patterns/docs-proxy-canon-as-tool.md` (new) + +### Section + +Whole document; new file. + +### Proposed Language + +```markdown +--- +uri: klappy://canon/patterns/docs-proxy-canon-as-tool +title: "Docs Proxy — Canon-as-Tool So Consumers Wire One MCP" +audience: canon +exposure: nav +tier: 2 +voice: neutral +stability: evolving +tags: ["canon", "pattern", "mcp-server", "docs-proxy", "consumer-experience", "vodka-architecture"] +derives_from: + - klappy://canon/principles/vodka-architecture + - klappy://canon/principles/consistency-same-pattern-every-time + - klappy://canon/principles/dry-canon-says-it-once +complements: + - klappy://canon/principles/doing-less-enables-more +status: active +--- + +# Docs Proxy — Canon-as-Tool + +> An MCP server whose action surface depends on a sibling canon repo for domain semantics SHOULD expose a `docs(query, audience?, depth?)` tool that proxies the canon-server (oddkit), parameterized to its own repo. Consumers get both surfaces through one MCP wiring. + +## The Pattern + +The action MCP server adds one tool — typically named `docs` or `_docs` — whose entire job is to forward queries to the canon-server with the action server's own repo URL pinned as the `knowledge_base_url` parameter. + +- **Inputs**: `query` (required), `audience` (optional, server-defined enum), `depth` (optional `1|2|3` — snippet, full top doc, top + next two) +- **Returns**: `{ answer, sources[], deeper[], governance_source }` +- **Failure**: graceful degradation when the canon-server is unreachable — `{ answer: null, sources: [], governance_source: "minimal", error }` rather than a hard error that blocks consumer flow + +## Vodka Check the Tool Must Pass + +The proxy tool knows exactly two URLs: this server's canon repo and the canon-server's MCP endpoint. It holds zero domain semantics. It does not parse, rank, filter, score, or reframe results. It is a pinned forwarding layer. + +If the proxy ever grows a domain-flavored taxonomy or a scoring tweak, that taxonomy moves into governance documents in the canon repo (which the canon-server retrieves through the same proxy), not into the tool's implementation. Domain logic in the proxy is a vodka-boundary leak. + +## Why the Pattern Exists + +Without the pattern, every consumer pays a wire-two-MCPs tax: one MCP for actions, a second MCP for canon retrieval, configured with the action server's repo as `knowledge_base_url`. Consumers who do not pay the tax lose canon-in-flow access at the moment they need it most. The pattern absorbs the tax once, server-side, for every present and future consumer. + +This pattern is orthogonal to `canon/principles/consistency-same-pattern-every-time`, which covers the *same-server-many-knowledge-bases* axis. This pattern covers *many-servers-one-knowledge-base*. Both are real; both deserve canon coverage. + +## Failure Mode + +Without this pattern: per-consumer onboarding friction; canon drifts from living context to web-search-of-last-resort; the canon-as-living-context value proposition silently degrades because consumers never wire it. + +With the pattern: one MCP wired, both surfaces accessible; canon stays in-flow as designed. + +## Receipts + +- **PTXprint-MCP v1.2 §3 `docs` tool.** Added in session 13 (2026-04-29), reversing v1.0's "no retrieval in MCP server" decision with explicit rationale: downstream agents like BT Servant want one MCP wiring. Vodka boundary preserved. +- *(Future receipts: each server adopting the pattern adds one row — server, tool name, date adopted, link to spec section.)* +``` + +### Rationale + +The pattern's content has been independently re-derived in two server projects within ~6 weeks of each other. The cost of writing it once into canon is small; the cost of letting every future server's session re-derive it is paid every time. This is the `dry-canon-says-it-once` shape — say it once at canon, every future server reads it at preflight. + +The placement under `canon/patterns/` rather than `canon/principles/` is deliberate. This is a concrete server-implementation pattern (specific tool shape, specific call mechanics), not a structural claim. `canon/patterns/` does not currently exist as a directory in the canon tree; this would establish it. If reviewers prefer to keep the directory namespace tighter, the document can land under `canon/methods/` instead with no content change. + +## Risk Assessment + +| Risk Level | Description | +| --- | --- | +| **Low** | **Clarifies existing rule, no scope change** | +| Medium | Adds new requirement, may affect workflows | +| High | Changes existing behavior, requires migration | + +**Risk level**: Low + +**Mitigation**: The pattern is opt-in for each server. Servers that do not depend on a sibling canon repo (none today, but hypothetical) need not adopt it. Adoption is per-server, gradual, additive. + +## Status + +`proposed` + +## Review Notes + +(To be filled during review) + +- **Reviewer**: +- **Decision**: +- **Date**: +- **Notes**: + +## Execution Record + +(To be filled after acceptance) + +- **Commit**: +- **Canon doc updated**: `canon/patterns/docs-proxy-canon-as-tool.md` (or `canon/methods/...` if directory placement is reconsidered) +- **Backlink added**: Yes / No diff --git a/docs/promotions/P0005-async-by-default-for-long-running-tools.md b/docs/promotions/P0005-async-by-default-for-long-running-tools.md new file mode 100644 index 0000000..5ec1ddf --- /dev/null +++ b/docs/promotions/P0005-async-by-default-for-long-running-tools.md @@ -0,0 +1,152 @@ +--- +uri: klappy://docs/promotions/P0005-async-by-default-for-long-running-tools +title: "P0005: Async by Default — Long-Running MCP Tools Return an Identifier, Never Block" +audience: docs +exposure: nav +tier: 3 +voice: neutral +stability: evolving +tags: ["promotions", "proposed", "mcp-server", "async", "long-running", "job-id", "polling", "latency"] +promotion_status: proposed +--- + +# P0005: Async by Default — Long-Running MCP Tools Return an Identifier, Never Block + +> Any MCP tool whose work could exceed ~5 seconds wall-clock returns an identifier within that budget and places the long-running work behind a separate read tool. No tool blocks for the duration of work. + +## Observed Pattern + +MCP servers in this program ship tools that wrap potentially-long-running work — typesetting jobs (30+ minutes worst case), large message sends, fanout retrieval. Without an explicit async convention, every server reasons through the same problem from scratch — platform timeouts (Workers `ctx.waitUntil`, Container `sleepAfter`), polling cadence, cancellation semantics — and arrives at the same three-tool shape: `` returns an id; `get__status` polls; `cancel_` requests cancellation. + +The pattern observed is that this is not project-specific reasoning. It is the canonical async shape for MCP work. Codifying it once means future servers read it at preflight rather than re-deriving it under time pressure. + +- Affects: any MCP server with action tools whose work could exceed ~5 seconds +- Outcome without the convention: each server arrives at the same shape independently, sometimes with subtle inconsistencies (different field names, different cancellation semantics) +- Outcome with the convention: consistent shape across servers; consumers can pattern-match across any compliant server + +## Evidence + +| Validation Session | Date | Outcome | Notes | +| --- | --- | --- | --- | +| `klappy/PTXprint-MCP` v1.2 typesetting | 2026-Q2 | Three-tool shape adopted | `submit_typeset` returns `job_id` within seconds via `ctx.waitUntil(fetch(...))`; `get_job_status` polls; `cancel_job` flips a Durable Object flag the worker polls | +| `klappy/agent-messaging-service` hosted /mcp planning | 2026-05-03 | Same shape arrived at independently | `ams_send` returns when the wire accepts the frame, not when peers receive; `ams_recv` is the explicit poll path with a 5–10s long-poll cap per `ams://canon/constraints/mcp-wrapper-conformance-for-conversational-ai` latency budget | + +**Total observations**: 2 across 2 independent server projects +**Independent occurrences**: 2 distinct repositories, with no cross-pollination of decision rationale at the time of arrival +**Affected workflows**: every long-running MCP tool implementation + +## Current Handling + +- **Detection today**: each server's author/agent reasons through platform constraints and arrives at the shape independently +- **Closest existing canon**: `canon/principles/partial-data-with-transparency-and-background-warm.md` (2026-04-24) covers the user-blocking-path-must-not-block-on-corpus-scan case for *read* operations. Three load-bearing properties: bounded blocking path, background warm, structured disclosure. That principle does not address long-running *action* tools (jobs that produce side effects, jobs the consumer wants to cancel mid-flight) +- **Gap**: no canon doc says "if it could take >5s, the answer is always job_id+poll+cancel, and here's why" + +## Proposed Promotion + +### Target Document + +`canon/principles/async-by-default-for-long-running-tools.md` (new) + +### Section + +Whole document; new file. + +### Proposed Language + +```markdown +--- +uri: klappy://canon/principles/async-by-default-for-long-running-tools +title: "Async by Default — Long-Running MCP Tools Return an Identifier, Never Block" +audience: canon +exposure: nav +tier: 2 +voice: neutral +stability: evolving +tags: ["canon", "principle", "mcp-server", "async", "long-running", "latency", "vodka-architecture"] +derives_from: + - klappy://canon/principles/partial-data-with-transparency-and-background-warm + - klappy://canon/principles/vodka-architecture + - klappy://canon/values/axioms +complements: + - klappy://canon/principles/partial-data-with-transparency-and-background-warm +status: active +--- + +# Async by Default — Long-Running MCP Tools Return an Identifier, Never Block + +> Any MCP tool whose work could exceed ~5 seconds wall-clock returns an identifier within that budget and places the long-running work behind a separate read tool. No tool blocks for the duration of work. + +## The Principle + +Three minimum-viable tools result for any long-running action: + +1. `(...)` — submit; returns identifier within ~5 seconds +2. `get__status(id)` — poll; returns current state, progress, and result-when-complete +3. `cancel_(id)` — request cancellation; returns ack + +Notification-style push (server-pushed events on supported transports) is additive. The polling tool remains the canonical floor so consumers on poll-only transports work too. + +## Latency Budget Recommendation + +- **Submission tool returns**: ≤ 1s median, ≤ 5s p99 +- **Status read tool returns**: ≤ 1s median (state read, never reaches the worker that does the work) +- **Notification delivery (when present)**: ≤ 1s median, ≤ 5s p99 +- **Long-poll fallback**: ≤ 5s p99 round-trip + +## Failure Mode — Blocking the Consumer + +A tool that blocks for 30 minutes ties up the consumer's MCP session, hides progress, breaks cancellation, and forces every consumer host to implement timeout/retry around it. Returning an identifier immediately keeps the wire predictable and the consumer in control of when to ask for results. + +The shape also keeps the *server* in control of how long the work continues if the consumer disconnects. With a blocking tool, the work dies on disconnect; with the async shape, the work continues, the cache populates, and the next consumer's request finds the result without re-running the work. + +## Relationship to Adjacent Canon + +`canon/principles/partial-data-with-transparency-and-background-warm` is the read-side complement: the user-blocking *read* path must not block on a corpus scan; return what's already observed, schedule the rest in the background, disclose what's missing. This principle is the action-side: the user-blocking *action* path must not block for the duration of the work; return an identifier, expose poll+cancel, let the consumer drive their own attention. + +Both principles share the underlying axiom: the consumer's blocking time is a budget the substrate must spend frugally. + +## Receipts + +- **PTXprint-MCP v1.2 typesetting.** `submit_typeset` / `get_job_status` / `cancel_job` triad. Worker → `ctx.waitUntil(fetch())` → Container → DO state. 30-minute jobs do not block the consumer's MCP session at any point. +- **AMS hosted /mcp.** `ams_send` returns on wire-accept, not peer-receive. `ams_recv` is the explicit poll path with a 5–10s long-poll cap. `ams_leave` is the cancellation path. Same shape. +- *(Future receipts: each compliant server adds one row — server, action tool, status tool, cancel tool, observed median submit latency.)* +``` + +### Rationale + +The shape is the same in both server projects despite no cross-pollination. That convergence is the signal that this is the canonical pattern. Without canon, every future server's planning session re-derives it. With canon, preflight surfaces it. + +The principle is distinct from `partial-data-with-transparency-and-background-warm`: that doc is about *read-side* corpus scans (return partial, warm in background); this doc is about *action-side* long-running work (return id, poll for completion). Both belong; they are complementary, not duplicative. + +## Risk Assessment + +| Risk Level | Description | +| --- | --- | +| **Low** | **Clarifies existing rule, no scope change** | +| Medium | Adds new requirement, may affect workflows | +| High | Changes existing behavior, requires migration | + +**Risk level**: Low + +**Mitigation**: The principle is implementation guidance, not enforcement. Existing tools that block ≤5s are unaffected. New tools that may exceed 5s gain a named pattern to follow. Adoption can be incremental. + +## Status + +`proposed` + +## Review Notes + +(To be filled during review) + +- **Reviewer**: +- **Decision**: +- **Date**: +- **Notes**: + +## Execution Record + +(To be filled after acceptance) + +- **Commit**: +- **Canon doc updated**: `canon/principles/async-by-default-for-long-running-tools.md` +- **Backlink added**: Yes / No diff --git a/docs/promotions/P0006-vodka-boundary-enumeration-as-spec-convention.md b/docs/promotions/P0006-vodka-boundary-enumeration-as-spec-convention.md new file mode 100644 index 0000000..817baf9 --- /dev/null +++ b/docs/promotions/P0006-vodka-boundary-enumeration-as-spec-convention.md @@ -0,0 +1,123 @@ +--- +uri: klappy://docs/promotions/P0006-vodka-boundary-enumeration-as-spec-convention +title: "P0006: Vodka Boundary Enumeration — Specs Must List What the Server Knows, Doesn't Know, and Is NOT" +audience: docs +exposure: nav +tier: 3 +voice: neutral +stability: evolving +tags: ["promotions", "proposed", "vodka-architecture", "spec-convention", "boundary", "non-goals", "amendment"] +promotion_status: proposed +--- + +# P0006: Vodka Boundary Enumeration — Specs Must List What the Server Knows, Doesn't Know, and Is NOT + +> Every MCP server spec written under vodka-architecture MUST include three enumerated sections: "What the server knows," "What the server does NOT know," and "What this server is NOT." Implicit boundaries get violated at PR time; enumerated boundaries get cited in PR review. + +## Observed Pattern + +`canon/principles/vodka-architecture.md` defines the discipline philosophically. Specs that follow vodka-architecture get the boundary right by author intuition plus reviewer pressure. Specs that don't, sprawl. The discipline is implicit; PR-level enforcement is by-feel. + +The pattern observed is that author intuition alone is insufficient at PR-review time. A boundary kept in author memory gets violated when the author rotates out, when a "small addition" PR lands, or when the spec is re-implemented by an agent that does not have the original author's mental model. A boundary written down — as enumerated bullet lists in the spec — survives those transitions because future PR authors must rebut a bulleted non-goal to expand scope, which is asymmetrically harder than arguing for a useful-sounding addition. + +- Affects: every MCP server spec written under vodka-architecture +- Outcome without enumeration: implicit boundaries drift across PRs; "small additions" cumulatively violate the original frame; the next vodka rewrite has to undo additions one by one +- Outcome with enumeration: boundary visible in the spec itself; PR reviewers can cite specific bullets; scope expansion requires explicit rebuttal of an existing non-goal + +## Evidence + +| Validation Session | Date | Outcome | Notes | +| --- | --- | --- | --- | +| `klappy/PTXprint-MCP` v1.0 (boundary implicit) | 2026-Q1 | Boundary drifted | v1.0 spec had no enumerated boundary; tools accreted across releases until v1.1 had 17 tools modeling the server as a project filesystem | +| `klappy/PTXprint-MCP` v1.2 §1 (boundary enumerated) | 2026-Q2 | Boundary held | v1.2 §1 enumerates 5 things the server knows (payload schema, dispatch, R2 presigning, DO state, content-addressed cache lookup), 4 things it does not (cfg semantics, font logic, config inheritance, override resolution), and a separate "What this server is NOT" section. Subsequent PRs cite §1 in review | +| `klappy/agent-messaging-service` `mcp-wrapper-conformance-for-conversational-ai` constraint | 2026-04+ | Philosophy stated, enumeration partial | "The wrapper does not parse `data`; it forwards opaque bytes" is one bullet. Full boundary enumeration not yet in spec form. Hosted /mcp planning explicitly identified the gap | + +**Total observations**: 3 across 2 independent server projects +**Independent occurrences**: 2 distinct repositories +**Affected workflows**: every spec author writing a vodka-architecture-compliant spec + +## Current Handling + +- **Detection today**: vodka-architecture (existing canon) describes the boundary philosophically with examples; specs follow it by author intuition; PR review surfaces drift case-by-case +- **Closest existing canon**: `canon/principles/vodka-architecture.md` (the philosophy) and `canon/principles/doing-less-enables-more.md` (the empirical claim about why thinness wins). Neither codifies the *spec-section convention* — what specific sections a spec must contain to prove it observes vodka discipline +- **Gap**: vodka-architecture says "the server should be thin." It does not say "specs MUST include these enumerated sections." The convention is missing one altitude down + +## Proposed Promotion + +### Target Document + +`canon/principles/vodka-architecture.md` — append a new section. + +### Section + +`## Spec Convention — The Boundary Must Be Enumerated` (new section near the end of the existing doc, before any "See Also" / footer sections) + +### Proposed Language + +```markdown +## Spec Convention — The Boundary Must Be Enumerated + +A server claiming to follow vodka-architecture MUST include three enumerated sections in its spec: + +### Boundary Section + +`## What This Server Knows` — bullet list of every state, schema, or external resource the server understands or holds. + +`## What This Server Does NOT Know` — bullet list of the domain semantics this server defers to canon or the consumer environment. **This list IS the vodka boundary, written down.** + +### Non-Goals Section + +`## What This Server Is NOT` — bullet list of the categories of responsibility this server explicitly refuses. Each item is a statement future PR authors must rebut to expand scope. + +### Why Enumeration Matters + +A boundary kept in author intuition gets violated when the author rotates out. A boundary written down survives the rotation. PR authors proposing a new tool must argue against an enumerated non-property — asymmetrically harder than arguing for a useful-sounding addition. + +### Failure Mode + +Implicit boundaries drift. After a few PRs, the server has accumulated "small additions" that each individually felt fine but cumulatively violate the original vodka frame. The next vodka rewrite has to undo those additions one by one. Enumeration prevents the drift at the PR-review surface, where it is cheapest to catch. + +### Receipts + +- `klappy/PTXprint-MCP` v1.2 §1 — enumerated boundary + non-goals, with explicit attribution that the v1.0 → v1.1 sprawl happened because the v1.0 boundary was implicit. +- *(Each subsequent vodka-compliant spec adds a row pointing at its boundary section.)* +``` + +### Rationale + +The amendment sharpens vodka-architecture from philosophy to convention. It does not change *what* vodka means; it specifies *how* a spec proves it observes vodka. Same enforcement model the canon doc already uses (smell tests, design pressure), now with a concrete spec-shape requirement that reviewers can cite directly. + +Placed at the end of vodka-architecture.md (near "See Also" but above it) so the philosophy reads first and the convention lands as a concrete operationalization. + +## Risk Assessment + +| Risk Level | Description | +| --- | --- | +| Low | Clarifies existing rule, no scope change | +| **Medium** | **Adds new requirement, may affect workflows** | +| High | Changes existing behavior, requires migration | + +**Risk level**: Medium + +**Mitigation**: Existing specs without enumerated boundaries are not retroactively invalid. The convention applies to new specs and v-bump rewrites. Reviewers may cite the convention as "missing — recommended for next rewrite" rather than as a hard merge blocker. Adoption is gradual and per-spec. + +## Status + +`proposed` + +## Review Notes + +(To be filled during review) + +- **Reviewer**: +- **Decision**: +- **Date**: +- **Notes**: + +## Execution Record + +(To be filled after acceptance) + +- **Commit**: +- **Canon doc updated**: `canon/principles/vodka-architecture.md` +- **Backlink added**: Yes / No diff --git a/docs/promotions/P0007-dod-as-agent-observable-behaviors.md b/docs/promotions/P0007-dod-as-agent-observable-behaviors.md new file mode 100644 index 0000000..e61385d --- /dev/null +++ b/docs/promotions/P0007-dod-as-agent-observable-behaviors.md @@ -0,0 +1,129 @@ +--- +uri: klappy://docs/promotions/P0007-dod-as-agent-observable-behaviors +title: "P0007: Spec DoD Must Be 5–7 Agent-Observable Behaviors, Not Implementation Milestones" +audience: docs +exposure: nav +tier: 3 +voice: neutral +stability: evolving +tags: ["promotions", "proposed", "definition-of-done", "spec-convention", "agent-observable", "amendment"] +promotion_status: proposed +--- + +# P0007: Spec DoD Must Be 5–7 Agent-Observable Behaviors, Not Implementation Milestones + +> When a software spec ships, its Definition of Done section MUST list 5–7 things the consumer/agent can observably do once the work lands, not implementation milestones. "Tools land," "tests pass," and "code compiles" are necessary but not the DoD; the DoD is what becomes possible for the consumer as a result. + +## Observed Pattern + +`canon/constraints/definition-of-done.md` defines DoD in terms of evidence and verification. It does not specify the *shape* of the DoD section in a spec document. Spec authors fill it with whatever feels right — sometimes implementation milestones (files to create, tests to pass), sometimes user-facing behaviors (what the consumer can do), sometimes a mix. + +The pattern observed across server specs is that DoD-as-implementation-milestones produces specs whose "done" condition is "the build TODO list is checked off" rather than "the consumer can do the new things." Spec readers don't know what they get from the work. Validators can't validate against behaviors. Promotion gates can't verify intent because intent was never expressed in consumer-observable terms. + +The fix is a one-line shape constraint: DoD entries are sentences of the form "`` can `` and observe ``." + +- Affects: every spec document with a DoD section +- Outcome without the constraint: DoD reads like a build TODO; "done" detaches from "consumer value"; fresh-validator reviews struggle because the contract was implementation-shaped +- Outcome with the constraint: DoD reads as the consumer contract; "done" means specific things the consumer can verifiably do; validators and reviewers have a concrete checklist + +## Evidence + +| Validation Session | Date | Outcome | Notes | +| --- | --- | --- | --- | +| `klappy/PTXprint-MCP` v1.2 §9 | 2026-Q2 | DoD as 7 agent-observable behaviors | §9 lists 7 things an agent connected to PTXprint MCP + oddkit MCP must be able to do (read project state, construct valid payload, submit, poll, cancel, get cache hit on resubmit, get clear failure_mode classification). Implementation details (DO classes, wrangler config) are §5 and §10, separate | +| `klappy/PTXprint-MCP` PR #30 fresh-validator review | 2026-04+ | Validator could verify §9 directly | Each DoD entry mapped to a PASS/FAIL with file:line evidence. Validator's job was tractable because the contract was behavior-shaped | +| `klappy/agent-messaging-service` hosted /mcp planning | 2026-05-03 | Convention identified pre-spec | Hosted /mcp DoD will need 5–7 agent-observable behaviors that prove the wrapper does its job, not 5–7 source files that exist. Planning explicitly chose this shape | + +**Total observations**: 3 across 2 independent server projects +**Independent occurrences**: 2 distinct repositories, with the second project pre-emptively adopting the convention from observing the first +**Affected workflows**: every spec author writing a DoD section + +## Current Handling + +- **Detection today**: `canon/constraints/definition-of-done.md` defines what completion *means* (evidence required, verification needed). It does not define what a *spec's* DoD section looks like +- **Closest adjacent canon**: `canon/constraints/definition-of-done.md` (the evidence policy), `canon/methods/self-audit.md` (the 10-area reflection), `canon/principles/specs-lock-at-implementation.md` (specs-as-contracts; this amendment specifies what the contract's DoD section contains) +- **Gap**: nothing names the consumer-observable shape as the required DoD format + +## Proposed Promotion + +### Target Document + +`canon/constraints/definition-of-done.md` — append a new section. + +### Section + +`## Spec DoD Convention — Agent-Observable Behaviors` (new section appended to the existing doc) + +### Proposed Language + +```markdown +## Spec DoD Convention — Agent-Observable Behaviors + +When a spec for an MCP server, library, or other consumer-facing surface includes a Definition of Done section, that section MUST express completion as 5–7 things the consumer can observably do once the work ships, not as implementation milestones. + +### Format + +A spec DoD entry is one sentence in the form: + +> "`` can `` and observe ``." + +### Allowed + +- "A Claude Code instance with `.mcp.json` pointing at this server can call `ams_create_conversation` and receive a magic link in the response." +- "A second instance with a different bearer can `ams_join` that link and `ams_send` a token; the first observes it as `notifications/ams/token` within 1s median." + +### Disallowed (these are implementation, not DoD) + +- "SessionDO class lands at `worker/src/session.ts`." +- "wrangler.toml updated with v2 migration." +- "All tests pass." + +The implementation list belongs in a separate "Tomorrow's Execution Scope" or equivalent section. The DoD is the contract with the consumer; implementation is the contract with the codebase. + +### Failure Mode + +When a spec's DoD reads like a build TODO list, "done" becomes "the TODO list is checked off" rather than "the consumer can do the new things." Spec readers don't know what they get. Fresh-validator reviews struggle because the contract was never expressed in observable terms. Promotion gates can't verify consumer intent. + +### Receipts + +- `klappy/PTXprint-MCP` v1.2 §9 — 7 agent-observable behaviors as DoD; §5 and §10 carry the implementation specifics, kept separate. PR #30's fresh-validator review verified each §9 entry directly with file:line evidence. +``` + +### Rationale + +The amendment sharpens existing definition-of-done. Same evidence requirements; specifies the format. The constraint is small (one section), low-risk (existing specs not retroactively invalid), and directly improves fresh-validator workflows because behavior-shaped DoDs are tractable to verify. + +The convention pairs naturally with `canon/principles/specs-lock-at-implementation.md` (specs are contracts; this specifies what the contract's DoD section contains) and with the existing fresh-validator pattern in `canon/constraints/release-validation-gate.md`. + +## Risk Assessment + +| Risk Level | Description | +| --- | --- | +| **Low** | **Clarifies existing rule, no scope change** | +| Medium | Adds new requirement, may affect workflows | +| High | Changes existing behavior, requires migration | + +**Risk level**: Low + +**Mitigation**: Existing specs are not retroactively invalid. The convention applies prospectively. Reviewers can cite the convention during PR review of new spec docs. + +## Status + +`proposed` + +## Review Notes + +(To be filled during review) + +- **Reviewer**: +- **Decision**: +- **Date**: +- **Notes**: + +## Execution Record + +(To be filled after acceptance) + +- **Commit**: +- **Canon doc updated**: `canon/constraints/definition-of-done.md` +- **Backlink added**: Yes / No diff --git a/docs/promotions/P0008-pr-validator-dolcheo-ledger-as-deliverable.md b/docs/promotions/P0008-pr-validator-dolcheo-ledger-as-deliverable.md new file mode 100644 index 0000000..b722787 --- /dev/null +++ b/docs/promotions/P0008-pr-validator-dolcheo-ledger-as-deliverable.md @@ -0,0 +1,137 @@ +--- +uri: klappy://docs/promotions/P0008-pr-validator-dolcheo-ledger-as-deliverable +title: "P0008: Fresh-Validator Deliverable Is a DOLCHEO Ledger Committed to the Repo" +audience: docs +exposure: nav +tier: 3 +voice: neutral +stability: evolving +tags: ["promotions", "proposed", "release-validation-gate", "dolcheo", "validator-ledger", "fresh-context", "amendment"] +promotion_status: proposed +--- + +# P0008: Fresh-Validator Deliverable Is a DOLCHEO Ledger Committed to the Repo + +> When a PR is reviewed by a fresh-session validator under the release-validation-gate, the validator's deliverable is a DOLCHEO ledger committed to the repo at a stable path, paired with a companion review handoff doc. Comments on the PR are not the deliverable; the canon-resident ledger is. + +## Observed Pattern + +`canon/constraints/release-validation-gate.md` (2026-04-20) binds every ship in this program: PRs require fresh-context validator review before merge to main, and load-bearing PRs require independent validator dispatch via Managed Agents. The gate specifies *that* validation happens and *who* performs it (a fresh-session validator, not the orchestrator). It does not specify *what the validator produces*. + +Without a deliverable convention, validation findings end up in PR comments. PR comments are ephemeral — filtered, paginated, hidden when threads collapse, not searchable across repos. The same deviation gets re-discovered three PRs later. The same "platform constraint" gets re-litigated each session because no one knows whether it was already accepted as a permanent compromise or a v+1 candidate. + +The pattern observed in PTXprint-MCP PR #30 (v1.3 telemetry) is that a structured ledger committed to the repo solves the durability problem. The validator produces two artifacts: a DOLCHEO-structured ledger at `canon/encodings/pr-NN-fresh-validator-ledger.md` (verdict, per-DoD-item PASS/FAIL with file:line evidence, learnings, accepted-deviations-with-revisit-candidates, nits, open observations) and a companion handoff doc at `canon/handoffs/pr-NN-fresh-validator-review.md`. Both are committed; both are searchable via oddkit. + +- Affects: every fresh-validator review under the release-validation-gate +- Outcome without a deliverable convention: findings live in PR comments; cross-PR memory is lost; validators re-discover deviations that previous validators already accepted +- Outcome with the convention: findings live in canon; future PRs can cite specific ledger entries; "we already accepted this constraint and queued v+1 revisit" is a one-search answer + +## Evidence + +| Validation Session | Date | Outcome | Notes | +| --- | --- | --- | --- | +| `klappy/PTXprint-MCP` PR #30 v1.3 telemetry | 2026-Q2 | Structured ledger format established | Fresh-session managed agent produced `canon/encodings/pr-30-fresh-validator-ledger.md` (DOLCHEO) + `canon/handoffs/pr-30-fresh-validator-review.md` (prose review). Both committed, attributed, dated | +| `klappy/PTXprint-MCP` PR #30 re-validation addendum | 2026-Q2 | Format proved durable across re-review | When Cursor Agent landed three additional fix commits, the re-validation produced `pr-30-revalidation-addendum.md` referencing the original ledger's numbered observations. Cross-session continuity worked | + +**Total observations**: 2 (initial review + re-review on the same PR) +**Independent occurrences**: 2 review sessions in different contexts on the same release pipeline +**Affected workflows**: every release-validation-gate dispatch + +## Current Handling + +- **Detection today**: `canon/constraints/release-validation-gate.md` requires the validator to produce findings; `canon/definitions/dolcheo-vocabulary.md` defines the DOLCHEO seven-letter session-capture format +- **Closest adjacent canon**: `canon/methods/governance-validation-via-agents.md` (validators do the checking before merge); `canon/constraints/canon-integration-audit.md` (the three audits between authoring and merge) +- **Gap**: no canon doc says "the validator's deliverable shape is DOLCHEO at this specific path." The DOLCHEO vocabulary covers session-capture broadly; the release gate covers when validation happens; the integration audit covers what to check; nothing joins them at "what does the validator commit to the repo when done" + +## Proposed Promotion + +### Target Document + +`canon/constraints/release-validation-gate.md` — append a new section. + +### Section + +`## Validator Deliverable Convention — The PR-NN Fresh-Validator Ledger` (new section appended) + +### Proposed Language + +```markdown +## Validator Deliverable Convention — The PR-NN Fresh-Validator Ledger + +A fresh-session validator running under this gate MUST produce two artifacts as part of accepting or rejecting a PR. Both committed to the repo. + +### 1. The Ledger + +**Path**: `canon/encodings/pr-NN-fresh-validator-ledger.md` (or repo-equivalent) +**Structure**: DOLCHEO per `canon/definitions/dolcheo-vocabulary` + +Sections (in order): + +- **Decisions (D)** — the verdict (`SAFE TO MERGE` | `NOT SAFE`) with one-paragraph reason +- **Observations Closed (O)** — per-DoD-item PASS/FAIL with file:line evidence +- **Learnings (L)** — patterns the validator wants future readers to internalize +- **Constraints (C)** — deviations from spec accepted as platform/library constraints. Each accepted-as-constraint deviation MUST be paired with either "permanent" or "v+1 revisit candidate" — never silently accepted as permanent without explicit naming +- **Handoff (H)** — nits-grade follow-ups for the next session +- **Opens (O-open)** — still-open questions, each numbered for back-reference from future sessions + +### 2. The Companion Review + +**Path**: `canon/handoffs/pr-NN-fresh-validator-review.md` (or repo-equivalent) + +Free-form prose attribution and summary. May reference the ledger's numbered observations directly. Names the validator (model + session ID), date, scope. + +### Both committed to the repo + +In the same PR or as a follow-up commit on the validation branch. Frontmatter declares `reviewer:` and `reviews:`. Date in frontmatter `date:`. + +### Why a Canon-Resident Ledger Is the Deliverable + +GitHub PR comments are ephemeral — filtered, paginated, hidden on thread collapse, not searchable across repos. A canon-resident ledger is searchable via oddkit, indexed by the validator's repo, and forms the long-term record of what was checked, what was accepted as a platform constraint with v+1 candidate, and what nits remain. Future PRs that reopen the same ground can cite the ledger directly. + +### Failure Mode + +Without this convention: findings live in PR comments that are read once and forgotten. The same deviation gets re-discovered three PRs later. The same "platform constraint" gets re-litigated each session because no one knows whether it was already accepted. The ledger creates the system memory the gate's enforcement-by-convention-plus-enforcer shape requires. + +### Receipts + +- `klappy/PTXprint-MCP` PR #30 v1.3 telemetry — `canon/encodings/pr-30-fresh-validator-ledger.md` + `canon/handoffs/pr-30-fresh-validator-review.md`. Re-validation produced `pr-30-revalidation-addendum.md` referencing the original ledger's numbered observations. +``` + +### Rationale + +The release-validation-gate currently says validation must happen and must be fresh-context. This amendment adds *what shape the output takes* — same enforcement model the gate already uses, with a writeable target. The DOLCHEO vocabulary already exists; this amendment specifies its application to the PR-validator workflow. + +Joining `release-validation-gate` + `dolcheo-vocabulary` at the deliverable layer closes a real gap. Both docs exist; nothing tells a validator "produce a DOLCHEO ledger at `canon/encodings/pr-NN-...`." + +## Risk Assessment + +| Risk Level | Description | +| --- | --- | +| Low | Clarifies existing rule, no scope change | +| **Medium** | **Adds new requirement, may affect workflows** | +| High | Changes existing behavior, requires migration | + +**Risk level**: Medium + +**Mitigation**: The convention adds a deliverable requirement to fresh-validator dispatch. Existing PRs whose validators commented inline are not retroactively invalid. The convention applies prospectively. The format is well-defined (DOLCHEO is canon) and template-friendly — a validator can be primed with the structure, lowering authoring overhead. + +## Status + +`proposed` + +## Review Notes + +(To be filled during review) + +- **Reviewer**: +- **Decision**: +- **Date**: +- **Notes**: + +## Execution Record + +(To be filled after acceptance) + +- **Commit**: +- **Canon doc updated**: `canon/constraints/release-validation-gate.md` +- **Backlink added**: Yes / No diff --git a/docs/promotions/P0009-dolcheo-not-dolcheo-plus-h-anti-pattern.md b/docs/promotions/P0009-dolcheo-not-dolcheo-plus-h-anti-pattern.md new file mode 100644 index 0000000..909335d --- /dev/null +++ b/docs/promotions/P0009-dolcheo-not-dolcheo-plus-h-anti-pattern.md @@ -0,0 +1,125 @@ +--- +uri: klappy://docs/promotions/P0009-dolcheo-not-dolcheo-plus-h-anti-pattern +title: "P0009: DOLCHEO+H Is Not the Vocabulary — Explicit Anti-Pattern Callout" +audience: docs +exposure: nav +tier: 3 +voice: neutral +stability: evolving +tags: ["promotions", "proposed", "dolcheo", "vocabulary", "anti-pattern", "amendment"] +promotion_status: proposed +--- + +# P0009: DOLCHEO+H Is Not the Vocabulary — Explicit Anti-Pattern Callout + +> The vocabulary is DOLCHEO. The H (Handoffs) is the fifth letter of the seven-letter acronym. Writing "DOLCHEO+H" is residue from the superseded OLDC+H vocabulary and doubles the H. Add an explicit anti-pattern callout to the canon definition. + +## Observed Pattern + +`canon/definitions/dolcheo-vocabulary.md` (2026-04-19) supersedes the earlier `OLDC+H` vocabulary. In OLDC+H, Handoffs were tacked on with `+H` because the original four letters did not include them. DOLCHEO absorbed Handoffs into the seven-letter acronym (D-O-L-C-**H**-E-O), eliminating the need for the suffix. + +The pattern observed across many sessions and repositories is that the `+H` suffix survives as muscle memory in agents and as residue in canon documents. It propagates because: + +1. The DOLCHEO vocabulary doc itself lists "OLDC+H" as a discoverability search term in its `## Discoverability` section, exposing the legacy form to BM25 search +2. The doc's `## See Also` correctly links the superseded `OLDC+H` doc, but does not explicitly say *do not write `DOLCHEO+H`* +3. At least one canon-resident artifact in a sibling repo (`klappy/PTXprint-MCP/canon/encodings/pr-30-fresh-validator-ledger.md`) contains the malformed string "DOLCHEO+H encoding of findings" +4. Agents reading that ledger as evidence echo the malformed form back when synthesizing patterns from it + +The result is a recurring hallucination across sessions: agents write "DOLCHEO+H" believing it is correct because they have seen it in canon-adjacent context. The fix is one explicit anti-pattern callout in the authoritative vocabulary doc, which makes the malformed string searchable as "do not write this." + +- Affects: every session that captures DOLCHEO artifacts; every downstream consumer of session ledgers +- Outcome without the callout: the malformed string keeps re-appearing in new artifacts; downstream agents propagate it; operators encounter the same correction across multiple sessions +- Outcome with the callout: oddkit_search for "DOLCHEO+H" surfaces the anti-pattern note; agents reading the vocabulary doc see the explicit "do not" and bounce off; the residue is killed at the source + +## Evidence + +| Validation Session | Date | Outcome | Notes | +| --- | --- | --- | --- | +| Session producing this slate | 2026-05-03 | Hallucinated "DOLCHEO+H" 8 times across 2 artifacts | Agent (Claude) propagated the form from PTXprint PR #30 ledger header into freshly-authored slate documents. Operator caught and corrected mid-session: "I have no idea where you keep making up the H." | +| `klappy/PTXprint-MCP` `canon/encodings/pr-30-fresh-validator-ledger.md` | 2026-Q2 | Canon-resident artifact contains the malformed string | Line ~12 of body: "DOLCHEO+H encoding of findings from the independent validation of PR #30." | +| Operator's stated experience | recurring | "Resurfacing every conversation" | Operator's framing on 2026-05-03: "It's minor but I'm frustrated at it resurfacing every conversation" — direct testimony of the recurring hallucination across multiple sessions | + +**Total observations**: 3 across multiple sessions and at least 2 repositories +**Independent occurrences**: ≥3 distinct sessions in the operator's stated experience (the hallucination has resurfaced repeatedly; the slate-authoring session is the one that surfaced the pattern explicitly) +**Affected workflows**: every DOLCHEO artifact authored by an agent that has read OLDC+H-era context + +## Current Handling + +- **Detection today**: operators correct it manually when they spot it in agent output. The DOLCHEO vocabulary doc's `## Discoverability` paragraph mentions both "DOLCHEO" and "OLDC+H" as searchable terms, but does not warn against the malformed combination +- **Closest adjacent canon**: `canon/definitions/dolcheo-vocabulary.md` (the authoritative definition). The doc currently treats `OLDC+H` as a search term and a superseded predecessor, but does not flag `DOLCHEO+H` as an anti-pattern +- **Gap**: `oddkit_search` for "DOLCHEO+H" surfaces the legacy `OLDC+H` doc and ledger uses, but no document that explicitly says "this is wrong; the vocabulary is DOLCHEO; the H is already in it" + +## Proposed Promotion + +### Target Document + +`canon/definitions/dolcheo-vocabulary.md` — append a new short section. + +### Section + +`## Anti-Pattern — Do Not Write "DOLCHEO+H"` (new section near the end, before `## See Also`) + +### Proposed Language + +```markdown +## Anti-Pattern — Do Not Write "DOLCHEO+H" + +The vocabulary is **DOLCHEO**. The seven letters are D-O-L-C-**H**-E-O — Handoffs is the fifth letter, already inside the acronym. Writing **DOLCHEO+H** is malformed: + +- It doubles the H (once inside the acronym, once as the suffix). +- It is residue from the superseded `OLDC+H` vocabulary (`docs/oddkit/proactive/oldc-h-vocabulary.md`), in which Handoffs were appended with `+H` because the original four letters did not include them. DOLCHEO absorbed Handoffs into the acronym; the suffix is no longer needed. +- It propagates because agents see "OLDC+H" in canon-adjacent context (this doc's See Also, ledger headers in older artifacts) and pattern-match the suffix onto the new vocabulary by mistake. + +When tagging or describing session capture, write **DOLCHEO**. The Handoff section is named with the letter `H` inside the acronym, just like Decision is `D` and Encode is `E`. + +### If you are reading an older artifact that uses "DOLCHEO+H" + +Treat it as a typo equivalent to "DOLCHEO." Do not propagate the form into new artifacts. If editing the older artifact, correct it. + +### Receipts + +- `klappy/PTXprint-MCP/canon/encodings/pr-30-fresh-validator-ledger.md` line ~12 — contains "DOLCHEO+H encoding of findings." Marked here as the propagation source for at least one downstream session's hallucination chain. +- 2026-05-03 slate-authoring session — agent propagated the form 8 times across two synthesis documents before operator correction. Direct stimulus for this anti-pattern callout. +``` + +### Rationale + +The callout is two paragraphs of net-new prose. It makes the malformed string findable through `oddkit_search` (which is the only way agents discover canon during preflight). Once the anti-pattern entry exists, an agent searching for "DOLCHEO+H" or for "DOLCHEO encoding" will hit the explicit warning and bounce off, rather than echoing the malformed form into new artifacts. + +This is the simplest, lowest-risk way to kill a recurring hallucination at the source — a single paragraph of canon weighed against an unbounded sequence of operator corrections in future sessions. + +The PTXprint PR-30 ledger's malformed instance is also flagged as a receipt so a future canon-cleanup pass can correct it at the source repo. + +## Risk Assessment + +| Risk Level | Description | +| --- | --- | +| **Low** | **Clarifies existing rule, no scope change** | +| Medium | Adds new requirement, may affect workflows | +| High | Changes existing behavior, requires migration | + +**Risk level**: Low + +**Mitigation**: Pure documentation addition. No workflow change. No agent retraining required — agents read the updated doc at next preflight and absorb the warning automatically. The fix is the same mechanism that produces the bug (canon read at preflight) so it is structurally aligned with how the system already self-corrects. + +## Status + +`proposed` + +## Review Notes + +(To be filled during review) + +- **Reviewer**: +- **Decision**: +- **Date**: +- **Notes**: + +## Execution Record + +(To be filled after acceptance) + +- **Commit**: +- **Canon doc updated**: `canon/definitions/dolcheo-vocabulary.md` +- **Backlink added**: Yes / No +- **Adjacent cleanup recommended**: `klappy/PTXprint-MCP/canon/encodings/pr-30-fresh-validator-ledger.md` line ~12 — separate PR in the PTXprint-MCP repo