From 966d954e55b0a4d3710d3b5c6c6f1433f38bc048 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:18:58 +0200 Subject: [PATCH 01/29] Plan renderer golden prompt assembly frontier --- memory/PLAN.md | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/memory/PLAN.md b/memory/PLAN.md index 152c14377..57d99e544 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -69,7 +69,7 @@ context-pipeline/ ### Active -- No active frontier is selected. Scope the next item from §Next. +- `renderer-golden-coverage` (FE-1091) — active context-pipeline RENDER closure plus prompt-assembly lock. Golden the remaining model-facing context surfaces and system-prompt assembly while reshaping `src/agents/` prompts/subagents topology only as needed for that lock. ### Recently Completed @@ -83,7 +83,6 @@ context-pipeline/ - `orchestrator-tool-port` (FE-1087) — **scoped but D98-sensitive.** Port the external `brunch cook` orchestrator into the future CODE/executor tool surface rather than preserving a separate execute/orchestrator product mode. First active scope: `memory/cards/orchestrator-tool-port--plan-check-tool.md`; reconcile that scope against D98-L before build. - `elicitor-project` (FE-1085) — **design-gated.** Cross-plane derivation (requirements -> design, design -> oracles) remains undesigned under A33-L; run `ln-design` before any scope/build. -- `renderer-golden-coverage` — **active parallel coverage track.** Remaining RENDER work lives by audience: model-facing context surfaces under `agents/contexts/`, human/product text beside its app/session owner. Remaining rows need fresh scoping against `src/agents/contexts/README.md`, `src/app/README.md`, and `src/session/README.md`. - `exchange-symmetry-audit` — **earned cleanup.** Delete-oriented audit of the exchange projection/renderer split; not a capability blocker. ### Parallel / Low-Conflict @@ -159,13 +158,16 @@ context-pipeline/ ### renderer-golden-coverage - **Name:** Adopt the D83-L context-render house style and lock remaining RENDER-stage surfaces -- **Linear:** [FE-870](https://linear.app/hash/issue/FE-870) -- **Branch:** `ln/fe-870-renderer-golden-context-tools` +- **Linear:** [FE-1091](https://linear.app/hash/issue/FE-1091/renderer-golden-coverage-and-prompt-assembly-lock) +- **Branch:** `ln/fe-1091-renderer-golden-coverage-and-prompt-assembly-lock` - **Kind:** coverage + build / hardening -- **Status:** next / active parallel. Substrate, ``, ``, graph overview/neighborhood renders, and band-filtered graph slice hardening are done. Remaining work needs a fresh `ln-scope` pass. -- **Objective:** Finish the RENDER stage: ``, `renderGraphSeed`, `exchanges/*`, `formatRelatedNodesResult` structural-leak repair, and the `brunch print` house-style-vs-status fork. -- **Acceptance:** `src/agents/contexts/README.md`, `src/app/README.md`, and `src/session/README.md` carry the audience split; required model-facing rows are built in the house style and locked with focused goldens/semantic invariants; no adapter/transport imports enter `agents/contexts/`. -- **Traceability:** D19-L, D52-L, D60-L, D62-L, D83-L. +- **Status:** active; needs fresh `ln-scope` pass. +- **Certainty:** earned — RENDER topology is now established; this frontier closes coverage, prompt assembly evidence, and stale topology ambiguity rather than proving a new seam. +- **Closes:** context-pipeline RENDER stage plus the COMPOSE full-stack real-rendered-context tripwire. +- **Locks in:** D83-L house style for model-facing context surfaces and prompt assembly as a golden/semantic-invariant surface. +- **Objective:** Finish the RENDER stage and lock system-prompt assembly as a golden surface. Remaining work lives by audience: model-facing context and prompt text under `src/agents/`, human/product text beside its app/session owner. Incidental prompt remodelling belongs here only when needed to make prompt assembly lockable: foreground prompts should collapse toward `elicitor` / `executor`, subagent prompt bodies should live as subagent resources, and `src/agents/` topology should make `contexts`, `prompts`, `runtime`, `shared`, `skills`, and `subagents` roles legible. +- **Acceptance:** `src/agents/contexts/README.md`, `src/agents/prompts/README.md`, `src/agents/runtime/README.md`, `src/app/README.md`, and `src/session/README.md` carry the audience/topology split; required model-facing renderer rows are built in the house style and locked with focused goldens/semantic invariants; system prompt assembly is locked with goldens/semantic invariants; no adapter/transport imports enter `agents/contexts/`; prompt topology remodel deletes obsolete role/body aliases rather than preserving compatibility shims. +- **Traceability:** D19-L, D40-L, D52-L, D58-L, D60-L, D62-L, D83-L, D98-L. ### exchange-symmetry-audit @@ -182,7 +184,11 @@ context-pipeline/ ```text frontiers: - Active: {} + Active: + renderer-golden-coverage + status: active (RENDER coverage + prompt assembly lock) + depends_on: context-pipeline PULL+PROJECT, D83-L, D52-L, D58-L, D98-L + coordinates_with: data-model-legibility (references substrate), elicitor-generate (present_candidates render already landed in house style) Next: orchestrator-tool-port @@ -194,11 +200,6 @@ frontiers: status: design-gated depends_on: elicitor-generate, D95-L, D96-L, I51-L - renderer-golden-coverage - status: active parallel coverage - depends_on: context-pipeline PULL+PROJECT, D83-L, D52-L - coordinates_with: elicitor-generate (present_candidates render already landed in house style) - exchange-symmetry-audit status: earned cleanup depends_on: exchange surface being mostly built From b9aa5d26af918f7fbaf353888a35633fe1eee243 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:36:14 +0200 Subject: [PATCH 02/29] Close context reference harvest ledger --- ...rer-golden-coverage--render-prompt-lock.md | 107 ++++++++++++++++++ src/agents/docs/context-reference-harvest.md | 102 ++++++++--------- 2 files changed, 153 insertions(+), 56 deletions(-) create mode 100644 memory/cards/renderer-golden-coverage--render-prompt-lock.md diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md new file mode 100644 index 000000000..8284e49c7 --- /dev/null +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -0,0 +1,107 @@ +# Renderer and prompt assembly lock ledger + +Frontier: renderer-golden-coverage +Status: active +Mode: sweep +Created: 2026-06-26 + +## Orientation + +- Containing seam: the `context-pipeline` RENDER stage plus its deferred COMPOSE tripwire. Frontier `renderer-golden-coverage` / FE-1091 is the Linear + branch boundary; this scope file is only the row ledger for that frontier. +- Handoff state: system-prompt assembly must be golden/semantically locked, and the user's `src/agents/` topology sketch (`contexts`, foreground `prompts`, `runtime`, `shared`, `skills`, `subagents`) is directional pressure, not a license for an unbounded rewrite. +- The previous `data-model-legibility` work closed generated ontology + graph-authoring references, but `src/agents/docs/context-reference-harvest.md` still carries unresolved candidate references. This sweep must either materialize, retire, or explicitly defer those candidates instead of assuming the ledger is fully worked. +- Main risk: locking stale D98-sensitive prompt/runtime-axis wording or old prompt-body topology in snapshots. Closure should delete aliases/dual homes rather than preserve compatibility shims. + +Posture: earned (inherited from `renderer-golden-coverage`). + +Frontier-level cross-cutting obligations: + +- Preserve D83-L house style for model-facing context text: markdown frame, TOON for large/unbounded uniform data, fenced tree for hierarchy, top-level `` / `` / `` scope clustering where applicable. +- Preserve D52-L / D60-L dependency direction: `agents/contexts` may render already-read facts but must not import adapters, app, RPC, web, or DB. +- Preserve D97-L provenance: generated vocabulary references come from typed graph sources; authored judgment references need concrete readers; prompt resources cite rather than restate shared references. +- Preserve D98-L: strategy/lens/method vocabulary may remain only as prompt-resource/internal conduct, not user-changeable session state or foreground-agent identity. +- Use deletion as closure: obsolete role/body aliases, stale docs, and superseded reference candidates should be removed or explicitly deferred, not bridged. + +## Sweep preflight + +1. **Boundary.** In scope: model-facing renderers under `src/agents/contexts/`, foreground/background prompt assembly, prompt body/reference topology under `src/agents/` when needed to make assembly lockable, and the local topology READMEs/tests that name those homes. Out of scope: new `project` capability behavior, CODE/orchestrator tool implementation, public RPC/UI changes, and human/product renderers except for README audience-split drift. +2. **Source-of-truth inputs.** SPEC D19-L, D40-L, D52-L, D58-L, D60-L, D62-L, D83-L, D97-L, D98-L; PLAN frontier `renderer-golden-coverage`; topology READMEs under `src/agents/`, `src/app/`, and `src/session/`; current renderer/prompt tests; `src/agents/docs/context-reference-harvest.md` for unresolved reference disposition only. +3. **Owners and closure oracles.** Each required row below names the canonical owner and a closure oracle: Vitest file snapshots, semantic invariant tests, import-boundary checks, topology README assertions, or a row-level explicit deferral tied to a plan assumption/frontier. +4. **Class.** Buildable-now. Deferred rows are marked `○` and tripwired to A33-L / `elicitor-project` or `orchestrator-tool-port`; they are not hidden required work for this sweep. +5. **Closed inventory.** This ledger is the inventory. If build discovers more than one genuinely missing renderer/prompt sub-seam, stop and route back through `ln-plan` instead of adding rows by symmetry. + +Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is either closed or explicitly reclassified with a named owner/tripwire. + +## Ledger — prompt topology and assembly + +| Capability | Status | Req | Fill | Owner / next | Notes | +| --- | --- | --- | --- | --- | --- | +| Foreground prompt-body topology is canonical | `partial` | ● | earned | `src/agents/registry.ts`, `src/agents/prompts/README.md`, prompt body tests | Closure oracle: one foreground body home matches the D98 target vocabulary (`elicitor` / executor direction) or a narrow explicit deferral says which CODE row owns the remaining rename. No compatibility alias from old role/body names remains without an owner. | +| Background subagent body topology is canonical | `partial` | ● | earned | `src/.pi/extensions/subagents/agents.ts`, prompt/subagent topology docs | Closure oracle: background bodies are not ambiguously presented as foreground prompts; either rehome under an earned `src/agents/subagents/` surface or lock the current `prompts//SYSTEM.md` convention with README/tests that explain why no rehome is needed now. | +| Foreground prompt assembly golden lock | `partial` | ● | earned | `src/agents/runtime/compose.ts`, `src/agents/runtime/__tests__/compose.test.ts`, snapshots | Existing elicitor preview snapshots are a start. Closure oracle: full provider-facing assembly has reviewed goldens/semantic invariants for the foreground role(s) this frontier owns, with stale readiness-grade/runtime-axis vocabulary guarded. | +| Pi `before_agent_start` assembly path is wired to the same lock | `partial` | ● | earned | `src/.pi/extensions/agent-runtime/system-prompts/` tests | Closure oracle: adapter-level test proves Brunch body + world reads + active-tool legality feed `composeAgentPrompt`; no harness-only snapshot path that product assembly bypasses. | +| Background subagent prompt assembly golden lock | `partial` | ● | earned | `src/.pi/extensions/subagents/prompt-assembly.ts`, `src/.pi/extensions/__tests__/subagents.test.ts` | Existing assertions prove sealing/tool grants. Closure oracle: file snapshot or equivalent semantic invariant locks assembled child prompt shape, injected-world snapshot, no foreground-only sections, no ambient Pi resources. | +| Context-reference harvest closure | `built` | ● | earned | `src/agents/docs/context-reference-harvest.md`, `src/agents/contexts/references/`, skill-local `references/` | Closed: materialized graph-authoring / ontology / oracle homes stay in their current owners; checkability/shared subtype candidates are rejected; elicitation-question hints are deferred to a future scoped reader; proposal/projection candidates are deferred to A33-L/`elicitor-project`. | + +## Ledger — model-facing context renderers + +| Capability | Status | Req | Fill | Owner / next | Notes | +| --- | --- | --- | --- | --- | --- | +| Workspace context renderer | `have` | ● | earned | `src/agents/contexts/workspace/` | Snapshot coverage exists for cwd + overview context; preserve D83-L audience split. | +| Specification context renderer | `have` | ● | earned | `src/agents/contexts/specification/` | Snapshot coverage exists for selected-spec context. | +| Graph overview / neighborhood / related-node renderers | `have` | ● | earned | `src/agents/contexts/graph/` | Snapshot coverage exists for overview, neighborhoods, and related nodes; preserve code handles and no structural-leak assertions. | +| Session runtime frame renderer | `partial` | ● | earned | `src/agents/contexts/session/` | Existing snapshot still displays D98-sensitive strategy/lens runtime wording. Closure oracle: runtime frame wording either removes that state or frames it strictly as prompt-resource/internal conduct, then updates the golden. | +| Turn/origination seed renderers | `partial` | ● | earned | `src/agents/contexts/seeds/` | Existing tests are semantic asserts. Closure oracle: stable seed text is snapshot-locked or intentionally reduced to invariant asserts with a README note explaining why wording is not a golden contract. | +| Elicitation agenda/update text | `partial` | ● | earned | `src/agents/contexts/elicitation.ts` | No focused renderer test found. Closure oracle: agenda/update text has semantic invariant or snapshot coverage, including structural-illegal diagnostics. | +| Structured-exchange result renderers | `partial` | ● | earned | `src/agents/contexts/exchanges/` | `present_candidates` and `present_review_set` have semantic asserts, other request/present renderers need inventory. Closure oracle: every registered model-facing exchange result has snapshot/semantic coverage or is explicitly retired/unregistered. | +| Human/product render audience split | `have` | ● | earned | `src/app/README.md`, `src/session/README.md`, `src/agents/contexts/README.md` | Current READMEs name app/session human text and agents model-facing text. Preserve; update only if topology changes above create drift. | + +## Ledger — deferred / tripwired rows + +| Capability | Status | Req | Fill | Owner / next | Notes | +| --- | --- | --- | --- | --- | --- | +| `projection-guidance.md` shared reference | `spec` | ○ | proving | `elicitor-project` / A33-L | Wait-gated: project shape is design-gated; do not materialize a shared projection reference in this sweep unless a concrete second reader appears. | +| CODE executor tool behavior | `spec` | ○ | proving | `orchestrator-tool-port` | Out of this sweep except for prompt-body naming/topology needed to avoid locking a stale foreground body alias. Tool behavior and write-capable CODE policy stay with FE-1087. | +| New renderer family discovered during build | `new` | ○ | proving | route to `ln-plan` if more than one appears | Tripwire: adding several rows means this inventory was not closed. | + +## Row build order recommendation + +1. Close **Context-reference harvest closure** first so prompt/reference topology is not goldened against a half-dispositioned ledger. +2. Close **Foreground/background prompt-body topology** before accepting prompt assembly snapshots; snapshots should lock the final home, not a transitional shape. +3. Close foreground + adapter prompt assembly locks. +4. Close background subagent assembly lock. +5. Sweep remaining renderer partials (`session`, `seeds`, `elicitation`, `exchanges`) with file-scoped tests. + +## Expected touched paths (tentative) + +```text +memory/cards/ +└── renderer-golden-coverage--render-prompt-lock.md + +memory/PLAN.md ? +src/agents/ +├── README.md ? +├── registry.ts ? +├── __tests__/ +│ └── registry.test.ts ? +├── docs/ +│ └── context-reference-harvest.md ? +├── contexts/ +│ ├── README.md ? +│ ├── elicitation.ts ~ +│ ├── references/ ? +│ ├── seeds/ ? +│ ├── session/ ? +│ └── exchanges/ ? +├── prompts/ ? +├── runtime/ +│ ├── README.md ? +│ ├── compose.ts ? +│ ├── __tests__/compose.test.ts ? +│ └── __snapshots__/ ? +└── subagents/ ? +src/.pi/extensions/ +├── agent-runtime/system-prompts/ ? +└── subagents/ ? +src/app/README.md ? +src/session/README.md ? +``` diff --git a/src/agents/docs/context-reference-harvest.md b/src/agents/docs/context-reference-harvest.md index 97b28a4f8..c37aa6cc9 100644 --- a/src/agents/docs/context-reference-harvest.md +++ b/src/agents/docs/context-reference-harvest.md @@ -1,45 +1,45 @@ # Context reference harvest ledger -Status: backstage-only curation ledger. This file is not runtime prompt payload, is not copied into packaged agent assets, and is not a shared context reference. Runtime-eligible references live under `src/agents/contexts/references/`; skill-local progressive-disclosure references live under the owning skill's `references/` directory. +Status: closed backstage-only curation ledger. This file is not runtime prompt payload, is not copied into packaged agent assets, and is not a shared context reference. Runtime-eligible references live under `src/agents/contexts/references/`; skill-local progressive-disclosure references live under the owning skill's `references/` directory. -Purpose: record the row-by-row disposition of recovered or design-era data-model guidance before any authored runtime reference is created. Rows point to source material and next action; they do not restate the ontology. +Purpose: record the row-by-row disposition of recovered or design-era data-model guidance after the data-model-legibility frontier. Rows point to materialized homes, rejected carrying cost, or explicit future tripwires; they do not restate the ontology. A source may carry more than one disposition class when it has separable uses. Treat the classes as labels, not an exclusive enum. Generated-reference inputs are the exception that preserves D97-L: only typed code sources may generate reference tables; recovered/design prose may motivate which table to generate, but it is authored-reference input or backstage rationale, not the source of truth. ## Disposition classes -| Class | Meaning | -| - | - | -| generated-reference input | Typed code source for generated content, not hand-authored prose. | -| authored-runtime-reference input | Candidate source for a shared reference under `src/agents/contexts/references/`. | -| skill-local-reference input | Candidate source for a specific skill's `references/` payload. | -| backstage-only rationale | Useful design history or validation record, but not model-facing prompt payload. | -| historical/archive candidate | Superseded or stale enough that future work should retire/archive rather than harvest directly. | -| leave-as-is | Current prompt/resource file already sits in the right home; no harvest action now. | +| Class | Meaning | +| -------------------------------- | ----------------------------------------------------------------------------------------------- | +| generated-reference input | Typed code source for generated content, not hand-authored prose. | +| authored-runtime-reference input | Candidate source for a shared reference under `src/agents/contexts/references/`. | +| skill-local-reference input | Candidate source for a specific skill's `references/` payload. | +| backstage-only rationale | Useful design history or validation record, but not model-facing prompt payload. | +| historical/archive candidate | Superseded or stale enough that future work should retire/archive rather than harvest directly. | +| leave-as-is | Current prompt/resource file already sits in the right home; no harvest action now. | ## Source ledger -| Source | Disposition labels | Candidate future reference | Reader / blocker | D98-sensitive notes | Next action | -| - | - | - | - | - | - | -| `/private/tmp/igs_recovered.md` (`INTENT_GRAPH_SEMANTICS`) | skill-local-reference input; backstage-only rationale; historical/archive candidate | `graph-authoring-heuristics.md` materialized; generated edge-category/detail-form table materialized from typed sources; oracle checkability guidance accepted skill-local only | Reader: capture/commit methods already cite shared graph-authoring judgment. Oracle `generate-proposal` may use progressive-checkability language when designing verification ensembles. | Contains retired subtype proposals and old edge/ontology language; do not revive stale modality/subtype claims, claim metadata, `strength`, or runtime `strategy` / `lens` / `method` session state. | Verdict complete: promotion rules already accepted in `graph-authoring-heuristics.md`; the 8-rung ladder is narrowed to skill-local oracle prompting; `strength` and claim-level `checkability` fields are rejected carrying cost; subtype/detail candidates are rejected as parallel enums except for already-shipped `detail.form` from typed sources. | -| `docs/design/ELICITATION_QUESTIONS.md` | authored-runtime-reference input | `elicitation-question-hints.md` | Reader: future elicitor question/gap guidance. Blocker: refresh against post-FE-1052 kind names, `story` / `unknown` / `entity` / `sketch`, and four-band D94-L model. | Uses older band framing and mentions strategy/lens as prompt-space terms; keep as prompt-resource vocabulary only, not runtime state. | Treat the durable thesis as: node kind is closed ontology; questions are open/projectable hints inside a kind. Rewrite examples before model-facing use. | -| `docs/design/ONTOLOGY_REVIEW_PROTOCOL.md` | backstage-only rationale; authored-runtime-reference input | possible `graph-authoring-heuristics.md` citations; may motivate generated edge-category/detail-form table scope but typed code remains the generated-reference input | Reader: data-model maintainers and future generated-reference authors. Blocker: live code/SPEC are authoritative; §0/§2–3/§9 are historical and `thesis → claim` did not land. | Mentions methods as validation lenses; preserve only as prompt/resource vocabulary where useful, never as user-changeable runtime axes. | Use as design-validation record for D87-L/D88-L, not as prompt payload. Pull only claims that still match current SPEC/code. | -| `docs/design/ELICITATION_LENSES.md` | authored-runtime-reference input; skill-local-reference input; historical/archive candidate | `proposal-meta-rubric.md`; `projection-guidance.md` | Reader: `generate-proposal` and future `project` capability. Blocker: D98-L retired `strategy` / `lens` / `method` as runtime state; A33-L still design-gates `project`. | Highly D98-sensitive: old lens catalogue must not reintroduce runtime lens/strategy/method axes. Fan-out/fan-in and D31 meta-rubric may survive as prompt conduct. | Harvest fan-out/fan-in, grounding-density, and meta-rubric ideas only into the relevant method/reference home after translating away runtime-axis assumptions. | -| `docs/design/BEHAVIORAL_KERNELS.md` | skill-local-reference input; backstage-only rationale; historical/archive candidate | oracle checkability phrasing in `generate-proposal/references/oracle.md`; possible future `elicitation-question-hints.md` | Reader: oracle generate guidance for weakest-sufficient verification artifacts; future elicitation/gap guidance only if a scoped reader appears. Blocker: no current runtime kernel ontology; must not create a parallel data model or prompt taxonomy without a concrete reader. | Kernel terminology is interviewer machinery at most, not graph state and not runtime session state. | Accepted only as skill-local oracle prompting for progressive verification artifacts; kernel labels/taxonomy remain rejected as runtime model and deferred for elicitation questions. | -| `src/agents/skills/methods/capture/SKILL.md` | leave-as-is; authored-runtime-reference input; partially materialized | `graph-authoring-heuristics.md` materialized; `checkability-ladder.md` deferred | Reader: capture now cites the shared authoring reference for declarative graph claims, low-confidence routing, contradiction routing, relation-bearing confidence, and role-named mutation grammar. FE-861 sweep sequencing, gap conduct, and commitment-gradient table remain local. | Method is a prompt-resource id, not runtime state. No D98 issue while it stays code-owned prompt-resource conduct. | Materialized shared graph-authoring guidance; defer checkability-ladder extraction until a second concrete reader needs it. | -| `src/agents/skills/methods/commit-graph/SKILL.md` | leave-as-is; authored-runtime-reference input; materialized | `graph-authoring-heuristics.md` | Reader: graph-write methods needing declarative-node, promotion, settled-commitment, confident-endpoint, and role-named mutation discipline. | Method remains prompt-resource conduct; do not make it a user-changeable runtime mode. | Materialized shared authoring reference and cite from this method; remaining direct-commit sequencing stays local. | -| `src/agents/skills/methods/generate-proposal/SKILL.md` | leave-as-is; skill-local-reference input; authored-runtime-reference input | `proposal-meta-rubric.md`; `projection-guidance.md` | Reader: current generate method and future project design. Blocker: proposal meta-rubric might belong skill-local unless `project` becomes a second reader. | Names intent/design/oracle lenses/planes as prompt conduct; keep out of runtime state and schema fields. | Leave body unchanged now. Revisit after `elicitor-project` design chooses whether projection folds into generate or needs a distinct surface. | -| `src/agents/skills/methods/generate-proposal/references/intent.md` | leave-as-is; skill-local-reference input | `proposal-meta-rubric.md` only if shared beyond generate | Reader: generate intent-plane fan-out. Blocker: no second reader yet. | Plane-specific prompt payload is okay; do not turn `pick` into a schema/runtime field. | Leave in skill-local home. | -| `src/agents/skills/methods/generate-proposal/references/design.md` | leave-as-is; skill-local-reference input | `proposal-meta-rubric.md`; possible `projection-guidance.md` | Reader: generate design-plane fan-out and possible future project design. Blocker: A33-L design verdict. | `synthesize` is method conduct, not schema or runtime axis. | Leave in skill-local home; use as input to `project` design only if that frontier needs it. | -| `src/agents/skills/methods/generate-proposal/references/oracle.md` | leave-as-is; skill-local-reference input; materialized | `proposal-meta-rubric.md`; skill-local progressive-checkability guidance | Reader: generate oracle-plane fan-out and fan-in. | `compose` and progressive-checkability language are method conduct, not schema fields, stored claim metadata, or runtime axes. | Materialized the accepted narrow verdict here: choose the weakest sufficient oracle artifact and name evidence breadth/blind spots without adding `checkability`, `strength`, kernel, or subtype schema. | - -## Candidate reference queue +| Source | Disposition labels | Candidate future reference | Reader / blocker | D98-sensitive notes | Next action | +| ------------------------------------------------------------------ | ------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `/private/tmp/igs_recovered.md` (`INTENT_GRAPH_SEMANTICS`) | skill-local-reference input; backstage-only rationale; historical/archive candidate | `graph-authoring-heuristics.md` materialized; generated edge-category/detail-form table materialized from typed sources; oracle checkability guidance accepted skill-local only | Reader: capture/commit methods already cite shared graph-authoring judgment. Oracle `generate-proposal` may use progressive-checkability language when designing verification ensembles. | Contains retired subtype proposals and old edge/ontology language; do not revive stale modality/subtype claims, claim metadata, `strength`, or runtime `strategy` / `lens` / `method` session state. | Verdict complete: promotion rules already accepted in `graph-authoring-heuristics.md`; the 8-rung ladder is narrowed to skill-local oracle prompting; `strength` and claim-level `checkability` fields are rejected carrying cost; subtype/detail candidates are rejected as parallel enums except for already-shipped `detail.form` from typed sources. | +| `docs/design/ELICITATION_QUESTIONS.md` | authored-runtime-reference input | `elicitation-question-hints.md` | Reader: future elicitor question/gap guidance. Blocker: refresh against post-FE-1052 kind names, `story` / `unknown` / `entity` / `sketch`, and four-band D94-L model. | Uses older band framing and mentions strategy/lens as prompt-space terms; keep as prompt-resource vocabulary only, not runtime state. | Treat the durable thesis as: node kind is closed ontology; questions are open/projectable hints inside a kind. Rewrite examples before model-facing use. | +| `docs/design/ONTOLOGY_REVIEW_PROTOCOL.md` | backstage-only rationale; authored-runtime-reference input | possible `graph-authoring-heuristics.md` citations; may motivate generated edge-category/detail-form table scope but typed code remains the generated-reference input | Reader: data-model maintainers and future generated-reference authors. Blocker: live code/SPEC are authoritative; §0/§2–3/§9 are historical and `thesis → claim` did not land. | Mentions methods as validation lenses; preserve only as prompt/resource vocabulary where useful, never as user-changeable runtime axes. | Use as design-validation record for D87-L/D88-L, not as prompt payload. Pull only claims that still match current SPEC/code. | +| `docs/design/ELICITATION_LENSES.md` | authored-runtime-reference input; skill-local-reference input; historical/archive candidate | `proposal-meta-rubric.md`; `projection-guidance.md` | Reader: `generate-proposal` and future `project` capability. Blocker: D98-L retired `strategy` / `lens` / `method` as runtime state; A33-L still design-gates `project`. | Highly D98-sensitive: old lens catalogue must not reintroduce runtime lens/strategy/method axes. Fan-out/fan-in and D31 meta-rubric may survive as prompt conduct. | Harvest fan-out/fan-in, grounding-density, and meta-rubric ideas only into the relevant method/reference home after translating away runtime-axis assumptions. | +| `docs/design/BEHAVIORAL_KERNELS.md` | skill-local-reference input; backstage-only rationale; historical/archive candidate | oracle checkability phrasing in `generate-proposal/references/oracle.md`; possible future `elicitation-question-hints.md` | Reader: oracle generate guidance for weakest-sufficient verification artifacts; future elicitation/gap guidance only if a scoped reader appears. Blocker: no current runtime kernel ontology; must not create a parallel data model or prompt taxonomy without a concrete reader. | Kernel terminology is interviewer machinery at most, not graph state and not runtime session state. | Accepted only as skill-local oracle prompting for progressive verification artifacts; kernel labels/taxonomy remain rejected as runtime model and deferred for elicitation questions. | +| `src/agents/skills/methods/capture/SKILL.md` | leave-as-is; authored-runtime-reference input; partially materialized | `graph-authoring-heuristics.md` materialized; `checkability-ladder.md` deferred | Reader: capture now cites the shared authoring reference for declarative graph claims, low-confidence routing, contradiction routing, relation-bearing confidence, and role-named mutation grammar. FE-861 sweep sequencing, gap conduct, and commitment-gradient table remain local. | Method is a prompt-resource id, not runtime state. No D98 issue while it stays code-owned prompt-resource conduct. | Materialized shared graph-authoring guidance; defer checkability-ladder extraction until a second concrete reader needs it. | +| `src/agents/skills/methods/commit-graph/SKILL.md` | leave-as-is; authored-runtime-reference input; materialized | `graph-authoring-heuristics.md` | Reader: graph-write methods needing declarative-node, promotion, settled-commitment, confident-endpoint, and role-named mutation discipline. | Method remains prompt-resource conduct; do not make it a user-changeable runtime mode. | Materialized shared authoring reference and cite from this method; remaining direct-commit sequencing stays local. | +| `src/agents/skills/methods/generate-proposal/SKILL.md` | leave-as-is; skill-local-reference input; authored-runtime-reference input | `proposal-meta-rubric.md`; `projection-guidance.md` | Reader: current generate method and future project design. Blocker: proposal meta-rubric might belong skill-local unless `project` becomes a second reader. | Names intent/design/oracle lenses/planes as prompt conduct; keep out of runtime state and schema fields. | Leave body unchanged now. Revisit after `elicitor-project` design chooses whether projection folds into generate or needs a distinct surface. | +| `src/agents/skills/methods/generate-proposal/references/intent.md` | leave-as-is; skill-local-reference input | `proposal-meta-rubric.md` only if shared beyond generate | Reader: generate intent-plane fan-out. Blocker: no second reader yet. | Plane-specific prompt payload is okay; do not turn `pick` into a schema/runtime field. | Leave in skill-local home. | +| `src/agents/skills/methods/generate-proposal/references/design.md` | leave-as-is; skill-local-reference input | `proposal-meta-rubric.md`; possible `projection-guidance.md` | Reader: generate design-plane fan-out and possible future project design. Blocker: A33-L design verdict. | `synthesize` is method conduct, not schema or runtime axis. | Leave in skill-local home; use as input to `project` design only if that frontier needs it. | +| `src/agents/skills/methods/generate-proposal/references/oracle.md` | leave-as-is; skill-local-reference input; materialized | `proposal-meta-rubric.md`; skill-local progressive-checkability guidance | Reader: generate oracle-plane fan-out and fan-in. | `compose` and progressive-checkability language are method conduct, not schema fields, stored claim metadata, or runtime axes. | Materialized the accepted narrow verdict here: choose the weakest sufficient oracle artifact and name evidence breadth/blind spots without adding `checkability`, `strength`, kernel, or subtype schema. | + +## Closed reference dispositions ```pseudo -tree context-reference-candidates: +tree context-reference-dispositions: graph-authoring-heuristics.md: home: src/agents/contexts/references/ - status: materialized for capture + commit-graph shared authoring rules + status: materialized readers: - capture/SKILL.md - commit-graph/SKILL.md @@ -58,7 +58,7 @@ tree context-reference-candidates: d98_guard: method vocabulary allowed only as prompt conduct checkability-ladder.md: - status: rejected as shared runtime reference for now + status: rejected_as_shared_runtime_reference verdict: - no new context reference: only `generate-proposal/references/oracle.md` has a concrete present reader - no schema fields: do not add `checkability`, `strength`, `validTraces`, or `invalidTraces` to graph nodes @@ -70,40 +70,30 @@ tree context-reference-candidates: d98_guard: no new runtime lens/kernel state elicitation-question-hints.md: - home: src/agents/contexts/references/ only if multiple elicitor methods cite it - readers: - - capture / elicit-by-question / review-for-gaps candidates, pending scope - likely inputs: - - refreshed ELICITATION_QUESTIONS thesis and examples - - selected BEHAVIORAL_KERNELS question patterns, if still useful - blockers: - - refresh stale kind names and four-band model - - prevent catalog examples from becoming stored enums or hidden domain facts + status: deferred_to_future_scope + owner_tripwire: elicitation-gap-guidance or another scoped elicitor-question reader + reason: + - no current second reader needs reusable question patterns + - source examples need refresh against current kind names and D94-L bands before model-facing use + reopen_only_if: + - a scoped build names concrete readers and updates stale examples d98_guard: examples are prompt hints, not strategy/lens/method runtime state proposal-meta-rubric.md: - home: skill-local generate-proposal reference unless project creates a second reader - readers: - - generate-proposal now - - possible project capability later - likely inputs: - - ELICITATION_LENSES fan-out/fan-in and D31 meta-rubric material - - generate-proposal SKILL shared candidate constraints - blockers: - - wait for elicitor-project design verdict before making shared runtime reference + status: skill_local_unless_project_design_earns_shared_home + current_home: src/agents/skills/methods/generate-proposal/references/ + owner_tripwire: elicitor-project / A33-L + reason: + - generate-proposal is the only present reader + - project may become a second reader, but its shape is design-gated d98_guard: pick/synthesize/compose remain conduct, not schema/runtime fields projection-guidance.md: - home: unresolved; likely after elicitor-project design - readers: - - future project capability - - maybe generate-proposal design/oracle references - likely inputs: - - ELICITATION_LENSES project-requirements-from-upstream material - - ONTOLOGY_REVIEW_PROTOCOL method-as-detail/routing rationale - blockers: - - A33-L project design verdict - - decide whether projection is generate-with-upstream-input or distinct surface + status: deferred_to_elicitor-project + owner_tripwire: elicitor-project / A33-L + reason: + - projection has no current capability surface or prompt reader + - A33-L must decide whether projection folds into generate or becomes distinct d98_guard: no revival of project strategy/lens/method as user-changeable state ``` From fd3882450bcf09fb9f0c99e05c5a2e68a65b2d57 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:41:56 +0200 Subject: [PATCH 03/29] Plan graph-derived document outputs --- memory/PLAN.md | 7 ++++--- memory/SPEC.md | 3 +++ 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/memory/PLAN.md b/memory/PLAN.md index 57d99e544..4ebcd0f2d 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -161,12 +161,13 @@ context-pipeline/ - **Linear:** [FE-1091](https://linear.app/hash/issue/FE-1091/renderer-golden-coverage-and-prompt-assembly-lock) - **Branch:** `ln/fe-1091-renderer-golden-coverage-and-prompt-assembly-lock` - **Kind:** coverage + build / hardening -- **Status:** active; needs fresh `ln-scope` pass. +- **Status:** scoped; first sweep ledger active. - **Certainty:** earned — RENDER topology is now established; this frontier closes coverage, prompt assembly evidence, and stale topology ambiguity rather than proving a new seam. +- **Current execution pointer:** `memory/cards/renderer-golden-coverage--render-prompt-lock.md`. - **Closes:** context-pipeline RENDER stage plus the COMPOSE full-stack real-rendered-context tripwire. - **Locks in:** D83-L house style for model-facing context surfaces and prompt assembly as a golden/semantic-invariant surface. -- **Objective:** Finish the RENDER stage and lock system-prompt assembly as a golden surface. Remaining work lives by audience: model-facing context and prompt text under `src/agents/`, human/product text beside its app/session owner. Incidental prompt remodelling belongs here only when needed to make prompt assembly lockable: foreground prompts should collapse toward `elicitor` / `executor`, subagent prompt bodies should live as subagent resources, and `src/agents/` topology should make `contexts`, `prompts`, `runtime`, `shared`, `skills`, and `subagents` roles legible. -- **Acceptance:** `src/agents/contexts/README.md`, `src/agents/prompts/README.md`, `src/agents/runtime/README.md`, `src/app/README.md`, and `src/session/README.md` carry the audience/topology split; required model-facing renderer rows are built in the house style and locked with focused goldens/semantic invariants; system prompt assembly is locked with goldens/semantic invariants; no adapter/transport imports enter `agents/contexts/`; prompt topology remodel deletes obsolete role/body aliases rather than preserving compatibility shims. +- **Objective:** Finish the RENDER stage and lock system-prompt assembly as a golden surface. Remaining work lives by audience: model-facing context and prompt text under `src/agents/`, human/product text beside its app/session owner. Incidental prompt remodelling belongs here only when needed to make prompt assembly lockable: foreground prompts should collapse toward `elicitor` / `executor`, subagent prompt bodies should live as subagent resources, and `src/agents/` topology should make `contexts`, `prompts`, `runtime`, `shared`, `skills`, and `subagents` roles legible. This frontier also extends D83-L to thin graph-derived markdown document outputs for selected-spec and plan-plane material, as future web/download response sources. +- **Acceptance:** `src/agents/contexts/README.md`, `src/agents/prompts/README.md`, `src/agents/runtime/README.md`, `src/app/README.md`, and `src/session/README.md` carry the audience/topology split; required model-facing renderer rows are built in the house style and locked with focused goldens/semantic invariants; system prompt assembly is locked with goldens/semantic invariants; selected-spec context moves from `contexts/specification/specification-context.ts` to `contexts/spec/spec-context.ts`; `contexts/spec/spec-output.ts` and `contexts/plan/plan-output.ts` use md-pen to render thin markdown-flattened outputs from graph/projection input rather than from `memory/SPEC.md` / `memory/PLAN.md`; no adapter/transport imports enter `agents/contexts/`; prompt topology remodel deletes obsolete role/body aliases rather than preserving compatibility shims. - **Traceability:** D19-L, D40-L, D52-L, D58-L, D60-L, D62-L, D83-L, D98-L. ### exchange-symmetry-audit diff --git a/memory/SPEC.md b/memory/SPEC.md index 138ffaf68..bce9db773 100644 --- a/memory/SPEC.md +++ b/memory/SPEC.md @@ -308,6 +308,7 @@ The POC's purpose is to prove three things: (a) that pi's coding-agent harness c - **Legibility-over-structure rule** — format is chosen by reader legibility, not by mirroring internal data shape: prose where raw structure would mislead (the anchored neighborhood projection deliberately renders relations as prose and forbids arrow/role-token vocabulary — already guarded by the no-structural-leak invariant); pseudo/graph notation is reserved for human/debug surfaces, never default LLM context. - **Scope clustering** — the D60-L “agent context” subjects are regrouped along the `workspace → spec → session` hierarchy (D19-L): **``** (cwd scope) carries project identity, the documents tree, and the spec roster, and **carries no sessions**; **``** (selected-spec scope) carries the spec header/readiness, graph overview, anchored neighborhood, ranked elicitation gaps, the spec's **sessions** (a session binds to exactly one spec, D19-L), and future reconciliation needs; **``** (live-session scope) carries the runtime-posture frame, mentions, the world-update watermark, lifecycle, and recent transcript. The soft readiness line is computed over the selected spec's full elicitation-gap register, while the `Gaps` block renders only the ask-eligible ranked agenda; seed and `` context share the same renderer so their readiness numbers cannot diverge by caller-chosen population. - **Lexicon** — these LLM-facing context renders are distinct from the `workspace.state` product-state projection, which D60-L keeps for print/RPC/UI status; the `` context render is not `workspace.state`, and `renderWorkspaceState` stays the product-state renderer. (The earlier `` tag sketch is renamed `` to avoid the collision.) + - **Document outputs** — RENDER also owns graph-derived markdown document outputs: a selected-spec output and a plan output. These are flattened markdown documents produced from graph planes and projections, not from `memory/SPEC.md` / `memory/PLAN.md` copies; both use the md-pen wrapper for markdown generation. The selected-spec context/output home is `src/agents/contexts/spec/` (`spec-context.ts` for the `` context render, `spec-output.ts` for the document output), replacing the longer `specification/` path while preserving the rendered `` scope where appropriate. The plan document output lives at `src/agents/contexts/plan/plan-output.ts` and renders plan-plane material (`milestone`, `frontier`, `slice`) thinly at first. Future web/UI download routes call these renderers; they do not own a second document semantics layer. - **Dependencies** — three small leaf libraries (md-pen, `@toon-format/toon`, stringify-tree), each *retiring owned format-generation code* (the md-pen/TOON wrapper seams are already stubbed; the tree library replaces a hand-built formatter), so net owned surface decreases — the trade that justifies them under a dependencies-resist posture. - **Rollout** — incremental: ``, ``, graph, session runtime-frame, and structured-exchange result renders now live under `src/agents/contexts/`; transcript debug/report rendering lives in `src/session/transcript-markdown.ts` as a human/product debug artifact. - **Closed audit** — per-session `turnCount` is derived once while inspecting canonical session files and counts only current Pi v3 JSONL message entries (`type: "message"` with `message.role: "user" | "assistant"`); tool/custom entries are excluded, and downstream workspace/specification overview renders reuse that inspected count rather than reparsing the file. @@ -603,6 +604,8 @@ src/.pi/ | **Elicitation backlog** *(renamed)* | Former name for the elicitation-gaps register and its question-instance / `open|closed` model. Renamed and reconceived as **elicitation gap** (D65-L). | | **Unknown** *(adopted — D87-L)* | A first-class intent node kind (`kind: "unknown"`, label UNK): a *known-unknown* — a durable domain-epistemic gap currently uneconomical or impossible to answer, requiring strategic accommodation (assumptions, decisions, design/verification/planning) rather than elicitation. Carries real cross-plane edges, so it is a node kind, not a table. Distinct from `assumption` (which proceeds on a believed-but-unprovable value; an `unknown` cannot yet pick a value) and from the prospective `elicitation_gaps` register (D65-L, an unasked-but-answerable question the user could answer). Subsumes the prior prototype's `risk` framing. Implementation lands with FE-1052. | | **Spec kind** | The ownership relation of a spec to the codebase (`spec.kind = product | feature | function`, D89-L), a field on the spec record, **not** a graph node kind. `product` owns the whole codebase; `feature` owns a part and a cycle in a brownfield codebase; `function`/`library` captures (often formal) verification around a focused area. `feature` is spec scope, not a node — the intra-spec grouping is the `story` node. | +| **Spec output** | A graph-derived flattened markdown rendering of one selected spec, owned by `src/agents/contexts/spec/spec-output.ts` under D83-L. It is not `memory/SPEC.md` and must be produced from graph/projection input. | +| **Plan output** | A graph-derived flattened markdown rendering of plan-plane material, owned by `src/agents/contexts/plan/plan-output.ts` under D83-L. It is not `memory/PLAN.md` and starts thin over `milestone` / `frontier` / `slice` nodes until richer graph structure exists. | | **Story** | A first-class intent node kind (`kind: "story"`, `elicitation` band, D87-L): the intra-spec mid-level grouping, the Gherkin `Feature` expressed inside one spec. Reuses `composition` (story → requirement) and `witness`; adds no edge. The `kind: feature` spec vs `story` node duality is incidental (same concept at two granularities), a disambiguation not a smell. | | **Node detail form** | The `form`-discriminated payload union on the claim kinds `requirement`/`criterion`/`invariant` (`detail.form ∈ plain | gherkin | formal | given`, D88-L), the carrier for method-specific structure. **`kind` drives behavior; `form` is inert payload** — readiness band, edge legality, and source-questions key off `kind`, never `form`. One shared discriminant lets a lens query "all `formal`-form nodes" to round-trip a LEAN/Dafny file. Defaults from the active lens / `spec.kind`, overridable per-node. Axiom/given rides `context` + `form:"given"`. | | **Method as lens** | The closure rule (D87-L): a specification method (BDD, EDD, formal verification) is hosted on the one ontology as `spec.kind` + `detail.form` + a renderer + a heuristic-set — never its own node/edge kind. A method term that cannot map is a finding about the model, not a licence to add a kind. The heuristic-set is the method-differentiation layer (named follow-on: collate into one inlinable SoT). | From dc548a71f8ee3b970064878b1047c527dc573a30 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:42:13 +0200 Subject: [PATCH 04/29] Scope graph-derived document output rows --- .../cards/renderer-golden-coverage--render-prompt-lock.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index 8284e49c7..a486c3d31 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -48,7 +48,9 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Capability | Status | Req | Fill | Owner / next | Notes | | --- | --- | --- | --- | --- | --- | | Workspace context renderer | `have` | ● | earned | `src/agents/contexts/workspace/` | Snapshot coverage exists for cwd + overview context; preserve D83-L audience split. | -| Specification context renderer | `have` | ● | earned | `src/agents/contexts/specification/` | Snapshot coverage exists for selected-spec context. | +| Specification context renderer | `partial` | ● | earned | `src/agents/contexts/spec/` | Move `specification/specification-context.ts` to `spec/spec-context.ts`; closure oracle: imports, README, and snapshot tests name the short `spec/` home while the rendered tag remains ``. | +| Spec markdown document output | `new` | ● | earned | `src/agents/contexts/spec/spec-output.ts` | Thin graph-derived flattened markdown output using md-pen; not a copy of `memory/SPEC.md`. Future web/download routes are consumers, not owners. | +| Plan markdown document output | `new` | ● | earned | `src/agents/contexts/plan/plan-output.ts` | Thin graph-derived flattened markdown output over plan-plane nodes (`milestone`, `frontier`, `slice`) using md-pen; not a copy of `memory/PLAN.md`. | | Graph overview / neighborhood / related-node renderers | `have` | ● | earned | `src/agents/contexts/graph/` | Snapshot coverage exists for overview, neighborhoods, and related nodes; preserve code handles and no structural-leak assertions. | | Session runtime frame renderer | `partial` | ● | earned | `src/agents/contexts/session/` | Existing snapshot still displays D98-sensitive strategy/lens runtime wording. Closure oracle: runtime frame wording either removes that state or frames it strictly as prompt-resource/internal conduct, then updates the golden. | | Turn/origination seed renderers | `partial` | ● | earned | `src/agents/contexts/seeds/` | Existing tests are semantic asserts. Closure oracle: stable seed text is snapshot-locked or intentionally reduced to invariant asserts with a README note explaining why wording is not a golden contract. | @@ -89,8 +91,11 @@ src/agents/ │ ├── README.md ? │ ├── elicitation.ts ~ │ ├── references/ ? +│ ├── plan/ ? │ ├── seeds/ ? │ ├── session/ ? +│ ├── spec/ ? +│ ├── specification/ - │ └── exchanges/ ? ├── prompts/ ? ├── runtime/ From f50c84b373a40e754d9e132213ed6a61cad9661d Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:45:32 +0200 Subject: [PATCH 05/29] Canonicalize foreground executor prompt body --- .../cards/renderer-golden-coverage--render-prompt-lock.md | 2 +- src/.pi/__tests__/architecture.test.ts | 2 +- .../extensions/__tests__/agent-runtime-runtime.test.ts | 8 ++++---- src/.pi/extensions/__tests__/registry.test.ts | 4 ++-- .../extensions/agent-runtime/orchestrator-stub/index.ts | 2 +- src/agents/__tests__/registry.test.ts | 2 +- src/agents/prompts/README.md | 2 +- src/agents/prompts/__tests__/prompt-bodies.test.ts | 4 ++-- src/agents/prompts/{orchestrator => executor}/SYSTEM.md | 4 ++-- src/agents/registry.ts | 2 +- src/agents/runtime/__tests__/state.test.ts | 4 ++-- src/agents/runtime/policy.ts | 8 ++++---- src/session/runtime-state.ts | 2 +- src/session/schema/kinds.ts | 2 +- 14 files changed, 24 insertions(+), 24 deletions(-) rename src/agents/prompts/{orchestrator => executor}/SYSTEM.md (56%) diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index a486c3d31..a0f1bd98c 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -36,7 +36,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Capability | Status | Req | Fill | Owner / next | Notes | | --- | --- | --- | --- | --- | --- | -| Foreground prompt-body topology is canonical | `partial` | ● | earned | `src/agents/registry.ts`, `src/agents/prompts/README.md`, prompt body tests | Closure oracle: one foreground body home matches the D98 target vocabulary (`elicitor` / executor direction) or a narrow explicit deferral says which CODE row owns the remaining rename. No compatibility alias from old role/body names remains without an owner. | +| Foreground prompt-body topology is canonical | `built` | ● | earned | `src/agents/registry.ts`, `src/agents/prompts/README.md`, prompt body tests | Closed: foreground prompt bodies now use D98 target ids (`elicitor` / `executor`); the old foreground `orchestrator` body/home is removed. The `orchestrator_stub` tool name remains owned by `orchestrator-tool-port`, not prompt-body topology. | | Background subagent body topology is canonical | `partial` | ● | earned | `src/.pi/extensions/subagents/agents.ts`, prompt/subagent topology docs | Closure oracle: background bodies are not ambiguously presented as foreground prompts; either rehome under an earned `src/agents/subagents/` surface or lock the current `prompts//SYSTEM.md` convention with README/tests that explain why no rehome is needed now. | | Foreground prompt assembly golden lock | `partial` | ● | earned | `src/agents/runtime/compose.ts`, `src/agents/runtime/__tests__/compose.test.ts`, snapshots | Existing elicitor preview snapshots are a start. Closure oracle: full provider-facing assembly has reviewed goldens/semantic invariants for the foreground role(s) this frontier owns, with stale readiness-grade/runtime-axis vocabulary guarded. | | Pi `before_agent_start` assembly path is wired to the same lock | `partial` | ● | earned | `src/.pi/extensions/agent-runtime/system-prompts/` tests | Closure oracle: adapter-level test proves Brunch body + world reads + active-tool legality feed `composeAgentPrompt`; no harness-only snapshot path that product assembly bypasses. | diff --git a/src/.pi/__tests__/architecture.test.ts b/src/.pi/__tests__/architecture.test.ts index ab555c53a..92709bfdc 100644 --- a/src/.pi/__tests__/architecture.test.ts +++ b/src/.pi/__tests__/architecture.test.ts @@ -17,7 +17,7 @@ const legacyImportNeedles = [ const runtimeRegistryExpectations = [ { file: 'src/session/schema/kinds.ts', - required: "export const AGENT_ROLE_IDS = ['elicitor', 'orchestrator'] as const;", + required: "export const AGENT_ROLE_IDS = ['elicitor', 'executor'] as const;", forbidden: ['reviewer', 'pi-coder'], }, { diff --git a/src/.pi/extensions/__tests__/agent-runtime-runtime.test.ts b/src/.pi/extensions/__tests__/agent-runtime-runtime.test.ts index 3cc470eac..e3cd1a4c9 100644 --- a/src/.pi/extensions/__tests__/agent-runtime-runtime.test.ts +++ b/src/.pi/extensions/__tests__/agent-runtime-runtime.test.ts @@ -113,20 +113,20 @@ describe('Brunch agent runtime-state projection', () => { expect(parseBrunchAgentState(executeState)).toEqual(executeState); expect(projectBrunchAgentState([runtimeEntry(executeState as BrunchAgentState)])).toMatchObject({ ...executeState, - agentRole: 'orchestrator', + agentRole: 'executor', operationalModeDefinition: { id: 'execute', foregroundAgent: { - id: 'orchestrator', + id: 'executor', kind: 'foreground', canDelegate: [], }, toolPolicy: { - id: 'execute-orchestrator', + id: 'execute-executor', }, }, agentRoleDefinition: { - id: 'orchestrator', + id: 'executor', kind: 'foreground', operationalMode: 'execute', }, diff --git a/src/.pi/extensions/__tests__/registry.test.ts b/src/.pi/extensions/__tests__/registry.test.ts index d0637c1ed..f5c35dcce 100644 --- a/src/.pi/extensions/__tests__/registry.test.ts +++ b/src/.pi/extensions/__tests__/registry.test.ts @@ -151,7 +151,7 @@ describe('Brunch explicit Pi extension registry', () => { expect(sessionStartIndexes[0]).toBeLessThan(sessionStartIndexes[1] ?? -1); }); - it('registers the orchestrator stub tool on the default product extension path', async () => { + it('registers the executor stub tool on the default product extension path', async () => { const registeredTools: Array<{ name: string; execute: (toolCallId: string, params: unknown) => Promise<{ content: readonly { text: string }[] }>; @@ -176,7 +176,7 @@ describe('Brunch explicit Pi extension registry', () => { const stub = registeredTools.find((tool) => tool.name === BRUNCH_ORCHESTRATOR_STUB_TOOL); expect(stub).toBeDefined(); await expect(stub!.execute('call-1', { message: 'standup' })).resolves.toMatchObject({ - content: [{ type: 'text', text: 'orchestrator stub ran: standup' }], + content: [{ type: 'text', text: 'executor stub ran: standup' }], }); }); diff --git a/src/.pi/extensions/agent-runtime/orchestrator-stub/index.ts b/src/.pi/extensions/agent-runtime/orchestrator-stub/index.ts index 6818125ad..7ee4b392a 100644 --- a/src/.pi/extensions/agent-runtime/orchestrator-stub/index.ts +++ b/src/.pi/extensions/agent-runtime/orchestrator-stub/index.ts @@ -30,7 +30,7 @@ export function createOrchestratorStubTool(): ToolDefinition< parameters: OrchestratorStubParams, async execute(_toolCallId, params, _signal, _onUpdate, _ctx) { return { - content: [{ type: 'text' as const, text: `orchestrator stub ran: ${params.message}` }], + content: [{ type: 'text' as const, text: `executor stub ran: ${params.message}` }], details: { message: params.message }, }; }, diff --git a/src/agents/__tests__/registry.test.ts b/src/agents/__tests__/registry.test.ts index 1f6acaead..84731c2f1 100644 --- a/src/agents/__tests__/registry.test.ts +++ b/src/agents/__tests__/registry.test.ts @@ -13,7 +13,7 @@ describe('agent context registry', () => { it('centralizes bundled prompt and current skill paths', () => { expect(BUNDLED_AGENT_BODY_IDS).toEqual([ 'elicitor', - 'orchestrator', + 'executor', 'explorer', 'researcher', 'projector', diff --git a/src/agents/prompts/README.md b/src/agents/prompts/README.md index 7a88328c9..70fe80d72 100644 --- a/src/agents/prompts/README.md +++ b/src/agents/prompts/README.md @@ -10,7 +10,7 @@ Keyed foreground and background agent body resources — the markdown persona te prompts/ ├── README.md ├── elicitor/SYSTEM.md foreground elicit-mode body -├── orchestrator/SYSTEM.md foreground execute-mode body +├── executor/SYSTEM.md foreground execute-mode body ├── explorer/SYSTEM.md background codebase recon body + frontmatter ├── researcher/SYSTEM.md background web-research body + frontmatter ├── projector/SYSTEM.md background candidate-proposal body + frontmatter diff --git a/src/agents/prompts/__tests__/prompt-bodies.test.ts b/src/agents/prompts/__tests__/prompt-bodies.test.ts index 1df91287f..c4d56e030 100644 --- a/src/agents/prompts/__tests__/prompt-bodies.test.ts +++ b/src/agents/prompts/__tests__/prompt-bodies.test.ts @@ -13,8 +13,8 @@ const agentDefinitionExpectations = [ needles: ['# Agent: elicitor', 'multi-spec discipline'], }, { - system: 'src/agents/prompts/orchestrator/SYSTEM.md', - needles: ['# Agent: orchestrator', 'execute mode'], + system: 'src/agents/prompts/executor/SYSTEM.md', + needles: ['# Agent: executor', 'execute mode'], }, { system: 'src/agents/prompts/reviewer/SYSTEM.md', diff --git a/src/agents/prompts/orchestrator/SYSTEM.md b/src/agents/prompts/executor/SYSTEM.md similarity index 56% rename from src/agents/prompts/orchestrator/SYSTEM.md rename to src/agents/prompts/executor/SYSTEM.md index 83b9f0948..d1e349842 100644 --- a/src/agents/prompts/orchestrator/SYSTEM.md +++ b/src/agents/prompts/executor/SYSTEM.md @@ -1,5 +1,5 @@ -# Agent: orchestrator +# Agent: executor -The orchestrator is the foreground Brunch session agent for execute mode. In this branch it proves the execute-mode path by calling the code-owned `orchestrator_stub` tool and reporting its deterministic output. +The executor is the foreground Brunch session agent for execute mode. In this branch it proves the execute-mode path by calling the code-owned `orchestrator_stub` tool and reporting its deterministic output. Stay inside the current selected spec and session context. Do not call shell or file-writing tools; execute mode blocks direct `bash`, `edit`, and `write` access. This branch has no delegated workers yet, so treat `canDelegate = []` as a hard boundary and use the stub tool directly for the standup proof. diff --git a/src/agents/registry.ts b/src/agents/registry.ts index 2f35ac6d9..c9e796072 100644 --- a/src/agents/registry.ts +++ b/src/agents/registry.ts @@ -2,7 +2,7 @@ import { fileURLToPath } from 'node:url'; export const BUNDLED_AGENT_BODY_IDS = [ 'elicitor', - 'orchestrator', + 'executor', 'explorer', 'researcher', 'projector', diff --git a/src/agents/runtime/__tests__/state.test.ts b/src/agents/runtime/__tests__/state.test.ts index 26269385c..514a16458 100644 --- a/src/agents/runtime/__tests__/state.test.ts +++ b/src/agents/runtime/__tests__/state.test.ts @@ -253,7 +253,7 @@ describe('agent posture policy', () => { ]); }); - it('activates the orchestrator stub only in execute mode', () => { + it('activates the executor stub only in execute mode', () => { const executeState = projectBrunchAgentState([ { type: 'custom', @@ -284,7 +284,7 @@ describe('agent posture policy', () => { gaps: groundingFloorGaps({ defaultCoverage: 0 }), }); - expect(executeState.agentRole).toBe('orchestrator'); + expect(executeState.agentRole).toBe('executor'); expect(delegatableAgentsForRuntimeState(executeState)).toEqual([]); expect(executeTools).toContain(BRUNCH_ORCHESTRATOR_STUB_TOOL); expect(executeTools).not.toEqual(expect.arrayContaining(['bash', 'edit', 'write'])); diff --git a/src/agents/runtime/policy.ts b/src/agents/runtime/policy.ts index 43043e8ae..e8e8acc5c 100644 --- a/src/agents/runtime/policy.ts +++ b/src/agents/runtime/policy.ts @@ -72,7 +72,7 @@ export const FOREGROUND_AGENT_ROSTER: Record Date: Fri, 26 Jun 2026 15:46:23 +0200 Subject: [PATCH 06/29] Lock background subagent body topology --- memory/cards/renderer-golden-coverage--render-prompt-lock.md | 2 +- src/agents/prompts/README.md | 5 +++-- src/agents/prompts/__tests__/prompt-bodies.test.ts | 2 ++ 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index a0f1bd98c..566afae88 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -37,7 +37,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Capability | Status | Req | Fill | Owner / next | Notes | | --- | --- | --- | --- | --- | --- | | Foreground prompt-body topology is canonical | `built` | ● | earned | `src/agents/registry.ts`, `src/agents/prompts/README.md`, prompt body tests | Closed: foreground prompt bodies now use D98 target ids (`elicitor` / `executor`); the old foreground `orchestrator` body/home is removed. The `orchestrator_stub` tool name remains owned by `orchestrator-tool-port`, not prompt-body topology. | -| Background subagent body topology is canonical | `partial` | ● | earned | `src/.pi/extensions/subagents/agents.ts`, prompt/subagent topology docs | Closure oracle: background bodies are not ambiguously presented as foreground prompts; either rehome under an earned `src/agents/subagents/` surface or lock the current `prompts//SYSTEM.md` convention with README/tests that explain why no rehome is needed now. | +| Background subagent body topology is canonical | `built` | ● | earned | `src/.pi/extensions/subagents/agents.ts`, prompt/subagent topology docs | Closed: background bodies intentionally stay under `src/agents/prompts//SYSTEM.md` as shared manifest body files, while `BACKGROUND_SUBAGENT_IDS` owns spawnability; README/tests now state that they are subagent resources, not foreground prompts. | | Foreground prompt assembly golden lock | `partial` | ● | earned | `src/agents/runtime/compose.ts`, `src/agents/runtime/__tests__/compose.test.ts`, snapshots | Existing elicitor preview snapshots are a start. Closure oracle: full provider-facing assembly has reviewed goldens/semantic invariants for the foreground role(s) this frontier owns, with stale readiness-grade/runtime-axis vocabulary guarded. | | Pi `before_agent_start` assembly path is wired to the same lock | `partial` | ● | earned | `src/.pi/extensions/agent-runtime/system-prompts/` tests | Closure oracle: adapter-level test proves Brunch body + world reads + active-tool legality feed `composeAgentPrompt`; no harness-only snapshot path that product assembly bypasses. | | Background subagent prompt assembly golden lock | `partial` | ● | earned | `src/.pi/extensions/subagents/prompt-assembly.ts`, `src/.pi/extensions/__tests__/subagents.test.ts` | Existing assertions prove sealing/tool grants. Closure oracle: file snapshot or equivalent semantic invariant locks assembled child prompt shape, injected-world snapshot, no foreground-only sections, no ambient Pi resources. | diff --git a/src/agents/prompts/README.md b/src/agents/prompts/README.md index 70fe80d72..e262c02c4 100644 --- a/src/agents/prompts/README.md +++ b/src/agents/prompts/README.md @@ -4,13 +4,13 @@ SPEC decisions: D25-L, D40-L, D58-L, D85-L, D90-L, D91-L, D93-L ## Owns -Keyed foreground and background agent body resources — the markdown persona text a Brunch agent contributes to its system prompt. +Keyed foreground and background agent body resources — the markdown persona text a Brunch agent contributes to its system prompt. Background bodies intentionally stay here instead of a parallel `src/agents/subagents/` home because foreground and background manifests share the same `AgentManifest.body` file convention; spawnability is still owned by the subagent registry, not by this directory. ```text prompts/ ├── README.md ├── elicitor/SYSTEM.md foreground elicit-mode body -├── executor/SYSTEM.md foreground execute-mode body +├── executor/SYSTEM.md foreground execute-mode body ├── explorer/SYSTEM.md background codebase recon body + frontmatter ├── researcher/SYSTEM.md background web-research body + frontmatter ├── projector/SYSTEM.md background candidate-proposal body + frontmatter @@ -23,6 +23,7 @@ This directory is markdown-only. It carries no TypeScript and registers no Pi ho ## Prompt-shape decisions - **SYSTEM.md convention is adopted:** foreground and background agent bodies use `src/agents/prompts//SYSTEM.md`. +- **Background bodies are subagent resources, not foreground prompts:** `explorer`, `researcher`, `projector`, and `reviewer` are loaded only through the explicit `BACKGROUND_SUBAGENT_IDS` registry in `src/.pi/extensions/subagents/agents.ts`; keeping their markdown beside foreground bodies is a shared body-file convention, not foreground availability. - **Background frontmatter is authoring DX:** background `SYSTEM.md` files carry `name`/`description`/`tools`/`model`/`thinking`, but the code-owned registry decides which ids exist. Unlisted directories are not spawnable. ## Does NOT own diff --git a/src/agents/prompts/__tests__/prompt-bodies.test.ts b/src/agents/prompts/__tests__/prompt-bodies.test.ts index c4d56e030..06e8dfb3c 100644 --- a/src/agents/prompts/__tests__/prompt-bodies.test.ts +++ b/src/agents/prompts/__tests__/prompt-bodies.test.ts @@ -59,6 +59,8 @@ describe('agent prompt bodies', () => { const readme = await readFile(join(projectRoot, 'src/agents/prompts/README.md'), 'utf8'); expect(readme).toContain('SYSTEM.md convention is adopted'); + expect(readme).toContain('Background bodies are subagent resources, not foreground prompts'); + expect(readme).toContain('BACKGROUND_SUBAGENT_IDS'); expect(readme).toContain('Background frontmatter is authoring DX'); expect(readme).toContain('Unlisted directories are not spawnable'); }); From 744f706a823a5bf502e73013c5576e5def1027b9 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:47:23 +0200 Subject: [PATCH 07/29] Lock foreground prompt assembly wording --- .../renderer-golden-coverage--render-prompt-lock.md | 2 +- .../__snapshots__/elicitor--auto-floor-gaps-open.md | 11 ++++++----- .../__snapshots__/elicitor--auto-high-coverage.md | 11 ++++++----- .../__snapshots__/elicitor--pinned-strategy-lens.md | 11 ++++++----- .../runtime/__snapshots__/elicitor--pushed-context.md | 11 ++++++----- src/agents/runtime/__tests__/compose.test.ts | 5 ++++- src/agents/runtime/compose.ts | 11 ++++++----- 7 files changed, 35 insertions(+), 27 deletions(-) diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index 566afae88..9ec87ae01 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -38,7 +38,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | --- | --- | --- | --- | --- | --- | | Foreground prompt-body topology is canonical | `built` | ● | earned | `src/agents/registry.ts`, `src/agents/prompts/README.md`, prompt body tests | Closed: foreground prompt bodies now use D98 target ids (`elicitor` / `executor`); the old foreground `orchestrator` body/home is removed. The `orchestrator_stub` tool name remains owned by `orchestrator-tool-port`, not prompt-body topology. | | Background subagent body topology is canonical | `built` | ● | earned | `src/.pi/extensions/subagents/agents.ts`, prompt/subagent topology docs | Closed: background bodies intentionally stay under `src/agents/prompts//SYSTEM.md` as shared manifest body files, while `BACKGROUND_SUBAGENT_IDS` owns spawnability; README/tests now state that they are subagent resources, not foreground prompts. | -| Foreground prompt assembly golden lock | `partial` | ● | earned | `src/agents/runtime/compose.ts`, `src/agents/runtime/__tests__/compose.test.ts`, snapshots | Existing elicitor preview snapshots are a start. Closure oracle: full provider-facing assembly has reviewed goldens/semantic invariants for the foreground role(s) this frontier owns, with stale readiness-grade/runtime-axis vocabulary guarded. | +| Foreground prompt assembly golden lock | `built` | ● | earned | `src/agents/runtime/compose.ts`, `src/agents/runtime/__tests__/compose.test.ts`, snapshots | Closed: elicitor provider-facing assembly has file snapshots plus semantic invariants; prompt-resource strategy/lens wording is framed as routing hints rather than user-changeable foreground identity, and readiness-grade/goal-axis regressions are guarded. | | Pi `before_agent_start` assembly path is wired to the same lock | `partial` | ● | earned | `src/.pi/extensions/agent-runtime/system-prompts/` tests | Closure oracle: adapter-level test proves Brunch body + world reads + active-tool legality feed `composeAgentPrompt`; no harness-only snapshot path that product assembly bypasses. | | Background subagent prompt assembly golden lock | `partial` | ● | earned | `src/.pi/extensions/subagents/prompt-assembly.ts`, `src/.pi/extensions/__tests__/subagents.test.ts` | Existing assertions prove sealing/tool grants. Closure oracle: file snapshot or equivalent semantic invariant locks assembled child prompt shape, injected-world snapshot, no foreground-only sections, no ambient Pi resources. | | Context-reference harvest closure | `built` | ● | earned | `src/agents/docs/context-reference-harvest.md`, `src/agents/contexts/references/`, skill-local `references/` | Closed: materialized graph-authoring / ontology / oracle homes stay in their current owners; checkability/shared subtype candidates are rejected; elicitation-question hints are deferred to a future scoped reader; proposal/projection candidates are deferred to A33-L/`elicitor-project`. | diff --git a/src/agents/runtime/__snapshots__/elicitor--auto-floor-gaps-open.md b/src/agents/runtime/__snapshots__/elicitor--auto-floor-gaps-open.md index b75114959..639e8c0a9 100644 --- a/src/agents/runtime/__snapshots__/elicitor--auto-floor-gaps-open.md +++ b/src/agents/runtime/__snapshots__/elicitor--auto-floor-gaps-open.md @@ -11,8 +11,8 @@ Preview role body from `src/agents/prompts/elicitor/SYSTEM.md`. [Brunch runtime state] - op_mode: elicit -- strategy: auto -- lens: auto +- prompt strategy resource: auto +- prompt lens resource: auto - spec: COMPOSE Preview Spec (#101), readiness estimate (soft; gates nothing): grounding=0.00, elicitation=0.00, projection=0.00, commitment=0.00 - workspace: /work/brunch-preview - workspace posture: certainty=proving; stakes=high; audience=internal; horizon=current-milestone; migration=free-rewrite; dependencies=resist @@ -107,7 +107,8 @@ When a skill file references a relative path, resolve it against the skill direc [Brunch prompt-resource routing] - Use only resources advertised in ; do not infer availability from the filesystem. -- Strategy and lens are AUTO/pinnable axes: choose at most one advertised strategy and at most one advertised lens, then read the selected resource before applying detailed behavior. +- Strategy and lens names are prompt-resource routing hints, not user-changeable session identity or stored foreground-agent roles. +- When AUTO exposes several strategy or lens resources, choose at most one advertised resource of each kind, then read the selected resource before applying detailed behavior. - Methods compose freely when advertised; read a method skill when that mechanism is relevant to the next turn. -- For pinned axes, the singleton skill of that kind is the selected resource. -- Current pins: strategy=auto; lens=auto. \ No newline at end of file +- For code-selected singleton resources, that singleton is the selected resource. +- Current prompt-resource selection: strategy=auto; lens=auto. \ No newline at end of file diff --git a/src/agents/runtime/__snapshots__/elicitor--auto-high-coverage.md b/src/agents/runtime/__snapshots__/elicitor--auto-high-coverage.md index b3ee23b73..ffa0b169e 100644 --- a/src/agents/runtime/__snapshots__/elicitor--auto-high-coverage.md +++ b/src/agents/runtime/__snapshots__/elicitor--auto-high-coverage.md @@ -11,8 +11,8 @@ Preview role body from `src/agents/prompts/elicitor/SYSTEM.md`. [Brunch runtime state] - op_mode: elicit -- strategy: auto -- lens: auto +- prompt strategy resource: auto +- prompt lens resource: auto - spec: COMPOSE Preview Spec (#101), readiness estimate (soft; gates nothing): grounding=1.00, elicitation=0.00, projection=0.00, commitment=0.00 - workspace: /work/brunch-preview - workspace posture: certainty=proving; stakes=high; audience=internal; horizon=current-milestone; migration=free-rewrite; dependencies=resist @@ -120,7 +120,8 @@ When a skill file references a relative path, resolve it against the skill direc [Brunch prompt-resource routing] - Use only resources advertised in ; do not infer availability from the filesystem. -- Strategy and lens are AUTO/pinnable axes: choose at most one advertised strategy and at most one advertised lens, then read the selected resource before applying detailed behavior. +- Strategy and lens names are prompt-resource routing hints, not user-changeable session identity or stored foreground-agent roles. +- When AUTO exposes several strategy or lens resources, choose at most one advertised resource of each kind, then read the selected resource before applying detailed behavior. - Methods compose freely when advertised; read a method skill when that mechanism is relevant to the next turn. -- For pinned axes, the singleton skill of that kind is the selected resource. -- Current pins: strategy=auto; lens=auto. \ No newline at end of file +- For code-selected singleton resources, that singleton is the selected resource. +- Current prompt-resource selection: strategy=auto; lens=auto. \ No newline at end of file diff --git a/src/agents/runtime/__snapshots__/elicitor--pinned-strategy-lens.md b/src/agents/runtime/__snapshots__/elicitor--pinned-strategy-lens.md index c7ff32d7f..9bc61e7c9 100644 --- a/src/agents/runtime/__snapshots__/elicitor--pinned-strategy-lens.md +++ b/src/agents/runtime/__snapshots__/elicitor--pinned-strategy-lens.md @@ -11,8 +11,8 @@ Preview role body from `src/agents/prompts/elicitor/SYSTEM.md`. [Brunch runtime state] - op_mode: elicit -- strategy: step-wise-disambiguate -- lens: design +- prompt strategy resource: step-wise-disambiguate +- prompt lens resource: design - spec: COMPOSE Preview Spec (#101), readiness estimate (soft; gates nothing): grounding=1.00, elicitation=0.00, projection=0.00, commitment=0.00 - workspace: /work/brunch-preview - workspace posture: certainty=proving; stakes=high; audience=internal; horizon=current-milestone; migration=free-rewrite; dependencies=resist @@ -102,7 +102,8 @@ When a skill file references a relative path, resolve it against the skill direc [Brunch prompt-resource routing] - Use only resources advertised in ; do not infer availability from the filesystem. -- Strategy and lens are AUTO/pinnable axes: choose at most one advertised strategy and at most one advertised lens, then read the selected resource before applying detailed behavior. +- Strategy and lens names are prompt-resource routing hints, not user-changeable session identity or stored foreground-agent roles. +- When AUTO exposes several strategy or lens resources, choose at most one advertised resource of each kind, then read the selected resource before applying detailed behavior. - Methods compose freely when advertised; read a method skill when that mechanism is relevant to the next turn. -- For pinned axes, the singleton skill of that kind is the selected resource. -- Current pins: strategy=step-wise-disambiguate; lens=design. \ No newline at end of file +- For code-selected singleton resources, that singleton is the selected resource. +- Current prompt-resource selection: strategy=step-wise-disambiguate; lens=design. \ No newline at end of file diff --git a/src/agents/runtime/__snapshots__/elicitor--pushed-context.md b/src/agents/runtime/__snapshots__/elicitor--pushed-context.md index 6d95ec63c..d64ae6db6 100644 --- a/src/agents/runtime/__snapshots__/elicitor--pushed-context.md +++ b/src/agents/runtime/__snapshots__/elicitor--pushed-context.md @@ -11,8 +11,8 @@ Preview role body from `src/agents/prompts/elicitor/SYSTEM.md`. [Brunch runtime state] - op_mode: elicit -- strategy: auto -- lens: auto +- prompt strategy resource: auto +- prompt lens resource: auto - spec: COMPOSE Preview Spec (#101), readiness estimate (soft; gates nothing): grounding=0.00, elicitation=0.00, projection=0.00, commitment=0.00 - workspace: /work/brunch-preview - workspace posture: certainty=proving; stakes=high; audience=internal; horizon=current-milestone; migration=free-rewrite; dependencies=resist @@ -112,7 +112,8 @@ When a skill file references a relative path, resolve it against the skill direc [Brunch prompt-resource routing] - Use only resources advertised in ; do not infer availability from the filesystem. -- Strategy and lens are AUTO/pinnable axes: choose at most one advertised strategy and at most one advertised lens, then read the selected resource before applying detailed behavior. +- Strategy and lens names are prompt-resource routing hints, not user-changeable session identity or stored foreground-agent roles. +- When AUTO exposes several strategy or lens resources, choose at most one advertised resource of each kind, then read the selected resource before applying detailed behavior. - Methods compose freely when advertised; read a method skill when that mechanism is relevant to the next turn. -- For pinned axes, the singleton skill of that kind is the selected resource. -- Current pins: strategy=auto; lens=auto. \ No newline at end of file +- For code-selected singleton resources, that singleton is the selected resource. +- Current prompt-resource selection: strategy=auto; lens=auto. \ No newline at end of file diff --git a/src/agents/runtime/__tests__/compose.test.ts b/src/agents/runtime/__tests__/compose.test.ts index a34411a6b..8f08eabe7 100644 --- a/src/agents/runtime/__tests__/compose.test.ts +++ b/src/agents/runtime/__tests__/compose.test.ts @@ -307,7 +307,7 @@ describe('composeAgentPrompt', () => { }); expect(result.prompt).not.toMatch(/- goal:/); - expect(result.prompt).toContain('- strategy: step-wise-disambiguate'); + expect(result.prompt).toContain('- prompt strategy resource: step-wise-disambiguate'); expect(Object.keys(result.manifests)).toEqual(['strategies', 'lenses', 'methods']); expect(result.manifests.strategies.map((entry) => entry.name)).toEqual(['step-wise-disambiguate']); // D86-L: commit-graph + generate-proposal are floor (graph-write is never readiness-gated); @@ -452,6 +452,9 @@ function expectPromptContracts(rendered: string): void { expect(rendered).toContain(''); expect(rendered).not.toMatch(/\bgoal=/); expect(rendered).not.toMatch(/- goal:/); + expect(rendered).not.toContain('- strategy:'); + expect(rendered).not.toContain('- lens:'); + expect(rendered).toContain('prompt-resource routing hints, not user-changeable session identity'); } describe('composeAgentPrompt previews', () => { diff --git a/src/agents/runtime/compose.ts b/src/agents/runtime/compose.ts index 918f3d8eb..7a9774148 100644 --- a/src/agents/runtime/compose.ts +++ b/src/agents/runtime/compose.ts @@ -68,8 +68,8 @@ function renderRuntimeState(input: ComposeAgentPromptInput): string { return [ '[Brunch runtime state]', `- op_mode: ${input.sessionState.operationalMode}`, - `- strategy: ${input.sessionState.agentStrategy}`, - `- lens: ${input.sessionState.agentLens}`, + `- prompt strategy resource: ${input.sessionState.agentStrategy}`, + `- prompt lens resource: ${input.sessionState.agentLens}`, `- spec: ${input.spec.name} (#${input.spec.id}), ${renderSoftReadinessEstimate(input.gaps)}`, `- workspace: ${input.workspace.cwd}`, `- workspace posture: ${renderPosture(input.workspace.posture)}`, @@ -122,10 +122,11 @@ function renderRouterRules(state: ResolvedBrunchAgentState): string { return [ '[Brunch prompt-resource routing]', '- Use only resources advertised in ; do not infer availability from the filesystem.', - '- Strategy and lens are AUTO/pinnable axes: choose at most one advertised strategy and at most one advertised lens, then read the selected resource before applying detailed behavior.', + '- Strategy and lens names are prompt-resource routing hints, not user-changeable session identity or stored foreground-agent roles.', + '- When AUTO exposes several strategy or lens resources, choose at most one advertised resource of each kind, then read the selected resource before applying detailed behavior.', '- Methods compose freely when advertised; read a method skill when that mechanism is relevant to the next turn.', - '- For pinned axes, the singleton skill of that kind is the selected resource.', - `- Current pins: strategy=${state.agentStrategy}; lens=${state.agentLens}.`, + '- For code-selected singleton resources, that singleton is the selected resource.', + `- Current prompt-resource selection: strategy=${state.agentStrategy}; lens=${state.agentLens}.`, ].join('\n'); } From 23cabc9c3d50b28ca3b0b0d5c444493bb4c3e4d8 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:48:22 +0200 Subject: [PATCH 08/29] Lock before-agent-start prompt assembly path --- .../renderer-golden-coverage--render-prompt-lock.md | 2 +- .../__tests__/agent-runtime-system-prompts.test.ts | 10 +++++----- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index 9ec87ae01..52f9c340e 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -39,7 +39,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Foreground prompt-body topology is canonical | `built` | ● | earned | `src/agents/registry.ts`, `src/agents/prompts/README.md`, prompt body tests | Closed: foreground prompt bodies now use D98 target ids (`elicitor` / `executor`); the old foreground `orchestrator` body/home is removed. The `orchestrator_stub` tool name remains owned by `orchestrator-tool-port`, not prompt-body topology. | | Background subagent body topology is canonical | `built` | ● | earned | `src/.pi/extensions/subagents/agents.ts`, prompt/subagent topology docs | Closed: background bodies intentionally stay under `src/agents/prompts//SYSTEM.md` as shared manifest body files, while `BACKGROUND_SUBAGENT_IDS` owns spawnability; README/tests now state that they are subagent resources, not foreground prompts. | | Foreground prompt assembly golden lock | `built` | ● | earned | `src/agents/runtime/compose.ts`, `src/agents/runtime/__tests__/compose.test.ts`, snapshots | Closed: elicitor provider-facing assembly has file snapshots plus semantic invariants; prompt-resource strategy/lens wording is framed as routing hints rather than user-changeable foreground identity, and readiness-grade/goal-axis regressions are guarded. | -| Pi `before_agent_start` assembly path is wired to the same lock | `partial` | ● | earned | `src/.pi/extensions/agent-runtime/system-prompts/` tests | Closure oracle: adapter-level test proves Brunch body + world reads + active-tool legality feed `composeAgentPrompt`; no harness-only snapshot path that product assembly bypasses. | +| Pi `before_agent_start` assembly path is wired to the same lock | `built` | ● | earned | `src/.pi/extensions/agent-runtime/system-prompts/` tests | Closed: adapter-level tests exercise the live `before_agent_start` path through Brunch body loading, selected-world reads, runtime-state projection, active-tool filtering, and `composeAgentPrompt` wording. | | Background subagent prompt assembly golden lock | `partial` | ● | earned | `src/.pi/extensions/subagents/prompt-assembly.ts`, `src/.pi/extensions/__tests__/subagents.test.ts` | Existing assertions prove sealing/tool grants. Closure oracle: file snapshot or equivalent semantic invariant locks assembled child prompt shape, injected-world snapshot, no foreground-only sections, no ambient Pi resources. | | Context-reference harvest closure | `built` | ● | earned | `src/agents/docs/context-reference-harvest.md`, `src/agents/contexts/references/`, skill-local `references/` | Closed: materialized graph-authoring / ontology / oracle homes stay in their current owners; checkability/shared subtype candidates are rejected; elicitation-question hints are deferred to a future scoped reader; proposal/projection candidates are deferred to A33-L/`elicitor-project`. | diff --git a/src/.pi/extensions/__tests__/agent-runtime-system-prompts.test.ts b/src/.pi/extensions/__tests__/agent-runtime-system-prompts.test.ts index 6430d759c..ad1f1b09e 100644 --- a/src/.pi/extensions/__tests__/agent-runtime-system-prompts.test.ts +++ b/src/.pi/extensions/__tests__/agent-runtime-system-prompts.test.ts @@ -139,8 +139,8 @@ describe('Brunch prompt-pack topology', () => { expect(result.prompt).toContain('[Brunch agent control]'); expect(result.prompt).toContain('- op_mode: elicit'); expect(result.prompt).not.toMatch(/- goal:/); - expect(result.prompt).toContain('- strategy: step-wise-decision-tree'); - expect(result.prompt).toContain('- lens: intent'); + expect(result.prompt).toContain('- prompt strategy resource: step-wise-decision-tree'); + expect(result.prompt).toContain('- prompt lens resource: intent'); expect(result.prompt).not.toContain(''); expect(result.prompt).toContain(''); expect(result.prompt).toContain('strategy'); @@ -196,7 +196,7 @@ describe('Brunch prompt-pack topology', () => { systemPrompt: expect.stringContaining('[Brunch agent control]'), }); expect(result).toMatchObject({ - systemPrompt: expect.stringContaining('- strategy: step-wise-disambiguate'), + systemPrompt: expect.stringContaining('- prompt strategy resource: step-wise-disambiguate'), }); expect(result).toMatchObject({ systemPrompt: expect.stringContaining('- active tools: read, grep, present_question, request_response'), @@ -394,10 +394,10 @@ describe('Brunch prompt-pack topology', () => { elicitFloorTools, ]); expect(defaultPrompt).toMatchObject({ - systemPrompt: expect.stringContaining('- strategy: auto'), + systemPrompt: expect.stringContaining('- prompt strategy resource: auto'), }); expect(switchedPrompt).toMatchObject({ - systemPrompt: expect.stringContaining('- strategy: step-wise-decision-tree'), + systemPrompt: expect.stringContaining('- prompt strategy resource: step-wise-decision-tree'), }); expect(defaultPrompt).toMatchObject({ systemPrompt: expect.stringContaining( From a7dd6f90c6a68b48bdf69912dd6fdedac28496a7 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:49:12 +0200 Subject: [PATCH 09/29] Lock background subagent prompt assembly --- ...rer-golden-coverage--render-prompt-lock.md | 2 +- .../__snapshots__/subagent-explorer-prompt.md | 28 +++++++++++++++++++ .../extensions/__tests__/subagents.test.ts | 18 ++++++++++++ 3 files changed, 47 insertions(+), 1 deletion(-) create mode 100644 src/.pi/extensions/__snapshots__/subagent-explorer-prompt.md diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index 52f9c340e..db3cfbd65 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -40,7 +40,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Background subagent body topology is canonical | `built` | ● | earned | `src/.pi/extensions/subagents/agents.ts`, prompt/subagent topology docs | Closed: background bodies intentionally stay under `src/agents/prompts//SYSTEM.md` as shared manifest body files, while `BACKGROUND_SUBAGENT_IDS` owns spawnability; README/tests now state that they are subagent resources, not foreground prompts. | | Foreground prompt assembly golden lock | `built` | ● | earned | `src/agents/runtime/compose.ts`, `src/agents/runtime/__tests__/compose.test.ts`, snapshots | Closed: elicitor provider-facing assembly has file snapshots plus semantic invariants; prompt-resource strategy/lens wording is framed as routing hints rather than user-changeable foreground identity, and readiness-grade/goal-axis regressions are guarded. | | Pi `before_agent_start` assembly path is wired to the same lock | `built` | ● | earned | `src/.pi/extensions/agent-runtime/system-prompts/` tests | Closed: adapter-level tests exercise the live `before_agent_start` path through Brunch body loading, selected-world reads, runtime-state projection, active-tool filtering, and `composeAgentPrompt` wording. | -| Background subagent prompt assembly golden lock | `partial` | ● | earned | `src/.pi/extensions/subagents/prompt-assembly.ts`, `src/.pi/extensions/__tests__/subagents.test.ts` | Existing assertions prove sealing/tool grants. Closure oracle: file snapshot or equivalent semantic invariant locks assembled child prompt shape, injected-world snapshot, no foreground-only sections, no ambient Pi resources. | +| Background subagent prompt assembly golden lock | `built` | ● | earned | `src/.pi/extensions/subagents/prompt-assembly.ts`, `src/.pi/extensions/__tests__/subagents.test.ts` | Closed: explorer child prompt assembly has a file snapshot plus semantic invariants for body/control/injected-world/background routing, no foreground elicitation section, and sealed ambient Pi resources. | | Context-reference harvest closure | `built` | ● | earned | `src/agents/docs/context-reference-harvest.md`, `src/agents/contexts/references/`, skill-local `references/` | Closed: materialized graph-authoring / ontology / oracle homes stay in their current owners; checkability/shared subtype candidates are rejected; elicitation-question hints are deferred to a future scoped reader; proposal/projection candidates are deferred to A33-L/`elicitor-project`. | ## Ledger — model-facing context renderers diff --git a/src/.pi/extensions/__snapshots__/subagent-explorer-prompt.md b/src/.pi/extensions/__snapshots__/subagent-explorer-prompt.md new file mode 100644 index 000000000..689f0a539 --- /dev/null +++ b/src/.pi/extensions/__snapshots__/subagent-explorer-prompt.md @@ -0,0 +1,28 @@ +You are an explorer. + +[Brunch background subagent control] +- agent: explorer +- host: sealed SDK child session +- delegated task: delivered as the first user message +- world view: explicit app-root snapshot at spawn plus granted read tools +- ambient Pi resources: sealed out; do not infer resources from ~/.pi or project .pi discovery +- model: default; thinking: low +- manifest tool grant: read, grep, find, ls + +[Brunch injected world snapshot] + [Selected workspace context] + - cwd: /work/brunch-subagent + - selected spec: Parent Spec (#7); readiness estimate (soft; gates nothing): grounding=0.00, elicitation=0.00, projection=0.00, commitment=0.00 + - selected session: Grounding (session-7) + - workspace posture: unrecorded + - ambient Pi resources: not scanned; Brunch prompt resources come only from code-owned manifests + - graph scope: selected spec only; no workspace-global graph fallback +[Parent session digest] + - user asked for graph reconciliation +- graph access: use granted Brunch read tools such as read_graph; the graph itself is not baked into this prompt + +[Brunch background routing] +- Treat the task message as the caller authority; do not assume access to the parent conversation beyond this snapshot. +- Use only tools listed in the manifest tool grant and actually advertised to you. +- No Brunch prompt resources are advertised for this background agent. +- Return findings as concise assistant text; structured details are render-only and not model context. \ No newline at end of file diff --git a/src/.pi/extensions/__tests__/subagents.test.ts b/src/.pi/extensions/__tests__/subagents.test.ts index ecad1e6f8..d6ab2ce6f 100644 --- a/src/.pi/extensions/__tests__/subagents.test.ts +++ b/src/.pi/extensions/__tests__/subagents.test.ts @@ -38,6 +38,7 @@ import { registerBrunchSubagents, type BrunchSubagentsDeps, } from '../subagents/index.js'; +import { composeBackgroundSubagentPrompt } from '../subagents/prompt-assembly.js'; import { createSubagentToolCatalog, planSubagentTools, @@ -675,6 +676,23 @@ describe('runSubagent (sealed SDK child session over a faux provider)', () => { }; } + it('locks the assembled explorer background prompt shape', async () => { + const def = parseSubagentMarkdown(EXPLORER_MD); + const rendered = composeBackgroundSubagentPrompt({ + definition: def, + world: injectedWorld({ cwd: '/work/brunch-subagent' }).snapshot, + }).prompt; + + await expect(rendered).toMatchFileSnapshot('../__snapshots__/subagent-explorer-prompt.md'); + expect(rendered).toContain('You are an explorer.'); + expect(rendered).toContain('[Brunch background subagent control]'); + expect(rendered).toContain('[Brunch injected world snapshot]'); + expect(rendered).toContain('[Brunch background routing]'); + expect(rendered).not.toContain('[Brunch elicitation recommendation]'); + expect(rendered).not.toContain('Current prompt-resource selection'); + expect(rendered).toContain('ambient Pi resources: sealed out'); + }); + it('runs a tool-less projector, owning the system prompt and returning its output', async () => { const rig = await fauxRig('PROPOSED VARIANT'); try { From 3cc6474b52cf0f899b0676f813257caacc8173f9 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:49:56 +0200 Subject: [PATCH 10/29] Rename specification context home to spec --- memory/cards/renderer-golden-coverage--render-prompt-lock.md | 2 +- src/.pi/extensions/brunch-data/context/get-specification.ts | 2 +- src/agents/contexts/README.md | 2 +- src/agents/contexts/{specification => spec}/README.md | 2 +- .../__snapshots__/spec-context.md} | 0 .../__tests__/spec-context.test.ts} | 4 ++-- .../specification-context.ts => spec/spec-context.ts} | 0 src/session/specification-overview-context.ts | 2 +- 8 files changed, 7 insertions(+), 7 deletions(-) rename src/agents/contexts/{specification => spec}/README.md (80%) rename src/agents/contexts/{specification/__snapshots__/specification-context.md => spec/__snapshots__/spec-context.md} (100%) rename src/agents/contexts/{specification/__tests__/specification-context.test.ts => spec/__tests__/spec-context.test.ts} (97%) rename src/agents/contexts/{specification/specification-context.ts => spec/spec-context.ts} (100%) diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index db3cfbd65..97132bad4 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -48,7 +48,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Capability | Status | Req | Fill | Owner / next | Notes | | --- | --- | --- | --- | --- | --- | | Workspace context renderer | `have` | ● | earned | `src/agents/contexts/workspace/` | Snapshot coverage exists for cwd + overview context; preserve D83-L audience split. | -| Specification context renderer | `partial` | ● | earned | `src/agents/contexts/spec/` | Move `specification/specification-context.ts` to `spec/spec-context.ts`; closure oracle: imports, README, and snapshot tests name the short `spec/` home while the rendered tag remains ``. | +| Specification context renderer | `built` | ● | earned | `src/agents/contexts/spec/` | Closed: selected-spec renderer now lives at `src/agents/contexts/spec/spec-context.ts`; imports, README, and snapshot test use the short `spec/` home while rendered output remains ``. | | Spec markdown document output | `new` | ● | earned | `src/agents/contexts/spec/spec-output.ts` | Thin graph-derived flattened markdown output using md-pen; not a copy of `memory/SPEC.md`. Future web/download routes are consumers, not owners. | | Plan markdown document output | `new` | ● | earned | `src/agents/contexts/plan/plan-output.ts` | Thin graph-derived flattened markdown output over plan-plane nodes (`milestone`, `frontier`, `slice`) using md-pen; not a copy of `memory/PLAN.md`. | | Graph overview / neighborhood / related-node renderers | `have` | ● | earned | `src/agents/contexts/graph/` | Snapshot coverage exists for overview, neighborhoods, and related nodes; preserve code handles and no structural-leak assertions. | diff --git a/src/.pi/extensions/brunch-data/context/get-specification.ts b/src/.pi/extensions/brunch-data/context/get-specification.ts index 5196f8e87..7a5172ca8 100644 --- a/src/.pi/extensions/brunch-data/context/get-specification.ts +++ b/src/.pi/extensions/brunch-data/context/get-specification.ts @@ -1,4 +1,4 @@ -import { renderSpecificationContext } from '../../../../agents/contexts/specification/specification-context.js'; +import { renderSpecificationContext } from '../../../../agents/contexts/spec/spec-context.js'; import { inspectSpecificationOverview } from '../../../../session/specification-overview-context.js'; import { resolveWorkspaceCwd } from './get-cwd.js'; import { resolveSelectedSpecBinding, type SessionManagerLike } from './session-binding.js'; diff --git a/src/agents/contexts/README.md b/src/agents/contexts/README.md index c0d301b74..b720558cf 100644 --- a/src/agents/contexts/README.md +++ b/src/agents/contexts/README.md @@ -14,7 +14,7 @@ contexts/ ├── graph/ graph overview/neighborhood, related-node, mutation, reconciliation text ├── elicitation.ts elicitation agenda/update text ├── workspace/ context text -├── specification/ context text +├── spec/ context text ├── session/ runtime-frame and readiness text └── exchanges/ present_* / request_* structured-exchange result text ``` diff --git a/src/agents/contexts/specification/README.md b/src/agents/contexts/spec/README.md similarity index 80% rename from src/agents/contexts/specification/README.md rename to src/agents/contexts/spec/README.md index 1b07474f9..b47ccd462 100644 --- a/src/agents/contexts/specification/README.md +++ b/src/agents/contexts/spec/README.md @@ -1,4 +1,4 @@ -# agents/contexts/specification/ — selected-spec context text +# agents/contexts/spec/ — selected-spec context text SPEC decisions: D19-L, D60-L, D83-L diff --git a/src/agents/contexts/specification/__snapshots__/specification-context.md b/src/agents/contexts/spec/__snapshots__/spec-context.md similarity index 100% rename from src/agents/contexts/specification/__snapshots__/specification-context.md rename to src/agents/contexts/spec/__snapshots__/spec-context.md diff --git a/src/agents/contexts/specification/__tests__/specification-context.test.ts b/src/agents/contexts/spec/__tests__/spec-context.test.ts similarity index 97% rename from src/agents/contexts/specification/__tests__/specification-context.test.ts rename to src/agents/contexts/spec/__tests__/spec-context.test.ts index d13e03f80..a8866627c 100644 --- a/src/agents/contexts/specification/__tests__/specification-context.test.ts +++ b/src/agents/contexts/spec/__tests__/spec-context.test.ts @@ -10,7 +10,7 @@ import { presenceGap } from '../../../../graph/schema/elicitation-gap-fixtures.j import { seedFixture, type SeedFixture } from '../../../../graph/seed-fixtures.js'; import { createSessionBindingData } from '../../../../session/session-binding.js'; import { inspectSpecificationOverview } from '../../../../session/specification-overview-context.js'; -import { renderSpecificationContext } from '../specification-context.js'; +import { renderSpecificationContext } from '../spec-context.js'; describe('renderSpecificationContext', () => { it('renders the approved specification house style', async () => { @@ -26,7 +26,7 @@ describe('renderSpecificationContext', () => { const details = await inspectSpecificationOverview(cwd, seeded.specId); const rendered = renderSpecificationContext(details); - await expect(rendered).toMatchFileSnapshot('../__snapshots__/specification-context.md'); + await expect(rendered).toMatchFileSnapshot('../__snapshots__/spec-context.md'); expect(rendered).toContain(''); expect(rendered).toContain('Overview:'); expect(rendered).toContain('Graph (LSN 2): 5 nodes, 3 edges'); diff --git a/src/agents/contexts/specification/specification-context.ts b/src/agents/contexts/spec/spec-context.ts similarity index 100% rename from src/agents/contexts/specification/specification-context.ts rename to src/agents/contexts/spec/spec-context.ts diff --git a/src/session/specification-overview-context.ts b/src/session/specification-overview-context.ts index 640723b00..5b1a57702 100644 --- a/src/session/specification-overview-context.ts +++ b/src/session/specification-overview-context.ts @@ -1,6 +1,6 @@ import { resolve } from 'node:path'; -import { type SpecificationContextRenderInput } from '../agents/contexts/specification/specification-context.js'; +import { type SpecificationContextRenderInput } from '../agents/contexts/spec/spec-context.js'; import { sortElicitationGapsForAsking } from '../graph/elicitation-driver.js'; import { openWorkspaceGraphRuntime } from '../graph/index.js'; import { inspectWorkspaceOverview } from './workspace-overview-context.js'; From 4218c89a49cdb19114e63019d2156f72df088992 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:50:50 +0200 Subject: [PATCH 11/29] Add graph-derived spec markdown output --- ...rer-golden-coverage--render-prompt-lock.md | 2 +- .../spec/__snapshots__/spec-output.md | 18 +++++++ .../spec/__tests__/spec-output.test.ts | 54 +++++++++++++++++++ src/agents/contexts/spec/spec-output.ts | 50 +++++++++++++++++ 4 files changed, 123 insertions(+), 1 deletion(-) create mode 100644 src/agents/contexts/spec/__snapshots__/spec-output.md create mode 100644 src/agents/contexts/spec/__tests__/spec-output.test.ts create mode 100644 src/agents/contexts/spec/spec-output.ts diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index 97132bad4..31d938172 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -49,7 +49,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | --- | --- | --- | --- | --- | --- | | Workspace context renderer | `have` | ● | earned | `src/agents/contexts/workspace/` | Snapshot coverage exists for cwd + overview context; preserve D83-L audience split. | | Specification context renderer | `built` | ● | earned | `src/agents/contexts/spec/` | Closed: selected-spec renderer now lives at `src/agents/contexts/spec/spec-context.ts`; imports, README, and snapshot test use the short `spec/` home while rendered output remains ``. | -| Spec markdown document output | `new` | ● | earned | `src/agents/contexts/spec/spec-output.ts` | Thin graph-derived flattened markdown output using md-pen; not a copy of `memory/SPEC.md`. Future web/download routes are consumers, not owners. | +| Spec markdown document output | `built` | ● | earned | `src/agents/contexts/spec/spec-output.ts` | Closed: thin md-pen-backed renderer flattens non-plan graph nodes into a markdown specification document; snapshot proves this is graph-derived output, not a copy of `memory/SPEC.md`. | | Plan markdown document output | `new` | ● | earned | `src/agents/contexts/plan/plan-output.ts` | Thin graph-derived flattened markdown output over plan-plane nodes (`milestone`, `frontier`, `slice`) using md-pen; not a copy of `memory/PLAN.md`. | | Graph overview / neighborhood / related-node renderers | `have` | ● | earned | `src/agents/contexts/graph/` | Snapshot coverage exists for overview, neighborhoods, and related nodes; preserve code handles and no structural-leak assertions. | | Session runtime frame renderer | `partial` | ● | earned | `src/agents/contexts/session/` | Existing snapshot still displays D98-sensitive strategy/lens runtime wording. Closure oracle: runtime frame wording either removes that state or frames it strictly as prompt-resource/internal conduct, then updates the golden. | diff --git a/src/agents/contexts/spec/__snapshots__/spec-output.md b/src/agents/contexts/spec/__snapshots__/spec-output.md new file mode 100644 index 000000000..7719a5867 --- /dev/null +++ b/src/agents/contexts/spec/__snapshots__/spec-output.md @@ -0,0 +1,18 @@ +# Widget Spec + +## Intent + +### G1 Capture decisions + +The product records specification decisions as graph truth. + +- basis: explicit +- source: stakeholder + +## Design + +### MOD1 Context renderer + +Renderer code owns model-facing context text. + +- basis: explicit \ No newline at end of file diff --git a/src/agents/contexts/spec/__tests__/spec-output.test.ts b/src/agents/contexts/spec/__tests__/spec-output.test.ts new file mode 100644 index 000000000..fcb7e8cc0 --- /dev/null +++ b/src/agents/contexts/spec/__tests__/spec-output.test.ts @@ -0,0 +1,54 @@ +import { describe, expect, it } from 'vitest'; + +import type { GraphNode } from '../../../../graph/schema/nodes.js'; +import { renderSpecMarkdownOutput } from '../spec-output.js'; + +const base = { + specId: 1, + basis: 'explicit', + createdAtLsn: 1, + updatedAtLsn: 1, +} as const; + +describe('renderSpecMarkdownOutput', () => { + it('flattens non-plan graph nodes into a markdown specification document', async () => { + const nodes: GraphNode[] = [ + { + ...base, + id: 3, + plane: 'plan', + kind: 'frontier', + kindOrdinal: 1, + title: 'Do not include planning nodes', + }, + { + ...base, + id: 1, + plane: 'intent', + kind: 'goal', + kindOrdinal: 1, + title: 'Capture decisions', + body: 'The product records specification decisions as graph truth.', + source: 'stakeholder', + }, + { + ...base, + id: 2, + plane: 'design', + kind: 'module', + kindOrdinal: 1, + title: 'Context renderer', + body: 'Renderer code owns model-facing context text.', + }, + ]; + + const rendered = renderSpecMarkdownOutput({ title: 'Widget Spec', nodes }); + + await expect(rendered).toMatchFileSnapshot('../__snapshots__/spec-output.md'); + expect(rendered).toContain('# Widget Spec'); + expect(rendered).toContain('## Intent'); + expect(rendered).toContain('### G1 Capture decisions'); + expect(rendered).toContain('## Design'); + expect(rendered).not.toContain('Do not include planning nodes'); + }); +}); diff --git a/src/agents/contexts/spec/spec-output.ts b/src/agents/contexts/spec/spec-output.ts new file mode 100644 index 000000000..691e1cd23 --- /dev/null +++ b/src/agents/contexts/spec/spec-output.ts @@ -0,0 +1,50 @@ +import { formatGraphNodeCode, type GraphNode } from '../../../graph/schema/nodes.js'; +import { joinMarkdownBlocks, markdownBullet, markdownHeading } from '../../shared/markdown.js'; + +export interface SpecMarkdownOutputInput { + readonly title: string; + readonly nodes: readonly GraphNode[]; +} + +const SPEC_PLANE_ORDER = ['intent', 'design', 'oracle'] as const; + +export function renderSpecMarkdownOutput(input: SpecMarkdownOutputInput): string { + const nodes = input.nodes.filter((node) => node.plane !== 'plan'); + return joinMarkdownBlocks( + markdownHeading(1, input.title), + ...SPEC_PLANE_ORDER.map((plane) => + renderPlane( + plane, + nodes.filter((node) => node.plane === plane), + ), + ), + ); +} + +function renderPlane(plane: (typeof SPEC_PLANE_ORDER)[number], nodes: readonly GraphNode[]): string { + if (nodes.length === 0) return ''; + return joinMarkdownBlocks( + markdownHeading(2, titleCase(plane)), + nodes.slice().sort(compareNodes).map(renderNode).join('\n\n'), + ); +} + +function renderNode(node: GraphNode): string { + const code = formatGraphNodeCode(node.kind, node.kindOrdinal); + return joinMarkdownBlocks( + markdownHeading(3, `${code} ${node.title}`), + node.body, + [ + markdownBullet(`basis: ${node.basis}`), + ...(node.source ? [markdownBullet(`source: ${node.source}`)] : []), + ].join('\n'), + ); +} + +function compareNodes(a: GraphNode, b: GraphNode): number { + return a.kind.localeCompare(b.kind) || a.kindOrdinal - b.kindOrdinal || a.id - b.id; +} + +function titleCase(value: string): string { + return `${value.slice(0, 1).toUpperCase()}${value.slice(1)}`; +} From 282c146570b017b6880e669791096daeb264f9a2 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:51:40 +0200 Subject: [PATCH 12/29] Add graph-derived plan markdown output --- ...rer-golden-coverage--render-prompt-lock.md | 2 +- src/agents/contexts/plan/README.md | 5 ++ .../plan/__snapshots__/plan-output.md | 24 +++++++ .../plan/__tests__/plan-output.test.ts | 63 +++++++++++++++++++ src/agents/contexts/plan/plan-output.ts | 50 +++++++++++++++ 5 files changed, 143 insertions(+), 1 deletion(-) create mode 100644 src/agents/contexts/plan/README.md create mode 100644 src/agents/contexts/plan/__snapshots__/plan-output.md create mode 100644 src/agents/contexts/plan/__tests__/plan-output.test.ts create mode 100644 src/agents/contexts/plan/plan-output.ts diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index 31d938172..7a0ec962c 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -50,7 +50,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Workspace context renderer | `have` | ● | earned | `src/agents/contexts/workspace/` | Snapshot coverage exists for cwd + overview context; preserve D83-L audience split. | | Specification context renderer | `built` | ● | earned | `src/agents/contexts/spec/` | Closed: selected-spec renderer now lives at `src/agents/contexts/spec/spec-context.ts`; imports, README, and snapshot test use the short `spec/` home while rendered output remains ``. | | Spec markdown document output | `built` | ● | earned | `src/agents/contexts/spec/spec-output.ts` | Closed: thin md-pen-backed renderer flattens non-plan graph nodes into a markdown specification document; snapshot proves this is graph-derived output, not a copy of `memory/SPEC.md`. | -| Plan markdown document output | `new` | ● | earned | `src/agents/contexts/plan/plan-output.ts` | Thin graph-derived flattened markdown output over plan-plane nodes (`milestone`, `frontier`, `slice`) using md-pen; not a copy of `memory/PLAN.md`. | +| Plan markdown document output | `built` | ● | earned | `src/agents/contexts/plan/plan-output.ts` | Closed: thin md-pen-backed renderer flattens plan-plane graph nodes (`milestone`, `frontier`, `slice`) into markdown; snapshot proves this is graph-derived output, not a copy of `memory/PLAN.md`. | | Graph overview / neighborhood / related-node renderers | `have` | ● | earned | `src/agents/contexts/graph/` | Snapshot coverage exists for overview, neighborhoods, and related nodes; preserve code handles and no structural-leak assertions. | | Session runtime frame renderer | `partial` | ● | earned | `src/agents/contexts/session/` | Existing snapshot still displays D98-sensitive strategy/lens runtime wording. Closure oracle: runtime frame wording either removes that state or frames it strictly as prompt-resource/internal conduct, then updates the golden. | | Turn/origination seed renderers | `partial` | ● | earned | `src/agents/contexts/seeds/` | Existing tests are semantic asserts. Closure oracle: stable seed text is snapshot-locked or intentionally reduced to invariant asserts with a README note explaining why wording is not a golden contract. | diff --git a/src/agents/contexts/plan/README.md b/src/agents/contexts/plan/README.md new file mode 100644 index 000000000..99e72ff7e --- /dev/null +++ b/src/agents/contexts/plan/README.md @@ -0,0 +1,5 @@ +# agents/contexts/plan/ — plan document output + +SPEC decisions: D60-L, D83-L + +Owns thin model-facing/document-output rendering for plan-plane graph nodes. This is graph-derived markdown output (`milestone`, `frontier`, `slice`) and is not a copy of `memory/PLAN.md`. Future web/download routes consume this renderer; they do not own plan text formatting. diff --git a/src/agents/contexts/plan/__snapshots__/plan-output.md b/src/agents/contexts/plan/__snapshots__/plan-output.md new file mode 100644 index 000000000..002478fdb --- /dev/null +++ b/src/agents/contexts/plan/__snapshots__/plan-output.md @@ -0,0 +1,24 @@ +# Widget Plan + +## Milestone + +### M1 Renderer coverage + +Close remaining renderer and prompt assembly rows. + +- basis: explicit + +## Frontier + +### F1 Golden lock + +Lock output with snapshots and semantic invariants. + +- basis: explicit + +## Slice + +### S1 Spec output + +- basis: explicit +- source: renderer-golden-coverage \ No newline at end of file diff --git a/src/agents/contexts/plan/__tests__/plan-output.test.ts b/src/agents/contexts/plan/__tests__/plan-output.test.ts new file mode 100644 index 000000000..007d41e3c --- /dev/null +++ b/src/agents/contexts/plan/__tests__/plan-output.test.ts @@ -0,0 +1,63 @@ +import { describe, expect, it } from 'vitest'; + +import type { GraphNode } from '../../../../graph/schema/nodes.js'; +import { renderPlanMarkdownOutput } from '../plan-output.js'; + +const base = { + specId: 1, + basis: 'explicit', + createdAtLsn: 1, + updatedAtLsn: 1, +} as const; + +describe('renderPlanMarkdownOutput', () => { + it('flattens plan-plane nodes into a markdown plan document', async () => { + const nodes: GraphNode[] = [ + { + ...base, + id: 1, + plane: 'intent', + kind: 'goal', + kindOrdinal: 1, + title: 'Do not include spec nodes', + }, + { + ...base, + id: 2, + plane: 'plan', + kind: 'milestone', + kindOrdinal: 1, + title: 'Renderer coverage', + body: 'Close remaining renderer and prompt assembly rows.', + }, + { + ...base, + id: 3, + plane: 'plan', + kind: 'frontier', + kindOrdinal: 1, + title: 'Golden lock', + body: 'Lock output with snapshots and semantic invariants.', + }, + { + ...base, + id: 4, + plane: 'plan', + kind: 'slice', + kindOrdinal: 1, + title: 'Spec output', + source: 'renderer-golden-coverage', + }, + ]; + + const rendered = renderPlanMarkdownOutput({ title: 'Widget Plan', nodes }); + + await expect(rendered).toMatchFileSnapshot('../__snapshots__/plan-output.md'); + expect(rendered).toContain('# Widget Plan'); + expect(rendered).toContain('## Milestone'); + expect(rendered).toContain('### M1 Renderer coverage'); + expect(rendered).toContain('## Frontier'); + expect(rendered).toContain('## Slice'); + expect(rendered).not.toContain('Do not include spec nodes'); + }); +}); diff --git a/src/agents/contexts/plan/plan-output.ts b/src/agents/contexts/plan/plan-output.ts new file mode 100644 index 000000000..620e349b2 --- /dev/null +++ b/src/agents/contexts/plan/plan-output.ts @@ -0,0 +1,50 @@ +import { formatGraphNodeCode, type GraphNode } from '../../../graph/schema/nodes.js'; +import { joinMarkdownBlocks, markdownBullet, markdownHeading } from '../../shared/markdown.js'; + +export interface PlanMarkdownOutputInput { + readonly title: string; + readonly nodes: readonly GraphNode[]; +} + +const PLAN_KIND_ORDER = ['milestone', 'frontier', 'slice'] as const; + +export function renderPlanMarkdownOutput(input: PlanMarkdownOutputInput): string { + const planNodes = input.nodes.filter((node) => node.plane === 'plan'); + return joinMarkdownBlocks( + markdownHeading(1, input.title), + ...PLAN_KIND_ORDER.map((kind) => + renderKind( + kind, + planNodes.filter((node) => node.kind === kind), + ), + ), + ); +} + +function renderKind(kind: (typeof PLAN_KIND_ORDER)[number], nodes: readonly GraphNode[]): string { + if (nodes.length === 0) return ''; + return joinMarkdownBlocks( + markdownHeading(2, titleCase(kind)), + nodes.slice().sort(compareNodes).map(renderNode).join('\n\n'), + ); +} + +function renderNode(node: GraphNode): string { + const code = formatGraphNodeCode(node.kind, node.kindOrdinal); + return joinMarkdownBlocks( + markdownHeading(3, `${code} ${node.title}`), + node.body, + [ + markdownBullet(`basis: ${node.basis}`), + ...(node.source ? [markdownBullet(`source: ${node.source}`)] : []), + ].join('\n'), + ); +} + +function compareNodes(a: GraphNode, b: GraphNode): number { + return a.kindOrdinal - b.kindOrdinal || a.id - b.id; +} + +function titleCase(value: string): string { + return `${value.slice(0, 1).toUpperCase()}${value.slice(1)}`; +} From 8e7828df034ec8ec70b0256b9f9929c564e569f5 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:52:22 +0200 Subject: [PATCH 13/29] Reframe session runtime prompt resources --- .../cards/renderer-golden-coverage--render-prompt-lock.md | 2 +- .../contexts/session/__snapshots__/runtime-frame-ready.md | 2 +- src/agents/contexts/session/__tests__/runtime-frame.test.ts | 6 +++++- src/agents/contexts/session/runtime-frame.ts | 2 +- 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index 7a0ec962c..db873ecb0 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -52,7 +52,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Spec markdown document output | `built` | ● | earned | `src/agents/contexts/spec/spec-output.ts` | Closed: thin md-pen-backed renderer flattens non-plan graph nodes into a markdown specification document; snapshot proves this is graph-derived output, not a copy of `memory/SPEC.md`. | | Plan markdown document output | `built` | ● | earned | `src/agents/contexts/plan/plan-output.ts` | Closed: thin md-pen-backed renderer flattens plan-plane graph nodes (`milestone`, `frontier`, `slice`) into markdown; snapshot proves this is graph-derived output, not a copy of `memory/PLAN.md`. | | Graph overview / neighborhood / related-node renderers | `have` | ● | earned | `src/agents/contexts/graph/` | Snapshot coverage exists for overview, neighborhoods, and related nodes; preserve code handles and no structural-leak assertions. | -| Session runtime frame renderer | `partial` | ● | earned | `src/agents/contexts/session/` | Existing snapshot still displays D98-sensitive strategy/lens runtime wording. Closure oracle: runtime frame wording either removes that state or frames it strictly as prompt-resource/internal conduct, then updates the golden. | +| Session runtime frame renderer | `built` | ● | earned | `src/agents/contexts/session/` | Closed: runtime-frame snapshot now frames strategy/lens values as prompt-resource selections, not runtime identity axes, and guards old `strategy=` / `lens=` / `goal=` wording. | | Turn/origination seed renderers | `partial` | ● | earned | `src/agents/contexts/seeds/` | Existing tests are semantic asserts. Closure oracle: stable seed text is snapshot-locked or intentionally reduced to invariant asserts with a README note explaining why wording is not a golden contract. | | Elicitation agenda/update text | `partial` | ● | earned | `src/agents/contexts/elicitation.ts` | No focused renderer test found. Closure oracle: agenda/update text has semantic invariant or snapshot coverage, including structural-illegal diagnostics. | | Structured-exchange result renderers | `partial` | ● | earned | `src/agents/contexts/exchanges/` | `present_candidates` and `present_review_set` have semantic asserts, other request/present renderers need inventory. Closure oracle: every registered model-facing exchange result has snapshot/semantic coverage or is explicitly retired/unregistered. | diff --git a/src/agents/contexts/session/__snapshots__/runtime-frame-ready.md b/src/agents/contexts/session/__snapshots__/runtime-frame-ready.md index f480f93ef..66386954d 100644 --- a/src/agents/contexts/session/__snapshots__/runtime-frame-ready.md +++ b/src/agents/contexts/session/__snapshots__/runtime-frame-ready.md @@ -1,7 +1,7 @@ [Selected session runtime frame] - status: ready - binding: spec #1; session session-1 -- agent: mode=elicit; role=elicitor; strategy=step-wise-disambiguate; lens=oracle +- agent: mode=elicit; role=elicitor; prompt_strategy_resource=step-wise-disambiguate; prompt_lens_resource=oracle - graph mentions: #D12 Decision seam @lsn 7 - file mentions: src/session/runtime-state.ts @git abc123 - world: graph_lsn=12; git_head=def456 diff --git a/src/agents/contexts/session/__tests__/runtime-frame.test.ts b/src/agents/contexts/session/__tests__/runtime-frame.test.ts index 590ebcb6d..9ae0c1f72 100644 --- a/src/agents/contexts/session/__tests__/runtime-frame.test.ts +++ b/src/agents/contexts/session/__tests__/runtime-frame.test.ts @@ -39,7 +39,11 @@ describe('renderRuntimeFrame', () => { await expect(rendered).toMatchFileSnapshot('../__snapshots__/runtime-frame-ready.md'); expect(rendered).toContain('#D12'); expect(rendered).not.toContain('node-1'); - expect(rendered).toContain('mode=elicit; role=elicitor; strategy=step-wise-disambiguate; lens=oracle'); + expect(rendered).toContain( + 'mode=elicit; role=elicitor; prompt_strategy_resource=step-wise-disambiguate; prompt_lens_resource=oracle', + ); + expect(rendered).not.toContain('strategy='); + expect(rendered).not.toContain('lens='); expect(rendered).not.toContain('goal='); }); diff --git a/src/agents/contexts/session/runtime-frame.ts b/src/agents/contexts/session/runtime-frame.ts index f0ca1b5d2..1b3aa5042 100644 --- a/src/agents/contexts/session/runtime-frame.ts +++ b/src/agents/contexts/session/runtime-frame.ts @@ -22,7 +22,7 @@ export function renderRuntimeFrame(input: SessionRuntimeFrameRenderInput): strin '[Selected session runtime frame]', '- status: ready', `- binding: spec #${input.specId}; session ${input.sessionId}`, - `- agent: mode=${input.agent.operationalMode}; role=${input.agent.role}; strategy=${input.agent.strategy}; lens=${input.agent.lens}`, + `- agent: mode=${input.agent.operationalMode}; role=${input.agent.role}; prompt_strategy_resource=${input.agent.strategy}; prompt_lens_resource=${input.agent.lens}`, `- graph mentions: ${renderGraphMentions(input.mentions.graphNodes)}`, `- file mentions: ${renderFileMentions(input.mentions.files)}`, `- world: graph_lsn=${input.world.graph.latestLsn ?? 'unknown'}; git_head=${input.world.git.head ?? 'unknown'}`, From bdd5516bb49c427773e8d41b7a0402600bffb16b Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:52:54 +0200 Subject: [PATCH 14/29] Document seed renderer invariant oracle --- memory/cards/renderer-golden-coverage--render-prompt-lock.md | 2 +- src/agents/contexts/seeds/README.md | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index db873ecb0..81ee8adff 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -53,7 +53,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Plan markdown document output | `built` | ● | earned | `src/agents/contexts/plan/plan-output.ts` | Closed: thin md-pen-backed renderer flattens plan-plane graph nodes (`milestone`, `frontier`, `slice`) into markdown; snapshot proves this is graph-derived output, not a copy of `memory/PLAN.md`. | | Graph overview / neighborhood / related-node renderers | `have` | ● | earned | `src/agents/contexts/graph/` | Snapshot coverage exists for overview, neighborhoods, and related nodes; preserve code handles and no structural-leak assertions. | | Session runtime frame renderer | `built` | ● | earned | `src/agents/contexts/session/` | Closed: runtime-frame snapshot now frames strategy/lens values as prompt-resource selections, not runtime identity axes, and guards old `strategy=` / `lens=` / `goal=` wording. | -| Turn/origination seed renderers | `partial` | ● | earned | `src/agents/contexts/seeds/` | Existing tests are semantic asserts. Closure oracle: stable seed text is snapshot-locked or intentionally reduced to invariant asserts with a README note explaining why wording is not a golden contract. | +| Turn/origination seed renderers | `built` | ● | earned | `src/agents/contexts/seeds/` | Closed: existing semantic tests remain the oracle, and the local README now states why seed wording is intentionally invariant-guarded rather than full-golden text. | | Elicitation agenda/update text | `partial` | ● | earned | `src/agents/contexts/elicitation.ts` | No focused renderer test found. Closure oracle: agenda/update text has semantic invariant or snapshot coverage, including structural-illegal diagnostics. | | Structured-exchange result renderers | `partial` | ● | earned | `src/agents/contexts/exchanges/` | `present_candidates` and `present_review_set` have semantic asserts, other request/present renderers need inventory. Closure oracle: every registered model-facing exchange result has snapshot/semantic coverage or is explicitly retired/unregistered. | | Human/product render audience split | `have` | ● | earned | `src/app/README.md`, `src/session/README.md`, `src/agents/contexts/README.md` | Current READMEs name app/session human text and agents model-facing text. Preserve; update only if topology changes above create drift. | diff --git a/src/agents/contexts/seeds/README.md b/src/agents/contexts/seeds/README.md index 20a2ca31e..8ff77700d 100644 --- a/src/agents/contexts/seeds/README.md +++ b/src/agents/contexts/seeds/README.md @@ -11,6 +11,8 @@ Seed text that Brunch deliberately inserts into model context: Both modules are pure over already-read data. Callers own PULL: graph reads, gap reads, workspace inspection, transcript-tail classification, and Pi/session side effects. +Seed wording is intentionally protected with semantic invariant tests rather than full goldens. These blocks are glue between already-goldened renderers and live prompt assembly; the contract is stable scope tags, selected workspace/spec/session facts, gap/readiness summaries, and graph-priority ordering, not exact prose beyond those invariants. + ## Boundary rules ```pseudo From b5b5af1c51c81088e2ac5d91b8c38f04e1da96bb Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:53:41 +0200 Subject: [PATCH 15/29] Cover elicitation context text --- ...rer-golden-coverage--render-prompt-lock.md | 2 +- .../contexts/__tests__/elicitation.test.ts | 30 +++++++++++++++++++ 2 files changed, 31 insertions(+), 1 deletion(-) create mode 100644 src/agents/contexts/__tests__/elicitation.test.ts diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index 81ee8adff..6e2cf0056 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -54,7 +54,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Graph overview / neighborhood / related-node renderers | `have` | ● | earned | `src/agents/contexts/graph/` | Snapshot coverage exists for overview, neighborhoods, and related nodes; preserve code handles and no structural-leak assertions. | | Session runtime frame renderer | `built` | ● | earned | `src/agents/contexts/session/` | Closed: runtime-frame snapshot now frames strategy/lens values as prompt-resource selections, not runtime identity axes, and guards old `strategy=` / `lens=` / `goal=` wording. | | Turn/origination seed renderers | `built` | ● | earned | `src/agents/contexts/seeds/` | Closed: existing semantic tests remain the oracle, and the local README now states why seed wording is intentionally invariant-guarded rather than full-golden text. | -| Elicitation agenda/update text | `partial` | ● | earned | `src/agents/contexts/elicitation.ts` | No focused renderer test found. Closure oracle: agenda/update text has semantic invariant or snapshot coverage, including structural-illegal diagnostics. | +| Elicitation agenda/update text | `built` | ● | earned | `src/agents/contexts/elicitation.ts` | Closed: focused semantic tests cover agenda/non-agenda rendering and structural-illegal update diagnostics. | | Structured-exchange result renderers | `partial` | ● | earned | `src/agents/contexts/exchanges/` | `present_candidates` and `present_review_set` have semantic asserts, other request/present renderers need inventory. Closure oracle: every registered model-facing exchange result has snapshot/semantic coverage or is explicitly retired/unregistered. | | Human/product render audience split | `have` | ● | earned | `src/app/README.md`, `src/session/README.md`, `src/agents/contexts/README.md` | Current READMEs name app/session human text and agents model-facing text. Preserve; update only if topology changes above create drift. | diff --git a/src/agents/contexts/__tests__/elicitation.test.ts b/src/agents/contexts/__tests__/elicitation.test.ts new file mode 100644 index 000000000..4f96b7933 --- /dev/null +++ b/src/agents/contexts/__tests__/elicitation.test.ts @@ -0,0 +1,30 @@ +import { describe, expect, it } from 'vitest'; + +import { presenceGap } from '../../../graph/schema/elicitation-gap-fixtures.js'; +import { formatElicitationAgenda, formatElicitationUpdateResult } from '../elicitation.js'; + +describe('elicitation context text', () => { + it('renders agenda and non-agenda gaps with stable semantic markers', () => { + const rendered = formatElicitationAgenda( + [presenceGap({ refersTo: 'constraint', coverage: 0.25, band: 'grounding', importance: 3 })], + [presenceGap({ refersTo: 'goal', coverage: 1, band: 'elicitation', answered: true })], + ); + + expect(rendered).toContain('[Elicitation agenda] 1 open question(s), ranked:'); + expect(rendered).toContain('refers to: constraint · band: grounding · importance: 3 · coverage: 0.25'); + expect(rendered).toContain('[Not on the agenda] 1 gap(s):'); + expect(rendered).toContain('(answered)'); + }); + + it('renders structural-illegal diagnostics for update failures', () => { + const rendered = formatElicitationUpdateResult( + { + status: 'structural_illegal', + diagnostics: [{ field: 'predicate', message: 'Unsupported predicate arm.' }], + }, + 'spawn', + ); + + expect(rendered).toBe('STRUCTURAL_ILLEGAL\n- predicate: Unsupported predicate arm.'); + }); +}); From abbdcc1f9c612b3776df96c7ef6116f54a8a51f4 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:55:11 +0200 Subject: [PATCH 16/29] Cover exchange renderer inventory --- ...rer-golden-coverage--render-prompt-lock.md | 2 +- .../exchange-renderer-inventory.test.ts | 96 +++++++++++++++++++ 2 files changed, 97 insertions(+), 1 deletion(-) create mode 100644 src/agents/contexts/exchanges/__tests__/exchange-renderer-inventory.test.ts diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md index 6e2cf0056..0a8778a35 100644 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ b/memory/cards/renderer-golden-coverage--render-prompt-lock.md @@ -55,7 +55,7 @@ Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is | Session runtime frame renderer | `built` | ● | earned | `src/agents/contexts/session/` | Closed: runtime-frame snapshot now frames strategy/lens values as prompt-resource selections, not runtime identity axes, and guards old `strategy=` / `lens=` / `goal=` wording. | | Turn/origination seed renderers | `built` | ● | earned | `src/agents/contexts/seeds/` | Closed: existing semantic tests remain the oracle, and the local README now states why seed wording is intentionally invariant-guarded rather than full-golden text. | | Elicitation agenda/update text | `built` | ● | earned | `src/agents/contexts/elicitation.ts` | Closed: focused semantic tests cover agenda/non-agenda rendering and structural-illegal update diagnostics. | -| Structured-exchange result renderers | `partial` | ● | earned | `src/agents/contexts/exchanges/` | `present_candidates` and `present_review_set` have semantic asserts, other request/present renderers need inventory. Closure oracle: every registered model-facing exchange result has snapshot/semantic coverage or is explicitly retired/unregistered. | +| Structured-exchange result renderers | `built` | ● | earned | `src/agents/contexts/exchanges/` | Closed: `present_candidates` / `present_review_set` keep focused tests, and the remaining present/request result renderers now have an inventory test covering normal plus unavailable/cancelled branches. | | Human/product render audience split | `have` | ● | earned | `src/app/README.md`, `src/session/README.md`, `src/agents/contexts/README.md` | Current READMEs name app/session human text and agents model-facing text. Preserve; update only if topology changes above create drift. | ## Ledger — deferred / tripwired rows diff --git a/src/agents/contexts/exchanges/__tests__/exchange-renderer-inventory.test.ts b/src/agents/contexts/exchanges/__tests__/exchange-renderer-inventory.test.ts new file mode 100644 index 000000000..6e2b6cdbd --- /dev/null +++ b/src/agents/contexts/exchanges/__tests__/exchange-renderer-inventory.test.ts @@ -0,0 +1,96 @@ +import { describe, expect, it } from 'vitest'; + +import { projectPresentQuestion } from '../../../../projections/exchanges/present-question.js'; +import { projectRequestAnswer } from '../../../../projections/exchanges/request-answer.js'; +import { projectRequestChoice } from '../../../../projections/exchanges/request-choice.js'; +import { projectRequestChoices } from '../../../../projections/exchanges/request-choices.js'; +import { projectRequestReview } from '../../../../projections/exchanges/request-review.js'; +import { formatPresentQuestion } from '../present-question.js'; +import { formatRequestAnswer } from '../request-answer.js'; +import { formatRequestChoice } from '../request-choice.js'; +import { formatRequestChoices } from '../request-choices.js'; +import { formatRequestResponseDiagnostic } from '../request-response.js'; +import { formatRequestReview } from '../request-review.js'; + +describe('structured-exchange renderer inventory', () => { + it('covers the request/present result renderers not snapshot-locked elsewhere', () => { + expect( + formatPresentQuestion( + projectPresentQuestion({ + exchangeId: 'ex-1', + heading: 'Choose a direction', + body: 'Pick one option.', + options: [{ id: 'a', content: 'Alpha', rationale: 'Fastest path.' }], + }), + ), + ).toContain('## 1. Alpha'); + + expect( + formatRequestAnswer( + projectRequestAnswer({ exchangeId: 'ex-1', status: 'answered', answer: 'Freeform answer' }), + ), + ).toContain('Freeform answer'); + expect( + formatRequestChoice( + projectRequestChoice({ + exchangeId: 'ex-1', + respondsToPresentTool: 'present_question', + status: 'answered', + choice: { id: 'a', label: 'Alpha', kind: 'listed' }, + comment: 'Because.', + }), + ), + ).toContain('Selected: **Alpha**'); + expect( + formatRequestChoices( + projectRequestChoices({ + exchangeId: 'ex-1', + status: 'answered', + choices: [{ id: 'a', label: 'Alpha*', kind: 'listed' }], + comment: 'Both.', + }), + ), + ).toContain('Alpha\\*'); + expect(formatRequestResponseDiagnostic({ message: 'Waiting for a structured response.' })).toContain( + 'Waiting for a structured response.', + ); + expect( + formatRequestReview( + projectRequestReview({ + exchangeId: 'ex-1', + status: 'answered', + review: 'request_changes', + comment: 'Tighten scope.', + }), + ), + ).toContain('Changes requested'); + }); + + it('keeps unavailable/cancelled branches model-facing and explicit', () => { + expect(formatRequestAnswer(projectRequestAnswer({ exchangeId: 'ex-1', status: 'cancelled' }))).toContain( + 'User cancelled', + ); + expect( + formatRequestChoice( + projectRequestChoice({ + exchangeId: 'ex-1', + respondsToPresentTool: 'present_question', + status: 'unavailable', + message: 'choice unavailable', + }), + ), + ).toContain('choice unavailable'); + expect( + formatRequestChoices( + projectRequestChoices({ + exchangeId: 'ex-1', + status: 'unavailable', + message: 'choices unavailable', + }), + ), + ).toContain('choices unavailable'); + expect(formatRequestReview(projectRequestReview({ exchangeId: 'ex-1', status: 'cancelled' }))).toContain( + 'User cancelled', + ); + }); +}); From 8f8230941c8d8018effdad4c74cffc451dc49e21 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:56:23 +0200 Subject: [PATCH 17/29] Copy executor prompt asset in build --- package.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/package.json b/package.json index 4447b8fca..756d85ed7 100644 --- a/package.json +++ b/package.json @@ -34,7 +34,7 @@ "build": "tsc -p tsconfig.build.json && npm run build:info && npm run build:pi-assets && npm run build:web", "build:info": "node scripts/write-build-info.mjs", "prepack": "RELEASE=true npm run build", - "build:pi-assets": "mkdir -p dist/.pi/components/workspace-dialog dist/.pi/extensions/subagents dist/agents/prompts dist/agents/skills dist/agents/contexts && cp -R src/.pi/components/workspace-dialog/assets dist/.pi/components/workspace-dialog/ && cp -R src/agents/prompts/elicitor src/agents/prompts/explorer src/agents/prompts/orchestrator src/agents/prompts/pi-coder src/agents/prompts/projector src/agents/prompts/researcher src/agents/prompts/reviewer dist/agents/prompts/ && cp -R src/agents/skills/strategies src/agents/skills/lenses src/agents/skills/methods dist/agents/skills/ && cp -R src/agents/contexts/references dist/agents/contexts/ && cp src/.pi/extensions/subagents/config.json dist/.pi/extensions/subagents/", + "build:pi-assets": "mkdir -p dist/.pi/components/workspace-dialog dist/.pi/extensions/subagents dist/agents/prompts dist/agents/skills dist/agents/contexts && cp -R src/.pi/components/workspace-dialog/assets dist/.pi/components/workspace-dialog/ && cp -R src/agents/prompts/elicitor src/agents/prompts/explorer src/agents/prompts/executor src/agents/prompts/pi-coder src/agents/prompts/projector src/agents/prompts/researcher src/agents/prompts/reviewer dist/agents/prompts/ && cp -R src/agents/skills/strategies src/agents/skills/lenses src/agents/skills/methods dist/agents/skills/ && cp -R src/agents/contexts/references dist/agents/contexts/ && cp src/.pi/extensions/subagents/config.json dist/.pi/extensions/subagents/", "build:web": "vite build", "seed": "tsx src/graph/seed-fixtures.ts", "generate:ontology": "tsx src/graph/schema/generate-ontology-ref.ts", From 7730eca8326f0a91510b7c8bc28189e5fb7132af Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 15:59:00 +0200 Subject: [PATCH 18/29] Reconcile renderer coverage completion --- memory/PLAN.md | 18 +-- ...rer-golden-coverage--render-prompt-lock.md | 112 ------------------ 2 files changed, 9 insertions(+), 121 deletions(-) delete mode 100644 memory/cards/renderer-golden-coverage--render-prompt-lock.md diff --git a/memory/PLAN.md b/memory/PLAN.md index 4ebcd0f2d..13354ff02 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -48,7 +48,7 @@ Brunch-next has delivered the original composition spine: the host, sealed Pi pr - **Done-definition:** all three capabilities carry promoted real-model evidence; no capability remains a stub or a method-less axis member. Open follow-ups (A32-L fan-in completion, the A1 anti-prompt) are tracked on their assumptions, not as arc blockers. - **Anchors:** D95-L, D96-L; A31-L–A35-L; I51-L. -### context-pipeline — ◐ active +### context-pipeline — ✓ done (2026-06-26) - **Goal:** lock the PULL → PROJECT → RENDER → COMPOSE context pipeline (D60-L). @@ -56,23 +56,24 @@ Brunch-next has delivered the original composition spine: the host, sealed Pi pr context-pipeline/ ├── PULL graph + session reads ✓ done ├── PROJECT projections/ ✓ done -├── RENDER agents/contexts + local human outputs ◐ open: renderer-golden-coverage (FE-870) -└── COMPOSE system-prompts + skills ✓ done* +├── RENDER agents/contexts + local human outputs ✓ done +└── COMPOSE system-prompts + skills ✓ done -*COMPOSE has one deferred full-stack real-rendered-context tripwire owned by RENDER. +COMPOSE's deferred full-stack real-rendered-context tripwire was discharged by `renderer-golden-coverage`. ``` -- **Done-definition:** every pipeline stage closed or owned by a live coverage frontier; the COMPOSE full-stack tripwire discharged by RENDER. `renderer-golden-coverage` is a parallel evidence/quality track, never a ship gate. +- **Done-definition:** ✓ every pipeline stage closed; the COMPOSE full-stack tripwire was discharged by RENDER. `renderer-golden-coverage` remains a parallel evidence/quality track, never a ship gate. - **Anchors:** D60-L; D83-L (RENDER house style). ## Sequencing ### Active -- `renderer-golden-coverage` (FE-1091) — active context-pipeline RENDER closure plus prompt-assembly lock. Golden the remaining model-facing context surfaces and system-prompt assembly while reshaping `src/agents/` prompts/subagents topology only as needed for that lock. +- None. ### Recently Completed +- 2026-06-26 `renderer-golden-coverage` (FE-1091) — **context-pipeline RENDER + COMPOSE tripwire complete.** Closed the render/prompt sweep ledger: context-reference harvest disposition, foreground `elicitor`/`executor` prompt-body topology, background subagent body topology, foreground and adapter prompt assembly goldens, background subagent assembly snapshot, `specification/` → `spec/` renderer home, graph-derived spec/plan markdown outputs, session runtime frame D98 wording, seed renderer oracle note, elicitation text coverage, and structured-exchange renderer inventory coverage. `npm run verify` passed after the asset-copy fix. - 2026-06-26 `data-model-legibility` (FE-1090) — **reference substrate complete.** Generated ontology tables are materialized from typed graph sources with `check:data-model`; authored graph-authoring heuristics are cited by `capture` + `commit-graph`; the final checkability/subtype audit closed with no schema/runtime expansion: progressive checkability is accepted only as skill-local oracle conduct, `checkability`/`strength` fields are rejected carrying cost, subtype enums are rejected as parallel ontology, and `detail.form` remains inert payload plus renderer hook. - 2026-06-25 `elicitor-generate` (FE-1059) — **generate capability done through promoted A31-L fan-out evidence.** Built slices: `present_candidates` tool/projection/renderer + pick path; intent/design/oracle facets under one plane-parameterized `generate-proposal` method; progressive-disclosure references; real-boot activation check; and real-model fan-out witness harness. Promoted run `.fixtures/runs/generate-fan-out/2026-06-24T16-51-13-704Z/` passed with `openai-codex/gpt-5.5`: oracle lens pinned, `SKILL.md` and `references/oracle.md` read, `present_candidates` emitted, no pre-prompt kick, no graph delta, no `mutate_graph`, and no approved review result. A32-L fan-in completion and the A1 anti-prompt remain follow-ups, not branch debt. - 2026-06-24 `subagent-reconciliation` (FE-1054) — foreground/background reconciliation complete through the execute-mode readiness target (D90-L-D93-L/I49-L): shared `AgentManifest`, code-owned background discovery, semi-permeable injected-world child sessions, sovereign grants gated by code-owned `canDelegate`, return rendering, and live `execute` -> `orchestrator` mode with a product-registered stub tool. `code` -> `pi-coder` remains future work. @@ -161,9 +162,8 @@ context-pipeline/ - **Linear:** [FE-1091](https://linear.app/hash/issue/FE-1091/renderer-golden-coverage-and-prompt-assembly-lock) - **Branch:** `ln/fe-1091-renderer-golden-coverage-and-prompt-assembly-lock` - **Kind:** coverage + build / hardening -- **Status:** scoped; first sweep ledger active. -- **Certainty:** earned — RENDER topology is now established; this frontier closes coverage, prompt assembly evidence, and stale topology ambiguity rather than proving a new seam. -- **Current execution pointer:** `memory/cards/renderer-golden-coverage--render-prompt-lock.md`. +- **Status:** done. Sweep ledger exhausted and retired. Foreground prompt bodies use `elicitor` / `executor`; background bodies are locked as subagent resources under the shared body-file convention; prompt assembly paths are golden/semantically locked; selected-spec context lives under `src/agents/contexts/spec/`; graph-derived spec/plan markdown outputs are covered; remaining renderer partials have focused goldens, semantic invariants, or an explicit README oracle note. +- **Certainty:** earned — RENDER topology is now established; this frontier closed coverage, prompt assembly evidence, and stale topology ambiguity rather than proving a new seam. - **Closes:** context-pipeline RENDER stage plus the COMPOSE full-stack real-rendered-context tripwire. - **Locks in:** D83-L house style for model-facing context surfaces and prompt assembly as a golden/semantic-invariant surface. - **Objective:** Finish the RENDER stage and lock system-prompt assembly as a golden surface. Remaining work lives by audience: model-facing context and prompt text under `src/agents/`, human/product text beside its app/session owner. Incidental prompt remodelling belongs here only when needed to make prompt assembly lockable: foreground prompts should collapse toward `elicitor` / `executor`, subagent prompt bodies should live as subagent resources, and `src/agents/` topology should make `contexts`, `prompts`, `runtime`, `shared`, `skills`, and `subagents` roles legible. This frontier also extends D83-L to thin graph-derived markdown document outputs for selected-spec and plan-plane material, as future web/download response sources. diff --git a/memory/cards/renderer-golden-coverage--render-prompt-lock.md b/memory/cards/renderer-golden-coverage--render-prompt-lock.md deleted file mode 100644 index 0a8778a35..000000000 --- a/memory/cards/renderer-golden-coverage--render-prompt-lock.md +++ /dev/null @@ -1,112 +0,0 @@ -# Renderer and prompt assembly lock ledger - -Frontier: renderer-golden-coverage -Status: active -Mode: sweep -Created: 2026-06-26 - -## Orientation - -- Containing seam: the `context-pipeline` RENDER stage plus its deferred COMPOSE tripwire. Frontier `renderer-golden-coverage` / FE-1091 is the Linear + branch boundary; this scope file is only the row ledger for that frontier. -- Handoff state: system-prompt assembly must be golden/semantically locked, and the user's `src/agents/` topology sketch (`contexts`, foreground `prompts`, `runtime`, `shared`, `skills`, `subagents`) is directional pressure, not a license for an unbounded rewrite. -- The previous `data-model-legibility` work closed generated ontology + graph-authoring references, but `src/agents/docs/context-reference-harvest.md` still carries unresolved candidate references. This sweep must either materialize, retire, or explicitly defer those candidates instead of assuming the ledger is fully worked. -- Main risk: locking stale D98-sensitive prompt/runtime-axis wording or old prompt-body topology in snapshots. Closure should delete aliases/dual homes rather than preserve compatibility shims. - -Posture: earned (inherited from `renderer-golden-coverage`). - -Frontier-level cross-cutting obligations: - -- Preserve D83-L house style for model-facing context text: markdown frame, TOON for large/unbounded uniform data, fenced tree for hierarchy, top-level `` / `` / `` scope clustering where applicable. -- Preserve D52-L / D60-L dependency direction: `agents/contexts` may render already-read facts but must not import adapters, app, RPC, web, or DB. -- Preserve D97-L provenance: generated vocabulary references come from typed graph sources; authored judgment references need concrete readers; prompt resources cite rather than restate shared references. -- Preserve D98-L: strategy/lens/method vocabulary may remain only as prompt-resource/internal conduct, not user-changeable session state or foreground-agent identity. -- Use deletion as closure: obsolete role/body aliases, stale docs, and superseded reference candidates should be removed or explicitly deferred, not bridged. - -## Sweep preflight - -1. **Boundary.** In scope: model-facing renderers under `src/agents/contexts/`, foreground/background prompt assembly, prompt body/reference topology under `src/agents/` when needed to make assembly lockable, and the local topology READMEs/tests that name those homes. Out of scope: new `project` capability behavior, CODE/orchestrator tool implementation, public RPC/UI changes, and human/product renderers except for README audience-split drift. -2. **Source-of-truth inputs.** SPEC D19-L, D40-L, D52-L, D58-L, D60-L, D62-L, D83-L, D97-L, D98-L; PLAN frontier `renderer-golden-coverage`; topology READMEs under `src/agents/`, `src/app/`, and `src/session/`; current renderer/prompt tests; `src/agents/docs/context-reference-harvest.md` for unresolved reference disposition only. -3. **Owners and closure oracles.** Each required row below names the canonical owner and a closure oracle: Vitest file snapshots, semantic invariant tests, import-boundary checks, topology README assertions, or a row-level explicit deferral tied to a plan assumption/frontier. -4. **Class.** Buildable-now. Deferred rows are marked `○` and tripwired to A33-L / `elicitor-project` or `orchestrator-tool-port`; they are not hidden required work for this sweep. -5. **Closed inventory.** This ledger is the inventory. If build discovers more than one genuinely missing renderer/prompt sub-seam, stop and route back through `ln-plan` instead of adding rows by symmetry. - -Aggregate DoD: every `●` row is `have` or `built`, and every `partial` row is either closed or explicitly reclassified with a named owner/tripwire. - -## Ledger — prompt topology and assembly - -| Capability | Status | Req | Fill | Owner / next | Notes | -| --- | --- | --- | --- | --- | --- | -| Foreground prompt-body topology is canonical | `built` | ● | earned | `src/agents/registry.ts`, `src/agents/prompts/README.md`, prompt body tests | Closed: foreground prompt bodies now use D98 target ids (`elicitor` / `executor`); the old foreground `orchestrator` body/home is removed. The `orchestrator_stub` tool name remains owned by `orchestrator-tool-port`, not prompt-body topology. | -| Background subagent body topology is canonical | `built` | ● | earned | `src/.pi/extensions/subagents/agents.ts`, prompt/subagent topology docs | Closed: background bodies intentionally stay under `src/agents/prompts//SYSTEM.md` as shared manifest body files, while `BACKGROUND_SUBAGENT_IDS` owns spawnability; README/tests now state that they are subagent resources, not foreground prompts. | -| Foreground prompt assembly golden lock | `built` | ● | earned | `src/agents/runtime/compose.ts`, `src/agents/runtime/__tests__/compose.test.ts`, snapshots | Closed: elicitor provider-facing assembly has file snapshots plus semantic invariants; prompt-resource strategy/lens wording is framed as routing hints rather than user-changeable foreground identity, and readiness-grade/goal-axis regressions are guarded. | -| Pi `before_agent_start` assembly path is wired to the same lock | `built` | ● | earned | `src/.pi/extensions/agent-runtime/system-prompts/` tests | Closed: adapter-level tests exercise the live `before_agent_start` path through Brunch body loading, selected-world reads, runtime-state projection, active-tool filtering, and `composeAgentPrompt` wording. | -| Background subagent prompt assembly golden lock | `built` | ● | earned | `src/.pi/extensions/subagents/prompt-assembly.ts`, `src/.pi/extensions/__tests__/subagents.test.ts` | Closed: explorer child prompt assembly has a file snapshot plus semantic invariants for body/control/injected-world/background routing, no foreground elicitation section, and sealed ambient Pi resources. | -| Context-reference harvest closure | `built` | ● | earned | `src/agents/docs/context-reference-harvest.md`, `src/agents/contexts/references/`, skill-local `references/` | Closed: materialized graph-authoring / ontology / oracle homes stay in their current owners; checkability/shared subtype candidates are rejected; elicitation-question hints are deferred to a future scoped reader; proposal/projection candidates are deferred to A33-L/`elicitor-project`. | - -## Ledger — model-facing context renderers - -| Capability | Status | Req | Fill | Owner / next | Notes | -| --- | --- | --- | --- | --- | --- | -| Workspace context renderer | `have` | ● | earned | `src/agents/contexts/workspace/` | Snapshot coverage exists for cwd + overview context; preserve D83-L audience split. | -| Specification context renderer | `built` | ● | earned | `src/agents/contexts/spec/` | Closed: selected-spec renderer now lives at `src/agents/contexts/spec/spec-context.ts`; imports, README, and snapshot test use the short `spec/` home while rendered output remains ``. | -| Spec markdown document output | `built` | ● | earned | `src/agents/contexts/spec/spec-output.ts` | Closed: thin md-pen-backed renderer flattens non-plan graph nodes into a markdown specification document; snapshot proves this is graph-derived output, not a copy of `memory/SPEC.md`. | -| Plan markdown document output | `built` | ● | earned | `src/agents/contexts/plan/plan-output.ts` | Closed: thin md-pen-backed renderer flattens plan-plane graph nodes (`milestone`, `frontier`, `slice`) into markdown; snapshot proves this is graph-derived output, not a copy of `memory/PLAN.md`. | -| Graph overview / neighborhood / related-node renderers | `have` | ● | earned | `src/agents/contexts/graph/` | Snapshot coverage exists for overview, neighborhoods, and related nodes; preserve code handles and no structural-leak assertions. | -| Session runtime frame renderer | `built` | ● | earned | `src/agents/contexts/session/` | Closed: runtime-frame snapshot now frames strategy/lens values as prompt-resource selections, not runtime identity axes, and guards old `strategy=` / `lens=` / `goal=` wording. | -| Turn/origination seed renderers | `built` | ● | earned | `src/agents/contexts/seeds/` | Closed: existing semantic tests remain the oracle, and the local README now states why seed wording is intentionally invariant-guarded rather than full-golden text. | -| Elicitation agenda/update text | `built` | ● | earned | `src/agents/contexts/elicitation.ts` | Closed: focused semantic tests cover agenda/non-agenda rendering and structural-illegal update diagnostics. | -| Structured-exchange result renderers | `built` | ● | earned | `src/agents/contexts/exchanges/` | Closed: `present_candidates` / `present_review_set` keep focused tests, and the remaining present/request result renderers now have an inventory test covering normal plus unavailable/cancelled branches. | -| Human/product render audience split | `have` | ● | earned | `src/app/README.md`, `src/session/README.md`, `src/agents/contexts/README.md` | Current READMEs name app/session human text and agents model-facing text. Preserve; update only if topology changes above create drift. | - -## Ledger — deferred / tripwired rows - -| Capability | Status | Req | Fill | Owner / next | Notes | -| --- | --- | --- | --- | --- | --- | -| `projection-guidance.md` shared reference | `spec` | ○ | proving | `elicitor-project` / A33-L | Wait-gated: project shape is design-gated; do not materialize a shared projection reference in this sweep unless a concrete second reader appears. | -| CODE executor tool behavior | `spec` | ○ | proving | `orchestrator-tool-port` | Out of this sweep except for prompt-body naming/topology needed to avoid locking a stale foreground body alias. Tool behavior and write-capable CODE policy stay with FE-1087. | -| New renderer family discovered during build | `new` | ○ | proving | route to `ln-plan` if more than one appears | Tripwire: adding several rows means this inventory was not closed. | - -## Row build order recommendation - -1. Close **Context-reference harvest closure** first so prompt/reference topology is not goldened against a half-dispositioned ledger. -2. Close **Foreground/background prompt-body topology** before accepting prompt assembly snapshots; snapshots should lock the final home, not a transitional shape. -3. Close foreground + adapter prompt assembly locks. -4. Close background subagent assembly lock. -5. Sweep remaining renderer partials (`session`, `seeds`, `elicitation`, `exchanges`) with file-scoped tests. - -## Expected touched paths (tentative) - -```text -memory/cards/ -└── renderer-golden-coverage--render-prompt-lock.md + -memory/PLAN.md ? -src/agents/ -├── README.md ? -├── registry.ts ? -├── __tests__/ -│ └── registry.test.ts ? -├── docs/ -│ └── context-reference-harvest.md ? -├── contexts/ -│ ├── README.md ? -│ ├── elicitation.ts ~ -│ ├── references/ ? -│ ├── plan/ ? -│ ├── seeds/ ? -│ ├── session/ ? -│ ├── spec/ ? -│ ├── specification/ - -│ └── exchanges/ ? -├── prompts/ ? -├── runtime/ -│ ├── README.md ? -│ ├── compose.ts ? -│ ├── __tests__/compose.test.ts ? -│ └── __snapshots__/ ? -└── subagents/ ? -src/.pi/extensions/ -├── agent-runtime/system-prompts/ ? -└── subagents/ ? -src/app/README.md ? -src/session/README.md ? -``` From 2314dc10eaeb163118cc9fa1cffbe3f06b756908 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 16:10:53 +0200 Subject: [PATCH 19/29] Reopen prompt topology flattening scope --- memory/PLAN.md | 22 ++-- memory/SPEC.md | 43 ++++--- ...lden-coverage--prompt-subagent-topology.md | 111 ++++++++++++++++++ 3 files changed, 146 insertions(+), 30 deletions(-) create mode 100644 memory/cards/renderer-golden-coverage--prompt-subagent-topology.md diff --git a/memory/PLAN.md b/memory/PLAN.md index 13354ff02..06acb9ec7 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -15,7 +15,7 @@ Brunch-next has delivered the original composition spine: the host, sealed Pi profile, transcript substrate, SQLite graph plane, public RPC, TUI/web observer shape, generalized capture, review-set commitment path, and public-entry ship gate all have evidence. The live plan is no longer organized around the old delivery cut. Active work is now the elicitor capability spine and the remaining hardening frontiers that build on that substrate. -**Active arcs.** Work is organized into multi-frontier **initiatives (arcs)** — see [§Initiatives](#initiatives) for through-lines, member frontiers, and done-definitions: the completed **skill-substrate** arc (populate / weed / lock), the active **elicitor-capability-spine** arc (`capture` / `generate` done, `project` next), and the active **context-pipeline** arc (PULL / PROJECT / COMPOSE locked, RENDER open). +**Active arcs.** Work is organized into multi-frontier **initiatives (arcs)** — see [§Initiatives](#initiatives) for through-lines, member frontiers, and done-definitions: the completed **skill-substrate** arc (populate / weed / lock), the active **elicitor-capability-spine** arc (`capture` / `generate` done, `project` next), and the active **context-pipeline** arc (PULL / PROJECT / COMPOSE locked, RENDER still open for final prompt/subagent topology closure). **Topology and evidence discipline.** Directory `README.md` files under `src/**` own current topology state. `memory/SPEC.md` owns product contract and architectural decisions; `memory/PLAN.md` owns only rolling frontier state. Scratch probe artifacts under `.fixtures/scratch/` are not durable evidence until reviewed and promoted to `.fixtures/runs/`. @@ -48,7 +48,7 @@ Brunch-next has delivered the original composition spine: the host, sealed Pi pr - **Done-definition:** all three capabilities carry promoted real-model evidence; no capability remains a stub or a method-less axis member. Open follow-ups (A32-L fan-in completion, the A1 anti-prompt) are tracked on their assumptions, not as arc blockers. - **Anchors:** D95-L, D96-L; A31-L–A35-L; I51-L. -### context-pipeline — ✓ done (2026-06-26) +### context-pipeline — ◐ active - **Goal:** lock the PULL → PROJECT → RENDER → COMPOSE context pipeline (D60-L). @@ -56,24 +56,23 @@ Brunch-next has delivered the original composition spine: the host, sealed Pi pr context-pipeline/ ├── PULL graph + session reads ✓ done ├── PROJECT projections/ ✓ done -├── RENDER agents/contexts + local human outputs ✓ done -└── COMPOSE system-prompts + skills ✓ done +├── RENDER agents/contexts + local human outputs ◐ open: prompt/subagent topology flattening +└── COMPOSE system-prompts + skills ✓ done* -COMPOSE's deferred full-stack real-rendered-context tripwire was discharged by `renderer-golden-coverage`. +*COMPOSE's deferred full-stack real-rendered-context tripwire is discharged, but RENDER remains open until prompt/subagent resource topology matches the accepted `src/agents/` shape. ``` -- **Done-definition:** ✓ every pipeline stage closed; the COMPOSE full-stack tripwire was discharged by RENDER. `renderer-golden-coverage` remains a parallel evidence/quality track, never a ship gate. +- **Done-definition:** every pipeline stage closed; COMPOSE's full-stack tripwire discharged by RENDER; foreground prompt bodies flattened under `src/agents/prompts/{elicitor,executor}.md`; background subagent bodies flattened under `src/agents/subagents/{explorer,researcher,projector,reviewer}.md`; no stale `prompts//SYSTEM.md` convention remains in docs, tests, or packaged asset copying. - **Anchors:** D60-L; D83-L (RENDER house style). ## Sequencing ### Active -- None. +- `renderer-golden-coverage` (FE-1091) — active continuation for prompt/subagent topology flattening after the render/prompt lock sweep. Prior sweep rows closed renderer and assembly evidence, but the accepted topology still requires flat foreground prompt files and a separate flat subagent resource home. ### Recently Completed -- 2026-06-26 `renderer-golden-coverage` (FE-1091) — **context-pipeline RENDER + COMPOSE tripwire complete.** Closed the render/prompt sweep ledger: context-reference harvest disposition, foreground `elicitor`/`executor` prompt-body topology, background subagent body topology, foreground and adapter prompt assembly goldens, background subagent assembly snapshot, `specification/` → `spec/` renderer home, graph-derived spec/plan markdown outputs, session runtime frame D98 wording, seed renderer oracle note, elicitation text coverage, and structured-exchange renderer inventory coverage. `npm run verify` passed after the asset-copy fix. - 2026-06-26 `data-model-legibility` (FE-1090) — **reference substrate complete.** Generated ontology tables are materialized from typed graph sources with `check:data-model`; authored graph-authoring heuristics are cited by `capture` + `commit-graph`; the final checkability/subtype audit closed with no schema/runtime expansion: progressive checkability is accepted only as skill-local oracle conduct, `checkability`/`strength` fields are rejected carrying cost, subtype enums are rejected as parallel ontology, and `detail.form` remains inert payload plus renderer hook. - 2026-06-25 `elicitor-generate` (FE-1059) — **generate capability done through promoted A31-L fan-out evidence.** Built slices: `present_candidates` tool/projection/renderer + pick path; intent/design/oracle facets under one plane-parameterized `generate-proposal` method; progressive-disclosure references; real-boot activation check; and real-model fan-out witness harness. Promoted run `.fixtures/runs/generate-fan-out/2026-06-24T16-51-13-704Z/` passed with `openai-codex/gpt-5.5`: oracle lens pinned, `SKILL.md` and `references/oracle.md` read, `present_candidates` emitted, no pre-prompt kick, no graph delta, no `mutate_graph`, and no approved review result. A32-L fan-in completion and the A1 anti-prompt remain follow-ups, not branch debt. - 2026-06-24 `subagent-reconciliation` (FE-1054) — foreground/background reconciliation complete through the execute-mode readiness target (D90-L-D93-L/I49-L): shared `AgentManifest`, code-owned background discovery, semi-permeable injected-world child sessions, sovereign grants gated by code-owned `canDelegate`, return rendering, and live `execute` -> `orchestrator` mode with a product-registered stub tool. `code` -> `pi-coder` remains future work. @@ -162,12 +161,13 @@ COMPOSE's deferred full-stack real-rendered-context tripwire was discharged by ` - **Linear:** [FE-1091](https://linear.app/hash/issue/FE-1091/renderer-golden-coverage-and-prompt-assembly-lock) - **Branch:** `ln/fe-1091-renderer-golden-coverage-and-prompt-assembly-lock` - **Kind:** coverage + build / hardening -- **Status:** done. Sweep ledger exhausted and retired. Foreground prompt bodies use `elicitor` / `executor`; background bodies are locked as subagent resources under the shared body-file convention; prompt assembly paths are golden/semantically locked; selected-spec context lives under `src/agents/contexts/spec/`; graph-derived spec/plan markdown outputs are covered; remaining renderer partials have focused goldens, semantic invariants, or an explicit README oracle note. +- **Status:** active. The render/prompt sweep ledger was exhausted for renderer and assembly evidence, but topology closure remains: foreground prompt bodies must flatten to `src/agents/prompts/{elicitor,executor}.md`, and background subagent bodies must flatten to `src/agents/subagents/{explorer,researcher,projector,reviewer}.md` rather than remaining under nested `prompts//SYSTEM.md` directories. +- **Current execution pointer:** `memory/cards/renderer-golden-coverage--prompt-subagent-topology.md`. - **Certainty:** earned — RENDER topology is now established; this frontier closed coverage, prompt assembly evidence, and stale topology ambiguity rather than proving a new seam. - **Closes:** context-pipeline RENDER stage plus the COMPOSE full-stack real-rendered-context tripwire. - **Locks in:** D83-L house style for model-facing context surfaces and prompt assembly as a golden/semantic-invariant surface. -- **Objective:** Finish the RENDER stage and lock system-prompt assembly as a golden surface. Remaining work lives by audience: model-facing context and prompt text under `src/agents/`, human/product text beside its app/session owner. Incidental prompt remodelling belongs here only when needed to make prompt assembly lockable: foreground prompts should collapse toward `elicitor` / `executor`, subagent prompt bodies should live as subagent resources, and `src/agents/` topology should make `contexts`, `prompts`, `runtime`, `shared`, `skills`, and `subagents` roles legible. This frontier also extends D83-L to thin graph-derived markdown document outputs for selected-spec and plan-plane material, as future web/download response sources. -- **Acceptance:** `src/agents/contexts/README.md`, `src/agents/prompts/README.md`, `src/agents/runtime/README.md`, `src/app/README.md`, and `src/session/README.md` carry the audience/topology split; required model-facing renderer rows are built in the house style and locked with focused goldens/semantic invariants; system prompt assembly is locked with goldens/semantic invariants; selected-spec context moves from `contexts/specification/specification-context.ts` to `contexts/spec/spec-context.ts`; `contexts/spec/spec-output.ts` and `contexts/plan/plan-output.ts` use md-pen to render thin markdown-flattened outputs from graph/projection input rather than from `memory/SPEC.md` / `memory/PLAN.md`; no adapter/transport imports enter `agents/contexts/`; prompt topology remodel deletes obsolete role/body aliases rather than preserving compatibility shims. +- **Objective:** Finish the RENDER stage and lock system-prompt assembly as a golden surface. Remaining work lives by audience: model-facing context and prompt text under `src/agents/`, human/product text beside its app/session owner. Incidental prompt remodelling belongs here only when needed to make prompt assembly lockable: foreground prompts flatten to `src/agents/prompts/elicitor.md` and `src/agents/prompts/executor.md`; subagent prompt bodies flatten to `src/agents/subagents/{explorer,reviewer,researcher,projector}.md`; `src/agents/` topology must make `contexts`, `prompts`, `runtime`, `shared`, `skills`, and `subagents` roles legible. This frontier also extends D83-L to thin graph-derived markdown document outputs for selected-spec and plan-plane material, as future web/download response sources. +- **Acceptance:** `src/agents/contexts/README.md`, `src/agents/prompts/README.md`, `src/agents/runtime/README.md`, `src/agents/subagents/README.md`, `src/app/README.md`, and `src/session/README.md` carry the audience/topology split; required model-facing renderer rows are built in the house style and locked with focused goldens/semantic invariants; system prompt assembly is locked with goldens/semantic invariants; selected-spec context moves from `contexts/specification/specification-context.ts` to `contexts/spec/spec-context.ts`; `contexts/spec/spec-output.ts` and `contexts/plan/plan-output.ts` use md-pen to render thin markdown-flattened outputs from graph/projection input rather than from `memory/SPEC.md` / `memory/PLAN.md`; foreground prompt files are flat (`prompts/elicitor.md`, `prompts/executor.md`); subagent files are flat under `subagents/`; no adapter/transport imports enter `agents/contexts/`; prompt topology remodel deletes obsolete role/body aliases rather than preserving compatibility shims. - **Traceability:** D19-L, D40-L, D52-L, D58-L, D60-L, D62-L, D83-L, D98-L. ### exchange-symmetry-audit diff --git a/memory/SPEC.md b/memory/SPEC.md index bce9db773..91d330cc5 100644 --- a/memory/SPEC.md +++ b/memory/SPEC.md @@ -267,17 +267,17 @@ The POC's purpose is to prove three things: (a) that pi's coding-agent harness c - **D46-L — Retired: commitment posture as persisted spec state.** Design and oracle lenses may still create accepted graph material, and cohesive review sets still commit atomically through `acceptReviewSet` per D27-L, but Brunch no longer models `pinning` or `commitment_focus` as spec-row state. Future commitment projection should derive from capability-readiness (D74-L), active strategy/lens/review-set state, and graph evidence rather than a persisted posture enum. Depends on: D27-L, D28-L, D45-L, D74-L. Supersedes: per-item requirement/criterion confirmation, treating design/oracle commitment phases as first permission to discuss design/oracle topics, and storing commitment posture/focus on the spec. - **D47-L — Structured-exchange `preface` is the near-term carrier for non-committed elicitor interpretation.** The structured-exchange payload's plain prose `preface` summarizes working context before the next question: exploratory file-reading/tool-use findings, implied graph candidates, low-confidence edges, and the rationale for what is being asked next. Preface text is transcript truth and user-visible orientation, but it is not graph truth, not candidate-artefact schema, and not a hidden side store. High-confidence facts still commit through `CommandExecutor`; low-confidence implications stay in preface/question material until clarified, accepted, or escalated to reconciliation needs. Future `capture_*` analysis entries provide a separate post-exchange/review evidence surface for candidate semantic changes; they do not replace preface as next-question orientation and do not become graph truth. Structured candidate metadata is deferred until fixtures/projections prove plain prose is insufficient. Depends on: D12-L, D18-L, D37-L, D50-L. Refined by: D82-L (the digest step generalizes the preface pattern for bulk acquisition — capture runs over the assistant-authored characterization, not the raw bulk). Supersedes: inventing a candidate-artefact substrate merely to carry ordinary next-question disambiguation material. - **D50-L — `capture_*` tools persist transcript-native ANALYSIS, not graph mutations.** Brunch may add a third structured-exchange tool family such as `capture_analysis` alongside `present_*` and `request_*`. A `capture_*` tool returns a normal persisted Pi `toolResult` with Brunch details and markdown content describing likely graph/node/edge changes, grouped into high-confidence candidates that could be committed later and low-confidence candidates that should drive clarification. `capture_*` output is transcript-visible evidence for Markdown/ASCII review and later graph-mutation cross-checking, but it is not graph truth and never bypasses the `CommandExecutor`. Product UI should hide capture analysis entirely if Pi exposes a supported hide seam; otherwise `renderResult` should be maximally collapsed/minimal while preserving full persisted `toolResult.content`/`details` for transcript renderers. The current schema layer deliberately defines only minimum capture details (`schema`, `v`, `exchange_id`, `tool_meta`) and rejects graph payloads; richer analysis payloads and shared component subparts (`Preface`, prompt body, option list, answer summary, capture analysis) require a later `ln-design` pass before implementation. Depends on: D12-L, D17-L, D18-L, D37-L, D41-L, D47-L. Supersedes: using ad hoc hidden custom entries, probe-only side files, or graph writes as the first carrier for pre-graph analysis. -- **D44-L — Subagents are main-agent-invoked, blocking Pi tool calls that gather data and propose variants through sealed SDK child sessions.** Brunch may register a single `subagent` Pi tool whose parameters are either `{ agent, task }` or `{ tasks: [] }` (parallel), never both. Each invocation runs an in-process SDK `AgentSession` built from explicit sealed services: in-memory settings/auth/session managers, no ambient resource discovery, a per-agent system prompt, the parent's model registry, and an explicit read-only/no-tool allowlist. The subagent has no inherited conversation context, so the task string must carry everything it needs. Background agent definitions are declarative `SYSTEM.md` bodies under the shared `src/agents/prompts//SYSTEM.md` convention with TypeBox-validated frontmatter (`name`, `description`, `tools`, `model`, `thinking`) plus a system-prompt body; duplicate frontmatter keys fail loud. Concurrency cap lives in [src/.pi/extensions/subagents/config.json](src/.pi/extensions/subagents/config.json) (default 4). The subagent's result text is returned directly to the main agent as tool result content; subagents do not append custom messages to the session log on their own behalf, do not invoke the `CommandExecutor`, and do not gain access to the parent's Brunch RPC handlers. Registration is opt-in: `src/app/pi-subagents.ts` can assemble deps for `createBrunchPiExtensions(...)`, but launch paths that omit those deps do not register or advertise the tool. POC starter agents split into two families: +- **D44-L — Subagents are main-agent-invoked, blocking Pi tool calls that gather data and propose variants through sealed SDK child sessions.** Brunch may register a single `subagent` Pi tool whose parameters are either `{ agent, task }` or `{ tasks: [] }` (parallel), never both. Each invocation runs an in-process SDK `AgentSession` built from explicit sealed services: in-memory settings/auth/session managers, no ambient resource discovery, a per-agent system prompt, the parent's model registry, and an explicit read-only/no-tool allowlist. The subagent has no inherited conversation context, so the task string must carry everything it needs. Background agent definitions are declarative flat markdown files under `src/agents/subagents/.md` with TypeBox-validated frontmatter (`name`, `description`, `tools`, `model`, `thinking`) plus a system-prompt body; duplicate frontmatter keys fail loud. Concurrency cap lives in [src/.pi/extensions/subagents/config.json](src/.pi/extensions/subagents/config.json) (default 4). The subagent's result text is returned directly to the main agent as tool result content; subagents do not append custom messages to the session log on their own behalf, do not invoke the `CommandExecutor`, and do not gain access to the parent's Brunch RPC handlers. Registration is opt-in: `src/app/pi-subagents.ts` can assemble deps for `createBrunchPiExtensions(...)`, but launch paths that omit those deps do not register or advertise the tool. POC starter agents split into two families: - **Data gatherers** — read-only context fetchers whose output grounds proposals: **explorer** (codebase + selected-spec graph recon: `read`, `grep`, `find`, `ls`, `read_graph`) and **researcher** (web research: `web_search`, `web_fetch`). `read_graph` is granted only when the app root injects parent graph readers; no write-capable graph child exists yet. - **Projectors/reviewers** — **projector** (no tools) emits one variant of a candidate proposal from a grounding bundle and lens frame; **reviewer** (no tools) checks supplied candidate material before main-agent presentation or commitment. The main agent achieves diversity by issuing parallel `tasks: []` invocations of `projector` with intentionally distinct framings — the subagent realization of the "design it twice" pattern from `ln-design` and the parallel fan-out anticipated by `ln-oracles`. Each `projector` invocation runs in its own isolated context so variants don't cross-contaminate; the main agent collects outputs and owns any product write. This division mirrors the batch-proposal flow in D26-L. Worker-style write-capable subagents and nesting remain deferred beyond the initial foreground-agent standup; D98-L moves the future write/execution surface under CODE/executor rather than a separate execute/orchestrator mode. Cross-extension agent registration (Amos's `globalThis.__pi_subagents` bridge), raw `pi` subprocesses, and ambient `~/.pi` discovery are rejected for the POC because they conflict with profile sealing. Subagents remain an optional enhancement to candidate-proposal diversity and future delegated acquisition, not a load-bearing M0–M9 substrate. Depends on: D2-L, D26-L, D27-L, D30-L, D31-L, D39-L, D40-L, D41-L. Distinct from: D15-L Side task (non-blocking, status-via-custom-message), the deferred Side chat (user-invoked overlay; see Future Direction Register). Supersedes: subprocess/argv-shaped subagents and the `globalThis.__pi_subagents` bridge. Refined by: D90-L (shared foreground/background manifest + code-owned background discovery), D91-L (semi-permeable seal + assembled prompt + injected world), D92-L (sovereign tool grants + op_mode delegatable-set gate). -- **D90-L — Foreground and background agents share one manifest model; background discovery is code-owned (frontmatter is authoring DX, not a second agent model).** Agent definitions project into one `AgentManifest` (`id`, `kind`, `description`, `model`, `thinking`, body at the canonical `src/agents/prompts//SYSTEM.md` convention, a skills grant, a tools grant, and a `canDelegate` set naming the background agents it may spawn — D92-L/D93-L) discriminated by `kind: "foreground" | "background"` — the execution **lifecycle/host**, not a noun: a foreground agent is a live op_mode-derived Pi session; a background agent is a spawned-to-completion sealed child. The kinds keep **distinct authority sources**: a foreground agent's identity is derived from `op_mode` (D40-L) and its tool/skill legality is dynamic (op_mode policy + live gaps); a background agent's identity is caller-chosen (`{agent, task}`) and its skills/tools come from its authored manifest. DX-vs-strictness is reconciled by keeping **frontmatter as the authoring surface** for background agents while making **discovery code-owned**: the `readdir` scan over `agents/*.md` is retired for an explicit registry id list (mirroring how `state.ts` loads foreground bodies/skills through `loadSkills({ skillPaths, includeDefaults: false })`), so D39-L "no filesystem discovery" holds and frontmatter authoring survives. "subagent" stays the tool/UX noun (the main-agent tool call), not the kind name. Depends on: D39-L, D40-L, D44-L, D58-L. Refines: D44-L (the parallel frontmatter-discovered format collapses into the shared manifest; background agent bodies migrated from extension-local `.md` discovery onto the canonical `src/agents/prompts//SYSTEM.md` convention, so SPEC carries one agent-body layout — D44-L and the `src/.pi/extensions/subagents/README.md` topology notes reconcile to that path). Establishing frontier: `subagent-reconciliation`. Supersedes: `readdir` filesystem discovery of subagent definitions; the standalone subagent frontmatter format as a second, separate agent model. +- **D90-L — Foreground and background agents share one manifest model; background discovery is code-owned (frontmatter is authoring DX, not a second agent model).** Agent definitions project into one `AgentManifest` (`id`, `kind`, `description`, `model`, `thinking`, body at the canonical flat foreground prompt or subagent resource path, a skills grant, a tools grant, and a `canDelegate` set naming the background agents it may spawn — D92-L/D93-L) discriminated by `kind: "foreground" | "background"` — the execution **lifecycle/host**, not a noun: a foreground agent is a live op_mode-derived Pi session; a background agent is a spawned-to-completion sealed child. The kinds keep **distinct authority sources**: a foreground agent's identity is derived from `op_mode` (D40-L) and its tool/skill legality is dynamic (op_mode policy + live gaps); a background agent's identity is caller-chosen (`{agent, task}`) and its skills/tools come from its authored manifest. DX-vs-strictness is reconciled by keeping **frontmatter as the authoring surface** for background agents while making **discovery code-owned**: the `readdir` scan over `agents/*.md` is retired for an explicit registry id list (mirroring how `state.ts` loads foreground bodies/skills through `loadSkills({ skillPaths, includeDefaults: false })`), so D39-L "no filesystem discovery" holds and frontmatter authoring survives. "subagent" stays the tool/UX noun (the main-agent tool call), not the kind name. Depends on: D39-L, D40-L, D44-L, D58-L. Refines: D44-L (the parallel frontmatter-discovered format collapses into the shared manifest; background agent bodies migrate from extension-local `.md` discovery onto the canonical `src/agents/subagents/.md` resource home, while foreground bodies live in `src/agents/prompts/{elicitor,executor}.md`; D44-L and the `src/.pi/extensions/subagents/README.md` topology notes reconcile to that split). Establishing frontier: `subagent-reconciliation`; final topology closure rides `renderer-golden-coverage`. Supersedes: `readdir` filesystem discovery of subagent definitions; the standalone subagent frontmatter format as a second, separate agent model; nested `src/agents/prompts//SYSTEM.md` body paths. - **D91-L — Background subagents run a semi-permeable seal: explicitly-injected parent world handles plus an assembled (not verbatim) prompt; ambient leakage stays closed.** This deliberately reopens the D44-L/I29-L "no graph access, no Brunch RPC, no inherited context" clause. The seal stays closed against **ambient** leakage (in-memory auth/settings/session, no `~/.pi` discovery — D39-L intact) but opens to **explicitly injected** parent world handles the app root (`src/app/pi-subagents.ts`) supplies at spawn: the same `GraphReaders` the foreground uses scoped to the parent's `specId`, the spec/workspace context seed, and a bounded **session digest** (the parent branch flattened via `sessionManager.getBranch()`, the pattern in pi's `summarize.ts` example). The child's system prompt becomes **assembled, not verbatim**: body + a background control header (sealed child, delegated task, snapshot view) + world snapshot + a `` manifest built from the manifest's skills grant + router rules — reusing the foreground composer's extracted prompt-skill core (`renderBrunchSkills`, the skill-manifest loader) plus the selected workspace/spec seed renderer from `src/agents/contexts/seeds/turn-context.ts`, minus the foreground-only elicitation-recommendation block. World binding is **snapshot-at-spawn** (the child runs to completion against a fixed view) where the foreground is live-per-turn. Read access is asymmetric **by design**: the **session digest** is a snapshot block baked into the prompt (expensive, rarely re-pulled), while the **graph** is exposed as Brunch read tools (`read_graph` now; `read_session_context`, `read_elicitation_gaps`, … remain future grants) the child calls on demand (a recon agent iterates on graph). Return to the main agent is the ordinary tool-call result: findings re-enter main-agent context as the tool-result `content`; the structured `details` payload (`{ agent, status, text, … }`) is render-only via custom `renderCall`/`renderResult`, never model context. Write-capable children stay deferred (gated by D92-L); when they land, a `mutate_graph` against the parent's `specId` is a real side effect crossing back *outside* the tool result, and is named here so the write slice does not surprise. Depends on: D39-L, D43-L, D44-L, D58-L, D60-L, D82-L. Establishing frontier: `subagent-reconciliation`. Supersedes: the D44-L/I29-L clause that subagents have no graph access, no Brunch RPC/graph reads, no inherited world context, and a verbatim-body system prompt. - **D92-L — Background tool grants are sovereign per-agent ceilings gated by a code-owned, op_mode-keyed delegatable-set allowlist — not parent-subset containment.** The earlier containment invariant (child tools ⊆ the parent's current legal set) is rejected: delegation may be **capability-inverting on purpose** — a foreground agent may spawn a narrow higher-privilege child (e.g. a file-writing worker) so a risky write is quarantined in a child that does one job and exits. Each background agent's tool grant is therefore **sovereign** (authored in its manifest; may exceed the parent's). The surviving safety boundary is not a tool subset but **which background agents an op_mode may spawn**: a **code-owned, op_mode-keyed delegatable-set allowlist** living beside the op_mode policy, *not* authored in frontmatter (otherwise a manifest could self-advertise into a read-only mode). This lifts D40-L's registration ≠ advertisement from tools to agents: every background agent is registered; op_mode decides which are advertised as spawnable. A read-only `elicit` session is write-safe because elicit's delegatable set **excludes** write-capable agents, not because children are subset-bounded. Enabling write tools later = author the write-capable worker manifest + add it to the relevant op_mode's delegatable set (an advertisement change), not a re-derivation of parent authority. Depends on: D39-L, D40-L, D44-L. Establishing frontier: `subagent-reconciliation`. Refined by: D93-L (the delegatable-set allowlist becomes a per-agent `AgentManifest` `canDelegate` field; for a foreground mode it is that mode's code-owned delegatable set, and it generalizes to background→background nesting). Supersedes: the parent-subset tool-containment model for subagents; D44-L's "read-only/no-tool allowlist" as the only background tool posture; the framing that write-capable subagents wait on an execute mode raising both parent and child ceilings together. - **D93-L — Operational mode and foreground agent collapse to one op-mode-keyed source of truth.** A foreground agent and its operational mode are 1:1 (D40-L: the foreground agent is derived from operational mode), so the prior **three-record fragmentation** — id enums in `src/session/schema/kinds.ts`, `OPERATIONAL_MODE_DEFINITIONS` + `AGENT_ROLE_DEFINITIONS` + `TOOL_POLICY_DEFINITIONS` in the former projections runtime-policy module, and `AGENT_PROMPT_DEFINITIONS` in `src/agents/runtime/state.ts` (which duplicated `model`/`thinking`/prompt-resource grants across two of them) — collapses to a **single op-mode-keyed record**. An operational mode IS `{ foreground AgentManifest (D90-L), tool policy, canDelegate set }`; background agents live in a sibling `AgentManifest` registry, and the per-agent **`canDelegate`** field (D92-L generalized from op_mode-keyed to a manifest field) links a foreground mode to the background agents it may spawn — **code-owned for foreground modes** so the write-safety boundary (I49-L) holds; it also generalizes to background→background nesting. D98-L refines the roster from the earlier `elicit` / `execute` / `code` split to the product target **`SPEC` → `elicitor`** and **`CODE` → `executor`**. The executor merges the prior `orchestrator` and `pi-coder` directions: it is Brunch-data-aware, can perform ordinary coding-assistant work under the CODE tool policy, and owns the plan-execution orchestration tool surface instead of forcing a separate execute coordinator. Depends on: D23-L, D40-L, D58-L, D90-L, D92-L, D98-L; I49-L. Establishing frontier: `subagent-reconciliation` established the shared manifest/collapse substrate; the SPEC/CODE roster correction is owned by the data-model-legibility / executor follow-on planning. Supersedes: the three-record foreground-agent fragmentation as separate sources of truth; `defaultRole`/`allowedRoles` as a flexible many-roles-per-mode model (it is 1:1); and the three-foreground-mode split where `execute`/`orchestrator` and `code`/`pi-coder` were separate product directions. - **D36-L — Spec/session selection is a reusable hierarchical decision model with transport-specific presentations.** Brunch owns a pure spec/session selection model that renders cwd-scoped inventory under the discovered project name without calling the user-created object a “workspace”. In TUI mode, the model may present a fast “continue last session” affordance when `.brunch/workspace.json` points to a valid spec+session; otherwise, or after “other spec/session”, the durable tree is: `create new spec → provide spec name → session created automatically`; `resume existing spec → choose existing spec → create a new session OR resume existing session → choose existing session`. The UI should not list every spec as a top-level action label; “resume existing spec” is the top-level intent, and the spec list is the next screen/scrollable selector. The model returns a product decision (`new spec`, `new session for spec`, `open session`, `continue selected session`, `cancel/quit`) without opening Pi sessions or mutating `.brunch/workspace.json` itself. The `WorkspaceSessionCoordinator` activates that decision and owns all persistence/session-binding effects. TUI startup and in-session paths share branded `pi-tui` components and colocated logo assets under `src/.pi/components/workspace-dialog`; adapters differ only in terminal lifecycle and Pi session-replacement mechanics (`ProcessTerminal`/`TUI.showOverlay` before Pi starts, `ctx.ui.custom(..., { overlay: true })` inside Pi), not in product semantics. RPC/headless transports must not invoke the TUI picker; they expose the same initial-selection requirement and activation decisions as JSON-RPC/product results so CLI JSON-RPC clients can select or create spec/session correctly. Depends on: D11-L, D21-L, D24-L, D33-L. Supersedes: implicit resume of `.brunch/workspace.json` on TUI launch, Pi `/resume`/`/new` as Brunch's product session chooser, one-off startup-only picker implementations, a flat action list that says “workspace” for specs, top-level `resume spec X` labels, and a separate intermediate action chooser for switching. - **D42-L — Session naming is Pi `session_info` presentation metadata, not spec identity.** Brunch-created sessions should be named at creation with neutral workspace-global defaults (`Untitled Session 1`, `Untitled Session 2`, …) so pickers/chrome never show an unnamed Brunch session and unchanged defaults do not collide across specs in the same cwd. These defaults are immediate lifecycle metadata, not LLM-generated summaries and not derived from the selected spec title. Brunch may later use Pi session lifecycle hooks to opportunistically replace a default with a short human-readable name that characterizes what happened in the transcript. The preferred generation trigger is `session_shutdown` for `quit`, `new`, and `resume` replacements because it sees the just-finished transcript and can name it before later picker lists need to distinguish sessions; `session_before_compact` or post-compaction (`session_compact`) may be used to refresh names after major summarization, and a manual/user rename command can force or override naming. The generation call should mirror the model-selection pattern in the local `summarize.ts` extension example: choose a cheap/fast authorized model, extract user/assistant text plus salient tool calls from the current branch, ask for a concise title, and append a Pi `session_info` entry through `SessionManager.appendSessionInfo`. Naming must be best-effort and non-blocking with a tight budget: failures, missing auth, empty transcripts, or shutdown aborts preserve the existing default/user label rather than blocking session replacement or exit. Session display names label sessions in pickers and chrome, but do not affect spec ids, session bindings, graph truth, or replay semantics. Depends on: D6-L, D17-L, D21-L, D35-L. Supersedes: using spec title or session UUID alone as the only durable display label once transcripts have meaningful content, leaving Brunch-created sessions unnamed, spec-local default numbering, or treating generated session names as canonical spec identity. -- **D58-L — Brunch prompt composition is a thin runtime header plus load-on-demand prompt resources, not eager selection of every objective pack.** The architectural commitment is: composition stays a projection layer, not a behavioral state machine; detailed guidance lives in read-on-demand prompt resources and agent-readable references rather than eager prompt-pack concatenation; runtime availability is Brunch's sealed resource manifest, not ambient Pi discovery; D98-L suspends prompt-resource axes as runtime state, so composition may advertise resources/pointers without presenting strategy/lens/method as selected posture; and the pushed-context slice stays compact, with deeper access governed by D60-L. Current prompt-resource topology, manifest emission, file-owned skill metadata, seed context composition, and ownership split across `agents/prompts/`, `agents/skills/`, `agents/runtime/`, `agents/contexts/`, and `.pi/extensions/agent-runtime/` live in [`src/agents/README.md`](src/agents/README.md), [`src/agents/prompts/README.md`](src/agents/prompts/README.md), [`src/agents/skills/README.md`](src/agents/skills/README.md), [`src/agents/runtime/README.md`](src/agents/runtime/README.md), [`src/agents/contexts/README.md`](src/agents/contexts/README.md), [`src/.pi/README.md`](src/.pi/README.md), [`src/.pi/extensions/README.md`](src/.pi/extensions/README.md), [`src/agents/runtime/compose.ts`](src/agents/runtime/compose.ts), [`src/agents/runtime/state.ts`](src/agents/runtime/state.ts), and [`src/agents/contexts/seeds/turn-context.ts`](src/agents/contexts/seeds/turn-context.ts). **Base-prompt relationship (validated 2026-06-18, slice 1):** the `before_agent_start` handler **appends** Brunch's composed block (now led by the agent `SYSTEM.md` body, then runtime header + manifests) to Pi's base system prompt (`${basePrompt}\n\n${composed}`), so a foreground agent currently *augments* Pi's base coding-agent prompt rather than replacing it. Whether a foreground role's `SYSTEM.md` body should suppress or replace that base is **open** and tied to the future `pi-coder` op-mode (which deliberately augments Pi's coding agent); the `elicitor` augmenting a coding base is a known follow-on question, not a settled choice. Refined by: D93-L (the `code`→`pi-coder` foreground mode instantiates the augment case; the replace option for other roles stays open). Composition is projection, not a behavioral state machine. Depends on: D23-L, D25-L, D39-L, D40-L, D52-L, D59-L, D60-L. Refined by: D85-L (implemented 2026-06-18/19: the manifest drops `` — two axes `strategy` + `lens` — and the `goal` body inlines into the `elicitor` role prompt) and by the 2026-06-22 prompt-skill-topology slice (all prompt resources adopt Agent Skills `SKILL.md` topology; `description` becomes file-owned frontmatter; the emitted wrapper becomes `` with per-skill ``). Supersedes: the flat "base + mode + role + strategy + lens + grade + …" layering; the fixed all-packs concatenation in `compose-brunch-prompt.ts`; "role preset / runtime bundle" as the composition unit; direct Layer-2 eager prompt-pack injection as the default mechanism; treating top-level `src/agents/` as Pi-only rather than Brunch LLM-context ingress; and `capability` as a parallel name for `method`. +- **D58-L — Brunch prompt composition is a thin runtime header plus load-on-demand prompt resources, not eager selection of every objective pack.** The architectural commitment is: composition stays a projection layer, not a behavioral state machine; detailed guidance lives in read-on-demand prompt resources and agent-readable references rather than eager prompt-pack concatenation; runtime availability is Brunch's sealed resource manifest, not ambient Pi discovery; D98-L suspends prompt-resource axes as runtime state, so composition may advertise resources/pointers without presenting strategy/lens/method as selected posture; and the pushed-context slice stays compact, with deeper access governed by D60-L. Current prompt-resource topology, manifest emission, file-owned skill metadata, seed context composition, and ownership split across `agents/prompts/`, `agents/subagents/`, `agents/skills/`, `agents/runtime/`, `agents/contexts/`, and `.pi/extensions/agent-runtime/` live in [`src/agents/README.md`](src/agents/README.md), [`src/agents/prompts/README.md`](src/agents/prompts/README.md), [`src/agents/subagents/README.md`](src/agents/subagents/README.md), [`src/agents/skills/README.md`](src/agents/skills/README.md), [`src/agents/runtime/README.md`](src/agents/runtime/README.md), [`src/agents/contexts/README.md`](src/agents/contexts/README.md), [`src/.pi/README.md`](src/.pi/README.md), [`src/.pi/extensions/README.md`](src/.pi/extensions/README.md), [`src/agents/runtime/compose.ts`](src/agents/runtime/compose.ts), [`src/agents/runtime/state.ts`](src/agents/runtime/state.ts), and [`src/agents/contexts/seeds/turn-context.ts`](src/agents/contexts/seeds/turn-context.ts). **Base-prompt relationship (validated 2026-06-18, slice 1):** the `before_agent_start` handler **appends** Brunch's composed block (now led by the foreground prompt body, then runtime header + manifests) to Pi's base system prompt (`${basePrompt}\n\n${composed}`), so a foreground agent currently *augments* Pi's base coding-agent prompt rather than replacing it. Whether a foreground prompt body should suppress or replace that base is **open** and tied to the future executor/CODE op-mode (which deliberately augments Pi's coding agent); the `elicitor` augmenting a coding base is a known follow-on question, not a settled choice. Refined by: D93-L (the `code`→`pi-coder` foreground mode instantiates the augment case; the replace option for other roles stays open). Composition is projection, not a behavioral state machine. Depends on: D23-L, D25-L, D39-L, D40-L, D52-L, D59-L, D60-L. Refined by: D85-L (implemented 2026-06-18/19: the manifest drops `` — two axes `strategy` + `lens` — and the `goal` body inlines into the `elicitor` role prompt) and by the 2026-06-22 prompt-skill-topology slice (all prompt resources adopt Agent Skills `SKILL.md` topology; `description` becomes file-owned frontmatter; the emitted wrapper becomes `` with per-skill ``). Supersedes: the flat "base + mode + role + strategy + lens + grade + …" layering; the fixed all-packs concatenation in `compose-brunch-prompt.ts`; "role preset / runtime bundle" as the composition unit; direct Layer-2 eager prompt-pack injection as the default mechanism; treating top-level `src/agents/` as Pi-only rather than Brunch LLM-context ingress; and `capability` as a parallel name for `method`. #### Continuity & origination (turn-boundary choreography) @@ -313,12 +313,12 @@ The POC's purpose is to prove three things: (a) that pi's coding-agent harness c - **Rollout** — incremental: ``, ``, graph, session runtime-frame, and structured-exchange result renders now live under `src/agents/contexts/`; transcript debug/report rendering lives in `src/session/transcript-markdown.ts` as a human/product debug artifact. - **Closed audit** — per-session `turnCount` is derived once while inspecting canonical session files and counts only current Pi v3 JSONL message entries (`type: "message"` with `message.role: "user" | "assistant"`); tool/custom entries are excluded, and downstream workspace/specification overview renders reuse that inspected count rather than reparsing the file. - **D85-L — Suspended prompt-resource axis model: strategy/lens/method are no longer runtime state.** A 2026-06-18 grill consolidation of the `agents/skills/` topology and the D58-L manifest axes, implemented across FE-893, FE-861, and FE-898, produced useful prompt-resource content and path topology. D98-L suspends the runtime-axis claim: strategy/lens/method may remain as prompt-resource organization or internal agent reasoning vocabulary, but Brunch should not expose or persist them as changeable runtime state unless later evidence earns that surface. Historical moves from that pass, retained only where D98-L does not supersede them: - 1. **Two AUTO objective axes, not three.** The runtime manifest advertises only `strategy` and `lens`; **`goal` is dropped as a manifest/runtime axis**. The four goal postures (`grounding-advance`, `elicit-expand`, `commit-converge`, always-on `capture-posture`) **inline into the `elicitor` agent role prompt** (`src/agents/prompts/elicitor/SYSTEM.md`), selected inline by the agent from the pushed readiness-band/posture context. Rationale: `goal` was already internal/readiness-derived and not user-mutable (D59-L), so advertising it as an AUTO-selectable axis was indirection over what is agent-directed-by-band anyway. Consequences for the build: `compose.ts` drops the `` family, `manifestsForState` drops `goals`, `runtime-state.ts` / `agents/runtime/policy.ts` drop the `goal` axis slot, and the runtime header drops the goal line. Capability-readiness (D74-L) is unaffected — it keys on gaps, not goal. + 1. **Two AUTO objective axes, not three.** The runtime manifest advertises only `strategy` and `lens`; **`goal` is dropped as a manifest/runtime axis**. The four goal postures (`grounding-advance`, `elicit-expand`, `commit-converge`, always-on `capture-posture`) **inline into the `elicitor` foreground prompt** (`src/agents/prompts/elicitor.md`), selected inline by the agent from the pushed readiness-band/posture context. Rationale: `goal` was already internal/readiness-derived and not user-mutable (D59-L), so advertising it as an AUTO-selectable axis was indirection over what is agent-directed-by-band anyway. Consequences for the build: `compose.ts` drops the `` family, `manifestsForState` drops `goals`, `runtime-state.ts` / `agents/runtime/policy.ts` drop the `goal` axis slot, and the runtime header drops the goal line. Capability-readiness (D74-L) is unaffected — it keys on gaps, not goal. 2. **Graph-write mechanism is method-routed, not a strategy-axis member.** `propose-graph` (direct-commit) and `project-graph` (review-set) describe the **graph-write capability ids** (the D26-L commitment mechanisms), not interaction shape; their strategy names are retired rather than rehomed. The existing methods absorb the mechanics: `commit-graph` carries direct-commit mechanics, and `generate-proposal` carries review-set mechanics. The offer→accept / derive→review choreography lives in the inlined `commit-converge` posture, not in method bodies. The graph-write readiness gate was originally placed on those method ids via capability-readiness (**removed by D86-L**: the graph-write methods are floor — readiness is advisory for them, never a tool gate), while the `strategy` axis keeps only genuine interaction shapes: `step-wise-decision-tree`, `step-wise-disambiguate`, and `freestyle` (AUTO-excluded, D66-L). 3. **Gap-reflection conduct belongs to the capture skill, not `review-for-gaps`.** D81-L spawn-on-noticing + close-on-answered is **always-on capture-sweep conduct** (every elicitor turn), so it lives with the D80-L capture skill, not an optionally-selected method. `review-for-gaps` is demoted to the **deliberate-audit sense only** (missing support, contradictions, verification debt). Read/interpret-gap semantics stay on the `read_elicitation_gaps` tool description (tool-local), not duplicated into a skill; the D81-L commitment gradient lives once, in the capture skill, with gap-spawn as its third outlet. - 4. **The prompt-content rewrite is design work entangled with live/stubbed seams — not a keyword fossil sweep.** The strategy/method bodies drift and overlap, but audit (2026-06-18) found their suspect tokens are mostly *not* dead history: `tool_meta` is live across every exchange projection; `capture_*` is a live `tool_meta.next` sequencing marker (`present_* → request_* → capture_*`), distinct from the D80-L-retired labeled-prefix capture core; and `present_candidates` + `user_rubric` / `meta_rubric` / `graph_refs` are the **anticipated payload of the live candidate topology stub** (`projections/exchanges/present-candidates.ts`, PLAN-confirmed stubbed), not removable fossils. Only `renderCall` is genuinely unreferenced (confirm against the Pi display API before removal). Rewriting prompt content must reconcile against the candidate stub, the exchange `tool_meta` model, and the D80-L sweep model rather than strip by keyword. Lexicon sweep in the same pass: `elicitation backlog` → `elicitation gaps` / `coverage obligation` (the D65-L rename; the inlined `elicit-expand` posture in `elicitor/SYSTEM.md` still carries the old term after the goal-axis drop relocated it). + 4. **The prompt-content rewrite is design work entangled with live/stubbed seams — not a keyword fossil sweep.** The strategy/method bodies drift and overlap, but audit (2026-06-18) found their suspect tokens are mostly *not* dead history: `tool_meta` is live across every exchange projection; `capture_*` is a live `tool_meta.next` sequencing marker (`present_* → request_* → capture_*`), distinct from the D80-L-retired labeled-prefix capture core; and `present_candidates` + `user_rubric` / `meta_rubric` / `graph_refs` are the **anticipated payload of the live candidate topology stub** (`projections/exchanges/present-candidates.ts`, PLAN-confirmed stubbed), not removable fossils. Only `renderCall` is genuinely unreferenced (confirm against the Pi display API before removal). Rewriting prompt content must reconcile against the candidate stub, the exchange `tool_meta` model, and the D80-L sweep model rather than strip by keyword. Lexicon sweep in the same pass: `elicitation backlog` → `elicitation gaps` / `coverage obligation` (the D65-L rename; the inlined `elicit-expand` posture in `prompts/elicitor.md` still carries the old term after the goal-axis drop relocated it). - Prompt-shape closure (revised 2026-06-22): (a) **`SKILL.md` directory topology is adopted for every strategy/lens/method** because the Agent Skills standard is now the target prompt-resource format; `references/` remains deferred until a concrete skill needs progressive disclosure, and D39-L's code-owned path list remains the availability surface. (b) **`src/agents/prompts//SYSTEM.md`** is adopted for live and named future bodies; no flat agent-body shape remains open. (c) **`[sub]` sub-agent notation** is deferred until the first real delegated sub-agent lands; no empty sub-agent stubs are introduced. (d) **generated typed-vocab context references** are **materialized** (first instance: the kind→band table at `src/agents/contexts/references/graph-ontology.md`, generated by `npm run generate:ontology` from the typed `graph/schema` sources and drift-checked by `npm run check:data-model`, wired into `npm run check`); they are read-only projections locked separately from authored prompt-resource bodies, and prompt resources cite them rather than restating vocabulary (the data-model-legibility frontier owns the expansion to further tables + the authored judgment layer). Current state: [`src/agents/contexts/README.md`](src/agents/contexts/README.md) and [`src/agents/skills/README.md`](src/agents/skills/README.md). Resolved 2026-06-18: the capture home is `methods/capture`, absorbing the former `infer-and-capture` method name; the full D80/D81/D82 conduct body remains FE-861. Depends on: D23-L, D25-L, D26-L, D39-L, D40-L, D58-L, D59-L, D65-L, D73-L, D80-L, D81-L. Refines: D25-L, D26-L, D40-L, D58-L, D59-L. Supersedes: `goal` as an AUTO-able manifest/runtime axis (the "objective axes `strategy`, `lens`, and `goal`" triple in D40-L/D58-L/D59-L → two axes, goal inlined); `propose-graph` / `project-graph` as `strategy`-axis members (D25-L/D26-L list them as strategies); treating gap spawn/close as a `review-for-gaps` method responsibility; `infer-and-capture` as a separate method; treating the `capture_*` / candidate / `tool_meta` prompt-resource references as removable fossils; and the 2026-06-19 deferral of Agent Skills `SKILL.md` topology. + Prompt-shape closure (revised 2026-06-26): (a) **`SKILL.md` directory topology is adopted for every strategy/lens/method** because the Agent Skills standard is now the target prompt-resource format; `references/` remains deferred until a concrete skill needs progressive disclosure, and D39-L's code-owned path list remains the availability surface. (b) Foreground bodies flatten to **`src/agents/prompts/{elicitor,executor}.md`**; nested `SYSTEM.md` directories are retired. (c) Background subagent bodies flatten to **`src/agents/subagents/{explorer,researcher,projector,reviewer}.md`** with frontmatter; they are subagent resources, not foreground prompts. (d) **generated typed-vocab context references** are **materialized** (first instance: the kind→band table at `src/agents/contexts/references/graph-ontology.md`, generated by `npm run generate:ontology` from the typed `graph/schema` sources and drift-checked by `npm run check:data-model`, wired into `npm run check`); they are read-only projections locked separately from authored prompt-resource bodies, and prompt resources cite them rather than restating vocabulary (the data-model-legibility frontier owns the expansion to further tables + the authored judgment layer). Current state: [`src/agents/contexts/README.md`](src/agents/contexts/README.md) and [`src/agents/skills/README.md`](src/agents/skills/README.md). Resolved 2026-06-18: the capture home is `methods/capture`, absorbing the former `infer-and-capture` method name; the full D80/D81/D82 conduct body remains FE-861. Depends on: D23-L, D25-L, D26-L, D39-L, D40-L, D58-L, D59-L, D65-L, D73-L, D80-L, D81-L. Refines: D25-L, D26-L, D40-L, D58-L, D59-L. Supersedes: `goal` as an AUTO-able manifest/runtime axis (the "objective axes `strategy`, `lens`, and `goal`" triple in D40-L/D58-L/D59-L → two axes, goal inlined); `propose-graph` / `project-graph` as `strategy`-axis members (D25-L/D26-L list them as strategies); treating gap spawn/close as a `review-for-gaps` method responsibility; `infer-and-capture` as a separate method; treating the `capture_*` / candidate / `tool_meta` prompt-resource references as removable fossils; and the 2026-06-19 deferral of Agent Skills `SKILL.md` topology. Depends on: D19-L, D52-L, D60-L, D62-L, D65-L, D75-L. Refines: D60-L (RENDER stage). Supersedes: the ad-hoc `[bracket]`-header + bullet-list render style as the house convention; hand-rolled markdown and tree string generation in the old renderer layer; carrying sessions in the `` cwd render. - **D95-L — Elicitor capability spine: `capture` / `generate` / `project` are the three SPEC-mode capabilities.** The elicitor's work decomposes into three capabilities by what each does to the graph: **capture** commits ground material already present in the transcript tail into graph truth (the D80-L banded sweep + D81-L commitment gradient + D82-L acquisition layer, already specced); **generate** proposes new typed graph expressions on a requested plane from grounding plus a conceptual frame, fanning candidates out and committing the chosen one through review (D96-L); **project** derives nodes on one plane from a subset/plane of the existing graph with connecting cross-plane edges (e.g. requirements→design, design→oracles, A33-L). D98-L makes this a SPEC-mode capability vocabulary, not a runtime-axis topology: capture/generate/project are the elicitor's jobs, while strategy/lens/method files are optional prompt-resource organization if they improve behavior. Capture remains always-on conduct of every elicitor turn; generate and project are requested just-in-time and readiness remains advisory rather than a graph-write tool gate (D74-L/D86-L). Background acquisition subagents (D82-L near-future, A34-L) are the `acquire` arm feeding capture, not a fourth capability. Depends on: D74-L, D80-L, D81-L, D82-L, D85-L, D86-L, D98-L. Supersedes: the proposed `grounding` / `elicitation` / `projection` lifecycle directories as a replacement skill topology, and treating strategy/lens/method as the load-bearing runtime capability model. @@ -358,7 +358,7 @@ The POC's purpose is to prove three things: (a) that pi's coding-agent harness c | I27-L | Session display names are presentation metadata only: every Brunch-created session gets a neutral workspace-global default `session_info` label (`Untitled Session N`) at creation, unchanged defaults do not collide across specs in one cwd, later user/generated names may replace the default, and no naming path mutates spec identity, session binding, or graph truth. | planned (creation/boundary tests for workspace-global default allocation across specs and replacement sessions; session-lifecycle naming tests with empty transcript/auth failure/success paths; picker/chrome projection tests read session names when present) | D6-L, D21-L, D35-L, D42-L | | I26-L | Runtime schema-library imports stay deliberately scoped: Zod may appear in D41-L-acknowledged product/protocol schema seams — the structured-exchange schemas (`src/.pi/extensions/exchanges/schemas/`), the graph-owned `present_review_set` payload teaching schema co-located with its deep validator (`src/graph/review-set.ts`), and the dev-gated query-tool params (`src/.pi/extensions/{session-query,introspect-query}/`), each converting to Pi `TSchema` only through a single per-plane `z.toJSONSchema(..., { unrepresentable: 'throw' })` cast adapter (`exchanges/pi-schema.ts`, `shared/pi-tool-schema.ts`); TypeBox remains valid for unrelated Pi tool parameters (e.g. graph tools), small config/frontmatter contracts, and Drizzle-derived row schemas; no boundary may hand-author parallel Zod and TypeBox sources for the same shape. Pi tool parameter schemas authored in Zod must export JSON Schema draft 2020-12 (Zod v4 default), so tuples emit `prefixItems` rather than the draft-07 array-`items`/`additionalItems` form that strict provider validators (Anthropic) reject. Drizzle row/insert/update schemas are not hand-authored alongside their target tables. | covered (structured-exchange schema tests prove Zod parse/export and assert semantic details contracts stay in `src/.pi/extensions/exchanges/schemas/` except for the graph-owned review-set payload teaching schema imported from `src/graph/review-set.ts`; the legacy `shared/model.ts` details interface is retired; structured-exchange TypeBox usage is quarantined to the single Pi `TSchema` cast adapter in `src/.pi/extensions/exchanges/pi-schema.ts`, and the dev query tools to `src/.pi/extensions/shared/pi-tool-schema.ts`; `session-query`/`introspect-query` tests assert the advertised parameter schema is draft 2020-12 with no draft-07 tuple form; the no-direct-`db/`-imports-outside-`graph/` boundary is enforced statically by oxlint `no-restricted-imports` (`.oxlintrc.json`), with the residual `architecture.test.ts` greps covering only the db→graph kinds-only edge and `db/schema.ts` enum-array ownership that lint cannot express; Drizzle derivation via `drizzle-typebox` in `row-schemas.ts`) | D41-L | | I28-L | Auto-compaction output preserves the configured anchor set byte-stable: every entry kind listed in [src/.pi/extensions/compaction/index.ts](src/.pi/extensions/compaction/index.ts) is reconstructable post-compaction according to its `select` rule (`first | latest | active-leaves | all-unresolved`); LLM-generated narrative summary never replaces or rephrases preserved-anchor content; extension failure falls through to Pi default compaction rather than dropping anchors silently. | planned (compaction round-trip property tests at M9 plus inner-loop anchor-rendering unit tests and TypeBox schema validation of the anchor contract) | D43-L; R15, R13; I3-L, I4-L, I8-L, I12-L | -| I29-L | Subagent SDK child sessions inherit Brunch Pi Profile sealing while allowing explicitly injected parent-world reads: every `subagent` tool invocation builds an in-process `AgentSession` from explicit sealed services (in-memory auth/settings/session managers, no ambient resources, assembled background system prompt, parent model registry, explicit tool allowlist); subagents never load ambient user/project `.pi/` skills, prompts, themes, extensions, context files, or behavior-shaping settings; subagents never gain direct access to the parent's `CommandExecutor`, Brunch RPC handlers, or graph persistence; parent world access is injected by the app root as a snapshot prompt block plus selected-spec read tools such as `read_graph`; parent aborts prevent prompt execution before/during setup and abort live child sessions; subagent results return to the main agent only as tool result content (no side-effect transcript writes). | covered for the implemented SDK seam by `src/.pi/extensions/subagents/subagents.test.ts`: frontmatter/config validation (including duplicate keys), explicit registry loading from `src/agents/prompts//SYSTEM.md` while ignoring unlisted planted bodies, tool allowlist conformance for `explorer`/`projector`/`researcher`, sealed faux-provider child sessions with no inherited base prompt or conversation, assembled prompt snapshot coverage (selected spec/workspace/session digest, no foreground elicitation recommendation), unknown-tool failure, `read_graph` availability only with injected parent graph readers, parent-spec-only graph read content with sibling-spec negative assertion, bounded concurrency including waiter/new-arrival race, invalid invocation shape rejection before runner call, and parent-abort setup/live-session behavior. Startup advertisement remains dev-gated by whether a launch path supplies subagent deps to `createBrunchPiExtensions(...)`. | D2-L, D39-L, D40-L, D44-L, D91-L; I1-L, I2-L, I11-L, I24-L | +| I29-L | Subagent SDK child sessions inherit Brunch Pi Profile sealing while allowing explicitly injected parent-world reads: every `subagent` tool invocation builds an in-process `AgentSession` from explicit sealed services (in-memory auth/settings/session managers, no ambient resources, assembled background system prompt, parent model registry, explicit tool allowlist); subagents never load ambient user/project `.pi/` skills, prompts, themes, extensions, context files, or behavior-shaping settings; subagents never gain direct access to the parent's `CommandExecutor`, Brunch RPC handlers, or graph persistence; parent world access is injected by the app root as a snapshot prompt block plus selected-spec read tools such as `read_graph`; parent aborts prevent prompt execution before/during setup and abort live child sessions; subagent results return to the main agent only as tool result content (no side-effect transcript writes). | covered for the implemented SDK seam by `src/.pi/extensions/subagents/subagents.test.ts`: frontmatter/config validation (including duplicate keys), explicit registry loading from `src/agents/subagents/.md` while ignoring unlisted planted bodies, tool allowlist conformance for `explorer`/`projector`/`researcher`, sealed faux-provider child sessions with no inherited base prompt or conversation, assembled prompt snapshot coverage (selected spec/workspace/session digest, no foreground elicitation recommendation), unknown-tool failure, `read_graph` availability only with injected parent graph readers, parent-spec-only graph read content with sibling-spec negative assertion, bounded concurrency including waiter/new-arrival race, invalid invocation shape rejection before runner call, and parent-abort setup/live-session behavior. Startup advertisement remains dev-gated by whether a launch path supplies subagent deps to `createBrunchPiExtensions(...)`. | D2-L, D39-L, D40-L, D44-L, D91-L; I1-L, I2-L, I11-L, I24-L | | I30-L | Elicitor capture commits only high-confidence graph truth; under the D81-L gradient, directly-stated facts commit `explicit`, confidently-materialized facts/edges commit `implicit`, low-confidence noticings never become graph truth — they map to existing-or-new `elicitation_gaps` as agenda — and contradictions with existing graph truth route to `reconciliation_need` rather than gap or overwrite. | covered for deterministic routing (`src/graph/__tests__/capture-commitment-gradient-gate.test.ts` proves the FE-861 routing gate through the real `mutate_graph`, `update_elicitation_gaps`, and `update_reconciliation_needs` adapters: explicit→commit, implicit→commit, low→one gap, contradiction→one semantic-conflict recon need, structural answered derivation, manual gap close on the graph clock, illegal capture batches failing loud, and the closed capture-quality-spike scenario family re-aimed from binary `shouldCommit` to gradient `expectedOutcome` rows across free prose, file refs, implication-heavy, and contradiction classes. `src/probes/capture-quality-loop.ts` keeps the LLM-in-loop probe as fitness by scoring gradient-routing accuracy, not gating classification quality. `src/.pi/extensions/brunch-data/reconciliation/index.test.ts` proves the recon-need tool pair over `CommandExecutor`/`getOpenReconciliationNeeds` plus elicit-posture legality. `src/projections/session/sweep-watermark.test.ts` plus `src/.pi/__tests__/extension-registry.test.ts` prove the D80-L transcript-position sweep watermark: conversational/digest tail classification, raw background exclusion, idempotent marker advance, graph-LSN watermark separation, and live `before_agent_start` wiring. The submit-time labeled-prefix capture module, its `session.*` wiring, and the `capture-response-to-graph` / `submit-message-capture` proofs were deleted 2026-06-19 (D80-L fossil retirement); `session.submitMessage` / `session.submitExchangeResponse` results no longer carry a `capture` field. Confidence/dedup quality remains fitness.) | D8-L, D18-L, D47-L, D65-L, D80-L, D81-L; A22-L | | I31-L | Readiness never bars graph truth or work; it is just-in-time capability-readiness over relevant gaps, not a stored grade or kind whitelist. There is no `readiness_grade` scalar; capability availability is judged on request against the relevant `elicitation_gaps` (D74-L) and may proceed, proceed at low epistemic status, or negotiate — it never refuses outright. The `CommandExecutor` must not reject a graph node solely because its kind belongs to a later readiness band (D64-L). The soft `readiness estimate` (D45-L) is UI-only and gates nothing. Capability-readiness never *withholds a graph-write tool*: `mutate_graph` and the review-set tools stay in the active tool set regardless of readiness; `negotiate` is advisory (establishment offer + epistemic scaling), never a tool gate (D86-L). | partially covered (`src/agents/runtime/__tests__/capability-readiness.test.ts` covers the D74-L tracer gate, including proceed / proceed_low_epistemic / negotiate, no-refusal, no grade-symbol import, and a live `presence` coverage flip; `src/agents/runtime/__tests__/policy.test.ts` covers the first consumer rewire: menu legality omits gated options while relevant gaps negotiate and includes them when coverage rises, with no grade symbols in `agents/runtime/policy.ts`, and a required `NodeKind` absent from the gap register fails loud (config bug ≠ uncovered — readiness omission never masks a seeding error); `src/projections/session/readiness-estimate.test.ts` covers the soft D45-L estimate shape, empty-band zero, importance-weighted per-band coverage, honest regression, and no legality-path imports; `src/.pi/extensions/agent-runtime/runtime/state.test.ts`, `src/agents/runtime/__tests__/compose.test.ts`, `src/agents/contexts/seeds/__tests__/turn-context.test.ts`, and `src/.pi/__tests__/prompting.test.ts` cover the prompt consumer path: selected-spec gaps render as the soft per-band estimate, readiness-thin pinned axes remain visible, gated methods stay withheld, `readiness_grade=` is absent from prompt display, and the turn boundary threads the same gaps into cwd context without prompt-assembly failure; `src/session/workspace-session-coordinator.test.ts`, `src/app/__tests__/print-workspace-state.test.ts`, `src/session/workspace-overview-context.test.ts`, `src/.pi/__tests__/context-tools.test.ts`, `src/rpc/handlers.test.ts`, and `src/web/app.test.tsx` cover the workspace/chrome display retirement: `chrome.phase` / `chrome.chatMode` no longer project through coordinator/RPC/web/chrome fixtures, and workspace overview session inventory no longer carries or renders `readinessGrade`; `createSpec` / `getSpec` persistence, seed/export fixture contracts, probes, and selected-spec prompt carriers no longer persist or transport a readiness grade; the D86-L graph-write-tool-floor sub-claim is covered — `state.test.ts` proves `mutate_graph` + review-set tools stay floor while `propose-graph`/`project-graph` readiness `negotiate`s and only the non-graph-write `review-for-gaps` is withheld, and `dev/__tests__/tier-2-harness.test.ts` proves the same through a real `runBrunchTui` boot at thin vs covered grounding) | D20-L, D45-L, D64-L, D74-L, D86-L | | I32-L | Public RPC structured-exchange driving never requires a client to speak raw Pi RPC: after Brunch method discovery and workspace/spec/session activation, each pending assistant-originated exchange is answered exactly once through `session.submitExchangeResponse`, and the deterministic permutation run produces linear Pi JSONL whose structured exchange projection preserves the same prompt/answer/status/comment artifacts as the equivalent TUI structured-exchange path. | covered for deterministic FE-744 parity under canonical session method names (`session.triggerExchange`, `session.pendingExchange`, `session.submitExchangeResponse`, `session.exchanges`): `rpc.discover` contract tests, pending/respond lifecycle tests, current public-RPC structured-exchange permutations, terminal non-answered status handling, option content/rationale parity, no repeated deterministic prompts, and transcript/exchange parity assertions. | D5-L, D48-L, D49-L; I10-L, I13-L, I21-L, I23-L | @@ -406,15 +406,20 @@ The POC's purpose is to prove three things: (a) that pi's coding-agent harness c ### Prompt/runtime profile architecture - Brunch prompt composition is a **runtime-header + sealed resource/reference manifest** composed per agent by `composeAgentPrompt(...)` in `src/agents/runtime/compose.ts` (D58-L, D98-L). The direct injection is intentionally small: agent control summary, selected operational mode, a legal `` / reference manifest with per-resource `kind`, `name`, `description`, and `location`, and compact context handles/rendered context blocks. Detailed guidance bodies and canonical references are Brunch-owned markdown resources the agent loads with `read` when needed; they are not selected runtime axes. The old `src/.pi/context/` prompt-pack layout is retired; top-level `src/agents/` is now the Brunch-owned LLM-context ingress home, not a Pi-only agent tree. -- Concrete `agents/prompts` + `agents/skills` + `agents/runtime` topology (D52-L). The markdown/code boundary falls on the control-plane/behavior split: enforcement and projection are TypeScript under `agents/runtime/`; `.pi/extensions/agent-runtime/` is the hook/tool adapter. Semantic prompting material is markdown under `agents/prompts/{agent-name}/SYSTEM.md` for live agent bodies and `agents/skills/`. +- Concrete `agents/prompts` + `agents/subagents` + `agents/skills` + `agents/runtime` topology (D52-L). The markdown/code boundary falls on the control-plane/behavior split: enforcement and projection are TypeScript under `agents/runtime/`; `.pi/extensions/agent-runtime/` is the hook/tool adapter. Foreground agent bodies are flat markdown files under `agents/prompts/{elicitor,executor}.md`; background subagent bodies are flat markdown files under `agents/subagents/{explorer,researcher,projector,reviewer}.md`; prompt-resource skills stay under `agents/skills/`. ```text src/agents/ prompts/ - README.md [md] ownership + migration note - elicitor/SYSTEM.md [md+] live foreground SPEC-mode body - reviewer/SYSTEM.md [md] background proposal/commitment review body - executor/SYSTEM.md [md] future CODE-mode Brunch-aware coding/execution body + README.md [md] foreground prompt ownership + migration note + elicitor.md [md+] live foreground SPEC-mode body + executor.md [md] foreground CODE-mode Brunch-aware coding/execution body + subagents/ + README.md [md] background subagent ownership + frontmatter contract + explorer.md [md] codebase/graph reconnaissance body + frontmatter + researcher.md [md] web-research body + frontmatter + projector.md [md] candidate-proposal body + frontmatter + reviewer.md [md] proposal/commitment review body + frontmatter skills/ README.md [md] ownership + body-lock ledger strategies/*/SKILL.md [md] legacy/suspended interaction-shape resources; prune or fold as evidence dictates @@ -433,8 +438,8 @@ src/.pi/ brunch-data/context/*.ts [ts] D60-L pull-tool context surface (read_workspace_context, read_session_context) ``` -- Manifest availability is code-owned, not filesystem-discovered: `agents/runtime/state.ts` binds each legal operational mode/agent policy to explicit resource paths and each live agent role to its `src/agents/prompts//SYSTEM.md` location. It loads prompt-resource `name` and `description` from `SKILL.md` frontmatter through pi's loader with `includeDefaults: false` and an explicit `skillPaths` list where skills remain useful; generated/authored context references are likewise explicit Brunch resources, not ambient files. `composeAgentPrompt()` emits legal resource bindings; the prompt extension reads the selected agent body explicitly and passes it into the pure composer. This keeps the legal set sealed while making the file body/frontmatter/reference file the description source of truth. -- The D60-L agent-context orchestration layer (TypeScript) lives in `src/agents/contexts/`: `seeds/` owns compact pushed/origination context, while `workspace/`, `specification/`, `session/`, `graph/`, `elicitation.ts`, and `exchanges/` own provider-visible context-tool and tool-result text. `.pi/extensions/agent-runtime/system-prompts/` and `.pi/extensions/brunch-data/context/` are adapters that gather data and call those renderers. Contexts are not part of the `read`-on-demand resource manifest and carry no `` family. +- Manifest availability is code-owned, not filesystem-discovered: `agents/runtime/state.ts` binds each legal operational mode/agent policy to explicit resource paths and each live foreground role to its `src/agents/prompts/.md` location; the subagent extension binds its explicit registry ids to `src/agents/subagents/.md`. It loads prompt-resource `name` and `description` from `SKILL.md` frontmatter through pi's loader with `includeDefaults: false` and an explicit `skillPaths` list where skills remain useful; generated/authored context references are likewise explicit Brunch resources, not ambient files. `composeAgentPrompt()` emits legal resource bindings; the prompt extension reads the selected agent body explicitly and passes it into the pure composer. This keeps the legal set sealed while making the file body/frontmatter/reference file the description source of truth. +- The D60-L agent-context orchestration layer (TypeScript) lives in `src/agents/contexts/`: `seeds/` owns compact pushed/origination context, while `workspace/`, `spec/`, `session/`, `graph/`, `elicitation.ts`, and `exchanges/` own provider-visible context-tool and tool-result text. `.pi/extensions/agent-runtime/system-prompts/` and `.pi/extensions/brunch-data/context/` are adapters that gather data and call those renderers. Contexts are not part of the `read`-on-demand resource manifest and carry no `` family. - Workspace **posture** is workspace-scoped product state persisted in `.brunch/workspace.json`, not spec state, session state, or graph truth. D57-L keeps it off the spec row and graph; D58-L composition injects known posture values into the runtime header as an axis of agent influence, and the `capture-posture` goal (D59-L) can confirm or refine those values conversationally. - Readiness is judged just-in-time per requested capability, not as a user-facing workflow stepper, a stored grade, a session-local phase, or a graph-node-kind whitelist. There is no `readiness_grade` on the spec row (D45-L); capability-readiness (D74-L) is evaluated over the relevant `elicitation_gaps`, and D64-L readiness bands describe non-exclusive evidence groupings feeding the readiness-estimate rollup, goal selection, and context filtering. The soft readiness estimate may surface in UI but gates nothing. A future structural milestone gate for export/plan/execute op-modes is deferred until such an op-mode exists; before readiness grows beyond the current tracer, Brunch still needs a real evaluator path for `manual` gaps and a more differentiated per-capability map than the shared grounding floor (A27-L). - Prompt resources, context references, and Pi skills are progressive-disclosure mechanisms, but they are not authority. Brunch code owns runtime-state projection, mode filtering, capability-readiness/allow-list gating, tool activation, and tool-call blocking. D98-L removes strategy/lens/method pins and AUTO choices from product runtime state; readiness negotiation changes response posture and advisory context, not authority. Pi-native skills may be used for startup-scoped capabilities; Brunch-owned resource availability is advertised through the sealed per-turn manifest so ambient user/project resources cannot leak into product behavior. @@ -503,7 +508,7 @@ src/.pi/ | **Subagent** | A main-agent-invoked, blocking background child session (D44-L/D91-L): caller chooses a background `AgentManifest`, Brunch starts a sealed in-process SDK `AgentSession`, injects only explicit parent-world snapshot/read handles, and returns the child's assistant text as ordinary tool-result content. Ambient `.pi` discovery, parent `CommandExecutor` access, and inherited conversation context remain sealed out. | | **Strategy** | Suspended as runtime state by D98-L. The term may survive only as prompt-resource or reference vocabulary for interaction shapes if a concrete agent behavior proves it useful; it is not a user-changeable axis, AUTO selection, or transcript-backed posture. | | **Lens** | Suspended as runtime state by D98-L. The term may survive only as prompt-resource or reference vocabulary for topical/plane framing (`intent`, `design`, `oracle`) if a concrete agent behavior proves it useful; payloads should carry explicit plane/provenance fields only when a downstream reader needs them. | -| **Goal posture** | Retired as a runtime/manifest axis by D85-L/D98-L. The former postures — `grounding-advance`, `elicit-expand`, `commit-converge`, plus always-on `capture-posture` — are inline objective guidance in `elicitor/SYSTEM.md`, selected by the agent from readiness bands, open gaps, and workspace posture. Distinct from graph `goal` node kind. | +| **Goal posture** | Retired as a runtime/manifest axis by D85-L/D98-L. The former postures — `grounding-advance`, `elicit-expand`, `commit-converge`, plus always-on `capture-posture` — are inline objective guidance in `src/agents/prompts/elicitor.md`, selected by the agent from readiness bands, open gaps, and workspace posture. Distinct from graph `goal` node kind. | | **AUTO** | Retired for prompt-resource axes by D98-L. Operational mode has explicit product choices (`SPEC` / `CODE`); prompt resources and context references are available for load-on-demand reading, not selected through persisted AUTO strategy/lens state. | | **Brunch Pi Profile** | The sealed programmatic wrapper around embedded Pi: settings policy, resource-loader policy, extension factories, keybinding/command policy, tool policy, and prompt policy. It allows Brunch-owned resources while suppressing ambient `.pi/` behavior. | | **Prompt resource** | A Brunch-owned markdown file under `src/agents/` containing detailed agent guidance. Prompt resources are loaded by the agent with `read` when needed; they are product control-plane assets, not ambient Pi prompt templates and not runtime state. | @@ -583,8 +588,8 @@ src/.pi/ | **Side task** | Main-agent-invoked, non-blocking work item tracked by the Brunch `SideTaskRegistry`. The main agent fires it and does not await a return value; the only path it influences the main agent is by appending a custom-message status update to the session log that arrives at the next-turn boundary via `prepareNextTurn`. Side-task writes route through the `CommandExecutor`. Distinct from Subagent (blocking) and Side chat (user-invoked). | | **Subagent** | Main-agent-invoked, **blocking** Pi tool call (`subagent`) that runs an isolated in-process SDK child `AgentSession` with sealed services, a per-agent tool allowlist, per-agent model resolution, and no ambient resource discovery. Has no inherited conversation context, no `CommandExecutor` access, and no Brunch RPC access. Result text returns directly as tool result content. POC starter agents split into **data gatherers** (`explorer` / `researcher` — read-only context fetchers that ground proposals), a **projector** (`projector` — system-prompt-only; one variant per invocation, fan-out via parallel mode realizes the "design it twice" pattern), and a no-tool `reviewer`. | | **Projector subagent** | The system-prompt-only starter subagent that emits exactly one well-formed candidate-proposal variant per invocation given a grounding bundle plus a batch-proposal lens frame. Diversity arises from parallel `tasks: []` invocations with intentionally distinct framings; the main agent assembles outputs into review-set structured-exchange proposal details via the D31-L meta-rubric. Realizes the "design it twice" / parallel-fan-out pattern from `ln-design` and `ln-oracles` skills in subagent form. | -| **Subagent registry** | The set of registered subagent definitions loaded from the unified `src/agents/prompts//SYSTEM.md` body home through the explicit `BACKGROUND_SUBAGENT_IDS` list at extension activation. Brunch-owned only for the POC; cross-extension agent registration is deferred. | -| **Subagent agent definition** | A `SYSTEM.md` body with TypeBox-validated frontmatter (`name`, `description`, `tools`, `model`, `thinking`) plus a system-prompt body. The frontmatter is the authoring contract; the code-owned registry is the discovery contract; the body is the subagent's standing instructions. | +| **Subagent registry** | The set of registered subagent definitions loaded from the `src/agents/subagents/.md` body home through the explicit `BACKGROUND_SUBAGENT_IDS` list at extension activation. Brunch-owned only for the POC; cross-extension agent registration is deferred. | +| **Subagent agent definition** | A flat markdown body under `src/agents/subagents/` with TypeBox-validated frontmatter (`name`, `description`, `tools`, `model`, `thinking`) plus a system-prompt body. The frontmatter is the authoring contract; the code-owned registry is the discovery contract; the body is the subagent's standing instructions. | | **Auto-compaction extension** | The Brunch-owned `session_before_compact` extension (`src/.pi/extensions/auto-compaction.ts`) that renders the preserved anchor set as a deterministic markdown header and prepends it to an LLM-generated narrative summary. Resolves its summarization model through the active agent definition's model preference; falls through to Pi default compaction on auth/empty-output/unexpected errors. | | **Preserved anchor set** | The configured list of transcript entry kinds and selection rules that must survive compaction byte-stable. Canonical source is [src/.pi/extensions/compaction/index.ts](src/.pi/extensions/compaction/index.ts); each rule is `{ kind, select, rationale }` where `select ∈ first | latest | active-leaves | all-unresolved`. Externalized so it can be reviewed and updated for correctness without SPEC churn. | | **Anchor contract** | The data inside the preserved-anchor TypeScript contract — distinct from the rendering policy (which lives in code) and the LLM summarization (which is bundle-resolved). | @@ -730,7 +735,7 @@ Dev-loop artifacts route to gitignored `.fixtures/scratch///`, res | Middle | **Streaming chat transport battery (topology A — `web-driver-streaming`)** | Web-as-driver streaming relay correctness on the tier-2 faux substrate: stream↔transcript differential (message assembled from `message_update` deltas == JSONL projection), ordered incremental `AgentSessionEvent` delivery (no gaps/dupes), Pi-turn-events + Brunch-domain notifications multiplexed on one WS, live `request_answer` answer convergence through `session.answerExchange`, reconnect/resume idempotence over turn cut-points, and one-driver/many-observer fan-out (no concurrent-driver serialization — out of scope by the 2026-06-15 relaxation). Claims 1–4 are production-wired through `SessionEventRelay` and the real TUI sidecar `/rpc` transport; claim 6 is covered by a replay-less reconnect test that proves projection refetch plus live continuation without frame history; claim 7 is covered by a fan-out test that proves byte-identical concurrent observer streams plus read-only observer write rejection; command-intake slice 1 is covered by a sidecar `session.driveTurn` test that proves web-driven plain turns fan out and reduce to JSONL truth, plus contract tests that prove observer `/rpc` sockets omit live driver methods even when handles exist, driverless discovery omits `session.driveTurn`, and attached-but-not-live drivers map to `-32010`; claim 5's answered leg is covered by a sidecar `session.answerExchange` test that proves a blocked broker-backed `request_answer` promise resumes when no interactive editor is present, reduces to JSONL truth, and fans out byte-identically while observer `/rpc` sockets omit live answer methods, driverless discovery omits the method, and no-pending answers map to `-32008`; the TUI-editor precedence regression is covered by the structured-exchange request tests. Render feel stays outer-loop manual. | R12, R24; D5-L, D19-L, D37-L, D49-L, D72-L, D84-L; I22-L; A5-L; A29-L. | | Middle | Capture-analysis transcript oracle | Future `capture_*` probes persist ANALYSIS as normal Brunch toolResults, assert no graph writes occur, render full analysis in Markdown/ASCII transcripts, and assert the TUI path hides or collapses the same result without losing persisted content/details. | D17-L, D18-L, D37-L, D47-L, D50-L; I23-L, I30-L, I33-L. | | Middle | **Capture commitment-gradient routing gate + sweep-watermark property (FE-861)** | The false-commit guard is landed as a deterministic faux-substrate gate (LLM out of the loop via fixed gradient-tagged extraction) in `src/graph/__tests__/capture-commitment-gradient-gate.test.ts`: every low-confidence item is abstract-mapped to exactly one existing-or-new `elicitation_gap` and **zero** of them commit to graph truth; every explicit/implicit commit routes via the `mutateGraph` grammar with the expected basis; contradictions route to exactly one `semantic_conflict` reconciliation need via `update_reconciliation_needs`, not a gap or graph overwrite; a commit satisfying a structural (`presence`/`coverage`) gap derives `answered` (never hand-set, D65-L); `manual`-gap close routes `setElicitationGapDisposition` through the one `{specId, lsn}`/`change_log` clock (no second clock); and the closed capture-quality-spike family is re-aimed from binary `shouldCommit` to `expectedOutcome` (`commit_explicit` / `commit_implicit` / `spawn_gap` / `reconciliation_need`) with every scenario class guarded through the real adapters. `src/probes/capture-quality-loop.ts` remains the LLM-in-loop fitness probe, re-scored as gradient-routing accuracy rather than precision/recall over `shouldCommit`. The paired **sweep-watermark invariant** (prior art I45-L) is now landed in `src/projections/session/sweep-watermark.test.ts` and wired through the real `before_agent_start` product extension path in `src/.pi/__tests__/extension-registry.test.ts`: after the turn-boundary advance, no conversational/digest content remains in the un-swept tail, raw tool/background continuity may remain behind the transcript-backed marker, and the graph-LSN assistant-visible watermark is not read or moved. Per the lean steer: this gate plus the watermark property are the only deterministic capture oracles; banded-traversal quality, confidence-classification accuracy, gap/recon abstract-map/dedup quality, carry-forward/reweight feel, and digest quality stay outer-loop fitness (manual + `.brunch/debug/*`). | D8-L, D65-L, D80-L, D81-L, D85-L; A22-L; I30-L. | -| Middle | **Subagent-reconciliation oracle battery (`subagent-reconciliation`)** | Four deterministic faux-substrate oracles for the foreground/background agent reconciliation. **(1) Extraction purity** — a slice-2-entry *characterization snapshot* of the exact current foreground composed prompt must be byte-identical after the composer core is extracted (D90-L slice 2). The snapshot is trusted only as a **stability baseline** ("did the refactor change output"), not as a quality golden ("is the output good") — output quality is owned by `renderer-golden-coverage`/COMPOSE, not this guard; where an existing COMPOSE golden is *already trusted* it doubles as the tripwire, otherwise a fresh pre-extraction snapshot is captured regardless of whether the current wording is final. **(2) Code-owned discovery** — a planted unlisted `src/agents/prompts//SYSTEM.md` is not spawnable, extending the I24-L/I29-L ambient-seal tests (D90-L/D39-L). **(3) Semi-permeable seal** — a faux-provider background run asserts the assembled child prompt contains the injected world snapshot (session digest + spec/workspace), the child's Brunch graph read tool returns the parent `specId`'s graph **and never a sibling spec** (mirrors I1-L spec isolation), the ambient seal is preserved (in-memory auth/settings/session, no `~/.pi` discovery), and the result returns as tool-result `content` while `details` is render-only (D91-L). **(4) Delegatable-set write-safety boundary** — a negative-space invariant over the code-owned op_mode→delegatable-set allowlist: spawnable(op_mode) equals the allowlist, a frontmatter manifest cannot self-advertise into a read-only mode, and a **test-only write-capable background manifest** proves `elicit` refuses to spawn it — so I49-L is proven *before* the execute-mode write worker exists. Outer fitness (not gated): a real delegated-acquisition run where explorer/researcher read the live graph and return a digest, judged for usefulness. | D39-L, D40-L, D44-L, D58-L, D60-L, D82-L, D90-L, D91-L, D92-L; I29-L, I49-L. | +| Middle | **Subagent-reconciliation oracle battery (`subagent-reconciliation`)** | Four deterministic faux-substrate oracles for the foreground/background agent reconciliation. **(1) Extraction purity** — a slice-2-entry *characterization snapshot* of the exact current foreground composed prompt must be byte-identical after the composer core is extracted (D90-L slice 2). The snapshot is trusted only as a **stability baseline** ("did the refactor change output"), not as a quality golden ("is the output good") — output quality is owned by `renderer-golden-coverage`/COMPOSE, not this guard; where an existing COMPOSE golden is *already trusted* it doubles as the tripwire, otherwise a fresh pre-extraction snapshot is captured regardless of whether the current wording is final. **(2) Code-owned discovery** — a planted unlisted `src/agents/subagents/.md` is not spawnable, extending the I24-L/I29-L ambient-seal tests (D90-L/D39-L). **(3) Semi-permeable seal** — a faux-provider background run asserts the assembled child prompt contains the injected world snapshot (session digest + spec/workspace), the child's Brunch graph read tool returns the parent `specId`'s graph **and never a sibling spec** (mirrors I1-L spec isolation), the ambient seal is preserved (in-memory auth/settings/session, no `~/.pi` discovery), and the result returns as tool-result `content` while `details` is render-only (D91-L). **(4) Delegatable-set write-safety boundary** — a negative-space invariant over the code-owned op_mode→delegatable-set allowlist: spawnable(op_mode) equals the allowlist, a frontmatter manifest cannot self-advertise into a read-only mode, and a **test-only write-capable background manifest** proves `elicit` refuses to spawn it — so I49-L is proven *before* the execute-mode write worker exists. Outer fitness (not gated): a real delegated-acquisition run where explorer/researcher read the live graph and return a digest, judged for usefulness. | D39-L, D40-L, D44-L, D58-L, D60-L, D82-L, D90-L, D91-L, D92-L; I29-L, I49-L. | | Outer | Manual walkthrough with checklist | UX/presentation life: TUI chrome, spec/session picker, web shell feel, coherence visibility, elicitation usefulness. Adds: ambient-affordance rendering from establishment-offer structured-exchange facets; proposal/framing quality review; lens-recommendation appropriateness; review-cycle UX (approve / request-changes / reject); meta-rubric comparative-usefulness review (D31-L hypothesis test). | A17-L; R4, R14, R16, R20, R21. | | Outer | Adversarial / generative probe runs | Elicitation quality, human-gated `needs_human`, contradictory requirements, cross-session updates, long-horizon compaction, and reviewer-finding precision through small targeted probe scenarios (brief-shaped inputs are allowed, but the probe run and transcript artifacts are canonical). POC scope remains one or two known-bad scenarios per relevant invariant, not exhaustive coverage. | A5-L, A8-L, A11-L, A14-L (and validated A9-L); I4-L, I6-L, I12-L, I13-L, I16-L. | @@ -776,7 +781,7 @@ The first required probe is M0: after manual TUI interaction, a checker proves ` | I25-L | Runtime-state tests: append init/switch custom entries, reload the linear transcript, reconstruct the active operational mode only (foreground role derived from mode), tolerate stale legacy `agentGoal`/strategy/lens fields on old entries without re-emitting them, and verify before-agent-start/tool-call policy suppresses disallowed tools for SPEC while CODE receives executor authority. | | I26-L | Structured-exchange schema tests prove the acknowledged Zod seam parses and exports JSON Schema; future M4 architectural tests should grep/import-audit schema libraries and Drizzle row-schema derivation boundaries. | | I28-L | Inner — TypeBox schema validation of [src/.pi/extensions/compaction/index.ts](src/.pi/extensions/compaction/index.ts) shape; deterministic anchor-rendering unit tests (same branch + same config → same header bytes). Middle (M9) — compaction round-trip property tests across all configured anchors and selection rules; fallback-to-Pi-default behavior under simulated auth failure, empty LLM output, and thrown error. Outer (M9) — long-horizon adversarial fixture confirms session binding, latest runtime state, latest establishment offer, in-flight side-task results, and unresolved staleness hints remain agent-intelligible post-compaction. | -| I29-L | Inner — SDK child-session tests prove sealed service construction, agent-body system prompt ownership, no inherited parent conversation, explicit tool allowlists per starter agent, no-tools projector/reviewer behavior, duplicate/malformed frontmatter failure, explicit registry discovery from `src/agents/prompts//SYSTEM.md`, config validation, bounded concurrency, invalid caller-shape rejection before runner invocation, and parent-abort behavior before/during setup and after session creation. Middle — when startup wiring lands, a product-path smoke should prove the launch gate supplies deps intentionally and ordinary elicit sessions without deps do not register/advertise `subagent`. Outer — probe-driven proposal-generation or delegated-acquisition runs invoking explorer/researcher/projector/reviewer confirm subagent outputs ground proposals/digests without bypassing primary authority. | +| I29-L | Inner — SDK child-session tests prove sealed service construction, agent-body system prompt ownership, no inherited parent conversation, explicit tool allowlists per starter agent, no-tools projector/reviewer behavior, duplicate/malformed frontmatter failure, explicit registry discovery from `src/agents/subagents/.md`, config validation, bounded concurrency, invalid caller-shape rejection before runner invocation, and parent-abort behavior before/during setup and after session creation. Middle — when startup wiring lands, a product-path smoke should prove the launch gate supplies deps intentionally and ordinary elicit sessions without deps do not register/advertise `subagent`. Outer — probe-driven proposal-generation or delegated-acquisition runs invoking explorer/researcher/projector/reviewer confirm subagent outputs ground proposals/digests without bypassing primary authority. | | I30-L | FE-807 covered the now-superseded labeled-text response tracer (D80-L retires it). The FE-861 **capture commitment-gradient routing gate** is now landed for the full closed matrix (explicit/implicit commits via `mutateGraph`; low-confidence never commits and maps to one gap; contradictions route to `semantic_conflict` reconciliation needs; structural gaps derive `answered`; `manual`-gap close on the one `{specId, lsn}` clock; binary `shouldCommit` retired in favor of gradient `expectedOutcome`). The paired **sweep-watermark property** is landed (`sweep-watermark.test.ts` + live `before_agent_start` wiring), and the submit-time labeled-prefix fossil + its `capture-response-to-graph` / `submit-message-capture` proofs are now deleted (D80-L fossil retirement). Confidence-classification accuracy and gap/recon dedup quality stay fitness/blind-spot (see below). | | I31-L | Capability-readiness tests proving live gap coverage negotiates/unlocks later actions without disabling gathering/refinement; prompt/tool-policy tests proving readiness does not withhold graph-write tools or require pinned runtime prompt-resource axes; graph write tests proving later-band node kinds are not rejected solely because grounding is thin. | | I32-L | FE-744 public-RPC structured-exchange parity proof: `rpc.discover` contract tests, pending/respond lifecycle tests, deterministic permutation run over Brunch JSON-RPC only, no repeated deterministic prompts, and parity assertions over the resulting Pi JSONL, transcript display, and session exchange projections. | diff --git a/memory/cards/renderer-golden-coverage--prompt-subagent-topology.md b/memory/cards/renderer-golden-coverage--prompt-subagent-topology.md new file mode 100644 index 000000000..ebc67cd0d --- /dev/null +++ b/memory/cards/renderer-golden-coverage--prompt-subagent-topology.md @@ -0,0 +1,111 @@ +# Prompt and subagent topology flattening + +Frontier: renderer-golden-coverage +Status: active +Mode: single +Created: 2026-06-26 + +## Orientation + +- Containing seam: `src/agents/` prompt-resource topology inside FE-1091 / `renderer-golden-coverage`; this is the remaining RENDER/COMPOSE closure after the renderer/assembly evidence sweep. +- The prior sweep locked current behavior but stopped short of the user's accepted topology: foreground prompts must be flat files under `prompts/`, and subagent bodies must have their own flat `subagents/` resource home. +- Main open risk: code/tests currently encode the nested `prompts//SYSTEM.md` convention and package copying likely follows that tree. This slice should change the canonical paths directly, not add aliases or compatibility readers. +- Cross-cutting obligations: preserve D39-L sealed/code-owned resource lists, D58-L thin composition, D90-L shared `AgentManifest` model, D91-L assembled subagent prompts, and D98-L SPEC/CODE foreground role vocabulary. + +Posture: earned (inherited from `renderer-golden-coverage`). + +## Target Behavior + +Brunch agent body resources use the accepted flat topology: foreground prompts at `src/agents/prompts/{elicitor,executor}.md` and background subagents at `src/agents/subagents/{explorer,researcher,projector,reviewer}.md`. + +## Full-card cold-start reads + +- `memory/SPEC.md` — D44-L, D58-L, D85-L, D90-L, D91-L, D92-L, D93-L, D98-L; I29-L +- `memory/PLAN.md` — frontier: `renderer-golden-coverage` +- `src/agents/README.md` — current `src/agents/` ownership and topology +- `src/agents/prompts/README.md` — foreground prompt ownership to update +- `src/.pi/extensions/subagents/README.md` — subagent loading/assembly topology to update +- `src/agents/registry.ts` and `src/.pi/extensions/subagents/agents.ts` — current path registries/loaders +- `package.json` — `build:pi-assets` prompt/subagent asset copying + +## Boundary Crossings + +```text +→ src/agents/registry.ts foreground body path registry +→ src/agents/runtime/policy.ts / state.ts foreground body lookup +→ src/.pi/extensions/agent-runtime/system-prompts foreground adapter +→ src/.pi/extensions/subagents/agents.ts background body loader +→ src/.pi/extensions/subagents/session.ts / prompt-assembly.ts child prompt assembly +→ package asset copy +→ topology docs and tests +``` + +## Risks and Assumptions + +- RISK: hidden tests or build assets still expect `prompts//SYSTEM.md` directories → MITIGATION: search the repo for `SYSTEM.md`, `prompts/`, and each concrete old path; update package asset copying and topology tests in the same slice. +- RISK: flattening subagents could accidentally make them foreground prompt resources → MITIGATION: keep explicit `BACKGROUND_SUBAGENT_IDS` loading and foreground `BUNDLED_AGENT_BODY_IDS` / policy lists separate; assert subagent files are not in the foreground prompt list. +- ASSUMPTION: no external packaged consumer depends on the nested prompt asset layout. + → IMPACT IF FALSE: this would need a migration bridge in packaged assets. + → VALIDATE: pre-release/free-rewrite posture plus package-local tests; do not preserve old paths unless a build/runtime test proves an atomic update is impossible. + +## Posture check + +Earned closure target: + +- **Canonicalizes** prompt-resource locations to the user's accepted topology. +- **Deletes / retires** nested `SYSTEM.md` body directories for agent bodies. +- **Materializes** separate `prompts/` foreground and `subagents/` background homes into filesystem topology, READMEs, registry tests, and package asset copying. +- **Locks in** no stale `prompts//SYSTEM.md` convention in docs/tests/build scripts. + +## Acceptance Criteria + +✓ Foreground prompt body tests — `elicitor` and `executor` load from `src/agents/prompts/elicitor.md` and `src/agents/prompts/executor.md`; old nested foreground paths are absent. +✓ Subagent loader tests — `explorer`, `researcher`, `projector`, and `reviewer` load from `src/agents/subagents/.md`; planted unlisted subagent files remain unspawnable. +✓ Registry/topology tests — foreground body ids exclude background subagents; subagent registry owns background ids; docs name the split. +✓ Build asset check — `build:pi-assets` copies flat foreground prompt files and flat subagent files into the corresponding dist homes. +✓ Repo search invariant — no canonical doc/test/source path still presents `src/agents/prompts//SYSTEM.md` as the live convention. + +## Verification Approach + +- Inner: targeted Vitest over `src/agents`, `src/agents/runtime`, and `src/.pi/extensions/__tests__/subagents.test.ts` — proves loaders, policy paths, and prompt assembly still work. +- Inner: `npm run check` — catches stale docs/format/skill/data-model drift after path edits. +- Gate: `npm run verify` — required before committing because package asset copying and build output are touched. + +## Cross-cutting obligations + +- D39-L: explicit/code-owned resource paths only; no directory discovery except explicit registry ids. +- D58-L: prompt composition remains thin; this slice moves files and loaders, not prompt behavior. +- D90-L/D91-L: foreground/background share manifest shape while retaining distinct homes and execution authority. +- D98-L: foreground vocabulary is SPEC/elicitor and CODE/executor; do not preserve orchestrator/pi-coder as product-role aliases. + +## Expected touched paths (tentative) + +```text +memory/cards/ +└── renderer-golden-coverage--prompt-subagent-topology.md + +memory/PLAN.md ~ +memory/SPEC.md ~ +package.json ~ +src/agents/ +├── README.md ~ +├── registry.ts ~ +├── __tests__/ +│ └── registry.test.ts ~ +├── prompts/ +│ ├── README.md ~ +│ ├── elicitor.md + +│ ├── executor.md + +│ ├── elicitor/ - +│ └── executor/ - +├── subagents/ +│ ├── README.md + +│ ├── explorer.md + +│ ├── researcher.md + +│ ├── projector.md + +│ └── reviewer.md + +└── runtime/ ? +src/.pi/extensions/subagents/ +├── README.md ~ +├── agents.ts ~ +└── tests ? +``` From 8957e83d39c156b84cb2db9c0557ce7ef78a019f Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 16:37:37 +0200 Subject: [PATCH 20/29] Rebuild agent asset homes from flat sources --- package.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/package.json b/package.json index 756d85ed7..a056435f8 100644 --- a/package.json +++ b/package.json @@ -34,7 +34,7 @@ "build": "tsc -p tsconfig.build.json && npm run build:info && npm run build:pi-assets && npm run build:web", "build:info": "node scripts/write-build-info.mjs", "prepack": "RELEASE=true npm run build", - "build:pi-assets": "mkdir -p dist/.pi/components/workspace-dialog dist/.pi/extensions/subagents dist/agents/prompts dist/agents/skills dist/agents/contexts && cp -R src/.pi/components/workspace-dialog/assets dist/.pi/components/workspace-dialog/ && cp -R src/agents/prompts/elicitor src/agents/prompts/explorer src/agents/prompts/executor src/agents/prompts/pi-coder src/agents/prompts/projector src/agents/prompts/researcher src/agents/prompts/reviewer dist/agents/prompts/ && cp -R src/agents/skills/strategies src/agents/skills/lenses src/agents/skills/methods dist/agents/skills/ && cp -R src/agents/contexts/references dist/agents/contexts/ && cp src/.pi/extensions/subagents/config.json dist/.pi/extensions/subagents/", + "build:pi-assets": "rm -rf dist/agents/prompts dist/agents/subagents && mkdir -p dist/.pi/components/workspace-dialog dist/.pi/extensions/subagents dist/agents/prompts dist/agents/subagents dist/agents/skills dist/agents/contexts && cp -R src/.pi/components/workspace-dialog/assets dist/.pi/components/workspace-dialog/ && cp src/agents/prompts/elicitor.md src/agents/prompts/executor.md dist/agents/prompts/ && cp src/agents/subagents/explorer.md src/agents/subagents/projector.md src/agents/subagents/researcher.md src/agents/subagents/reviewer.md dist/agents/subagents/ && cp -R src/agents/skills/strategies src/agents/skills/lenses src/agents/skills/methods dist/agents/skills/ && cp -R src/agents/contexts/references dist/agents/contexts/ && cp src/.pi/extensions/subagents/config.json dist/.pi/extensions/subagents/", "build:web": "vite build", "seed": "tsx src/graph/seed-fixtures.ts", "generate:ontology": "tsx src/graph/schema/generate-ontology-ref.ts", From 0df83f296cc5985434029b2f7a1e1b9a9390eb8f Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 16:38:10 +0200 Subject: [PATCH 21/29] Cover generated agent asset topology --- .../prompts/__tests__/prompt-bodies.test.ts | 98 ++++++++++++++----- 1 file changed, 72 insertions(+), 26 deletions(-) diff --git a/src/agents/prompts/__tests__/prompt-bodies.test.ts b/src/agents/prompts/__tests__/prompt-bodies.test.ts index 06e8dfb3c..954d6d4e0 100644 --- a/src/agents/prompts/__tests__/prompt-bodies.test.ts +++ b/src/agents/prompts/__tests__/prompt-bodies.test.ts @@ -1,67 +1,113 @@ -import { access, readFile } from 'node:fs/promises'; +import { execFile } from 'node:child_process'; +import { access, readFile, readdir } from 'node:fs/promises'; import { dirname, join } from 'node:path'; import { fileURLToPath } from 'node:url'; +import { promisify } from 'node:util'; import { describe, expect, it } from 'vitest'; +const execFileAsync = promisify(execFile); + const projectRoot = dirname(dirname(dirname(dirname(dirname(fileURLToPath(import.meta.url)))))); -const agentDefinitionExpectations = [ +const foregroundPromptExpectations = [ { - system: 'src/agents/prompts/elicitor/SYSTEM.md', + system: 'src/agents/prompts/elicitor.md', + oldNested: 'src/agents/prompts/elicitor/SYSTEM.md', legacyFlat: 'src/.pi/agents/elicitor.md', needles: ['# Agent: elicitor', 'multi-spec discipline'], }, { - system: 'src/agents/prompts/executor/SYSTEM.md', + system: 'src/agents/prompts/executor.md', + oldNested: 'src/agents/prompts/executor/SYSTEM.md', needles: ['# Agent: executor', 'execute mode'], }, +]; + +const backgroundSubagentExpectations = [ { - system: 'src/agents/prompts/reviewer/SYSTEM.md', + system: 'src/agents/subagents/reviewer.md', + oldNested: 'src/agents/prompts/reviewer/SYSTEM.md', legacyFlat: 'src/.pi/agents/reviewer.md', needles: ['name: reviewer', 'checking candidate'], }, { - system: 'src/agents/prompts/explorer/SYSTEM.md', + system: 'src/agents/subagents/explorer.md', + oldNested: 'src/agents/prompts/explorer/SYSTEM.md', needles: ['name: explorer', 'read-only reconnaissance agent'], }, { - system: 'src/agents/prompts/researcher/SYSTEM.md', + system: 'src/agents/subagents/researcher.md', + oldNested: 'src/agents/prompts/researcher/SYSTEM.md', needles: ['name: researcher', 'web-research agent'], }, { - system: 'src/agents/prompts/projector/SYSTEM.md', + system: 'src/agents/subagents/projector.md', + oldNested: 'src/agents/prompts/projector/SYSTEM.md', needles: ['name: projector', 'candidate-proposal'], }, - { - system: 'src/agents/prompts/pi-coder/SYSTEM.md', - needles: [ - 'expert coding assistant operating inside *brunch*', - 'Show file paths clearly when working with files', - ], - }, ]; +async function expectMissing(path: string): Promise { + await expect(access(join(projectRoot, path))).rejects.toThrow(); +} + describe('agent prompt bodies', () => { - it('keeps agent body resources under src/agents/prompts//SYSTEM.md', async () => { - for (const expectation of agentDefinitionExpectations) { + it('keeps foreground agent body resources as flat prompt files', async () => { + for (const expectation of foregroundPromptExpectations) { const content = await readFile(join(projectRoot, expectation.system), 'utf8'); for (const needle of expectation.needles) { expect(content).toContain(needle); } - if (expectation.legacyFlat) { - await expect(access(join(projectRoot, expectation.legacyFlat))).rejects.toThrow(); + await expectMissing(expectation.oldNested); + if (expectation.legacyFlat) await expectMissing(expectation.legacyFlat); + } + }); + + it('keeps background subagent bodies out of the foreground prompt home', async () => { + for (const expectation of backgroundSubagentExpectations) { + const content = await readFile(join(projectRoot, expectation.system), 'utf8'); + for (const needle of expectation.needles) { + expect(content).toContain(needle); } + await expectMissing(expectation.oldNested); + if (expectation.legacyFlat) await expectMissing(expectation.legacyFlat); } + + await expectMissing('src/agents/prompts/pi-coder/SYSTEM.md'); + }); + + it('builds generated agent assets without retired nested prompt-body directories', async () => { + await execFileAsync('npm', ['run', 'build:pi-assets'], { cwd: projectRoot }); + + await expectMissing('dist/agents/prompts/elicitor/SYSTEM.md'); + await expectMissing('dist/agents/prompts/executor/SYSTEM.md'); + await expectMissing('dist/agents/prompts/explorer/SYSTEM.md'); + await expectMissing('dist/agents/prompts/pi-coder/SYSTEM.md'); + await expectMissing('dist/agents/prompts/projector/SYSTEM.md'); + await expectMissing('dist/agents/prompts/researcher/SYSTEM.md'); + await expectMissing('dist/agents/prompts/reviewer/SYSTEM.md'); + expect((await readdir(join(projectRoot, 'dist/agents/prompts'))).sort()).toEqual([ + 'elicitor.md', + 'executor.md', + ]); + expect((await readdir(join(projectRoot, 'dist/agents/subagents'))).sort()).toEqual([ + 'explorer.md', + 'projector.md', + 'researcher.md', + 'reviewer.md', + ]); }); - it('records the adopted body topology in the local README', async () => { - const readme = await readFile(join(projectRoot, 'src/agents/prompts/README.md'), 'utf8'); + it('records the foreground/background split in local READMEs', async () => { + const promptsReadme = await readFile(join(projectRoot, 'src/agents/prompts/README.md'), 'utf8'); + const subagentsReadme = await readFile(join(projectRoot, 'src/agents/subagents/README.md'), 'utf8'); - expect(readme).toContain('SYSTEM.md convention is adopted'); - expect(readme).toContain('Background bodies are subagent resources, not foreground prompts'); - expect(readme).toContain('BACKGROUND_SUBAGENT_IDS'); - expect(readme).toContain('Background frontmatter is authoring DX'); - expect(readme).toContain('Unlisted directories are not spawnable'); + expect(promptsReadme).toContain('Flat foreground files are canonical'); + expect(promptsReadme).toContain('src/agents/prompts/{elicitor,executor}.md'); + expect(promptsReadme).toContain('src/agents/subagents/'); + expect(promptsReadme).toContain('retired orchestrator / pi-coder body aliases are not preserved'); + expect(subagentsReadme).toContain('BACKGROUND_SUBAGENT_IDS'); + expect(subagentsReadme).toContain('Unlisted files are not spawnable'); }); }); From 4bec76372e2d8de07547112941fd5d7792afb123 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 16:38:49 +0200 Subject: [PATCH 22/29] Reconcile plan-check scope to executor surface --- ...orchestrator-tool-port--plan-check-tool.md | 29 ++++++++----------- 1 file changed, 12 insertions(+), 17 deletions(-) diff --git a/memory/cards/orchestrator-tool-port--plan-check-tool.md b/memory/cards/orchestrator-tool-port--plan-check-tool.md index 8e1535346..83ae9d877 100644 --- a/memory/cards/orchestrator-tool-port--plan-check-tool.md +++ b/memory/cards/orchestrator-tool-port--plan-check-tool.md @@ -7,7 +7,7 @@ Created: 2026-06-25 ## Orientation -- Containing seam: `execute` mode's foreground `orchestrator` agent and the `.pi/extensions` adapter boundary; this slice replaces the standup stub with the first real cook-orchestrator tool. +- Containing seam: `execute` mode's foreground `executor` agent and the `.pi/extensions` adapter boundary; this slice replaces the branch-local standup stub with the first real cook-plan inspection tool. - Relevant frontier item: `orchestrator-tool-port` / FE-1087, inherited as the Linear issue and branch boundary from `memory/PLAN.md`. - Volatile handoff state: none in `HANDOFF.md` (absent); source context comes from the prior port analysis and the external `../brunch` orchestrator docs/source. - Main open risk: accidentally importing the CLI's execution side effects before the read-only tool boundary is proved; preserve the D39-L sealed profile and D90-L-D93-L/I49-L code-owned authority model. @@ -16,14 +16,14 @@ Posture: proving (inherited from `orchestrator-tool-port`) ## Target Behavior -The execute-mode orchestrator can inspect a cook plan through a product-registered, read-only `cook_plan_check` tool whose result contains plan shape plus contract findings. +The execute-mode executor can inspect a cook plan through a product-registered, read-only `cook_plan_check` tool whose result contains plan shape plus contract findings. ## Full-card cold-start reads - `memory/SPEC.md` — decisions / invariants: D39-L, D40-L, D90-L, D91-L, D92-L, D93-L, I49-L. - `memory/PLAN.md` — frontier: `orchestrator-tool-port`. - `src/.pi/extensions/README.md` — adapter-only ownership and boundary rules. -- `src/agents/prompts/orchestrator/SYSTEM.md` — current execute-mode foreground prompt and stub wording to retire. +- `src/agents/prompts/executor.md` — current execute-mode foreground prompt and stub wording to retire. - `src/agents/runtime/policy.ts` — `execute` foreground roster and blocked direct tool policy. - `src/session/schema/tool-names.ts` — shared tool-name constants. - `/Users/lunelson/Code/hashintel/brunch/ORCHESTRATOR.md` — source CLI behavior and plan format. @@ -32,9 +32,9 @@ The execute-mode orchestrator can inspect a cook plan through a product-register ## Boundary Crossings ```text -→ execute-mode foreground `orchestrator` prompt +→ execute-mode foreground `executor` prompt → runtime policy tool grant / block list -→ `.pi/extensions/orchestrator` Pi tool adapter +→ `.pi/extensions/agent-runtime` Pi tool adapter → product-owned `src/orchestrator` plan loader + contract core → workspace cook plan path → typed Pi tool result content/details @@ -43,7 +43,7 @@ The execute-mode orchestrator can inspect a cook plan through a product-register ## Risks and Assumptions - RISK: CLI code pulls in process exits, git worktree creation, model auth, or child Pi sessions too early → MITIGATION: port only pure/read-only plan loading and contract checking in this slice; no sandbox, engine, Petrinaut stream, or worker session imports. -- RISK: The foreground `orchestrator` gains accidental write authority while replacing the stub → MITIGATION: keep `bash`, `edit`, and `write` blocked in `agents/runtime/policy.ts`; register only the read-only `cook_plan_check` tool for this card. +- RISK: The foreground `executor` gains accidental write authority while replacing the stub → MITIGATION: keep `bash`, `edit`, and `write` blocked in `agents/runtime/policy.ts`; register only the read-only `cook_plan_check` tool for this card. - RISK: External source names leak as temporary compatibility aliases → MITIGATION: canonicalize the product-facing tool name now; delete the `orchestrator_stub` tool path when the real tool is registered. - ASSUMPTION: The external cook plan contract is the right first tracer boundary for the port. → IMPACT IF FALSE: the later `cook_run` surface may need a different plan source/result model, but this slice's blast radius is limited to read-only validation and prompt/tool naming. @@ -51,7 +51,7 @@ The execute-mode orchestrator can inspect a cook plan through a product-register ## Posture check -This is a proving tracer. It scores on proof of life by making execute mode call real orchestrator-derived product code, on invariants by locking the foreground no-direct-write boundary while still exposing orchestration capability, and on uncertainty by testing that the external `brunch cook` plan contract can be ported without shell-wrapping the CLI. +This is a proving tracer. It scores on proof of life by making execute mode call real cook-plan product code, on invariants by locking the foreground no-direct-write boundary while still exposing orchestration capability, and on uncertainty by testing that the external `brunch cook` plan contract can be ported without shell-wrapping the CLI. No separate spike is cheaper than this slice: the useful proof is whether the product registry, prompt, runtime policy, and plan contract all line up through the real execute-mode tool boundary. @@ -59,9 +59,9 @@ No separate spike is cheaper than this slice: the useful proof is whether the pr ✓ `cook_plan_check` is product-registered for execute mode and returns a typed result for a valid plan path containing mode, epic count, slice count, policy-relevant findings, and source path. ✓ Invalid or contract-failing plans return deterministic typed findings/errors without creating `.brunch/cook/runs`, git worktrees, Petrinaut artifacts, or child Pi sessions. -✓ `orchestrator_stub` is no longer advertised to the foreground orchestrator, and the old stub registration path is retired. +✓ The branch-local executor stub is no longer advertised to the foreground executor, and the old stub registration path is retired. ✓ `agents/runtime/policy.ts` still blocks direct `bash`, `edit`, and `write` for `execute`, with tests or assertions covering the new tool grant. -✓ `src/agents/prompts/orchestrator/SYSTEM.md` tells the foreground agent to use the real plan-check tool and preserves the no-direct-write instruction. +✓ `src/agents/prompts/executor.md` tells the foreground agent to use the real plan-check tool and preserves the no-direct-write instruction. ## Verification Approach @@ -72,7 +72,7 @@ No separate spike is cheaper than this slice: the useful proof is whether the pr ## Cross-cutting obligations - Preserve D39-L sealed-profile discipline: no ambient Pi discovery, dynamic import scanning, or shell-wrapped CLI escape hatch. -- Preserve D90-L-D93-L/I49-L authority: foreground `orchestrator` remains low-privilege; any future write-capable worker must be code-owned and explicitly allowlisted. +- Preserve D90-L-D93-L/I49-L authority: foreground `executor` remains low-privilege; any future write-capable worker must be code-owned and explicitly allowlisted. - Keep `.pi/extensions` adapter-only: reusable plan-contract logic belongs in product core, not hidden extension memory. - Treat `.brunch/cook/runs/` as an execution artifact for later `cook_run`, not an artifact this read-only slice creates. @@ -92,8 +92,7 @@ src/ │ └── plan-check.test.ts + ├── agents/ │ ├── prompts/ -│ │ └── orchestrator/ -│ │ └── SYSTEM.md ~ +│ │ └── executor.md ~ │ └── runtime/ │ ├── policy.ts ~ │ └── __tests__/ ? @@ -101,11 +100,7 @@ src/ │ ├── extensions/ │ │ ├── README.md ~ │ │ ├── agent-runtime/ ~ -│ │ ├── orchestrator/ + -│ │ │ ├── index.ts + -│ │ │ └── __tests__/ -│ │ │ └── orchestrator-tool.test.ts + -│ │ └── orchestrator-stub/ - +│ │ └── agent-runtime/orchestrator-stub/ - │ └── __tests__/ ? ├── app/ │ └── pi-extensions.ts ~ From 8b231afbf519d2cfce033f1986b40aa517470f28 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 16:39:49 +0200 Subject: [PATCH 23/29] Clarify current execute vocabulary in prompt prose --- memory/SPEC.md | 4 ++-- src/agents/README.md | 10 ++++++---- src/agents/prompts/README.md | 29 ++++++++++------------------- src/agents/prompts/executor.md | 5 +++++ 4 files changed, 23 insertions(+), 25 deletions(-) create mode 100644 src/agents/prompts/executor.md diff --git a/memory/SPEC.md b/memory/SPEC.md index 91d330cc5..9f3546178 100644 --- a/memory/SPEC.md +++ b/memory/SPEC.md @@ -277,7 +277,7 @@ The POC's purpose is to prove three things: (a) that pi's coding-agent harness c - **D93-L — Operational mode and foreground agent collapse to one op-mode-keyed source of truth.** A foreground agent and its operational mode are 1:1 (D40-L: the foreground agent is derived from operational mode), so the prior **three-record fragmentation** — id enums in `src/session/schema/kinds.ts`, `OPERATIONAL_MODE_DEFINITIONS` + `AGENT_ROLE_DEFINITIONS` + `TOOL_POLICY_DEFINITIONS` in the former projections runtime-policy module, and `AGENT_PROMPT_DEFINITIONS` in `src/agents/runtime/state.ts` (which duplicated `model`/`thinking`/prompt-resource grants across two of them) — collapses to a **single op-mode-keyed record**. An operational mode IS `{ foreground AgentManifest (D90-L), tool policy, canDelegate set }`; background agents live in a sibling `AgentManifest` registry, and the per-agent **`canDelegate`** field (D92-L generalized from op_mode-keyed to a manifest field) links a foreground mode to the background agents it may spawn — **code-owned for foreground modes** so the write-safety boundary (I49-L) holds; it also generalizes to background→background nesting. D98-L refines the roster from the earlier `elicit` / `execute` / `code` split to the product target **`SPEC` → `elicitor`** and **`CODE` → `executor`**. The executor merges the prior `orchestrator` and `pi-coder` directions: it is Brunch-data-aware, can perform ordinary coding-assistant work under the CODE tool policy, and owns the plan-execution orchestration tool surface instead of forcing a separate execute coordinator. Depends on: D23-L, D40-L, D58-L, D90-L, D92-L, D98-L; I49-L. Establishing frontier: `subagent-reconciliation` established the shared manifest/collapse substrate; the SPEC/CODE roster correction is owned by the data-model-legibility / executor follow-on planning. Supersedes: the three-record foreground-agent fragmentation as separate sources of truth; `defaultRole`/`allowedRoles` as a flexible many-roles-per-mode model (it is 1:1); and the three-foreground-mode split where `execute`/`orchestrator` and `code`/`pi-coder` were separate product directions. - **D36-L — Spec/session selection is a reusable hierarchical decision model with transport-specific presentations.** Brunch owns a pure spec/session selection model that renders cwd-scoped inventory under the discovered project name without calling the user-created object a “workspace”. In TUI mode, the model may present a fast “continue last session” affordance when `.brunch/workspace.json` points to a valid spec+session; otherwise, or after “other spec/session”, the durable tree is: `create new spec → provide spec name → session created automatically`; `resume existing spec → choose existing spec → create a new session OR resume existing session → choose existing session`. The UI should not list every spec as a top-level action label; “resume existing spec” is the top-level intent, and the spec list is the next screen/scrollable selector. The model returns a product decision (`new spec`, `new session for spec`, `open session`, `continue selected session`, `cancel/quit`) without opening Pi sessions or mutating `.brunch/workspace.json` itself. The `WorkspaceSessionCoordinator` activates that decision and owns all persistence/session-binding effects. TUI startup and in-session paths share branded `pi-tui` components and colocated logo assets under `src/.pi/components/workspace-dialog`; adapters differ only in terminal lifecycle and Pi session-replacement mechanics (`ProcessTerminal`/`TUI.showOverlay` before Pi starts, `ctx.ui.custom(..., { overlay: true })` inside Pi), not in product semantics. RPC/headless transports must not invoke the TUI picker; they expose the same initial-selection requirement and activation decisions as JSON-RPC/product results so CLI JSON-RPC clients can select or create spec/session correctly. Depends on: D11-L, D21-L, D24-L, D33-L. Supersedes: implicit resume of `.brunch/workspace.json` on TUI launch, Pi `/resume`/`/new` as Brunch's product session chooser, one-off startup-only picker implementations, a flat action list that says “workspace” for specs, top-level `resume spec X` labels, and a separate intermediate action chooser for switching. - **D42-L — Session naming is Pi `session_info` presentation metadata, not spec identity.** Brunch-created sessions should be named at creation with neutral workspace-global defaults (`Untitled Session 1`, `Untitled Session 2`, …) so pickers/chrome never show an unnamed Brunch session and unchanged defaults do not collide across specs in the same cwd. These defaults are immediate lifecycle metadata, not LLM-generated summaries and not derived from the selected spec title. Brunch may later use Pi session lifecycle hooks to opportunistically replace a default with a short human-readable name that characterizes what happened in the transcript. The preferred generation trigger is `session_shutdown` for `quit`, `new`, and `resume` replacements because it sees the just-finished transcript and can name it before later picker lists need to distinguish sessions; `session_before_compact` or post-compaction (`session_compact`) may be used to refresh names after major summarization, and a manual/user rename command can force or override naming. The generation call should mirror the model-selection pattern in the local `summarize.ts` extension example: choose a cheap/fast authorized model, extract user/assistant text plus salient tool calls from the current branch, ask for a concise title, and append a Pi `session_info` entry through `SessionManager.appendSessionInfo`. Naming must be best-effort and non-blocking with a tight budget: failures, missing auth, empty transcripts, or shutdown aborts preserve the existing default/user label rather than blocking session replacement or exit. Session display names label sessions in pickers and chrome, but do not affect spec ids, session bindings, graph truth, or replay semantics. Depends on: D6-L, D17-L, D21-L, D35-L. Supersedes: using spec title or session UUID alone as the only durable display label once transcripts have meaningful content, leaving Brunch-created sessions unnamed, spec-local default numbering, or treating generated session names as canonical spec identity. -- **D58-L — Brunch prompt composition is a thin runtime header plus load-on-demand prompt resources, not eager selection of every objective pack.** The architectural commitment is: composition stays a projection layer, not a behavioral state machine; detailed guidance lives in read-on-demand prompt resources and agent-readable references rather than eager prompt-pack concatenation; runtime availability is Brunch's sealed resource manifest, not ambient Pi discovery; D98-L suspends prompt-resource axes as runtime state, so composition may advertise resources/pointers without presenting strategy/lens/method as selected posture; and the pushed-context slice stays compact, with deeper access governed by D60-L. Current prompt-resource topology, manifest emission, file-owned skill metadata, seed context composition, and ownership split across `agents/prompts/`, `agents/subagents/`, `agents/skills/`, `agents/runtime/`, `agents/contexts/`, and `.pi/extensions/agent-runtime/` live in [`src/agents/README.md`](src/agents/README.md), [`src/agents/prompts/README.md`](src/agents/prompts/README.md), [`src/agents/subagents/README.md`](src/agents/subagents/README.md), [`src/agents/skills/README.md`](src/agents/skills/README.md), [`src/agents/runtime/README.md`](src/agents/runtime/README.md), [`src/agents/contexts/README.md`](src/agents/contexts/README.md), [`src/.pi/README.md`](src/.pi/README.md), [`src/.pi/extensions/README.md`](src/.pi/extensions/README.md), [`src/agents/runtime/compose.ts`](src/agents/runtime/compose.ts), [`src/agents/runtime/state.ts`](src/agents/runtime/state.ts), and [`src/agents/contexts/seeds/turn-context.ts`](src/agents/contexts/seeds/turn-context.ts). **Base-prompt relationship (validated 2026-06-18, slice 1):** the `before_agent_start` handler **appends** Brunch's composed block (now led by the foreground prompt body, then runtime header + manifests) to Pi's base system prompt (`${basePrompt}\n\n${composed}`), so a foreground agent currently *augments* Pi's base coding-agent prompt rather than replacing it. Whether a foreground prompt body should suppress or replace that base is **open** and tied to the future executor/CODE op-mode (which deliberately augments Pi's coding agent); the `elicitor` augmenting a coding base is a known follow-on question, not a settled choice. Refined by: D93-L (the `code`→`pi-coder` foreground mode instantiates the augment case; the replace option for other roles stays open). Composition is projection, not a behavioral state machine. Depends on: D23-L, D25-L, D39-L, D40-L, D52-L, D59-L, D60-L. Refined by: D85-L (implemented 2026-06-18/19: the manifest drops `` — two axes `strategy` + `lens` — and the `goal` body inlines into the `elicitor` role prompt) and by the 2026-06-22 prompt-skill-topology slice (all prompt resources adopt Agent Skills `SKILL.md` topology; `description` becomes file-owned frontmatter; the emitted wrapper becomes `` with per-skill ``). Supersedes: the flat "base + mode + role + strategy + lens + grade + …" layering; the fixed all-packs concatenation in `compose-brunch-prompt.ts`; "role preset / runtime bundle" as the composition unit; direct Layer-2 eager prompt-pack injection as the default mechanism; treating top-level `src/agents/` as Pi-only rather than Brunch LLM-context ingress; and `capability` as a parallel name for `method`. +- **D58-L — Brunch prompt composition is a thin runtime header plus load-on-demand prompt resources, not eager selection of every objective pack.** The architectural commitment is: composition stays a projection layer, not a behavioral state machine; detailed guidance lives in read-on-demand prompt resources and agent-readable references rather than eager prompt-pack concatenation; runtime availability is Brunch's sealed resource manifest, not ambient Pi discovery; D98-L suspends prompt-resource axes as runtime state, so composition may advertise resources/pointers without presenting strategy/lens/method as selected posture; and the pushed-context slice stays compact, with deeper access governed by D60-L. Current prompt-resource topology, manifest emission, file-owned skill metadata, seed context composition, and ownership split across `agents/prompts/`, `agents/subagents/`, `agents/skills/`, `agents/runtime/`, `agents/contexts/`, and `.pi/extensions/agent-runtime/` live in [`src/agents/README.md`](src/agents/README.md), [`src/agents/prompts/README.md`](src/agents/prompts/README.md), [`src/agents/subagents/README.md`](src/agents/subagents/README.md), [`src/agents/skills/README.md`](src/agents/skills/README.md), [`src/agents/runtime/README.md`](src/agents/runtime/README.md), [`src/agents/contexts/README.md`](src/agents/contexts/README.md), [`src/.pi/README.md`](src/.pi/README.md), [`src/.pi/extensions/README.md`](src/.pi/extensions/README.md), [`src/agents/runtime/compose.ts`](src/agents/runtime/compose.ts), [`src/agents/runtime/state.ts`](src/agents/runtime/state.ts), and [`src/agents/contexts/seeds/turn-context.ts`](src/agents/contexts/seeds/turn-context.ts). **Base-prompt relationship (validated 2026-06-18, slice 1):** the `before_agent_start` handler **appends** Brunch's composed block (now led by the foreground prompt body, then runtime header + manifests) to Pi's base system prompt (`${basePrompt}\n\n${composed}`), so a foreground agent currently *augments* Pi's base coding-agent prompt rather than replacing it. Whether a foreground prompt body should suppress or replace that base is **open** and tied to the future executor/CODE op-mode (which deliberately augments Pi's coding agent); the `elicitor` augmenting a coding base is a known follow-on question, not a settled choice. Refined by: D93-L/D98-L (the CODE→`executor` foreground mode instantiates the augment case; the replace option for other roles stays open). Composition is projection, not a behavioral state machine. Depends on: D23-L, D25-L, D39-L, D40-L, D52-L, D59-L, D60-L. Refined by: D85-L (implemented 2026-06-18/19: the manifest drops `` — two axes `strategy` + `lens` — and the `goal` body inlines into the `elicitor` role prompt) and by the 2026-06-22 prompt-skill-topology slice (all prompt resources adopt Agent Skills `SKILL.md` topology; `description` becomes file-owned frontmatter; the emitted wrapper becomes `` with per-skill ``). Supersedes: the flat "base + mode + role + strategy + lens + grade + …" layering; the fixed all-packs concatenation in `compose-brunch-prompt.ts`; "role preset / runtime bundle" as the composition unit; direct Layer-2 eager prompt-pack injection as the default mechanism; treating top-level `src/agents/` as Pi-only rather than Brunch LLM-context ingress; and `capability` as a parallel name for `method`. #### Continuity & origination (turn-boundary choreography) @@ -324,7 +324,7 @@ The POC's purpose is to prove three things: (a) that pi's coding-agent harness c - **D95-L — Elicitor capability spine: `capture` / `generate` / `project` are the three SPEC-mode capabilities.** The elicitor's work decomposes into three capabilities by what each does to the graph: **capture** commits ground material already present in the transcript tail into graph truth (the D80-L banded sweep + D81-L commitment gradient + D82-L acquisition layer, already specced); **generate** proposes new typed graph expressions on a requested plane from grounding plus a conceptual frame, fanning candidates out and committing the chosen one through review (D96-L); **project** derives nodes on one plane from a subset/plane of the existing graph with connecting cross-plane edges (e.g. requirements→design, design→oracles, A33-L). D98-L makes this a SPEC-mode capability vocabulary, not a runtime-axis topology: capture/generate/project are the elicitor's jobs, while strategy/lens/method files are optional prompt-resource organization if they improve behavior. Capture remains always-on conduct of every elicitor turn; generate and project are requested just-in-time and readiness remains advisory rather than a graph-write tool gate (D74-L/D86-L). Background acquisition subagents (D82-L near-future, A34-L) are the `acquire` arm feeding capture, not a fourth capability. Depends on: D74-L, D80-L, D81-L, D82-L, D85-L, D86-L, D98-L. Supersedes: the proposed `grounding` / `elicitation` / `projection` lifecycle directories as a replacement skill topology, and treating strategy/lens/method as the load-bearing runtime capability model. - **D96-L — `generate` is one deep plane-parameterized skill; fan-in is a three-value mode carried by `present_candidates` + the review-set path, not three skills.** Generative proposal across the intent, design, and oracle planes is **one** `generate` skill taking the target plane (and lens frame) as a parameter, not per-plane `propose-scenarios` / `propose-design-shapes` / `propose-oracle-ensembles` skills (the earlier per-plane sketch in [`docs/design/ELICITATION_LENSES.md`](docs/design/ELICITATION_LENSES.md)). The fan-out/fan-in shape is shared: the skill fans candidate expressions out, then **fan-in is a three-value mode** — `pick` (choose one), `synthesize` (merge candidates into one), `compose` (accept several) — expressed as plane-keyed method conduct over `present_candidates` plus the review-set path rather than branched per plane. Plane-specific judgment (the "design it twice" pattern for design, oracle-family selection for oracles, the kernel/lens heuristics in [`docs/design/BEHAVIORAL_KERNELS.md`](docs/design/BEHAVIORAL_KERNELS.md) / [`docs/design/ELICITATION_LENSES.md`](docs/design/ELICITATION_LENSES.md) for intent) lives in plane-keyed skill content read on demand (D58-L manifest world), not in separate skills. This **entailed un-stubbing the `present_candidates` topology** — the tool [`src/.pi/extensions/exchanges/present-candidates.ts`](src/.pi/extensions/exchanges/present-candidates.ts), the projection [`src/projections/exchanges/present-candidates.ts`](src/projections/exchanges/present-candidates.ts), and the renderer [`src/agents/contexts/exchanges/present-candidates.ts`](src/agents/contexts/exchanges/present-candidates.ts) — which D85-L move 4 confirmed as a live anticipated stub, not a fossil: candidate presentation gets its product owner here. The materialized tool remains **pick-only at the UI boundary**: intent uses the pick as recognition/provenance, while design-plane synthesize is performed by the method after the pick and then reviewed/committed through `present_review_set → request_response → acceptReviewSet`, so no `fan_in_mode` field is needed unless a later plane proves the UI itself must carry that mode. Commitment still flows through the review-set path (D27-L); `present_candidates` recognizes fan-out presentation and never commits graph truth itself (I51-L). Depends on: D26-L, D27-L, D30-L, D31-L, D58-L, D74-L, D85-L, D95-L; A31-L, A32-L. Supersedes: per-plane generative skills as the topology; treating `present_candidates` as a permanent stub without a product owner; prebuilding a fan-in schema field before a plane proves it necessary. - **D97-L — Skill ontology-heuristic provenance: three sources — consumed context renders, generated typed-vocab, hand-authored judgment — kept distinct.** Skill bodies that teach the agent how to think about the graph model draw ontology/heuristic content from three provenance classes that must not blur: (1) **dynamic instance context** rendered into the prompt by the context-render house style (D83-L, FE-870) — graph overviews, gap agendas, neighborhoods — consumed, never restated in skill prose; (2) **generated typed-vocab context references** (`src/agents/contexts/references/`, D85-L prompt-shape closure (d)) projected from the closed `kinds.ts` enums (D73-L) and drift-checked, for any skill that must enumerate node kinds / edge categories / bands / planes; and (3) **hand-authored judgment** — the irreducible "how to reason" content (kernels, lenses, oracle-family selection) that is neither instance data nor mechanical vocabulary. The materialized shared authored judgment reference is [`src/agents/contexts/references/graph-authoring-heuristics.md`](src/agents/contexts/references/graph-authoring-heuristics.md), cited by `capture` and `commit-graph` for declarative graph claims, settled commitment, low-confidence/contradiction routing, confident relation endpoints, and role-named mutation grammar. The rule: a skill cites the context renderer or the generated/authored reference rather than copying its content, so ontology drift (D73-L renames, D94-L band changes) propagates through one canonical source. Depends on: D58-L, D73-L, D83-L, D85-L, D94-L; FE-870. Supersedes: hand-restating node-kind / band / edge-category / plane vocabulary inside skill bodies. -- **D98-L — Operational mode only: suspend strategy/lens/method runtime axes; target product modes are SPEC and CODE.** The architectural correction is that the `strategy` / `lens` / `method` model is not yet proven as the right product/runtime abstraction. It may still organize prompt-resource files and concise agent-readable references, but it must not be a user-facing TUI picker, transcript-backed posture field, AUTO axis, tool gate, or the source of foreground-agent identity until live elicitor behavior proves that shape earns its cost. Runtime state narrows to one mutable axis: operational mode. The target product mode labels are **`SPEC`** and **`CODE`**. `SPEC` runs the `elicitor`, whose job is to get a user from zero to a complete spec through three capabilities: (1) capture arbitrary unstructured material into graph truth with correct confidence/basis/gap handling; (2) generate candidate graph concepts on the intent, design, and oracle planes and commit coherent graph expressions through the appropriate review/commit path; and (3) project selected graph subsets or planes into downstream planes with connecting nodes and edges. `CODE` runs the **`executor`**, a Brunch-aware coding assistant that merges the prior `orchestrator` and `pi-coder` directions: it can read/use Brunch graph and session context, can act as a normal coding assistant under the CODE tool policy, and owns the plan-execution orchestration tool surface (the previously stubbed orchestrator tool) instead of requiring a separate execute-mode coordinator. The TUI should expose only `mode: SPEC | CODE`; prompt-resource/reference loading remains agent-internal and load-on-demand unless a narrow runtime moment proves eager injection necessary. Depends on: D23-L, D40-L, D58-L, D85-L, D90-L, D93-L, D95-L, D97-L. Establishing frontier: `data-model-legibility` for the SPEC-mode guidance/reference substrate, followed by executor/tool-port work for CODE. Supersedes: runtime persistence or UI exposure of strategy/lens/method axes; the `elicit` / `execute` / planned `code` three-mode foreground roster; the separation between `orchestrator` as execute coordinator and `pi-coder` as direct-coding mode; and treating method routing as the product-level capability model. +- **D98-L — Operational mode only: suspend strategy/lens/method runtime axes; target product modes are SPEC and CODE.** The architectural correction is that the `strategy` / `lens` / `method` model is not yet proven as the right product/runtime abstraction. It may still organize prompt-resource files and concise agent-readable references, but it must not be a user-facing TUI picker, transcript-backed posture field, AUTO axis, tool gate, or the source of foreground-agent identity until live elicitor behavior proves that shape earns its cost. Runtime state narrows to one mutable axis: operational mode. Current runtime ids remain **`elicit`** and **`execute`** in this slice; **`SPEC`** and **`CODE`** are the target product labels, not yet the persisted/runtime ids. `elicit`/target-SPEC runs the `elicitor`, whose job is to get a user from zero to a complete spec through three capabilities: (1) capture arbitrary unstructured material into graph truth with correct confidence/basis/gap handling; (2) generate candidate graph concepts on the intent, design, and oracle planes and commit coherent graph expressions through the appropriate review/commit path; and (3) project selected graph subsets or planes into downstream planes with connecting nodes and edges. `execute`/target-CODE runs the **`executor`**, a Brunch-aware coding assistant that merges the prior `orchestrator` and `pi-coder` directions: it can read/use Brunch graph and session context, can act as a normal coding assistant under the execute/CODE tool policy, and owns the plan-execution orchestration tool surface (the previously stubbed orchestrator tool) instead of requiring a separate execute-mode coordinator. A future runtime rename may expose `mode: SPEC | CODE`; prompt-resource/reference loading remains agent-internal and load-on-demand unless a narrow runtime moment proves eager injection necessary. Depends on: D23-L, D40-L, D58-L, D85-L, D90-L, D93-L, D95-L, D97-L. Establishing frontier: `data-model-legibility` for the SPEC-mode guidance/reference substrate, followed by executor/tool-port work for CODE. Supersedes: runtime persistence or UI exposure of strategy/lens/method axes; the planned `code` third foreground roster; the separation between `orchestrator` as execute coordinator and `pi-coder` as direct-coding mode; and treating method routing as the product-level capability model. ### Critical Invariants diff --git a/src/agents/README.md b/src/agents/README.md index 745aad3e7..9d8723b89 100644 --- a/src/agents/README.md +++ b/src/agents/README.md @@ -9,11 +9,12 @@ SPEC decisions: D39-L, D40-L, D52-L, D60-L, D85-L, D90-L, D91-L, D93-L ```text agents/ ├── README.md -├── prompts/ bundled foreground/background agent body markdown +├── prompts/ flat foreground elicit/execute body markdown +├── subagents/ flat background subagent body markdown ├── skills/ strategy/lens/method prompt-resource markdown ├── runtime/ prompt composition and prompt-resource/tool legality ├── contexts/ agent-visible seed, context-tool, graph, and exchange text -├── registry.ts path registry for bundled agent bodies and prompt-resource skills +├── registry.ts path registry for foreground bodies and prompt-resource skills └── __tests__/ registry/topology tests ``` @@ -21,7 +22,8 @@ agents/ ```pseudo rules: - agents/registry.ts -> agents/prompts/*/SYSTEM.md [body file locations] + agents/registry.ts -> agents/prompts/{elicitor,executor}.md [foreground body file locations] + .pi/extensions/subagents/agents.ts -> agents/subagents/*.md [background body file locations] agents/registry.ts -> agents/skills/*/*/SKILL.md [prompt-resource locations] agents/contexts/ -> graph/, projections/, session/, workspace/ [agent-visible text over already-read facts] agents/runtime/ -> agents/registry, agents/prompts, agents/skills, session/schema @@ -33,4 +35,4 @@ rules: ## Migration note -Agent prompt bodies, prompt-resource skills, foreground roster/tool policy, capability-readiness policy, prompt composition, prompt-resource/tool legality, seed context composition, reusable agent-visible context renderers, and formerly adapter-local model-facing text live here. Pi extensions remain runtime adapters that register hooks/tools, gather data, and call this layer for Brunch-authored text. +Foreground prompt bodies, background subagent bodies, prompt-resource skills, foreground roster/tool policy, capability-readiness policy, prompt composition, prompt-resource/tool legality, seed context composition, reusable agent-visible context renderers, and formerly adapter-local model-facing text live here. Pi extensions remain runtime adapters that register hooks/tools, gather data, and call this layer for Brunch-authored text. diff --git a/src/agents/prompts/README.md b/src/agents/prompts/README.md index e262c02c4..eff6520c5 100644 --- a/src/agents/prompts/README.md +++ b/src/agents/prompts/README.md @@ -1,40 +1,31 @@ -# agents/prompts/ — agent role bodies +# agents/prompts/ — foreground agent bodies -SPEC decisions: D25-L, D40-L, D58-L, D85-L, D90-L, D91-L, D93-L +SPEC decisions: D25-L, D40-L, D58-L, D85-L, D90-L, D91-L, D93-L, D98-L ## Owns -Keyed foreground and background agent body resources — the markdown persona text a Brunch agent contributes to its system prompt. Background bodies intentionally stay here instead of a parallel `src/agents/subagents/` home because foreground and background manifests share the same `AgentManifest.body` file convention; spawnability is still owned by the subagent registry, not by this directory. +Flat markdown persona text for Brunch foreground operational modes. The foreground roster is code-owned in `src/agents/runtime/policy.ts`; body file locations are centralized in `src/agents/registry.ts`. ```text prompts/ ├── README.md -├── elicitor/SYSTEM.md foreground elicit-mode body -├── executor/SYSTEM.md foreground execute-mode body -├── explorer/SYSTEM.md background codebase recon body + frontmatter -├── researcher/SYSTEM.md background web-research body + frontmatter -├── projector/SYSTEM.md background candidate-proposal body + frontmatter -├── reviewer/SYSTEM.md background proposal/commitment review body + frontmatter -└── pi-coder/SYSTEM.md future unwired coding-agent augmentation baseline +├── elicitor.md elicit runtime / target-SPEC foreground body +└── executor.md execute runtime / target-CODE foreground body ``` -This directory is markdown-only. It carries no TypeScript and registers no Pi hooks. Foreground metadata is code-owned in the op-mode-keyed foreground roster (`src/agents/runtime/policy.ts`), while body file locations are centralized in `src/agents/registry.ts`. Background metadata is authored as frontmatter but discovered only through the explicit `BACKGROUND_SUBAGENT_IDS` registry in `src/.pi/extensions/subagents/agents.ts`. +This directory is markdown-only. It carries no TypeScript and registers no Pi hooks. ## Prompt-shape decisions -- **SYSTEM.md convention is adopted:** foreground and background agent bodies use `src/agents/prompts//SYSTEM.md`. -- **Background bodies are subagent resources, not foreground prompts:** `explorer`, `researcher`, `projector`, and `reviewer` are loaded only through the explicit `BACKGROUND_SUBAGENT_IDS` registry in `src/.pi/extensions/subagents/agents.ts`; keeping their markdown beside foreground bodies is a shared body-file convention, not foreground availability. -- **Background frontmatter is authoring DX:** background `SYSTEM.md` files carry `name`/`description`/`tools`/`model`/`thinking`, but the code-owned registry decides which ids exist. Unlisted directories are not spawnable. +- **Flat foreground files are canonical:** foreground agent bodies live at `src/agents/prompts/{elicitor,executor}.md`. +- **Background bodies are subagent resources, not foreground prompts:** `explorer`, `researcher`, `projector`, and `reviewer` live under `src/agents/subagents/` and load only through the explicit `BACKGROUND_SUBAGENT_IDS` registry. +- **Runtime/product vocabulary stays honest:** current runtime ids are `elicit` and `execute`; target product labels are SPEC and CODE. `elicitor` is the elicit/target-SPEC foreground agent and `executor` is the execute/target-CODE foreground agent; retired orchestrator / pi-coder body aliases are not preserved. ## Does NOT own +- Background prompt bodies, frontmatter, or spawnability — `src/agents/subagents/` plus `src/.pi/extensions/subagents/`. - Foreground prompt composition, pushed seed contexts, prompt-resource manifest selection, or tool/method legality — `src/agents/runtime/` and `src/agents/contexts/seeds/`. -- Background prompt assembly and injected-world child-session wiring — `src/.pi/extensions/subagents/`. - Strategy/lens/method prompt-resource skills — `src/agents/skills/`. - Reusable model-facing context text — `src/agents/contexts/`. - Human/product-only text rendering — owned beside its product/session caller. - Pi tool definitions, lifecycle hooks, UI, and background child-session loading/running — `src/.pi/extensions/*`. - -## Migration note - -Pi extension code remains a runtime adapter: it loads foreground bodies and background agent definitions through `src/agents/registry.ts`, not through extension-local paths or directory discovery. `pi-coder` records Pi's `buildSystemPrompt` worked-example baseline while D58-L's augment-vs-replace question stays open. diff --git a/src/agents/prompts/executor.md b/src/agents/prompts/executor.md new file mode 100644 index 000000000..63a646148 --- /dev/null +++ b/src/agents/prompts/executor.md @@ -0,0 +1,5 @@ +# Agent: executor + +The executor is the foreground Brunch session agent for the current `execute` runtime mode and the target CODE product mode. It is a Brunch-aware coding/execution agent: read the selected spec/session context, explain what execution step is possible, and use only the tools exposed by the execute policy. + +Stay inside the current selected spec and session context. Do not call shell or file-writing tools; execute mode blocks direct `bash`, `edit`, and `write` access. This branch has no delegated workers yet, so treat `canDelegate = []` as a hard boundary. From 0fca41b802d68f8e48a84925e5fc658efdde2025 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 16:40:28 +0200 Subject: [PATCH 24/29] Lock executor prompt composition negative case --- .../executor--execute-default.md | 35 +++++++++++++++++++ src/agents/runtime/__tests__/compose.test.ts | 35 ++++++++++++++++++- 2 files changed, 69 insertions(+), 1 deletion(-) create mode 100644 src/agents/runtime/__snapshots__/executor--execute-default.md diff --git a/src/agents/runtime/__snapshots__/executor--execute-default.md b/src/agents/runtime/__snapshots__/executor--execute-default.md new file mode 100644 index 000000000..740ae4ae7 --- /dev/null +++ b/src/agents/runtime/__snapshots__/executor--execute-default.md @@ -0,0 +1,35 @@ +# Agent: executor + +Preview role body from `src/agents/prompts/executor.md`. + +[Brunch agent control] +- agent: executor +- foreground role: executor (derived from op_mode=execute) +- model: default; thinking: medium +- tool authority: execute executor read-only plus a code-owned stub tool; direct shell and file writes are blocked +- active tools: read, grep, find, ls, orchestrator_stub + +[Brunch runtime state] +- op_mode: execute +- prompt strategy resource: auto +- prompt lens resource: auto +- spec: COMPOSE Preview Spec (#101), readiness estimate (soft; gates nothing): grounding=0.00, elicitation=0.00, projection=0.00, commitment=0.00 +- workspace: /work/brunch-preview +- workspace posture: certainty=proving; stakes=high; audience=internal; horizon=current-milestone; migration=free-rewrite; dependencies=resist + +[Brunch elicitation recommendation] +- next question: What should Brunch know about the constraint before proceeding? +- refers to: constraint +- rationale: Constraints bound the solution space; an unestablished constraint undermines proposal legality. + +[Brunch pushed context] +- handles: none pushed +- rendered context blocks: none pushed + +[Brunch prompt-resource routing] +- Use only resources advertised in ; do not infer availability from the filesystem. +- Strategy and lens names are prompt-resource routing hints, not user-changeable session identity or stored foreground-agent roles. +- When AUTO exposes several strategy or lens resources, choose at most one advertised resource of each kind, then read the selected resource before applying detailed behavior. +- Methods compose freely when advertised; read a method skill when that mechanism is relevant to the next turn. +- For code-selected singleton resources, that singleton is the selected resource. +- Current prompt-resource selection: strategy=auto; lens=auto. \ No newline at end of file diff --git a/src/agents/runtime/__tests__/compose.test.ts b/src/agents/runtime/__tests__/compose.test.ts index 8f08eabe7..abf70ce22 100644 --- a/src/agents/runtime/__tests__/compose.test.ts +++ b/src/agents/runtime/__tests__/compose.test.ts @@ -434,7 +434,7 @@ function composePreviewPrompt(input: Partial = {}): str workspace: previewWorkspace, activeTools: ['read', 'grep', 'find', 'ls', 'present_question', 'request_response'], gaps: previewFloorGaps(0), - agentBody: '# Agent: elicitor\n\nPreview role body from `src/agents/prompts/elicitor/SYSTEM.md`.', + agentBody: '# Agent: elicitor\n\nPreview role body from `src/agents/prompts/elicitor.md`.', ...input, }).prompt; } @@ -499,6 +499,39 @@ describe('composeAgentPrompt previews', () => { expect(rendered).toContain('design'); }); + it('executor--execute-default: executor prompt omits elicitor-only guidance', async () => { + const rendered = normalizeRepoPaths( + composePreviewPrompt({ + agentId: 'executor', + sessionState: projectBrunchAgentState([ + { + type: 'custom', + customType: 'brunch.agent_runtime_state', + data: { + schemaVersion: 1, + reason: 'switch', + source: 'user', + state: { + ...DEFAULT_BRUNCH_AGENT_STATE, + operationalMode: 'execute', + }, + }, + }, + ]), + activeTools: ['read', 'grep', 'find', 'ls', 'orchestrator_stub'], + agentBody: '# Agent: executor\n\nPreview role body from `src/agents/prompts/executor.md`.', + }), + ); + + await expect(rendered).toMatchFileSnapshot('../__snapshots__/executor--execute-default.md'); + expect(rendered).toContain('# Agent: executor'); + expect(rendered).toContain('- op_mode: execute'); + expect(rendered).not.toContain('[Brunch elicitation recommendation]'); + expect(rendered).not.toContain('[Brunch prompt-resource routing]'); + expect(rendered).not.toContain(''); + expect(rendered).not.toContain('Current prompt-resource selection'); + }); + it('elicitor--pushed-context: fixture handles and rendered contexts present', async () => { const rendered = normalizeRepoPaths( composePreviewPrompt({ From 0442f2897d0ed3f15115a363e341c78078418867 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 16:41:02 +0200 Subject: [PATCH 25/29] Gate prompt sections by foreground role --- .../__snapshots__/executor--execute-default.md | 15 +-------------- src/agents/runtime/compose.ts | 10 +++++++--- 2 files changed, 8 insertions(+), 17 deletions(-) diff --git a/src/agents/runtime/__snapshots__/executor--execute-default.md b/src/agents/runtime/__snapshots__/executor--execute-default.md index 740ae4ae7..bd4900c35 100644 --- a/src/agents/runtime/__snapshots__/executor--execute-default.md +++ b/src/agents/runtime/__snapshots__/executor--execute-default.md @@ -17,19 +17,6 @@ Preview role body from `src/agents/prompts/executor.md`. - workspace: /work/brunch-preview - workspace posture: certainty=proving; stakes=high; audience=internal; horizon=current-milestone; migration=free-rewrite; dependencies=resist -[Brunch elicitation recommendation] -- next question: What should Brunch know about the constraint before proceeding? -- refers to: constraint -- rationale: Constraints bound the solution space; an unestablished constraint undermines proposal legality. - [Brunch pushed context] - handles: none pushed -- rendered context blocks: none pushed - -[Brunch prompt-resource routing] -- Use only resources advertised in ; do not infer availability from the filesystem. -- Strategy and lens names are prompt-resource routing hints, not user-changeable session identity or stored foreground-agent roles. -- When AUTO exposes several strategy or lens resources, choose at most one advertised resource of each kind, then read the selected resource before applying detailed behavior. -- Methods compose freely when advertised; read a method skill when that mechanism is relevant to the next turn. -- For code-selected singleton resources, that singleton is the selected resource. -- Current prompt-resource selection: strategy=auto; lens=auto. \ No newline at end of file +- rendered context blocks: none pushed \ No newline at end of file diff --git a/src/agents/runtime/compose.ts b/src/agents/runtime/compose.ts index 7a9774148..cf16a56f2 100644 --- a/src/agents/runtime/compose.ts +++ b/src/agents/runtime/compose.ts @@ -40,10 +40,10 @@ export function composeAgentPrompt(input: ComposeAgentPromptInput): ComposeAgent input.agentBody ?? '', renderAgentControl(input, definition), renderRuntimeState(input), - renderElicitationRecommendation(input), + renderElicitorOnlySection(input, renderElicitationRecommendation(input)), renderPushedContext(input.context), - renderBrunchSkills(manifests), - renderRouterRules(input.sessionState), + renderElicitorOnlySection(input, renderBrunchSkills(manifests)), + renderElicitorOnlySection(input, renderRouterRules(input.sessionState)), ]); return { prompt, manifests }; @@ -84,6 +84,10 @@ function renderPosture(posture: AgentPromptWorkspaceContext['posture']): string return entries.length > 0 ? entries.map(([key, value]) => `${key}=${value}`).join('; ') : 'unrecorded'; } +function renderElicitorOnlySection(input: ComposeAgentPromptInput, section: string): string { + return input.sessionState.agentRole === 'elicitor' ? section : ''; +} + function renderElicitationRecommendation(input: ComposeAgentPromptInput): string { const gap = selectElicitationGap(input.gaps, input.sessionState); if (!gap) return ''; From f52d2f3d8700a01ec54ee89b80a1e6fd97dbe3ba Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 16:41:13 +0200 Subject: [PATCH 26/29] Flatten agent prompt topology --- memory/PLAN.md | 17 +-- ...lden-coverage--prompt-subagent-topology.md | 111 ------------------ .../extensions/__tests__/subagents.test.ts | 8 +- src/.pi/extensions/subagents/README.md | 14 +-- src/.pi/extensions/subagents/agents.ts | 12 +- src/agents/__tests__/registry.test.ts | 14 +-- .../{elicitor/SYSTEM.md => elicitor.md} | 0 src/agents/prompts/executor/SYSTEM.md | 5 - src/agents/prompts/pi-coder/SYSTEM.md | 18 --- src/agents/registry.ts | 14 +-- .../elicitor--auto-floor-gaps-open.md | 2 +- .../elicitor--auto-high-coverage.md | 2 +- .../elicitor--pinned-strategy-lens.md | 2 +- .../__snapshots__/elicitor--pushed-context.md | 2 +- src/agents/runtime/__tests__/state.test.ts | 2 +- src/agents/skills/README.md | 2 +- src/agents/subagents/README.md | 33 ++++++ .../SYSTEM.md => subagents/explorer.md} | 0 .../SYSTEM.md => subagents/projector.md} | 0 .../SYSTEM.md => subagents/researcher.md} | 0 .../SYSTEM.md => subagents/reviewer.md} | 0 src/treedocs.yaml | 22 ++-- 22 files changed, 78 insertions(+), 202 deletions(-) delete mode 100644 memory/cards/renderer-golden-coverage--prompt-subagent-topology.md rename src/agents/prompts/{elicitor/SYSTEM.md => elicitor.md} (100%) delete mode 100644 src/agents/prompts/executor/SYSTEM.md delete mode 100644 src/agents/prompts/pi-coder/SYSTEM.md create mode 100644 src/agents/subagents/README.md rename src/agents/{prompts/explorer/SYSTEM.md => subagents/explorer.md} (100%) rename src/agents/{prompts/projector/SYSTEM.md => subagents/projector.md} (100%) rename src/agents/{prompts/researcher/SYSTEM.md => subagents/researcher.md} (100%) rename src/agents/{prompts/reviewer/SYSTEM.md => subagents/reviewer.md} (100%) diff --git a/memory/PLAN.md b/memory/PLAN.md index 06acb9ec7..65f88fec3 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -48,7 +48,7 @@ Brunch-next has delivered the original composition spine: the host, sealed Pi pr - **Done-definition:** all three capabilities carry promoted real-model evidence; no capability remains a stub or a method-less axis member. Open follow-ups (A32-L fan-in completion, the A1 anti-prompt) are tracked on their assumptions, not as arc blockers. - **Anchors:** D95-L, D96-L; A31-L–A35-L; I51-L. -### context-pipeline — ◐ active +### context-pipeline — ✓ done (2026-06-26) - **Goal:** lock the PULL → PROJECT → RENDER → COMPOSE context pipeline (D60-L). @@ -56,10 +56,10 @@ Brunch-next has delivered the original composition spine: the host, sealed Pi pr context-pipeline/ ├── PULL graph + session reads ✓ done ├── PROJECT projections/ ✓ done -├── RENDER agents/contexts + local human outputs ◐ open: prompt/subagent topology flattening -└── COMPOSE system-prompts + skills ✓ done* +├── RENDER agents/contexts + local human outputs ✓ done +└── COMPOSE system-prompts + skills ✓ done -*COMPOSE's deferred full-stack real-rendered-context tripwire is discharged, but RENDER remains open until prompt/subagent resource topology matches the accepted `src/agents/` shape. +Foreground prompt bodies are flat under `src/agents/prompts/{elicitor,executor}.md`; background subagent bodies are flat under `src/agents/subagents/{explorer,researcher,projector,reviewer}.md`; the old nested prompt-body convention is retired from loaders, docs, tests, and package asset copying. ``` - **Done-definition:** every pipeline stage closed; COMPOSE's full-stack tripwire discharged by RENDER; foreground prompt bodies flattened under `src/agents/prompts/{elicitor,executor}.md`; background subagent bodies flattened under `src/agents/subagents/{explorer,researcher,projector,reviewer}.md`; no stale `prompts//SYSTEM.md` convention remains in docs, tests, or packaged asset copying. @@ -69,10 +69,11 @@ context-pipeline/ ### Active -- `renderer-golden-coverage` (FE-1091) — active continuation for prompt/subagent topology flattening after the render/prompt lock sweep. Prior sweep rows closed renderer and assembly evidence, but the accepted topology still requires flat foreground prompt files and a separate flat subagent resource home. +- _None._ ### Recently Completed +- 2026-06-26 `renderer-golden-coverage` (FE-1091) — **context pipeline done.** The final topology slice flattened foreground prompt bodies to `src/agents/prompts/{elicitor,executor}.md`, moved background bodies to `src/agents/subagents/{explorer,researcher,projector,reviewer}.md`, retired nested prompt-body directories and the unwired `pi-coder` body, updated explicit registries/loaders and packaged asset copying, and reconciled `src/agents/` / prompt / subagent topology READMEs. - 2026-06-26 `data-model-legibility` (FE-1090) — **reference substrate complete.** Generated ontology tables are materialized from typed graph sources with `check:data-model`; authored graph-authoring heuristics are cited by `capture` + `commit-graph`; the final checkability/subtype audit closed with no schema/runtime expansion: progressive checkability is accepted only as skill-local oracle conduct, `checkability`/`strength` fields are rejected carrying cost, subtype enums are rejected as parallel ontology, and `detail.form` remains inert payload plus renderer hook. - 2026-06-25 `elicitor-generate` (FE-1059) — **generate capability done through promoted A31-L fan-out evidence.** Built slices: `present_candidates` tool/projection/renderer + pick path; intent/design/oracle facets under one plane-parameterized `generate-proposal` method; progressive-disclosure references; real-boot activation check; and real-model fan-out witness harness. Promoted run `.fixtures/runs/generate-fan-out/2026-06-24T16-51-13-704Z/` passed with `openai-codex/gpt-5.5`: oracle lens pinned, `SKILL.md` and `references/oracle.md` read, `present_candidates` emitted, no pre-prompt kick, no graph delta, no `mutate_graph`, and no approved review result. A32-L fan-in completion and the A1 anti-prompt remain follow-ups, not branch debt. - 2026-06-24 `subagent-reconciliation` (FE-1054) — foreground/background reconciliation complete through the execute-mode readiness target (D90-L-D93-L/I49-L): shared `AgentManifest`, code-owned background discovery, semi-permeable injected-world child sessions, sovereign grants gated by code-owned `canDelegate`, return rendering, and live `execute` -> `orchestrator` mode with a product-registered stub tool. `code` -> `pi-coder` remains future work. @@ -161,8 +162,8 @@ context-pipeline/ - **Linear:** [FE-1091](https://linear.app/hash/issue/FE-1091/renderer-golden-coverage-and-prompt-assembly-lock) - **Branch:** `ln/fe-1091-renderer-golden-coverage-and-prompt-assembly-lock` - **Kind:** coverage + build / hardening -- **Status:** active. The render/prompt sweep ledger was exhausted for renderer and assembly evidence, but topology closure remains: foreground prompt bodies must flatten to `src/agents/prompts/{elicitor,executor}.md`, and background subagent bodies must flatten to `src/agents/subagents/{explorer,researcher,projector,reviewer}.md` rather than remaining under nested `prompts//SYSTEM.md` directories. -- **Current execution pointer:** `memory/cards/renderer-golden-coverage--prompt-subagent-topology.md`. +- **Status:** done. The render/prompt sweep ledger closed renderer and assembly evidence, and the final topology slice flattened foreground prompt bodies to `src/agents/prompts/{elicitor,executor}.md` and background subagent bodies to `src/agents/subagents/{explorer,researcher,projector,reviewer}.md`. +- **Current execution pointer:** none; scope file consumed and retired. - **Certainty:** earned — RENDER topology is now established; this frontier closed coverage, prompt assembly evidence, and stale topology ambiguity rather than proving a new seam. - **Closes:** context-pipeline RENDER stage plus the COMPOSE full-stack real-rendered-context tripwire. - **Locks in:** D83-L house style for model-facing context surfaces and prompt assembly as a golden/semantic-invariant surface. @@ -187,7 +188,7 @@ context-pipeline/ frontiers: Active: renderer-golden-coverage - status: active (RENDER coverage + prompt assembly lock) + status: done (RENDER coverage + prompt assembly lock + prompt/subagent topology flattening) depends_on: context-pipeline PULL+PROJECT, D83-L, D52-L, D58-L, D98-L coordinates_with: data-model-legibility (references substrate), elicitor-generate (present_candidates render already landed in house style) diff --git a/memory/cards/renderer-golden-coverage--prompt-subagent-topology.md b/memory/cards/renderer-golden-coverage--prompt-subagent-topology.md deleted file mode 100644 index ebc67cd0d..000000000 --- a/memory/cards/renderer-golden-coverage--prompt-subagent-topology.md +++ /dev/null @@ -1,111 +0,0 @@ -# Prompt and subagent topology flattening - -Frontier: renderer-golden-coverage -Status: active -Mode: single -Created: 2026-06-26 - -## Orientation - -- Containing seam: `src/agents/` prompt-resource topology inside FE-1091 / `renderer-golden-coverage`; this is the remaining RENDER/COMPOSE closure after the renderer/assembly evidence sweep. -- The prior sweep locked current behavior but stopped short of the user's accepted topology: foreground prompts must be flat files under `prompts/`, and subagent bodies must have their own flat `subagents/` resource home. -- Main open risk: code/tests currently encode the nested `prompts//SYSTEM.md` convention and package copying likely follows that tree. This slice should change the canonical paths directly, not add aliases or compatibility readers. -- Cross-cutting obligations: preserve D39-L sealed/code-owned resource lists, D58-L thin composition, D90-L shared `AgentManifest` model, D91-L assembled subagent prompts, and D98-L SPEC/CODE foreground role vocabulary. - -Posture: earned (inherited from `renderer-golden-coverage`). - -## Target Behavior - -Brunch agent body resources use the accepted flat topology: foreground prompts at `src/agents/prompts/{elicitor,executor}.md` and background subagents at `src/agents/subagents/{explorer,researcher,projector,reviewer}.md`. - -## Full-card cold-start reads - -- `memory/SPEC.md` — D44-L, D58-L, D85-L, D90-L, D91-L, D92-L, D93-L, D98-L; I29-L -- `memory/PLAN.md` — frontier: `renderer-golden-coverage` -- `src/agents/README.md` — current `src/agents/` ownership and topology -- `src/agents/prompts/README.md` — foreground prompt ownership to update -- `src/.pi/extensions/subagents/README.md` — subagent loading/assembly topology to update -- `src/agents/registry.ts` and `src/.pi/extensions/subagents/agents.ts` — current path registries/loaders -- `package.json` — `build:pi-assets` prompt/subagent asset copying - -## Boundary Crossings - -```text -→ src/agents/registry.ts foreground body path registry -→ src/agents/runtime/policy.ts / state.ts foreground body lookup -→ src/.pi/extensions/agent-runtime/system-prompts foreground adapter -→ src/.pi/extensions/subagents/agents.ts background body loader -→ src/.pi/extensions/subagents/session.ts / prompt-assembly.ts child prompt assembly -→ package asset copy -→ topology docs and tests -``` - -## Risks and Assumptions - -- RISK: hidden tests or build assets still expect `prompts//SYSTEM.md` directories → MITIGATION: search the repo for `SYSTEM.md`, `prompts/`, and each concrete old path; update package asset copying and topology tests in the same slice. -- RISK: flattening subagents could accidentally make them foreground prompt resources → MITIGATION: keep explicit `BACKGROUND_SUBAGENT_IDS` loading and foreground `BUNDLED_AGENT_BODY_IDS` / policy lists separate; assert subagent files are not in the foreground prompt list. -- ASSUMPTION: no external packaged consumer depends on the nested prompt asset layout. - → IMPACT IF FALSE: this would need a migration bridge in packaged assets. - → VALIDATE: pre-release/free-rewrite posture plus package-local tests; do not preserve old paths unless a build/runtime test proves an atomic update is impossible. - -## Posture check - -Earned closure target: - -- **Canonicalizes** prompt-resource locations to the user's accepted topology. -- **Deletes / retires** nested `SYSTEM.md` body directories for agent bodies. -- **Materializes** separate `prompts/` foreground and `subagents/` background homes into filesystem topology, READMEs, registry tests, and package asset copying. -- **Locks in** no stale `prompts//SYSTEM.md` convention in docs/tests/build scripts. - -## Acceptance Criteria - -✓ Foreground prompt body tests — `elicitor` and `executor` load from `src/agents/prompts/elicitor.md` and `src/agents/prompts/executor.md`; old nested foreground paths are absent. -✓ Subagent loader tests — `explorer`, `researcher`, `projector`, and `reviewer` load from `src/agents/subagents/.md`; planted unlisted subagent files remain unspawnable. -✓ Registry/topology tests — foreground body ids exclude background subagents; subagent registry owns background ids; docs name the split. -✓ Build asset check — `build:pi-assets` copies flat foreground prompt files and flat subagent files into the corresponding dist homes. -✓ Repo search invariant — no canonical doc/test/source path still presents `src/agents/prompts//SYSTEM.md` as the live convention. - -## Verification Approach - -- Inner: targeted Vitest over `src/agents`, `src/agents/runtime`, and `src/.pi/extensions/__tests__/subagents.test.ts` — proves loaders, policy paths, and prompt assembly still work. -- Inner: `npm run check` — catches stale docs/format/skill/data-model drift after path edits. -- Gate: `npm run verify` — required before committing because package asset copying and build output are touched. - -## Cross-cutting obligations - -- D39-L: explicit/code-owned resource paths only; no directory discovery except explicit registry ids. -- D58-L: prompt composition remains thin; this slice moves files and loaders, not prompt behavior. -- D90-L/D91-L: foreground/background share manifest shape while retaining distinct homes and execution authority. -- D98-L: foreground vocabulary is SPEC/elicitor and CODE/executor; do not preserve orchestrator/pi-coder as product-role aliases. - -## Expected touched paths (tentative) - -```text -memory/cards/ -└── renderer-golden-coverage--prompt-subagent-topology.md + -memory/PLAN.md ~ -memory/SPEC.md ~ -package.json ~ -src/agents/ -├── README.md ~ -├── registry.ts ~ -├── __tests__/ -│ └── registry.test.ts ~ -├── prompts/ -│ ├── README.md ~ -│ ├── elicitor.md + -│ ├── executor.md + -│ ├── elicitor/ - -│ └── executor/ - -├── subagents/ -│ ├── README.md + -│ ├── explorer.md + -│ ├── researcher.md + -│ ├── projector.md + -│ └── reviewer.md + -└── runtime/ ? -src/.pi/extensions/subagents/ -├── README.md ~ -├── agents.ts ~ -└── tests ? -``` diff --git a/src/.pi/extensions/__tests__/subagents.test.ts b/src/.pi/extensions/__tests__/subagents.test.ts index d6ab2ce6f..29ef7daec 100644 --- a/src/.pi/extensions/__tests__/subagents.test.ts +++ b/src/.pi/extensions/__tests__/subagents.test.ts @@ -1,4 +1,4 @@ -import { mkdir, mkdtemp, writeFile } from 'node:fs/promises'; +import { mkdtemp, writeFile } from 'node:fs/promises'; import { tmpdir } from 'node:os'; import { join } from 'node:path'; @@ -132,11 +132,9 @@ describe('loadSubagentDefinitions (bundled agents)', () => { it('loads only the explicit registry ids and ignores planted unlisted SYSTEM.md files', async () => { const dir = await mkdtemp(join(tmpdir(), 'brunch-subagent-registry-')); - await mkdir(join(dir, 'explorer')); - await writeFile(join(dir, 'explorer', 'SYSTEM.md'), EXPLORER_MD); - await mkdir(join(dir, 'ghost')); + await writeFile(join(dir, 'explorer.md'), EXPLORER_MD); await writeFile( - join(dir, 'ghost', 'SYSTEM.md'), + join(dir, 'ghost.md'), '---\nname: ghost\ndescription: Should not load\ntools: bash\n---\nYou should not see me.', ); diff --git a/src/.pi/extensions/subagents/README.md b/src/.pi/extensions/subagents/README.md index f72af6c24..29483ef13 100644 --- a/src/.pi/extensions/subagents/README.md +++ b/src/.pi/extensions/subagents/README.md @@ -93,12 +93,12 @@ context that crosses back to the parent; structured `details` remain render-only | File | Responsibility | | --- | --- | -| [`agents.ts`](./agents.ts) | Markdown agent loader: tiny frontmatter parser (no YAML dep), TypeBox-validated schema (`name`, `description`, `tools`, `model`, `thinking`), explicit `BACKGROUND_SUBAGENT_IDS` registry, `loadSubagentDefinitions(dir, ids?)` over `src/agents/prompts//SYSTEM.md` → `Map`. Projects frontmatter into the shared `AgentManifest` background shape and fails loud on malformed/duplicate/id-drifted agents. | +| [`agents.ts`](./agents.ts) | Markdown agent loader: tiny frontmatter parser (no YAML dep), TypeBox-validated schema (`name`, `description`, `tools`, `model`, `thinking`), explicit `BACKGROUND_SUBAGENT_IDS` registry, `loadSubagentDefinitions(dir, ids?)` over `src/agents/subagents/.md` → `Map`. Projects frontmatter into the shared `AgentManifest` background shape and fails loud on malformed/duplicate/id-drifted agents. | | [`config.ts`](./config.ts) | TypeBox loader for [`config.json`](./config.json) (`version`, `maxConcurrency`; tolerates `$comment`). | | [`prompt-assembly.ts`](./prompt-assembly.ts) | Background prompt assembler: agent body + child-control header + injected world snapshot + `` + background router rules. Reuses the shared prompt-skill manifest renderer; deliberately omits the foreground elicitation recommendation block. | | [`session.ts`](./session.ts) | The sealed child-session runner. `resolveSubagentModel`, `createSubagentToolCatalog`, `planSubagentTools`, `runSubagent`. The catalog is the shared source that resolves sovereign manifest-authored grants. Never throws — failures return as error results. **Injectable SDK builders** (`createServices`/`createSession`) for testing. | | [`index.ts`](./index.ts) | `registerBrunchSubagents(pi, deps)` — registers the one `subagent` tool (single `{agent,task}` or parallel `{tasks:[…]}`), filters advertisement/execution to `definitions ∩ deps.delegatableAgents`, `createSemaphore` for bounded concurrency, result formatting. Re-exports the public surface. | -| [`../../../agents/prompts//SYSTEM.md`](../../../agents/prompts) | Declarative foreground/background agent body home. Background bodies carry frontmatter; `agents.ts` loads only registry-listed ids. | +| [`../../../agents/subagents/.md`](../../../agents/subagents) | Declarative background agent body home. Background bodies carry frontmatter; `agents.ts` loads only registry-listed ids. | | [`config.json`](./config.json) | Externalized concurrency cap (`maxConcurrency: 4`). | | [`subagents.test.ts`](./subagents.test.ts) | Tests parsing, config, model resolution, tool planning, semaphore fairness, registrar usage errors, abort lifecycle, and **two end-to-end faux-provider child-session runs** asserting the sealing invariants. | | [`../../../app/pi-subagents.ts`](../../../app/pi-subagents.ts) | **App composition root.** `loadBrunchSubagents({cwd, agentDir, delegatableAgents, world})` assembles `BrunchSubagentsDeps` using the sealed `pi-settings` helpers plus explicit parent-world handles and the code-owned op-mode delegatable set. Keeps `.pi/` free of `src/app` imports (deps are injected). | @@ -107,14 +107,14 @@ Boundary rule: `.pi/extensions/subagents/*` may import the SDK and `../web-tools (for `web_search`/`web_fetch`), but **never** `src/app/*`. The app layer injects the sealed primitives. -## Agent definitions (`src/agents/prompts//SYSTEM.md`) +## Agent definitions (`src/agents/subagents/.md`) Frontmatter is the background-agent authoring contract; the code-owned `BACKGROUND_SUBAGENT_IDS` list is the registry. The markdown body is the first section of the child's assembled system prompt, replacing Pi's coding base. -Foreground bodies share the same `/SYSTEM.md` home but their metadata is -owned by the op-mode keyed foreground roster. `canDelegate` is not a background -frontmatter field; background manifests project it to `[]`. +Foreground bodies live separately as flat files under `src/agents/prompts/` and +their metadata is owned by the op-mode keyed foreground roster. `canDelegate` is +not a background frontmatter field; background manifests project it to `[]`. ```yaml --- @@ -193,7 +193,7 @@ and carry a depth/allowlist bound; pairs naturally with the future write-capable | Aspect | Original | Brunch (this) | | --- | --- | --- | -| Agent discovery | Bundled `agents/*.md` beside `index.ts` **+** `globalThis.__pi_subagents` runtime bridge for other extensions | Unified `src/agents/prompts//SYSTEM.md` home via explicit `BACKGROUND_SUBAGENT_IDS` → `loadSubagentDefinitions(dir, ids?)`; **no** bridge, **no** ambient `~/.pi` scan, and no directory scan | +| Agent discovery | Bundled `agents/*.md` beside `index.ts` **+** `globalThis.__pi_subagents` runtime bridge for other extensions | Flat `src/agents/subagents/.md` home via explicit `BACKGROUND_SUBAGENT_IDS` → `loadSubagentDefinitions(dir, ids?)`; **no** bridge, **no** ambient `~/.pi` scan, and no directory scan | | Frontmatter | Loose: string split + silent defaults; extra `subagent_agents` allowlist; `model` default `anthropic/claude-sonnet-4-6` | Strict TypeBox schema, **fails loud**; no `subagent_agents` (no nesting); `model: default` inherits parent | | Execution | `spawn()` a child `pi` process (`--mode json -p --no-session --no-skills --no-extensions`, re-adds `--extension` paths, `--append-system-prompt` temp file) | In-process SDK `AgentSession` with sealed services | | Isolation basis | OS process boundary + flags; depends on a resolvable `pi` binary on PATH | Sealed in-memory services; no binary, no ambient leakage | diff --git a/src/.pi/extensions/subagents/agents.ts b/src/.pi/extensions/subagents/agents.ts index ab39e8c55..25ca3c911 100644 --- a/src/.pi/extensions/subagents/agents.ts +++ b/src/.pi/extensions/subagents/agents.ts @@ -1,8 +1,8 @@ /** * Subagent agent definitions (D44-L / D90-L). * - * Background agents are declarative SYSTEM.md files under the shared - * `src/agents/prompts//` body home. Each file carries a small frontmatter block + * Background agents are declarative markdown files under `src/agents/subagents/`. + * Each file carries a small frontmatter block * plus a system-prompt body. The frontmatter is the registry contract; the body * is the subagent's standing instructions and the first section of the assembled * child prompt. Frontmatter is validated through a TypeBox schema (D41-L) so a @@ -16,11 +16,11 @@ import { readFile } from 'node:fs/promises'; import { join } from 'node:path'; +import { fileURLToPath } from 'node:url'; import { Type } from 'typebox'; import { Value } from 'typebox/value'; -import { bundledAgentBodyHome } from '../../../agents/registry.js'; import type { BackgroundAgentManifest } from '../../../session/schema/agent-manifest.js'; export const BACKGROUND_SUBAGENT_IDS = ['explorer', 'researcher', 'projector', 'reviewer'] as const; @@ -139,9 +139,9 @@ export function parseSubagentMarkdown( }; } -/** Filesystem location of the unified bundled agent body home. */ +/** Filesystem location of the bundled background subagent body home. */ export function subagentAgentsDir(): string { - return bundledAgentBodyHome(); + return fileURLToPath(new URL('../../../agents/subagents', import.meta.url)); } /** @@ -155,7 +155,7 @@ export async function loadSubagentDefinitions( ): Promise> { const definitions = new Map(); for (const id of ids) { - const file = join(id, 'SYSTEM.md'); + const file = `${id}.md`; const source = await readFile(join(dir, file), 'utf8'); const definition = parseSubagentMarkdown(source, { sourcePath: file }); if (definition.name !== id) { diff --git a/src/agents/__tests__/registry.test.ts b/src/agents/__tests__/registry.test.ts index 84731c2f1..f135cf350 100644 --- a/src/agents/__tests__/registry.test.ts +++ b/src/agents/__tests__/registry.test.ts @@ -11,17 +11,9 @@ import { describe('agent context registry', () => { it('centralizes bundled prompt and current skill paths', () => { - expect(BUNDLED_AGENT_BODY_IDS).toEqual([ - 'elicitor', - 'executor', - 'explorer', - 'researcher', - 'projector', - 'reviewer', - 'pi-coder', - ]); - expect(bundledAgentBodyRepoPath('elicitor')).toBe('src/agents/prompts/elicitor/SYSTEM.md'); - expect(bundledAgentBodyLocation('reviewer')).toMatch(/src\/agents\/prompts\/reviewer\/SYSTEM\.md$/); + expect(BUNDLED_AGENT_BODY_IDS).toEqual(['elicitor', 'executor']); + expect(bundledAgentBodyRepoPath('elicitor')).toBe('src/agents/prompts/elicitor.md'); + expect(bundledAgentBodyLocation('executor')).toMatch(/src\/agents\/prompts\/executor\.md$/); expect(bundledAgentBodyHome()).toMatch(/src\/agents\/prompts$/); expect(promptResourceLocation('methods', 'generate-proposal')).toMatch( /src\/agents\/skills\/methods\/generate-proposal\/SKILL\.md$/, diff --git a/src/agents/prompts/elicitor/SYSTEM.md b/src/agents/prompts/elicitor.md similarity index 100% rename from src/agents/prompts/elicitor/SYSTEM.md rename to src/agents/prompts/elicitor.md diff --git a/src/agents/prompts/executor/SYSTEM.md b/src/agents/prompts/executor/SYSTEM.md deleted file mode 100644 index d1e349842..000000000 --- a/src/agents/prompts/executor/SYSTEM.md +++ /dev/null @@ -1,5 +0,0 @@ -# Agent: executor - -The executor is the foreground Brunch session agent for execute mode. In this branch it proves the execute-mode path by calling the code-owned `orchestrator_stub` tool and reporting its deterministic output. - -Stay inside the current selected spec and session context. Do not call shell or file-writing tools; execute mode blocks direct `bash`, `edit`, and `write` access. This branch has no delegated workers yet, so treat `canDelegate = []` as a hard boundary and use the stub tool directly for the standup proof. diff --git a/src/agents/prompts/pi-coder/SYSTEM.md b/src/agents/prompts/pi-coder/SYSTEM.md deleted file mode 100644 index 90c5d12b9..000000000 --- a/src/agents/prompts/pi-coder/SYSTEM.md +++ /dev/null @@ -1,18 +0,0 @@ -You are an expert coding assistant operating inside *brunch*, a software-specification-elicitation and -plan-execution agent harness, that also happens to be a coding agent harness. - -You help users by reading files, executing commands, editing code, and writing -new files. - -Available tools: -{visible selected tools, formatted as "- name: one-line snippet"; "(none)" when -no selected tool has a snippet} - -In addition to the tools above, you may have access to other custom tools -depending on the project. - -Guidelines: -- Use bash for file operations like ls, rg, find - {included when bash is selected without grep/find/ls tool aliases} -- {caller-provided prompt guidelines, trimmed and de-duplicated} -- Be concise in your responses -- Show file paths clearly when working with files diff --git a/src/agents/registry.ts b/src/agents/registry.ts index c9e796072..2732a5c89 100644 --- a/src/agents/registry.ts +++ b/src/agents/registry.ts @@ -1,14 +1,6 @@ import { fileURLToPath } from 'node:url'; -export const BUNDLED_AGENT_BODY_IDS = [ - 'elicitor', - 'executor', - 'explorer', - 'researcher', - 'projector', - 'reviewer', - 'pi-coder', -] as const; +export const BUNDLED_AGENT_BODY_IDS = ['elicitor', 'executor'] as const; export type BundledAgentBodyId = (typeof BUNDLED_AGENT_BODY_IDS)[number]; export type PromptResourceFamily = 'strategies' | 'lenses' | 'methods'; @@ -20,11 +12,11 @@ export function bundledAgentBodyHome(): string { /** Repo-relative path used by manifest bodies that are read later by the Pi runtime. */ export function bundledAgentBodyRepoPath(id: BundledAgentBodyId): string { - return `src/agents/prompts/${id}/SYSTEM.md`; + return `src/agents/prompts/${id}.md`; } export function bundledAgentBodyLocation(id: BundledAgentBodyId): string { - return fileURLToPath(new URL(`./prompts/${id}/SYSTEM.md`, import.meta.url)); + return fileURLToPath(new URL(`./prompts/${id}.md`, import.meta.url)); } /** Agent directory passed to Pi's Agent Skills loader for Brunch prompt resources. */ diff --git a/src/agents/runtime/__snapshots__/elicitor--auto-floor-gaps-open.md b/src/agents/runtime/__snapshots__/elicitor--auto-floor-gaps-open.md index 639e8c0a9..96c26de21 100644 --- a/src/agents/runtime/__snapshots__/elicitor--auto-floor-gaps-open.md +++ b/src/agents/runtime/__snapshots__/elicitor--auto-floor-gaps-open.md @@ -1,6 +1,6 @@ # Agent: elicitor -Preview role body from `src/agents/prompts/elicitor/SYSTEM.md`. +Preview role body from `src/agents/prompts/elicitor.md`. [Brunch agent control] - agent: elicitor diff --git a/src/agents/runtime/__snapshots__/elicitor--auto-high-coverage.md b/src/agents/runtime/__snapshots__/elicitor--auto-high-coverage.md index ffa0b169e..2c99452bc 100644 --- a/src/agents/runtime/__snapshots__/elicitor--auto-high-coverage.md +++ b/src/agents/runtime/__snapshots__/elicitor--auto-high-coverage.md @@ -1,6 +1,6 @@ # Agent: elicitor -Preview role body from `src/agents/prompts/elicitor/SYSTEM.md`. +Preview role body from `src/agents/prompts/elicitor.md`. [Brunch agent control] - agent: elicitor diff --git a/src/agents/runtime/__snapshots__/elicitor--pinned-strategy-lens.md b/src/agents/runtime/__snapshots__/elicitor--pinned-strategy-lens.md index 9bc61e7c9..c7d0df15e 100644 --- a/src/agents/runtime/__snapshots__/elicitor--pinned-strategy-lens.md +++ b/src/agents/runtime/__snapshots__/elicitor--pinned-strategy-lens.md @@ -1,6 +1,6 @@ # Agent: elicitor -Preview role body from `src/agents/prompts/elicitor/SYSTEM.md`. +Preview role body from `src/agents/prompts/elicitor.md`. [Brunch agent control] - agent: elicitor diff --git a/src/agents/runtime/__snapshots__/elicitor--pushed-context.md b/src/agents/runtime/__snapshots__/elicitor--pushed-context.md index d64ae6db6..36d471c0c 100644 --- a/src/agents/runtime/__snapshots__/elicitor--pushed-context.md +++ b/src/agents/runtime/__snapshots__/elicitor--pushed-context.md @@ -1,6 +1,6 @@ # Agent: elicitor -Preview role body from `src/agents/prompts/elicitor/SYSTEM.md`. +Preview role body from `src/agents/prompts/elicitor.md`. [Brunch agent control] - agent: elicitor diff --git a/src/agents/runtime/__tests__/state.test.ts b/src/agents/runtime/__tests__/state.test.ts index 514a16458..ff3221f78 100644 --- a/src/agents/runtime/__tests__/state.test.ts +++ b/src/agents/runtime/__tests__/state.test.ts @@ -218,7 +218,7 @@ describe('agent posture policy', () => { it('resolves agent SYSTEM.md bodies through the central agent context registry location', () => { const location = agentBodyResourceLocation('elicitor'); expect(location).toBe(bundledAgentBodyLocation('elicitor')); - expect(location).toMatch(/src\/agents\/prompts\/elicitor\/SYSTEM\.md$/); + expect(location).toMatch(/src\/agents\/prompts\/elicitor\.md$/); const body = readFileSync(location, 'utf8'); expect(body).toContain('# Agent: elicitor'); }); diff --git a/src/agents/skills/README.md b/src/agents/skills/README.md index 9b07f64a7..d6544010a 100644 --- a/src/agents/skills/README.md +++ b/src/agents/skills/README.md @@ -32,7 +32,7 @@ rules: agents/skills/ x> graph mutation [guidance only] ``` -The legal set is sealed by the code-owned path list in `agents/runtime/state.ts`; adding a `SKILL.md` does not make it available until that table enumerates it. `src/agents/registry.ts` owns file locations. Frontmatter owns `name` and `description`; code owns axis family, legality, and location enumeration. The former `goals/` family is retired by D85-L; the elicitor objective postures are inline in `src/agents/prompts/elicitor/SYSTEM.md`. +The legal set is sealed by the code-owned path list in `agents/runtime/state.ts`; adding a `SKILL.md` does not make it available until that table enumerates it. `src/agents/registry.ts` owns file locations. Frontmatter owns `name` and `description`; code owns axis family, legality, and location enumeration. The former `goals/` family is retired by D85-L; the elicitor objective postures are inline in `src/agents/prompts/elicitor.md`. ## Prompt-resource sub-shapes diff --git a/src/agents/subagents/README.md b/src/agents/subagents/README.md new file mode 100644 index 000000000..05ff0f177 --- /dev/null +++ b/src/agents/subagents/README.md @@ -0,0 +1,33 @@ +# agents/subagents/ — background agent bodies + +SPEC decisions: D39-L, D44-L, D90-L, D91-L, D92-L, D93-L + +## Owns + +Flat markdown body resources for background subagents. These are not foreground prompt bodies; spawnability is owned by the explicit `BACKGROUND_SUBAGENT_IDS` registry in `src/.pi/extensions/subagents/agents.ts`. + +```text +subagents/ +├── README.md +├── explorer.md background codebase recon body + frontmatter +├── researcher.md background web-research body + frontmatter +├── projector.md background candidate-proposal body + frontmatter +└── reviewer.md background proposal/commitment review body + frontmatter +``` + +Each file carries frontmatter (`name`, `description`, `tools`, `model`, `thinking`) plus the child system-prompt body. Frontmatter is authoring DX; the code-owned registry decides which ids exist. Unlisted files are not spawnable. + +## Boundary rules + +```pseudo +rules: + .pi/extensions/subagents/agents.ts -> agents/subagents/*.md [explicit BACKGROUND_SUBAGENT_IDS only] + agents/subagents/ x> foreground prompt roster + agents/subagents/ x> Pi hooks or runtime registration +``` + +## Does NOT own + +- Foreground SPEC/CODE prompt bodies — `src/agents/prompts/`. +- Background prompt assembly, child-session sealing, tool grants, and spawn execution — `src/.pi/extensions/subagents/`. +- Prompt-resource skills — `src/agents/skills/`. diff --git a/src/agents/prompts/explorer/SYSTEM.md b/src/agents/subagents/explorer.md similarity index 100% rename from src/agents/prompts/explorer/SYSTEM.md rename to src/agents/subagents/explorer.md diff --git a/src/agents/prompts/projector/SYSTEM.md b/src/agents/subagents/projector.md similarity index 100% rename from src/agents/prompts/projector/SYSTEM.md rename to src/agents/subagents/projector.md diff --git a/src/agents/prompts/researcher/SYSTEM.md b/src/agents/subagents/researcher.md similarity index 100% rename from src/agents/prompts/researcher/SYSTEM.md rename to src/agents/subagents/researcher.md diff --git a/src/agents/prompts/reviewer/SYSTEM.md b/src/agents/subagents/reviewer.md similarity index 100% rename from src/agents/prompts/reviewer/SYSTEM.md rename to src/agents/subagents/reviewer.md diff --git a/src/treedocs.yaml b/src/treedocs.yaml index 63d3a8455..a69ac8337 100644 --- a/src/treedocs.yaml +++ b/src/treedocs.yaml @@ -215,20 +215,14 @@ tree: README.md: 'Documents this source subtree.' prompts: README.md: 'Documents this source subtree.' - elicitor: - SYSTEM.md: 'Defines the elicitor agent system prompt.' - explorer: - SYSTEM.md: 'Defines the explorer agent system prompt.' - orchestrator: - SYSTEM.md: 'Defines the orchestrator agent system prompt.' - pi-coder: - SYSTEM.md: 'Defines the pi-coder agent system prompt.' - projector: - SYSTEM.md: 'Defines the projector agent system prompt.' - researcher: - SYSTEM.md: 'Defines the researcher agent system prompt.' - reviewer: - SYSTEM.md: 'Defines the reviewer agent system prompt.' + elicitor.md: 'Defines the elicitor agent system prompt.' + executor.md: 'Defines the executor agent system prompt.' + subagents: + README.md: 'Documents this source subtree.' + explorer.md: 'Defines the explorer background agent system prompt.' + projector.md: 'Defines the projector background agent system prompt.' + researcher.md: 'Defines the researcher background agent system prompt.' + reviewer.md: 'Defines the reviewer background agent system prompt.' registry.ts: 'Implements registry.' runtime: README.md: 'Documents this source subtree.' From c093d669023c7faadee292f6be789c570760b56e Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 17:22:00 +0200 Subject: [PATCH 27/29] Harden topology test oracles --- .agents/skills/ln-review/SKILL.md | 2 + .../ln-review/references/contract-lenses.md | 2 + src/.pi/__tests__/architecture.test.ts | 58 +------- src/agents/__tests__/registry.test.ts | 21 ++- .../prompts/__tests__/prompt-bodies.test.ts | 84 ++---------- .../skills/__tests__/prompt-resources.test.ts | 129 +++++------------- 6 files changed, 72 insertions(+), 224 deletions(-) diff --git a/.agents/skills/ln-review/SKILL.md b/.agents/skills/ln-review/SKILL.md index 5679c568b..16e5d7b66 100644 --- a/.agents/skills/ln-review/SKILL.md +++ b/.agents/skills/ln-review/SKILL.md @@ -65,6 +65,8 @@ If `memory/SPEC.md` §Oracle Strategy by Loop Tier exists, check whether recent Collect gaps as numbered findings (category: `oracle-coverage`). +**Test-oracle hygiene for topology/prose sentinels.** Source/doc/resource substring tests are suspicious unless they either cross the production consumer that uses the resource (loader, prompt composition, package asset build, public API) or name a live topology contract from SPEC / a co-located README. Brunch intentionally materializes architecture into topology, so do not ban topology tests; distinguish named architecture sentinels from arbitrary path/prose locks. Route repairs to consumer-level tests, named boundary sentinels with decision citations, or deletion of completed-migration residue. + **Notation aid.** Map test artifacts against acceptance leaves with `pseudo matrix` (coverage variant): rows = obligation leaves from a `pseudo tree` decomposition of the frontier acceptance, columns = test artifacts. Gaps surface as `.` cells; partial coverage as `~`. Compact, scannable, and the matrix itself becomes a coverage artifact reviewers can re-run. ### Load-bearing layer coverage (category: `coverage-candidacy`) diff --git a/.agents/skills/ln-review/references/contract-lenses.md b/.agents/skills/ln-review/references/contract-lenses.md index 1ec86a495..ca8ee547f 100644 --- a/.agents/skills/ln-review/references/contract-lenses.md +++ b/.agents/skills/ln-review/references/contract-lenses.md @@ -19,6 +19,8 @@ Each finding routes to one of three repairs: **enforce it loudly** (fail on viol - A **tagged-union arm that is representable and boundary-accepted but has no semantics anywhere downstream** (validation checks kind membership only; derivation hits a default/zero branch) → a "dark variant" persists as permanently-inert data, silently (graduated 2026-06-11: `field`/`coverage` gap predicates were creatable but derived coverage 0 forever and could not be hand-answered). Repair: one exhaustive `never`-checked owner of per-arm semantics that both validation and derivation ride — every accepted arm gets an implementation or loud rejection at the boundary, and adding an arm without deciding its semantics fails to compile. - A **provider-visible string composed outside the ledgered render planes** (`content:` fields, custom entry copy, toolResult text built inline rather than in `renderers/` or the compose path) that restates behavior or decisions maintained elsewhere → the string is a duplicated behavioral claim with no wording oracle and no decision traceability, so decision revisions sweep the surrounding *comments* (reviewed) but not the *payload* (un-oracled), and the model acts on stale instructions (graduated 2026-06-12: `kickTurnMessage` still instructed the model that "a structured exchange offer was just presented" after revised D78-L retired the canned offer — the same module's docstring had been updated; elicitor gap guidance was similarly ad-hoc rather than source-of-truthed). Repair: name the contract — give the surface a ledger row (renderer ledger entry-copy rows) with a wording oracle (golden) or derive it from the source of truth; at minimum tag it with the decision ID it restates so revisions sweep it. Project-scoped lens: it is operative here because the provider-visible channel is enumerable and the golden/ledger discipline exists, and the stakes are high because these strings are control surfaces, not cosmetic copy. (Universal kernel, for calibration only: any string restating a contract maintained elsewhere drifts silently — but at that altitude it stops being searchable.) +- A **topology/prose sentinel test without a product-entrypoint or named-contract oracle** (`readFile` / `access` / `readdir` plus hardcoded `src/...`, README phrases, retired path checks, or arbitrary prompt-resource `toContain` needles) → the test freezes representation or completed-migration residue rather than the behavior a user/runtime observes. It can pass while the production consumer is unwired, or fail during a harmless topology refactor. **Discriminator:** if the assertion crosses the real consumer (loader, prompt composition, package asset build, public API) or explicitly names a live SPEC / topology README contract that cannot be cheaply expressed elsewhere, it is legitimate architecture coverage; otherwise it is a prose/path lock. Repair class **name or relocate the oracle**: move to a consumer-level test, keep a narrow named boundary sentinel with decision citation, or delete the residue once the migration is closed. Do not generalize to "no topology tests" — Brunch deliberately materializes architecture into topology. + - A **capability-readiness / permission gate whose required precondition can only be produced by the capability it gates** — a *circular-precondition gate* (bootstrap-deadlock smell). The gate reads as ordinary defensive readiness ("don't allow X until the frame for X exists"), but the only path to the required state runs *through* X, so a fresh/empty subject can never satisfy it and the gate never opens — silently, with no error, presenting as "the tool just isn't available" (graduated 2026-06-22: `mutate_graph` was gated on `propose-graph` readiness, which required `context`/`thesis`/`goal`/`constraint` nodes that only `mutate_graph` can create — a fresh or foundation-light spec could never establish its own frame; D86-L floored the write tools). **Discriminator (mechanical, clears false positives): does the gated action *produce* the gated-on state?** If yes → circular, floor it. If the gated action only *consumes* state it doesn't write (audit gated on graph truth; a downstream/generative lens gated on a grounded frame it defers creating elsewhere), the gate is legitimate — not every readiness gate is circular. Repair class **floor the bootstrapping capability**: make the state-producing action ungated so the precondition can ever become true; keep readiness *advisory* for it (scale epistemic status / surface an establishment offer) rather than withholding the tool. Search seam: cross the capability→required-state map against the capability→granted-tools/actions map and flag any capability whose granted action writes its own required state. - An **opaque companion to an enforced discriminant** — a param that carries an enforced discriminant (`kind`/`type`/`action`/`mode` enum) alongside a *companion* (a payload, detail blob, or required-field set) typed `Unknown` / `z.unknown` / `Type.Unknown()` / `z.record(_, z.unknown)` / `looseObject` whose real per-discriminant shape is reconstructed only in a downstream validator. The discriminant is taught and rejected at the boundary; its companion is not — so the author (LLM agent or human) must *know* the per-discriminant shape from outside the schema, guesses it, and burns a turn on a downstream rejection (loud variant), **or** a malformed companion is silently absorbed into an empty/default result and the caller never learns it called wrong (silent variant) (graduated 2026-06-23 from three recurrences: `present_review_set.payload` was `z.unknown()`/`schemaVersion`-only while the nested `grounding`/`pitch`/`epistemicStatus`/`entityDrafts`/`edgeDrafts` shape lived only in `validateReviewSetPayloadShape` (`64fe9a41`, then taught at the boundary); `mutate_graph` create_node `detail: Type.Unknown()` ×2 (agent tool + dev-RPC mirror) while the per-kind `decision`/`term` shape lived only in `command-validation.ts`; the silent face is `read_graph`'s flat-all-optional `mode`↔companion schema returning an empty slice on a malformed call). **Discriminator: is the companion's legal shape a pure function of the discriminant value, and is that mapping enforced only downstream of the boundary?** If yes → opaque companion. (A genuinely free-form blob with no per-discriminant contract is not this lens.) Repair class **name the contract at the boundary**: teach the companion's per-discriminant shape in the boundary schema — structurally (a discriminated union per `create_edge`'s role-named-per-`category` template) or by description — from **one owner the boundary schema and the downstream validator share** (do not fork a second nested model); for the silent variant, make the malformed call unrepresentable (per-`mode` union) or fail loud (`update_elicitation_gaps`-style per-field diagnostics) rather than returning empty. Keep the downstream validator as the diagnostic authority; the boundary only teaches. Search seam: cross each enforced discriminant enum against its companion param's type and flag any companion typed `Unknown`/`unknown`/`record(_, unknown)`/`looseObject` whose real shape is reconstructed in a separate validator. See also `fixture-vs-real-audit` (PLAN) — the same `z.unknown()`/`Type.Unknown()` sites read from the untested-against-real angle. diff --git a/src/.pi/__tests__/architecture.test.ts b/src/.pi/__tests__/architecture.test.ts index 92709bfdc..7daa6fafb 100644 --- a/src/.pi/__tests__/architecture.test.ts +++ b/src/.pi/__tests__/architecture.test.ts @@ -5,32 +5,6 @@ import { fileURLToPath } from 'node:url'; import { describe, expect, it } from 'vitest'; const projectRoot = dirname(dirname(dirname(dirname(fileURLToPath(import.meta.url))))); -const legacyContextPath = join(projectRoot, 'src/.pi/context'); - -const legacyImportNeedles = [ - ['src', '.pi', 'context'].join('/'), - 'compose' + '-brunch-prompt', - ['context', 'prompt-packs'].join('/'), - ['context', 'builders'].join('/'), -]; - -const runtimeRegistryExpectations = [ - { - file: 'src/session/schema/kinds.ts', - required: "export const AGENT_ROLE_IDS = ['elicitor', 'executor'] as const;", - forbidden: ['reviewer', 'pi-coder'], - }, - { - file: 'src/agents/runtime/policy.ts', - required: - 'export const FOREGROUND_AGENT_ROSTER: Record = {', - // `reviewer` is a non-write background agent that legitimately appears in - // elicit's code-owned `canDelegate` set (D92-L delegatable-set lives beside - // the op_mode policy). Only `pi-coder` — an unwired planned foreground — - // must stay out of the foreground registry here. - forbidden: ['pi-coder'], - }, -]; const modelTextAdapterDirs = [ join(projectRoot, 'src/.pi/extensions/brunch-data'), @@ -42,34 +16,10 @@ const allowedModelTextAdapterFiles = new Set([ ]); describe('agents topology', () => { - it('removes the legacy .pi context source', async () => { - await expect(readdir(legacyContextPath)).rejects.toThrow(); - }); - - it('keeps named future agent bodies out of the runtime registry', async () => { - for (const expectation of runtimeRegistryExpectations) { - const content = await readFile(join(projectRoot, expectation.file), 'utf8'); - expect(content).toContain(expectation.required); - for (const needle of expectation.forbidden) { - expect(content, `${expectation.file} must not register ${needle}`).not.toContain(needle); - } - } - }); - - it('keeps product source imports free of legacy .pi context prompt paths', async () => { - const files = await listSourceFiles(join(projectRoot, 'src')); - - for (const file of files) { - const rel = relative(projectRoot, file); - if (rel.endsWith('.test.ts') || rel.includes('/__tests__/')) continue; - const content = await readFile(file, 'utf8'); - for (const needle of legacyImportNeedles) { - expect(content, `${rel} must not reference ${needle}`).not.toContain(needle); - } - } - }); - - it('keeps Pi tool adapters from owning Brunch-authored model text', async () => { + it('keeps Pi tool adapters from owning Brunch-authored provider text', async () => { + // D39-L/D60-L: `.pi/extensions` is the harness adapter surface; Brunch-authored + // model/provider text belongs under `src/agents/contexts` or prompt composition. + // This is a named architecture sentinel, not a generic source-prose lock. const files = (await Promise.all(modelTextAdapterDirs.map((dir) => listSourceFiles(dir)))).flat(); for (const file of files) { diff --git a/src/agents/__tests__/registry.test.ts b/src/agents/__tests__/registry.test.ts index f135cf350..030002db9 100644 --- a/src/agents/__tests__/registry.test.ts +++ b/src/agents/__tests__/registry.test.ts @@ -1,3 +1,6 @@ +import { access } from 'node:fs/promises'; +import { relative } from 'node:path'; + import { describe, expect, it } from 'vitest'; import { @@ -10,14 +13,18 @@ import { } from '../registry.js'; describe('agent context registry', () => { - it('centralizes bundled prompt and current skill paths', () => { + it('owns the foreground body registry contract', async () => { expect(BUNDLED_AGENT_BODY_IDS).toEqual(['elicitor', 'executor']); expect(bundledAgentBodyRepoPath('elicitor')).toBe('src/agents/prompts/elicitor.md'); - expect(bundledAgentBodyLocation('executor')).toMatch(/src\/agents\/prompts\/executor\.md$/); - expect(bundledAgentBodyHome()).toMatch(/src\/agents\/prompts$/); - expect(promptResourceLocation('methods', 'generate-proposal')).toMatch( - /src\/agents\/skills\/methods\/generate-proposal\/SKILL\.md$/, - ); - expect(promptResourceAgentDir()).toMatch(/src\/agents\/?$/); + + for (const id of BUNDLED_AGENT_BODY_IDS) { + await expect(access(bundledAgentBodyLocation(id))).resolves.toBeUndefined(); + expect(relative(bundledAgentBodyHome(), bundledAgentBodyLocation(id))).toBe(`${id}.md`); + } + }); + + it('resolves prompt-resource skills under the Brunch agent resource home', () => { + const location = promptResourceLocation('methods', 'generate-proposal'); + expect(relative(promptResourceAgentDir(), location)).toBe('skills/methods/generate-proposal/SKILL.md'); }); }); diff --git a/src/agents/prompts/__tests__/prompt-bodies.test.ts b/src/agents/prompts/__tests__/prompt-bodies.test.ts index 954d6d4e0..a94e8f66e 100644 --- a/src/agents/prompts/__tests__/prompt-bodies.test.ts +++ b/src/agents/prompts/__tests__/prompt-bodies.test.ts @@ -1,80 +1,36 @@ import { execFile } from 'node:child_process'; -import { access, readFile, readdir } from 'node:fs/promises'; +import { access, readdir } from 'node:fs/promises'; import { dirname, join } from 'node:path'; import { fileURLToPath } from 'node:url'; import { promisify } from 'node:util'; import { describe, expect, it } from 'vitest'; +import { + BACKGROUND_SUBAGENT_IDS, + loadSubagentDefinitions, + subagentAgentsDir, +} from '../../../.pi/extensions/subagents/agents.js'; +import { BUNDLED_AGENT_BODY_IDS, bundledAgentBodyLocation } from '../../registry.js'; + const execFileAsync = promisify(execFile); const projectRoot = dirname(dirname(dirname(dirname(dirname(fileURLToPath(import.meta.url)))))); -const foregroundPromptExpectations = [ - { - system: 'src/agents/prompts/elicitor.md', - oldNested: 'src/agents/prompts/elicitor/SYSTEM.md', - legacyFlat: 'src/.pi/agents/elicitor.md', - needles: ['# Agent: elicitor', 'multi-spec discipline'], - }, - { - system: 'src/agents/prompts/executor.md', - oldNested: 'src/agents/prompts/executor/SYSTEM.md', - needles: ['# Agent: executor', 'execute mode'], - }, -]; - -const backgroundSubagentExpectations = [ - { - system: 'src/agents/subagents/reviewer.md', - oldNested: 'src/agents/prompts/reviewer/SYSTEM.md', - legacyFlat: 'src/.pi/agents/reviewer.md', - needles: ['name: reviewer', 'checking candidate'], - }, - { - system: 'src/agents/subagents/explorer.md', - oldNested: 'src/agents/prompts/explorer/SYSTEM.md', - needles: ['name: explorer', 'read-only reconnaissance agent'], - }, - { - system: 'src/agents/subagents/researcher.md', - oldNested: 'src/agents/prompts/researcher/SYSTEM.md', - needles: ['name: researcher', 'web-research agent'], - }, - { - system: 'src/agents/subagents/projector.md', - oldNested: 'src/agents/prompts/projector/SYSTEM.md', - needles: ['name: projector', 'candidate-proposal'], - }, -]; - async function expectMissing(path: string): Promise { await expect(access(join(projectRoot, path))).rejects.toThrow(); } describe('agent prompt bodies', () => { - it('keeps foreground agent body resources as flat prompt files', async () => { - for (const expectation of foregroundPromptExpectations) { - const content = await readFile(join(projectRoot, expectation.system), 'utf8'); - for (const needle of expectation.needles) { - expect(content).toContain(needle); - } - await expectMissing(expectation.oldNested); - if (expectation.legacyFlat) await expectMissing(expectation.legacyFlat); + it('loads foreground bodies through the code-owned registry', async () => { + for (const id of BUNDLED_AGENT_BODY_IDS) { + await expect(access(bundledAgentBodyLocation(id))).resolves.toBeUndefined(); } }); - it('keeps background subagent bodies out of the foreground prompt home', async () => { - for (const expectation of backgroundSubagentExpectations) { - const content = await readFile(join(projectRoot, expectation.system), 'utf8'); - for (const needle of expectation.needles) { - expect(content).toContain(needle); - } - await expectMissing(expectation.oldNested); - if (expectation.legacyFlat) await expectMissing(expectation.legacyFlat); - } - - await expectMissing('src/agents/prompts/pi-coder/SYSTEM.md'); + it('loads background subagents through their explicit registry', async () => { + const definitions = await loadSubagentDefinitions(subagentAgentsDir()); + expect([...definitions.keys()].sort()).toEqual([...BACKGROUND_SUBAGENT_IDS].sort()); }); it('builds generated agent assets without retired nested prompt-body directories', async () => { @@ -98,16 +54,4 @@ describe('agent prompt bodies', () => { 'reviewer.md', ]); }); - - it('records the foreground/background split in local READMEs', async () => { - const promptsReadme = await readFile(join(projectRoot, 'src/agents/prompts/README.md'), 'utf8'); - const subagentsReadme = await readFile(join(projectRoot, 'src/agents/subagents/README.md'), 'utf8'); - - expect(promptsReadme).toContain('Flat foreground files are canonical'); - expect(promptsReadme).toContain('src/agents/prompts/{elicitor,executor}.md'); - expect(promptsReadme).toContain('src/agents/subagents/'); - expect(promptsReadme).toContain('retired orchestrator / pi-coder body aliases are not preserved'); - expect(subagentsReadme).toContain('BACKGROUND_SUBAGENT_IDS'); - expect(subagentsReadme).toContain('Unlisted files are not spawnable'); - }); }); diff --git a/src/agents/skills/__tests__/prompt-resources.test.ts b/src/agents/skills/__tests__/prompt-resources.test.ts index 7f50e1142..e390833bd 100644 --- a/src/agents/skills/__tests__/prompt-resources.test.ts +++ b/src/agents/skills/__tests__/prompt-resources.test.ts @@ -1,114 +1,57 @@ -import { readFile } from 'node:fs/promises'; -import { dirname, join } from 'node:path'; +import { access, readFile } from 'node:fs/promises'; +import { dirname, join, relative } from 'node:path'; import { fileURLToPath } from 'node:url'; +import { parseFrontmatter } from '@earendil-works/pi-coding-agent'; import { describe, expect, it } from 'vitest'; -const projectRoot = dirname(dirname(dirname(dirname(dirname(fileURLToPath(import.meta.url)))))); +import { LENS_RESOURCES, METHOD_RESOURCES, STRATEGY_RESOURCES } from '../../runtime/state.js'; -const resourceExpectations = [ - { - file: 'src/agents/skills/methods/run-structured-exchange/SKILL.md', - needles: ['details.schema', 'schema` plus `v', 'answered`, `cancelled`, or `unavailable`'], - }, - { - file: 'src/agents/skills/methods/capture/SKILL.md', - needles: [ - 'single home', - 'FE-861', - 'Gap close/spawn responsibility belongs here', - 'graph-authoring-heuristics.md', - ], - }, - { - file: 'src/agents/skills/methods/commit-graph/SKILL.md', - needles: ['graph-authoring-heuristics.md', 'role-named mutation grammar'], - }, - { - file: 'src/agents/contexts/references/graph-authoring-heuristics.md', - needles: ['Graph authoring heuristics', 'graph-ontology.md', 'low-confidence', 'mutate_graph'], - }, - { - file: 'src/agents/skills/methods/generate-proposal/SKILL.md', - needles: ['legibility_cost_of_knowing', 'core_bet', 'graph_refs', '`{ node_id: string }` only'], - }, -]; +const projectRoot = dirname(dirname(dirname(dirname(dirname(fileURLToPath(import.meta.url)))))); const generateProposalDisclosureExpectations = { skill: 'src/agents/skills/methods/generate-proposal/SKILL.md', references: [ - { - file: 'src/agents/skills/methods/generate-proposal/references/intent.md', - needles: ['intent plane', 'single pick', 'present_candidates'], - }, - { - file: 'src/agents/skills/methods/generate-proposal/references/design.md', - needles: ['design plane', 'synthesize', 'present_review_set'], - }, - { - file: 'src/agents/skills/methods/generate-proposal/references/oracle.md', - needles: ['oracle plane', 'compose', 'blind spots'], - }, + 'src/agents/skills/methods/generate-proposal/references/intent.md', + 'src/agents/skills/methods/generate-proposal/references/design.md', + 'src/agents/skills/methods/generate-proposal/references/oracle.md', ], - probes: 'src/agents/skills/methods/generate-proposal/probes.md', }; describe('prompt-resource skills', () => { - it('keeps prompt-resource guidance in skill resources', async () => { - for (const expectation of resourceExpectations) { - const content = await readFile(join(projectRoot, expectation.file), 'utf8'); - for (const needle of expectation.needles) { - expect(content).toContain(needle); - } + it('keeps every code-owned prompt resource readable and substantial', async () => { + const entries = [ + ...Object.values(STRATEGY_RESOURCES), + ...Object.values(LENS_RESOURCES), + ...Object.values(METHOD_RESOURCES), + ]; + + for (const entry of entries) { + expect(relative(projectRoot, entry.location).startsWith('src/agents/skills/')).toBe(true); + expect(entry.location.endsWith(`/${entry.name}/SKILL.md`)).toBe(true); + await expect(access(entry.location)).resolves.toBeUndefined(); + + const raw = await readFile(entry.location, 'utf8'); + const { frontmatter, body } = parseFrontmatter(raw); + expect(frontmatter).toMatchObject({ name: entry.name, description: entry.description }); + expect( + body.length, + `${entry.name} should carry prompt-resource guidance beyond a placeholder`, + ).toBeGreaterThanOrEqual(700); } }); - it('keeps generate-proposal plane details behind explicit disclosed references', async () => { + it('keeps generate-proposal progressive-disclosure references reachable from the owning skill', async () => { const skill = await readFile(join(projectRoot, generateProposalDisclosureExpectations.skill), 'utf8'); - expect(skill).toContain('references/intent.md'); - expect(skill).toContain('references/design.md'); - expect(skill).toContain('references/oracle.md'); - expect(skill).toContain('Do not write picked intent candidates to the graph'); - expect(skill).toContain('Cite existing ontology/render surfaces'); - for (const expectation of generateProposalDisclosureExpectations.references) { - const content = await readFile(join(projectRoot, expectation.file), 'utf8'); - for (const needle of expectation.needles) { - expect(content).toContain(needle); - } + for (const reference of generateProposalDisclosureExpectations.references) { + await expect(access(join(projectRoot, reference))).resolves.toBeUndefined(); + expect(skill).toContain( + relative( + dirname(join(projectRoot, generateProposalDisclosureExpectations.skill)), + join(projectRoot, reference), + ), + ); } - - const probes = await readFile(join(projectRoot, generateProposalDisclosureExpectations.probes), 'utf8'); - expect(probes).toContain('Model: GPT-5.5 Last run: 2026-06-24'); - expect(probes).toContain('intent-pick'); - expect(probes).toContain('design-synthesize'); - expect(probes).toContain('oracle-compose'); - expect(probes).toContain('should NOT fire'); - }); - - it('records adopted prompt-skill topology and deferred prompt-skill triggers in the local README', async () => { - const readme = await readFile(join(projectRoot, 'src/agents/skills/README.md'), 'utf8'); - - expect(readme).toContain('Agent Skills-standard prompt resources'); - expect(readme).toContain('/SKILL.md'); - expect(readme).toContain('references/` subfiles'); - expect(readme).toContain('progressive disclosure'); - expect(readme).toContain('Shared typed-vocab context references'); - expect(readme).toContain('src/agents/contexts/references/graph-ontology.md'); - expect(readme).toContain('edge-policy, detail-payload, and `detail.form` vocabulary'); - expect(readme).toContain('drift-checked'); - expect(readme).toContain('Shared authored context references'); - expect(readme).toContain('src/agents/contexts/references/graph-authoring-heuristics.md'); - }); - - it('records the shared context-reference and backstage curation homes', async () => { - const contextsReadme = await readFile(join(projectRoot, 'src/agents/contexts/README.md'), 'utf8'); - expect(contextsReadme).toContain('references/ runtime-eligible shared context references'); - expect(contextsReadme).toContain('references/graph-ontology.md'); - expect(contextsReadme).toContain('references/graph-authoring-heuristics.md'); - - const docsReadme = await readFile(join(projectRoot, 'src/agents/docs/README.md'), 'utf8'); - expect(docsReadme).toContain('backstage notes for curating Brunch-authored agent resources'); - expect(docsReadme).toContain('not copied into packaged runtime assets'); }); }); From 2c4c136126e21163a1844d46daa35ee882326227 Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 17:30:34 +0200 Subject: [PATCH 28/29] design brainstorm on context slicing/drafting Signed-off-by: Lu Nelson --- src/.pi/extensions/brunch-data/graph/index.ts | 4 +- src/agents/contexts/README.md | 2 +- src/agents/contexts/drafting/README.md | 31 ++ .../drafting/intent-graph-semantics.md | 454 ++++++++++++++++++ .../drafting/slice-detail-payloads.md | 74 +++ .../contexts/drafting/slice-edge-authoring.md | 90 ++++ .../contexts/drafting/slice-kind-selection.md | 90 ++++ .../drafting/slice-neighborhood-reading.md | 71 +++ .../drafting/slice-plane-authoring.md | 109 +++++ .../drafting/slice-promotion-capture.md | 74 +++ .../references/context-slice-index.md | 61 +++ .../references/design-projection-slice.md | 99 ++++ .../references/graph-authoring-heuristics.md | 290 ++++++++++- .../references/intent-capture-slice.md | 77 +++ .../neighborhood-consumption-slice.md | 104 ++++ .../references/oracle-witness-slice.md | 109 +++++ .../references/plan-sequencing-slice.md | 116 +++++ .../references/review-set-drafting-slice.md | 122 +++++ src/agents/docs/context-reference-harvest.md | 285 +++++------ src/agents/skills/lenses/README.md | 10 +- src/agents/skills/lenses/oracle/SKILL.md | 6 +- .../generate-proposal/references/oracle.md | 2 +- 22 files changed, 2110 insertions(+), 170 deletions(-) create mode 100644 src/agents/contexts/drafting/README.md create mode 100644 src/agents/contexts/drafting/intent-graph-semantics.md create mode 100644 src/agents/contexts/drafting/slice-detail-payloads.md create mode 100644 src/agents/contexts/drafting/slice-edge-authoring.md create mode 100644 src/agents/contexts/drafting/slice-kind-selection.md create mode 100644 src/agents/contexts/drafting/slice-neighborhood-reading.md create mode 100644 src/agents/contexts/drafting/slice-plane-authoring.md create mode 100644 src/agents/contexts/drafting/slice-promotion-capture.md create mode 100644 src/agents/contexts/references/context-slice-index.md create mode 100644 src/agents/contexts/references/design-projection-slice.md create mode 100644 src/agents/contexts/references/intent-capture-slice.md create mode 100644 src/agents/contexts/references/neighborhood-consumption-slice.md create mode 100644 src/agents/contexts/references/oracle-witness-slice.md create mode 100644 src/agents/contexts/references/plan-sequencing-slice.md create mode 100644 src/agents/contexts/references/review-set-drafting-slice.md diff --git a/src/.pi/extensions/brunch-data/graph/index.ts b/src/.pi/extensions/brunch-data/graph/index.ts index 3b70d4b87..9689b40b3 100644 --- a/src/.pi/extensions/brunch-data/graph/index.ts +++ b/src/.pi/extensions/brunch-data/graph/index.ts @@ -165,8 +165,8 @@ export function registerBrunchGraph(pi: ExtensionAPI, deps: BrunchGraphDeps): vo 'Use mutate_graph to persist specification elements (goals, requirements, decisions, etc.) after the user has accepted the concept.', 'Each create_node op must have a unique batch `ref` string. create_edge ops reference nodes by role-named fields using that `ref` or `{existingCode: "G1"}` for nodes already in the selected spec.', 'If mutate_graph returns STRUCTURAL_ILLEGAL, read the diagnostics, fix the issues, and retry. Do not show intermediate failures to the user.', - 'The `stance` field is required on `proof` and `support` create_edge ops, and invalid on all other categories.', - 'Node kinds `decision` and `term` require a `detail` object; all other kinds must omit `detail`.', + 'The `stance` field is required on `witness` and `rationale` create_edge ops, and invalid on all other categories.', + 'Detail rules: `decision` and `term` require detail; `requirement`, `criterion`, and `invariant` may use `plain`/`gherkin`/`formal`; `context` may use `given`; other kinds omit detail.', ], parameters: MutateGraphParams, diff --git a/src/agents/contexts/README.md b/src/agents/contexts/README.md index b720558cf..95adad387 100644 --- a/src/agents/contexts/README.md +++ b/src/agents/contexts/README.md @@ -31,7 +31,7 @@ rules: `src/.pi/__tests__/architecture.test.ts` guards the adapter half of this boundary for `brunch-data` and structured-exchange tools: Pi adapters may own schemas, labels, descriptions, prompt snippets, and TUI rendering, but provider-visible Brunch text must be imported from this subtree rather than formatted inline. -`references/` files are runtime-eligible agent-readable context references. They are shared cite targets for prompt resources when vocabulary or judgment content should be loaded on demand without copying tables into skill bodies. Generated references, such as `references/graph-ontology.md`, are committed artifacts with their source-of-truth and drift-check command named in the file. Authored references, such as `references/graph-authoring-heuristics.md`, carry shared judgment with concrete skill readers and must point to generated references for vocabulary rather than restating tables. The packaged CLI copies this subtree into `dist/agents/contexts/references/` because skills may cite these files at runtime. +`references/` files are runtime-eligible agent-readable context references. They are shared cite targets for prompt resources when vocabulary or judgment content should be loaded on demand without copying tables into skill bodies. Generated references, such as `references/graph-ontology.md`, are committed artifacts with their source-of-truth and drift-check command named in the file. Authored references, such as `references/graph-authoring-heuristics.md`, carry shared judgment with concrete skill readers and must point to generated references for vocabulary rather than restating tables. Draft injectable slice candidates may live here while being evaluated when they self-label as drafts and are not treated as required prompt-resource payload until a skill or prompt cites them. The packaged CLI copies this subtree into `dist/agents/contexts/references/` because skills may cite these files at runtime. ## Snapshot convention diff --git a/src/agents/contexts/drafting/README.md b/src/agents/contexts/drafting/README.md new file mode 100644 index 000000000..f2da784c7 --- /dev/null +++ b/src/agents/contexts/drafting/README.md @@ -0,0 +1,31 @@ +# agents/contexts/drafting/ — scratch, not wired + +Draft, isolated experiments. Nothing here is runtime prompt payload, packaged into agent assets, cited by a skill/prompt, or imported by code. This directory exists to develop candidate context material before any decision to promote it. + +Promoting anything from here into `src/agents/contexts/references/` is a separate, deliberate step: a runtime-eligible reference needs a named skill/prompt reader under D97-L, and the generated vocabulary tables ([`../references/graph-ontology.md`](../references/graph-ontology.md)) remain the source of truth that authored slices cite rather than restate. + +## Contents + +- [`intent-graph-semantics.md`](intent-graph-semantics.md) — the design-reasoning synthesis: the current ontology (4 planes / 24 kinds / 4 bands, 9 edge categories, `detail`/`detail.form`, reconciliation + elicitation substrates) with the rationale preserved from the recovered `INTENT_GRAPH_SEMANTICS.md`. Read this for *why*; read the slices for *do this now*. +- `slice-*.md` — compact, model-facing injectable slices distilled from that synthesis. + +## Injectable slices — when to inject which + +``` +policy: cumulative (more than one slice may apply to a turn) + +slice | inject when the agent is… | primary readers +-------------------------------|-------------------------------------------------|----------------------------- +slice-kind-selection.md | picking a node `kind` for new graph truth | elicitor capture, generate +slice-edge-authoring.md | relating two nodes (which category + stance) | commit-graph, generate +slice-detail-payloads.md | creating decision/term, or attaching detail.form | capture, generate +slice-promotion-capture.md | sweeping a turn into truth/gaps/reconciliation | capture, review-for-gaps +slice-neighborhood-reading.md | consuming an anchored context pack to reason | any agent reading graph context +slice-plane-authoring.md | generating coherent intent/oracle/design/plan | generate-proposal (per lens) +``` + +`slice-plane-authoring.md` is section-anchored (`#intent`, `#oracle`, `#design`, `#plan`) so a per-lens caller can inject one plane's conduct rather than the whole file. + +## Slice form conventions + +Slices use the `pseudo` notations — `matrix` decision tables (with explicit `policy:`), `chain` flows, `graph` node/edge lists, `data-shape` YAML — plus markdown tables, kept terse and activation-dense. Each slice header states its purpose, its inject-trigger, and the source of truth it cites. A slice is operational ("do this"); the synthesis doc is explanatory ("why"). diff --git a/src/agents/contexts/drafting/intent-graph-semantics.md b/src/agents/contexts/drafting/intent-graph-semantics.md new file mode 100644 index 000000000..a3fb4ba55 --- /dev/null +++ b/src/agents/contexts/drafting/intent-graph-semantics.md @@ -0,0 +1,454 @@ +# Intent graph semantics + +> **Status: draft, isolated, not wired.** This file lives in `src/agents/contexts/drafting/` — a scratch directory. It is **not** runtime prompt payload, is not copied into packaged agent assets, is not cited by any skill or prompt, and is imported by nothing. It exists to carry the *design reasoning* behind the Brunch intent graph in one legible place, updated to the current model. +> +> **Provenance.** This is a faithful-to-current redraft of the recovered `docs/design/INTENT_GRAPH_SEMANTICS.md` — the now-dangling companion that [`ONTOLOGY_REVIEW_PROTOCOL.md`](../../../../docs/design/ONTOLOGY_REVIEW_PROTOCOL.md) and [`BEHAVIORAL_KERNELS.md`](../../../../docs/design/BEHAVIORAL_KERNELS.md) both still link to at `./INTENT_GRAPH_SEMANTICS.md`. The original described an older nine-kind claim ontology. The model has since moved through FE-1052 (schema enum changes; `GRAPH_MODEL.md` retired) and FE-1090 (data-model-legibility: generated references + adjudication of the salvaged richness). This draft describes **today's** model accurately and preserves the original's design thinking where it still earns its keep, noting where old schema ideas were deliberately superseded rather than re-proposing dead schema as live. +> +> **Stance.** Describe the current model; preserve the reasoning; do not re-open settled verdicts as if undecided. Where the old doc proposed structure the current model rejected (kind subtypes, `checkability`/`strength` stored fields, the five-family relation taxonomy as edge kinds, `support`/`status` edge metadata), this draft maps the *intent* onto the mechanism that carries it now. + +## Source of truth this draft cites, never overrides + +This is reasoning prose, not authority. The canonical artifacts: + +- **Generated vocabulary tables** — [`src/agents/contexts/references/graph-ontology.md`](../references/graph-ontology.md), projected by `src/graph/schema/generate-ontology-ref.ts` from [`kinds.ts`](../../../graph/schema/kinds.ts), [`nodes.ts`](../../../graph/schema/nodes.ts), and [`category-policy.ts`](../../../graph/policy/category-policy.ts) (D73-L). Regenerate with `npm run generate:ontology`; drift is caught by `npm run check:data-model`. +- **Authored authoring judgment** — [`src/agents/contexts/references/graph-authoring-heuristics.md`](../references/graph-authoring-heuristics.md): the runtime-eligible shared reference cited by `capture` and `commit-graph` (D97-L). +- **Schema leaves** — `src/graph/schema/kinds.ts` (closed enums), `nodes.ts` (`GraphNode`, detail schemas), `edges.ts` (`GraphEdge`), `reconciliation-need.ts`, `elicitation-gaps.ts`; `src/graph/policy/category-policy.ts` (edge-category metadata); `src/graph/projection/labels.ts` + `direction.ts` (anchor-relative phrasing + impact direction). +- **SPEC decisions** — D51-L (closed edge categories + ReconciliationNeed), D54-L (node shape), D55-L (provenance retired → `change_log`), D56-L (13 intent kinds, per-kind rubric, no derived category axis), D57-L (LLM-judged readiness), D61-L (spec = initiative; "claim" is an umbrella over truth-bearing kinds), D62-L (projected codes), D63-L (`basis` = approval directness), D64-L/D94-L (derived readiness bands), D65-L (elicitation_gaps), D73-L (domain owns vocabulary), D87-L/D88-L/D89-L (closure rule, `detail.form`, `spec.kind`), D97-L (cite-don't-inline), D98-L (SPEC/CODE mode-only runtime), D8-L/D29-L (reconciliation substrate). +- **Worked rationale companion** — [`ONTOLOGY_REVIEW_PROTOCOL.md`](../../../../docs/design/ONTOLOGY_REVIEW_PROTOCOL.md) §6–9 records exactly how the older ontology narrowed into the current one (the closure rule, node/edge deltas, the epistemic triad, the Gherkin validation). + +When this draft and a generated table disagree, the generated table wins; this prose is stale and should be fixed. + +## The framing + +**A spec is a graph of typed claims.** Each node kind is a *modality* of claim — a stance toward the world — not just a section bucket. The original doc's central thesis survives intact; what changed is the partitioning. Where the old model had nine flat top-level kinds, the current model partitions the node space into **four planes** carrying **24 kinds**, and pushes method-specific structure (BDD, formal verification) down into inert `detail.form` payload rather than into the kind set (D87-L closure rule). + +```pseudo +spec graph (one per spec; no cross-spec claim sharing — D61-L): + intent plane what / why / obligation / uncertainty / examples + oracle plane how claims are checked or evidenced + design plane how the system is shaped + plan plane how the work is sequenced + +accepted graph truth: + nodes stable items: kind, basis, source, optional detail (no status) + edges closed structural categories with role-named endpoints (no status) + +not graph truth, adjacent substrates: + elicitation_gaps prospective coverage obligations (D65-L) + reconciliation_needs retrospective repair obligations (D8-L / D29-L) + review-set drafts candidate material awaiting human acceptance +``` + +The conceptual load-bearing rule, repeated throughout: **`kind` drives behavior** — readiness band (D94-L), edge legality (D51-L), and the elicitor's source-question (D56-L) all key off `kind`. `detail.form` is inert payload plus a renderer hook; it never changes what kind of graph thing a node is. + +## The four planes and their kinds + +Twenty-four kinds across four planes, in canonical plane order. Codes and bands are generated in [`graph-ontology.md`](../references/graph-ontology.md) (reproduced here for legibility; that file is the source of truth). A band of `—` means the kind carries no readiness band (D94-L); band-less kinds are `example`, `sketch`, `term`. + +### Intent plane — what and why (13 kinds) + +| Kind | Code | Modality of claim | Source-question | Bands | +| --- | --- | --- | --- | --- | +| `goal` | G | Value / outcome claim | "What outcome are we after?" | grounding | +| `thesis` | TH | Position / bet claim | "Who is this for, and why does it matter?" | grounding | +| `term` | T | Vocabulary commitment | "What do we mean by X?" | — | +| `context` | CTX | Descriptive claim | "What is true about the world this lives in?" | grounding, elicitation | +| `story` | ST | Intra-spec grouping | "What cluster of behavior does this belong to?" | elicitation | +| `unknown` | UNK | Known-unknown claim | "What can't we answer yet but must accommodate?" | elicitation | +| `requirement` | REQ | Obligation claim | "What must the system do?" | commitment | +| `assumption` | A | Deferred-falsifiable belief | "What might be false?" | elicitation | +| `constraint` | CON | Boundary claim | "What does this rule out?" | grounding, elicitation | +| `invariant` | INV | Preservation claim | "What must never be broken?" | elicitation | +| `decision` | D | Choice claim | "What did we pick among real alternatives?" | elicitation | +| `criterion` | AC | Oracle claim | "How will we judge that it holds?" | commitment | +| `example` | EX | Witness / disambiguator | "What concrete case would settle this?" | — | + +What is new relative to the salvaged nine-kind doc: + +- **`thesis`** (TH) — the who/what/why/for-whom framing, target user, problem theory, product bet (La Carte Blanche style, D56-L). The old doc folded this into `goal`/`context`. A goal commits the team to a target; a thesis stakes a refutable position about who the work is for and why. +- **`term`** (T) — canonical naming commitments / ubiquitous language. The old doc explicitly said `term` was *not* part of the typed-claim kind set "until a future lexicon model promotes terms into graph-addressable claim records." **That future arrived:** `term` is now a first-class, graph-addressable intent kind. It carries a required `detail` payload (`definition`, optional `aliases`) and is band-less. +- **`story`** (ST) — mid-level narrative grouping inside one spec (a Gherkin `Feature` expressed inside a single spec; ONTOLOGY_REVIEW_PROTOCOL §6.5). Adds no edge of its own — it reuses `composition` (story → requirement) and `witness` (criterion → requirement). +- **`unknown`** (UNK) — a known-unknown: a domain uncertainty not presently answerable that the spec or plan must structurally accommodate. It completes the epistemic triad (below). + +`thesis` carries the conceptual weight the salvaged doc's earlier drafts wanted to assign to a renamed `claim` kind; the ONTOLOGY_REVIEW_PROTOCOL §6.2 proposed `thesis → claim`, but the code kept `thesis`. "Claim" is now an **umbrella vocabulary term** (D61-L) for the truth-bearing intent kinds (`requirement`, `assumption`, `constraint`, `invariant`, `decision`, `criterion`, `example`), not a node kind. + +### Oracle plane — how we know (4 kinds) + +| Kind | Code | Role | Band | +| --- | --- | --- | --- | +| `check` | CH | A concrete verification check (a test, assertion, step-def) | projection | +| `vv_method` | VV | A verification method (prover / solver / golden / probe family) | projection | +| `evidence` | E | Observed evidence | projection | +| `vv_obligation` | O | A proof / verification obligation | projection | + +The salvaged doc's `criterion` subtypes (`acceptance`, `test`, `manual_review`, `runtime_check`, `proof`, `observability`) are reconstructed here as: **the intent-plane `criterion`** (the oracle *claim* — how we judge a property) plus **oracle-plane nodes** (the concrete machinery). The discrimination the subtypes carried is preserved as the intent/oracle plane boundary, not as a subtype enum. Link a concrete oracle to the claim it judges with a `witness` edge. + +### Design plane — how it's shaped (4 kinds) + +| Kind | Code | Role | Band | +| --- | --- | --- | --- | +| `module` | MOD | An implementation seam / module | projection | +| `interface` | API | An interface / contract surface | projection | +| `entity` | ENT | A data / domain entity | projection | +| `sketch` | SKT | An intentionally lightweight design sketch (advisory, not hardened) | — | + +### Plan plane — how it's sequenced (3 kinds) + +| Kind | Code | Role | Band | +| --- | --- | --- | --- | +| `milestone` | M | A bounded phase | commitment | +| `frontier` | F | The plan / tracker / branch unit | commitment | +| `slice` | S | The buildable implementation unit inside a frontier | commitment | + +### Spec scope is not a node kind + +`spec.kind ∈ product | feature | function` (D89-L) is an **ownership relation to the codebase**, resolved on the spec row, not in the node graph: + +- `product` — the spec owns the whole codebase. +- `feature` — the spec owns a part and a cycle within a brownfield codebase. +- `function` — the spec captures (often formal) verification around a focused area. + +The recurring "feature" intuition the old doc would have modeled as a kind is spec-scope leaking into the node taxonomy. `actor` and `scenario` remain deferred (ONTOLOGY_REVIEW_PROTOCOL §8). + +## Why there are no subtypes + +The salvaged doc gave each of `constraint`, `criterion`, `invariant`, and `example` an enum of subtypes "to keep the top-level kind set small while preserving the discriminations the LLM needs." The current model reached the same goal — a small kind set with preserved discrimination — by a **different mechanism**, and FE-1090 explicitly rejected subtype enums as a parallel ontology carrying cost. The discriminations are preserved in three places instead: + +1. **The plane boundary** — what the old `criterion` subtypes split (`test`, `runtime_check`, `proof`, `observability`) now splits across `criterion` (intent oracle-claim) and the oracle-plane kinds (`check`, `vv_method`, `evidence`, `vv_obligation`). +2. **Edge structure + stance** — what the old `example` subtypes split (`positive`, `negative`) is now polarity on a `witness` edge (`stance: for | against`); `not_relevant` is an `exclusion` edge from a `constraint`/non-goal boundary. +3. **`detail.form`** — what method-specific subtypes (Gherkin, formal) carried is now inert `detail.form` payload (D88-L). + +Mapping the old subtype intents onto current mechanisms: + +| Old subtype intent | Current mechanism | +| --- | --- | +| `constraint.non_goal` | `constraint` node + `exclusion` edge to the excluded subject | +| `constraint.scope` / `technical` / `policy` / `resource` / `compatibility` / `environmental` | `constraint` node; the nuance lives in `title`/`body`, not a stored subtype | +| `criterion.acceptance` | `criterion` (the default reading of AC) | +| `criterion.test` / `runtime_check` | oracle-plane `check`, linked by `witness` | +| `criterion.proof` | oracle-plane `vv_obligation` / `vv_method`; `detail.form:"formal"` on the claim | +| `criterion.manual_review` | `criterion` + `vv_method` naming a reviewer rubric | +| `criterion.observability` | oracle-plane `evidence` / `check` | +| `invariant.state` / `transition` / `authority` / `provenance` / `consistency` / `security` / `data_integrity` | `invariant` node; nuance in `title`/`body`; `detail.form:"formal"` when round-tripping a prover | +| `example.positive` | `example` + `witness:for` | +| `example.negative` / counterexample | `example` + `witness:against` | +| `example.edge_case` / `trace` | `example`; the kind of case lives in wording | +| `example.not_relevant` | `example` + `exclusion` edge from a boundary `constraint` | + +`invariant` being first-class (not a `constraint` subtype, as some readings of the old doc implied) is load-bearing per D56-L: its operational role differs — **invariants take `dependency` and `witness` edges; constraints take `exclusion` edges.** + +## The epistemic triad: context / assumption / unknown + +The old doc's `context` promotion rules implied a two-way fork between "known" and "might be false." The current model makes this a **three-way informal certainty triad** — a routing heuristic, not a stored `epistemic_status` field (ONTOLOGY_REVIEW_PROTOCOL §6.6): + +- `context` — known / stipulated true for this spec. +- `assumption` — believed enough to proceed, but **deferred-falsifiable** ("what might be false"). +- `unknown` — a known-unknown; explicitly not known, and the system or plan must accommodate that ignorance. + +Do not launder a known-unknown into an assumption to make the graph look complete. Routing for formal work: an **axiom / given → `context` + `detail.form:"given"`** (known *and* load-bearing); load-bearing-ness comes from outgoing `dependency` edges, not from the kind. A **theorem / property → `invariant`** (a preservation claim carrying `witness` edges). + +## Promotion rules + +The interviewer and the capture sweep should treat the kinds as a partial lattice with explicit promotion. The most common drift is `context` — the broadest attractor — absorbing material that deserves a sharper kind. This is the authored judgment in [`graph-authoring-heuristics.md`](../references/graph-authoring-heuristics.md); reproduced here with the triad and the new kinds folded in. + +| If the descriptive material… | Promote to… | +| --- | --- | +| states the desired outcome or why the work matters | `goal` or `thesis` | +| defines a term or naming commitment | `term` | +| must be true for success or safety | `requirement` or `invariant` | +| limits acceptable solutions or scope | `constraint` | +| is believed but might be materially false | `assumption` | +| is an acknowledged unknown that can't simply be answered now | `unknown` | +| chooses among alternatives with durable consequences | `decision` | +| explains how success will be judged | `criterion` or an oracle-plane node | +| gives a concrete case, trace, or counterexample | `example` | +| only helps interpretation, no stronger role yet | keep `context` | + +Cross-kind pairings the old doc named, still true: + +- **`requirement` ↔ `invariant`** — a requirement to *do* X often pairs with an invariant to *preserve* P across the doing of it. +- **`decision` ↔ `invariant`** — the decision captures the choice; the invariant captures the rule that must keep holding after it. +- **`assumption` retirement** — a validated assumption does not become a requirement. It becomes a `decision` (if validation forced a choice) or it is retired as confirmed `context`; dependents stop carrying the assumption dependency. + +## Decision-capture criteria + +Unchanged judgment, reconciled fields. A claim becomes a `decision` only if **all** hold (the old doc's five tests survive verbatim in spirit): + +1. Plausible alternatives existed. +2. The choice is durable — it constrains future design, implementation, or interpretation. +3. The choice is explicit — statable as "we chose A over B/C," not as a description of current behavior. +4. At least one rejected alternative can be named. +5. There is a rationale. + +**Required `detail` fields, reconciled to code** ([`nodes.ts`](../../../graph/schema/nodes.ts) `DecisionDetail`): `chosen_option`, `rejected` (≥ 1), `rationale`. The old doc also required `scope` and `consequences`; the current schema **dropped both** — put scope and downstream consequences in the node `body` or express them with edges (`exclusion` for what the decision rules out, `dependency` for what now relies on it). Do not invent decision-detail fields. + +## Classification guide + +When the capture sweep turns an answered turn into graph truth, a one-line rule per kind decides how to classify a span. Abstain rather than guess; speculative captures degrade graph signal and should route to an `elicitation_gap` instead. + +| Kind | One-line classification rule | +| --- | --- | +| `goal` | "X so that Y" / "we want Y" — outcome, no implementation committed | +| `thesis` | "this is for X because…" — target user / problem theory / bet | +| `term` | "by X we mean…" — naming commitment | +| `context` | descriptive present-tense fact that does not commit the system | +| `story` | "this group of behavior is about…" — intra-spec cluster | +| `unknown` | "a known unknown is…" — can't answer now, must accommodate | +| `constraint` | "must not", "cannot", "out of scope", "only if" — bounds solution space | +| `assumption` | "we think", "probably", "if X is true" — material belief that could be wrong | +| `decision` | "we chose A over B because" — see decision-capture criteria | +| `requirement` | "the system shall" / "must do" — obligation | +| `invariant` | "always true", "never", "must remain" — preservation across states/transitions | +| `criterion` | "we'll know it works when", "tested by", "we'll review for" — oracle for a property | +| `example` | "for instance", "like when", "what about the case where" — concrete witness | + +## Readiness bands replace phases + +The old doc mapped capture to four **phases** (grounding / design / requirements review / criteria review). The current model derives a **readiness band** per kind (D64-L/D94-L via `bandsForKind`) over four bands — `grounding`, `elicitation`, `projection`, `commitment`. Bands guide questioning and projection; **they do not gate graph truth.** If the user states a later-band item early, capture it honestly with the right kind and basis. + +| Band | What it gathers | Kinds (intent unless noted) | +| --- | --- | --- | +| `grounding` | the starting frame | `goal`, `thesis`, `context`, `constraint` | +| `elicitation` | the working middle | `context`, `story`, `unknown`, `assumption`, `constraint`, `invariant`, `decision` | +| `projection` | materialized structure | oracle + design plane kinds (`check`, `vv_method`, `evidence`, `vv_obligation`, `module`, `interface`, `entity`) | +| `commitment` | hardened obligations | `requirement`, `criterion`; plan plane (`milestone`, `frontier`, `slice`) | +| `—` (band-less) | always-available | `term`, `example`, `sketch` | + +The conceptual shift the old doc anticipated holds: **hardening is requirements + invariants + criteria + examples**, with preservation claims and witness claims durable rather than conversational. Operationally, the runtime exposes only two modes (D98-L): **`SPEC`** runs the elicitor (the band ladder above); **`CODE`** runs the executor. The old per-phase "materialized at review acceptance" column is now the `basis` distinction (below) plus review-set acceptance. + +## Edges: nine closed structural categories + +The old doc proposed a **five-family relation taxonomy** with open named relations (`derived_from`, `motivated_by`, `rules_out`, `tested_by`, …). The current model (D51-L) is a **closed set of nine structural categories** with role-named endpoints and per-category policy. The named-relation dialects are retired as edge categories — do **not** use `derived_from`, `motivated_by`, `rules_out`, `counterexample_for`, or `tested_by` as categories. The category metadata is the source of truth ([`category-policy.ts`](../../../graph/policy/category-policy.ts)); reproduced for legibility: + +| Category | Endpoint roles | Affected | Impact | Stance | Criteria help | Projection effect | +| --- | --- | --- | --- | --- | --- | --- | +| `dependency` | dependency → dependent | target | cascade | — | no | none | +| `witness` | oracle → claim | source | advisory | required | yes | none | +| `rationale` | support → claim | source | advisory | required | no | none | +| `realization` | abstract → concrete | target | advisory | — | no | none | +| `refinement` | abstract → concrete | target | advisory | — | no | none | +| `exclusion` | boundary → subject | target | advisory | — | no | none | +| `composition` | whole → part | source | advisory | — | no | none | +| `cross_reference` | peer → peer | — | none | — | no | none | +| `supersession` | successor → predecessor | source | advisory | — | no | hide_predecessor_from_active_context | + +Stance (`for | against`) is **required** on `witness` and `rationale`, **invalid** everywhere else. + +### Old families → current categories + +| Old family (old relations) | Current category | +| --- | --- | +| Justification — `motivated_by`, `supports` | `rationale` (`stance: for`) | +| Justification — `derived_from` | `dependency` (reliance) or `refinement` (specialization), per intent | +| Dependency — `depends_on`, `assumes`, `requires` | `dependency` | +| Boundary — `constrains`, `excludes`, `rules_out`, `bounds_scope_of` | `exclusion` | +| Refinement — `refines`, `specializes` | `refinement` | +| Refinement — `decomposes` | `composition` | +| Verification — `verifies`, `illustrates`, `disambiguates`, `tested_by` | `witness` (`stance: for`) | +| Verification — `counterexample_for` | `witness` (`stance: against`) | +| (catch-all `related_to`) | `cross_reference` | +| (replacement lineage) | `supersession` | + +### Negative knowledge is first-class + +The old doc's most important insight — *intent is clarified by ruling out plausible interpretations* — survives, carried by stance and `exclusion` rather than by negative relation kinds: + +```pseudo +counterexample / rejected interpretation: + EX2: rejected review item appears in export + witness oracle: EX2 claim: INV4 stance: against + +out-of-scope disambiguator: + EX3: importing old local dev fixtures + exclusion boundary: CON2 subject: EX3 +``` + +Prefer a concrete `example` plus `witness:against`, or an `exclusion` edge, over vague prose ("not that"). Contradiction between two accepted claims is **not** an edge: with the `conflict` edge deliberately deferred, a contradiction surfaces as a `reconciliation_need` of kind `semantic_conflict` (ONTOLOGY_REVIEW_PROTOCOL §8). + +## Edge and node records: basis, not support/status + +The old doc's `KnowledgeEdge` carried `support` (explicit / strong_inference / weak_candidate) and `status` (proposed / accepted / rejected / stale) plus `provenanceTurnId`. The current `GraphEdge` ([`edges.ts`](../../../graph/schema/edges.ts)) and `GraphNode` ([`nodes.ts`](../../../graph/schema/nodes.ts)) are leaner: + +```ts +interface GraphEdge { + category: EdgeCategory // one of the nine + sourceId, targetId: NodeId // storage order carries NO impact meaning + stance?: 'for' | 'against' // required iff witness | rationale + basis: 'explicit' | 'implicit' // approval directness (D63-L) + rationale?: string // why the relation holds + // + id, specId, createdAtLsn, updatedAtLsn +} + +interface GraphNode { + plane, kind, kindOrdinal // kind drives behavior; code = label+ordinal (D62-L) + title, body? + basis: 'explicit' | 'implicit' + source?: string // lightweight epistemic attribution text, not policy + detail?: NodeDetail // decision | term | claim-form union + // + id, specId, createdAtLsn, updatedAtLsn +} +``` + +How the old fields map: + +- **`support` → `basis`.** Approval *directness*, two values: `explicit` (user stated it or approved the exact item in a review set) vs `implicit` (user accepted a concept and the agent materialized the specific item without per-item review, D63-L). The old "weak_candidate" tier does **not** become graph truth at all — it routes to an `elicitation_gap` or a review-set draft. +- **`status` → absent.** Accepted nodes and edges are **present-or-absent** (no mutable `status`). `proposed` lives in review-set drafts; `rejected` is absence plus `change_log` audit; `stale` is a `reconciliation_need`, not a mutated field. +- **`provenanceTurnId` → retired.** `change_log` owns the full audit trail keyed by LSN (D55-L). Transcript-entry pointers are fragile under compaction. +- **`rationale` → kept** on edges; `source` on nodes carries free-form epistemic attribution ("stakeholder", "regulatory", "derived"). + +Do not re-introduce `support`, `status`, `provenanceTurnId`, `createdBy`, or per-claim `checkability`/`strength` fields. + +## Node detail payloads + +Two kinds carry required, non-form detail; four kinds carry the inert `detail.form` union (D88-L). Source of truth: [`nodes.ts`](../../../graph/schema/nodes.ts). + +```ts +// required detail +decision: { chosen_option: string; rejected: string[] /* ≥1 */; rationale: string } +term: { definition: string; aliases?: string[] } + +// inert method payload (form-discriminated) +type ClaimForm = + | { form: 'plain' } + | { form: 'gherkin'; given?: string[]; when?: string[]; then: string[] /* ≥1 */ } + | { form: 'formal'; language: string; statement: string } + | { form: 'given'; statement: string } + +requirement | criterion | invariant : form ∈ plain | gherkin | formal +context : form ∈ given +``` + +The closure rule (D87-L): a specification *method* — BDD, EDD, formal verification — earns no kind of its own. It maps onto the ontology as `spec.kind` + `detail.form` + a renderer + a heuristic-set. One shared `form` discriminant across kinds lets a lens query "all `formal`-form nodes in this spec" to round-trip a LEAN/Dafny file regardless of kind. Do **not** infer edge legality, readiness, or commitment strength from `detail.form` — it is structure plus a renderer hook only. + +## Endpoint-relative labels and direction + +The old doc's relation-policy registry called for storing `source_label` / `target_label` and `source_change` / `target_change` so a snapshot centered on either endpoint reads correctly and so directionality is never recovered from a verb name. The current model implements exactly this split across two projections that both read `category-policy.ts`: + +- **`projection/labels.ts`** — anchor-relative phrasing. A two-tier table keyed on `(category, anchorRole, stance)` (≈18 base cells) plus a small tier-2 refinement keyed on `(category, sourceKind, targetKind)`. Renderers never leak the structural vocabulary. +- **`projection/direction.ts`** — upstream / downstream / lateral, read from the `affected` endpoint in the policy table, **not** from storage geometry. "Downstream" is the endpoint that needs reconciliation when the other changes. + +Base anchor-relative labels (from [`labels.ts`](../../../graph/projection/labels.ts)): + +| Category | Anchor = source | Anchor = target | +| --- | --- | --- | +| `dependency` | required by | depends on | +| `witness` | witnesses / refutes | witnessed by / challenged by | +| `rationale` | supports / argues against | motivated by / opposed by | +| `realization` | realized by | realizes | +| `refinement` | refined by | refines | +| `exclusion` | bounds | bounded by | +| `composition` | contains | part of | +| `supersession` | supersedes | superseded by | +| `cross_reference` | related to | related to | + +Tier-2 refinements sharpen a few `realization` verbs by endpoint kind: requirement→module / interface→module render "implemented by" / "implements"; requirement→slice render "established by" / "establishes"; invariant→requirement render "expressed by" / "expresses". + +The old doc's worry — "context packs and reconciliation never recover directionality from verb names alone" — is now an enforced invariant: directionality comes from `category` + `affected`, and labels are projections of `category` + `anchorRole` + `stance`. + +## Edge-local neighborhoods are the useful context unit + +The old doc's strongest practical recommendation — provide **edge-local neighborhoods**, not grouped item lists ("all goals, all requirements") — is the live rendering shape under `src/agents/contexts/graph/`. A neighborhood pack anchors on one node and groups incident edges by impact direction with policy-derived labels: + +```text +anchor node +- REQ1: Stage 2 configuration-space requirement (hub anchor) + +upstream nodes (3) — review anchor if these change +- depends on A1: Local-only execution assumption +- expresses INV1: No network call invariant +- bounded by CON1: No cloud dependencies constraint + +downstream nodes (9) — reconcile these if anchor changes +- required by D1: Two-stage split decision {hard} +- implemented by MOD1: SQLite configuration store module +- established by S1: Persist configuration spaces slice +- witnessed by AC1: Airplane-mode acceptance criterion +- challenged by EX1: Network-outage counterexample +- motivated by CTX1: Stakeholder offline-first preference +- opposed by CTX2: Conflicting always-connected note +- part of F1: Configuration-space data frontier +- superseded by REQ2: Revised configuration-space requirement + +lateral nodes (1) — cross-check with anchor if either changes +- related to G1: Offline-first product goal +``` + +The old doc's "dependencies / dependents / evidence / reconciliation / historical" neighborhood selectors map onto: `upstream` (premises the anchor relies on), `downstream` (impact if the anchor changes), `lateral` (`cross_reference` peers), the `criteriaHelpSignal` axis (evidence selection via `witness` edges), and `reconciliation_need` records (the reconciliation selector). The **historical** selector remains as the old doc left it — changeset-derived, not approximated from current graph order — and is not faked before a changeset ledger exists. + +## Topology-driven question ranking + +Once the graph carries kinds and typed edges, the interviewer ranks the next question by topology rather than template. These are ranking heuristics, not automatic writes; low-confidence material routes to an `elicitation_gap`, never to a speculative node. They complement the behavioral-kernel signal-phrase routing in [`BEHAVIORAL_KERNELS.md`](../../../../docs/design/BEHAVIORAL_KERNELS.md): kernels suggest *what kind* of question to ask; topology heuristics suggest *which item* to ask about next. + +| Signal | Suggested question shape | +| --- | --- | +| High-fanout `assumption` with thin evidence | "Many claims depend on X. Validate it, or mark the risk?" | +| `requirement` / `invariant` with no `witness` path | "How will we know this holds?" | +| `criterion` not linked to the claim it judges | "Which requirement or invariant does this criterion check?" | +| Candidate `decision` lacking rejected alternatives or rationale | "What did we consider and rule out before choosing this?" | +| `exclusion`/constraints disagreeing about one subject | "These boundaries conflict. Which one wins?" | +| `goal`/`thesis` with no path into requirements, design, or plan | "What would satisfy this in the actual system?" | +| Requirement with no example and high ambiguity | "What concrete case would settle this interpretation?" | +| `unknown` blocking a design or plan edge | "Accommodate it, investigate it, or narrow scope around it?" | + +This substrate is the `elicitation_gaps` register (D65-L): a flat table of prospective coverage obligations, each with a `predicate` (`presence` is structurally derivable; `field` and `coverage` are not yet supported; `manual` rides disposition), a `band`, an `importance`, and a `disposition` (`open` / `answered` / `not_applicable` / `irrelevant` / `reopened`). Structural coverage is derived from the graph at read time, not stored. + +## Translation table — user phrases to kinds + +The bridge between user vocabulary and the ontology. Treat these as **strong priors**, not rigid rules; the classification rule still governs the final assignment. + +| User phrase pattern | Most likely route | +| --- | --- | +| "we want Y" / "X so that Y" | `goal` | +| "this is for X because…" | `thesis` | +| "by X we mean…" | `term` | +| "true about the environment / repo / domain…" | `context` (unless promotable) | +| "a known unknown is…" | `unknown` | +| "always true that…" / "should never…" / "must remain" | `invariant` | +| "valid transition from X to Y" | `invariant` (a transition-flavored one) | +| "must not" / "cannot" / "out of scope" / "we don't care about X" | `constraint` (with an `exclusion` edge for non-goals) | +| "probably" / "we think" / "if X is true" | `assumption` | +| "the system must…" | `requirement` | +| "we picked Y over Z because…" | `decision` | +| "we'll know it works when…" / "tested by" | `criterion` or an oracle-plane node | +| "for example, when…" | `example` (link `witness:for`) | +| "but what about the case where…" | `example` (edge case) | +| "we wouldn't want…" / counterexample | `example` + `witness:against`, or `constraint` | +| "another plausible interpretation is…" | `example` (a disambiguating one) | +| "module" / "API" / "entity" / "sketch" | design-plane kind | +| "test" / "proof" / "evidence" / "verification method" | oracle-plane kind | +| "milestone" / "frontier" / "slice" | plan-plane kind | + +## Progressive checkability is conduct, not schema + +The old doc proposed a stored `checkability` ladder and a `ClaimMetadata` record (`checkability`, `oracle`, `strength`, `validTraces`, `invalidTraces`). FE-1090 **rejected these as carrying cost**: claim-level `checkability` / `strength` / trace-list fields are not added to the schema. The *discipline* survives as **oracle conduct**, documented for the oracle lens in [`generate-proposal/references/oracle.md`](../../skills/methods/generate-proposal/references/oracle.md). + +The ladder is a reasoning tool, weakest sufficient artifact first: + +``` +human review → example / counterexample → regression / golden + → runtime contract → property / model-based rule + → probe / transcript → proof obligation +``` + +Emit the **weakest sufficient** artifact for the claim at hand, and express it as existing graph vocabulary (`criterion`, `check`, `vv_method`, `evidence`, `vv_obligation`, `example`, `witness` edges). The old `strength` field's honesty function ("checked on three examples" ≠ "proved for all reachable states") is preserved as a **prose disclosure** of evidence breadth and blind spots in the oracle ensemble, not as graph metadata. If a future scoped reader proves it needs evidence breadth as schema, that is a fresh decision — it is not assumed here. + +## Consumers of the typed graph + +What reads this ontology today, and which part each consumer leans on: + +| Consumer | Leans on | +| --- | --- | +| Capture sweep (`methods/capture`) | kinds, promotion rules, `basis`, capture routes (graph truth vs gap vs reconciliation need) | +| Commit-graph (`methods/commit-graph`) | edge categories, role-named `mutate_graph` grammar, stance legality | +| Generate-proposal (`methods/generate-proposal`) | per-plane fan-out (intent / design / oracle), review-set fan-in; oracle conduct ladder | +| Review-for-gaps (`methods/review-for-gaps`) | `elicitation_gaps` predicates, topology question ranking | +| Reconciliation flow | `affected` endpoint + impact direction; `reconciliation_need` (incl. `semantic_conflict`) | +| Neighborhood / overview rendering (`contexts/graph/`) | anchor-relative labels, upstream/downstream/lateral, edge-local packs | +| Spec / plan document outputs (`contexts/spec/`, `contexts/plan/`) | kinds, bands, codes | +| Export grounding | neighborhood traversal to explain why each requirement is present | + +## What this draft deliberately does not do + +- It does not re-propose kind subtypes, a `conflict` edge, an `epistemic_status` field, `actor` / `scenario` kinds, the speculation/`bench` plane, or a project graph — all deferred or rejected (ONTOLOGY_REVIEW_PROTOCOL §8, FE-1090). Where the original doc leaned on them, the mapping above shows the current carrier. +- It does not add `checkability` / `strength` / `validTraces` / `invalidTraces` to any record. +- It is not wired into runtime payload and claims no skill reader. Promoting any section into `src/agents/contexts/references/` (with a named skill reader, per D97-L) would be a separate, deliberate decision. diff --git a/src/agents/contexts/drafting/slice-detail-payloads.md b/src/agents/contexts/drafting/slice-detail-payloads.md new file mode 100644 index 000000000..9ed2d9389 --- /dev/null +++ b/src/agents/contexts/drafting/slice-detail-payloads.md @@ -0,0 +1,74 @@ +# Slice: node detail payloads + +> Draft injectable context slice (scratch; not wired). Inject when an agent creates a `decision` or `term` node, or attaches a `detail.form` to a claim/`context` node. Source of truth is [`graph-ontology.md`](../references/graph-ontology.md) (projected from `src/graph/schema/nodes.ts`). + +Two kinds require a non-form `detail` payload. Four kinds accept the inert `detail.form` method payload. **`kind` drives behavior; `detail.form` is inert** — it changes how a node renders or round-trips, never its readiness band, edge legality, or commitment strength. + +## Required detail + +```yaml +# decision — all three required; rejected must be non-empty +decision: + chosen_option: string # the selected option/position + rejected: string[] # >= 1 named alternative + rationale: string # why the chosen option won + # scope/consequences are NOT fields — put them in body or express with edges + +# term — definition required; aliases optional +term: + definition: string # canonical definition + aliases: string[]? # optional alternate names +``` + +A `decision` without a named rejected alternative is just a description — capture it as `context` or `decision` only once an alternative can be named (see `slice-kind-selection.md`). + +## Claim detail.form + +```yaml +# legality: which kinds accept which forms +requirement: form in [plain, gherkin, formal] +criterion: form in [plain, gherkin, formal] +invariant: form in [plain, gherkin, formal] +context: form in [given] +# all other kinds: no detail.form + +# form payloads (discriminated by `form`) +plain: + form: literal "plain" # default; no structured payload + +gherkin: + form: literal "gherkin" + given: string[]? # preconditions + when: string[]? # actions + then: string[] # outcomes — >= 1 required + +formal: + form: literal "formal" + language: string # e.g. lean | dafny + statement: string # formal statement text for round-trip + +given: + form: literal "given" # context only + statement: string # stipulated axiom/given +``` + +## Routing forms + +``` +policy: first-match + +material | kind + form +--------------------------------------------------|----------------------------- +plain prose claim | + form: plain +Given/When/Then behavior spec | + form: gherkin +theorem/property for a prover (LEAN/Dafny) | invariant + form: formal +stipulated axiom, load-bearing & known-true | context + form: given + +notes: + - load-bearing-ness of a `given` comes from its outgoing `dependency` edges, not the form. + - one shared `form` vocabulary across kinds lets a lens collect all `formal`-form nodes to round-trip one prover file. +``` + +## Do not invent + +Accepted nodes/edges carry no `status`, `support`, `provenanceTurnId`, `createdBy`, `checkability`, or `strength` fields. Approval directness is `basis: explicit | implicit`; epistemic attribution is the free-text `source` on a node; audit/provenance lives in `change_log` by LSN; staleness is a `reconciliation_need`. diff --git a/src/agents/contexts/drafting/slice-edge-authoring.md b/src/agents/contexts/drafting/slice-edge-authoring.md new file mode 100644 index 000000000..efa653904 --- /dev/null +++ b/src/agents/contexts/drafting/slice-edge-authoring.md @@ -0,0 +1,90 @@ +# Slice: edge authoring + +> Draft injectable context slice (scratch; not wired). Inject when an agent is about to relate two nodes. Source of truth for edge-category policy is [`graph-ontology.md`](../references/graph-ontology.md) (projected from `src/graph/policy/category-policy.ts`); authoring judgment is [`graph-authoring-heuristics.md`](../references/graph-authoring-heuristics.md). + +Edges are a **closed set of nine structural categories** with role-named endpoints. Do not use retired named-relation dialects (`derived_from`, `motivated_by`, `rules_out`, `counterexample_for`, `tested_by`) as categories — they map onto the nine below. Endpoint storage order carries no meaning; category metadata owns direction. + +## The nine categories + +``` +policy: exclusive (each edge is exactly one category) + +category | source role -> target role | affected | impact | stance +----------------|------------------------------|----------|----------|---------- +dependency | dependency -> dependent | target | cascade | - +witness | oracle -> claim | source | advisory | required +rationale | support -> claim | source | advisory | required +realization | abstract -> concrete | target | advisory | - +refinement | abstract -> concrete | target | advisory | - +exclusion | boundary -> subject | target | advisory | - +composition | whole -> part | source | advisory | - +cross_reference | peer -> peer | - | none | - +supersession | successor -> predecessor | source | advisory | - +``` + +Stance (`for | against`) is **required** on `witness` and `rationale`, **omitted** everywhere else. `supersession` hides the predecessor from active context and must stay acyclic. + +## If you mean… use… + +``` +policy: exclusive + +if you mean… | use category | stance +------------------------------------------------------------|--------------|-------- +one claim relies on another staying true | dependency | - +an oracle/example/check/evidence supports a claim | witness | for +an oracle/example/counterexample refutes a claim | witness | against +a goal/thesis/argument motivates a claim | rationale | for +an argument opposes a claim | rationale | against +an abstract claim is implemented/expressed by a concrete one| realization | - +a general claim/model is specialized by a more specific one | refinement | - +a boundary/non-goal/constraint limits a subject | exclusion | - +a whole contains a part | composition | - +two items relate but no stronger relation is justified | cross_reference | - +a newer item replaces an older item | supersession | - +``` + +## Role-named grammar + +Author with role names, never with `source`/`target` geometry: + +``` +create_edge dependency: dependency: A1 dependent: REQ1 +create_edge witness: oracle: AC1 claim: REQ1 stance: for +create_edge witness: oracle: EX2 claim: INV4 stance: against +create_edge rationale: support: G2 claim: REQ1 stance: for +create_edge realization: abstract: REQ1 concrete: MOD1 +create_edge refinement: abstract: REQ1 concrete: REQ2 +create_edge exclusion: boundary: CON2 subject: EX3 +create_edge composition: whole: F1 part: S1 +create_edge supersession: successor: REQ2 predecessor: REQ1 +create_edge cross_reference: peer: REQ1 peer: G1 +``` + +## Negative knowledge is first-class + +Intent is often clarified by what is ruled out. Prefer a concrete node + stance/exclusion over vague "not that" prose. + +``` +counterexample / rejected interpretation: + EX2: rejected review item appears in export + create_edge witness: oracle: EX2 claim: INV4 stance: against + +out-of-scope disambiguator: + EX3: importing old local dev fixtures + create_edge exclusion: boundary: CON2 subject: EX3 +``` + +Contradiction between two accepted claims is **not** an edge — there is no `conflict` category. Raise a `reconciliation_need` of kind `semantic_conflict`. + +## Author edges only between settled endpoints + +``` +chain relation-authoring: + candidate relation + -> both endpoints settled enough to be graph truth? + x> no: spawn/reuse an elicitation_gap for the missing endpoint; skip the edge + -> contradicts existing truth? + x> yes: raise a reconciliation_need; do not overwrite or add a competing edge + -> create_edge with role-named endpoints + stance (witness/rationale only) +``` diff --git a/src/agents/contexts/drafting/slice-kind-selection.md b/src/agents/contexts/drafting/slice-kind-selection.md new file mode 100644 index 000000000..40f638e8a --- /dev/null +++ b/src/agents/contexts/drafting/slice-kind-selection.md @@ -0,0 +1,90 @@ +# Slice: node-kind selection + +> Draft injectable context slice (scratch; not wired). Inject when an agent is about to write graph truth and must pick a node `kind`. Source of truth for the exact kind list/codes/bands is [`graph-ontology.md`](../references/graph-ontology.md); authoring judgment is [`graph-authoring-heuristics.md`](../references/graph-authoring-heuristics.md). This slice is a compact decision aid, not authority. + +Pick the `kind` by the **role the material plays**, not the words the user used. `kind` drives behavior (readiness band, edge legality, the source-question you answer next). When support is weak, do not guess a kind — route to an elicitation gap (see `slice-promotion-capture.md`). + +## Intent kinds — modality and source-question + +| Kind | Code | Modality of claim | Answer this | Band | +| --- | --- | --- | --- | --- | +| `goal` | G | value / outcome | "What outcome are we after?" | grounding | +| `thesis` | TH | position / bet | "Who is this for, and why does it matter?" | grounding | +| `term` | T | vocabulary commitment | "What do we mean by X?" | — | +| `context` | CTX | descriptive | "What is true about the world this lives in?" | grounding, elicitation | +| `story` | ST | intra-spec grouping | "What cluster of behavior is this part of?" | elicitation | +| `unknown` | UNK | known-unknown | "What can't we answer yet but must accommodate?" | elicitation | +| `requirement` | REQ | obligation | "What must the system do?" | commitment | +| `assumption` | A | deferred-falsifiable belief | "What might be false?" | elicitation | +| `constraint` | CON | boundary | "What does this rule out?" | grounding, elicitation | +| `invariant` | INV | preservation | "What must never be broken?" | elicitation | +| `decision` | D | choice | "What did we pick among real alternatives?" | elicitation | +| `criterion` | AC | oracle | "How will we judge that it holds?" | commitment | +| `example` | EX | witness / disambiguator | "What concrete case would settle this?" | — | + +Other planes (use when the material is no longer intent capture): oracle — `check` (CH), `vv_method` (VV), `evidence` (E), `vv_obligation` (O); design — `module` (MOD), `interface` (API), `entity` (ENT), `sketch` (SKT); plan — `milestone` (M), `frontier` (F), `slice` (S). + +## Signal → kind (first-match) + +`context` is the broadest attractor, so it sits last: try every sharper kind before filing as `context`. + +``` +policy: first-match +context: classifying one span of settled material + +rule | signal in the material | -> kind +-----|--------------------------------------------------------|------------------ +R1 | "we chose A over B because…" (real alternatives ruled out) | decision +R2 | "the system must / shall…" | requirement +R3 | "always / never / must remain true while it runs" | invariant +R4 | "must not / cannot / out of scope / we don't care about"| constraint +R5 | "we'll know it works when / tested by / reviewed for" | criterion +R6 | "for instance / like when / the case where / counterexample" | example +R7 | "probably / we think / assuming / if X holds" (could be false) | assumption +R8 | "we don't know yet / open question / TBD" (must accommodate) | unknown +R9 | "by X we mean / call it X" (naming commitment) | term +R10 | "this is for because…" (target + problem theory) | thesis +R11 | "we want / so that…" (outcome, no implementation) | goal +R12 | "this group of behavior is about…" (mid-level cluster) | story +R13 | otherwise descriptive, aids interpretation only | context + +notes: + - R3 vs R4: invariant = a property preserved across operation/evolution; constraint = a bound on the solution space. Invariants take dependency/witness edges; constraints take exclusion edges. + - R1: if no real alternative was ever on the table, it is context, not a decision. + - A claim can spawn a paired node (a requirement to DO X often pairs with an invariant to PRESERVE P). Capture both; relate with edges. +``` + +## Disambiguate nearby kinds + +``` +goal vs thesis + -> goal commits to a target outcome + -> thesis stakes a refutable position about who/why; carries the problem theory + +context vs assumption vs unknown (epistemic triad) + -> context : known / stipulated true for this spec + -> assumption: believed enough to proceed, but later validation could overturn it + -> unknown : explicitly not known; the spec/plan must accommodate the ignorance + x> do not launder a known-unknown into an assumption to look complete + +constraint vs invariant + -> constraint narrows acceptable solutions (a non-goal, scope, policy, platform bound) + -> invariant protects a property across states/transitions/versions + +criterion vs oracle-plane node + -> criterion : the oracle CLAIM in intent space (how we judge a property) + -> check / vv_method / evidence / vv_obligation : the concrete verification machinery + -> link the concrete oracle to the claim with a `witness` edge + +story vs example + -> story groups related behavior inside the spec (a Gherkin Feature lives here) + -> example is a concrete witness (a Scenario/row is criterion or example) + +sketch vs committed design node + -> sketch : advisory/early design, not yet graph truth + -> module/interface/entity : settled design claim +``` + +## Abstain rule + +Weak classification support is a signal to **stop**, not to pick the nearest kind. Route low-confidence material to an `elicitation_gap` with a question + rationale; route a contradiction with existing truth to a `reconciliation_need`. Speculative captures degrade graph signal. diff --git a/src/agents/contexts/drafting/slice-neighborhood-reading.md b/src/agents/contexts/drafting/slice-neighborhood-reading.md new file mode 100644 index 000000000..7a8e67fad --- /dev/null +++ b/src/agents/contexts/drafting/slice-neighborhood-reading.md @@ -0,0 +1,71 @@ +# Slice: reading an anchored neighborhood + +> Draft injectable context slice (scratch; not wired). Inject when an agent consumes an anchored neighborhood / context pack and must reason about consequences, dependencies, or drift. Direction and labels are projected from `src/graph/policy/category-policy.ts` via `src/graph/projection/{labels,direction}.ts`. + +An edge-local neighborhood is a stronger context object than "all goals, all requirements." It anchors on one node and groups incident edges by **impact direction**, each rendered with an **anchor-relative label**. Read the grouping and the label as the meaning — never reconstruct direction from the English verb or from `sourceId`/`targetId`. + +## How to read a pack + +``` +anchor node +- REQ1: Stage 2 configuration-space requirement (hub anchor) + +upstream nodes — review the anchor if these change +- depends on A1: Local-only execution assumption +- expresses INV1: No network call invariant +- bounded by CON1: No cloud dependencies constraint + +downstream nodes — reconcile these if the anchor changes +- required by D1: Two-stage split decision {hard} +- implemented by MOD1: SQLite configuration store module +- witnessed by AC1: Airplane-mode acceptance criterion +- challenged by EX1: Network-outage counterexample +- superseded by REQ2: Revised configuration-space requirement + +lateral nodes — cross-check with the anchor if either changes +- related to G1: Offline-first product goal +``` + +Reading rules: + +- **upstream** = premises the anchor relies on; if they change, **review the anchor**. +- **downstream** = claims affected if the anchor changes; **reconcile them** on edit. `{hard}` marks a cascade-strength (`dependency`) edge. +- **lateral** = symmetric `cross_reference` peers; no impact direction. + +## Anchor-relative label table + +The same edge reads differently from each endpoint. Labels are projections of `(category, anchor end, stance)`. + +| Category | Anchor = source side | Anchor = target side | +| --- | --- | --- | +| `dependency` | required by | depends on | +| `witness` (for / against) | witnesses / refutes | witnessed by / challenged by | +| `rationale` (for / against) | supports / argues against | motivated by / opposed by | +| `realization` | realized by | realizes | +| `refinement` | refined by | refines | +| `exclusion` | bounds | bounded by | +| `composition` | contains | part of | +| `supersession` | supersedes | superseded by | +| `cross_reference` | related to | related to | + +Kind-sharpened `realization` verbs: requirement/interface → module render "implemented by / implements"; requirement → slice render "established by / establishes"; invariant → requirement render "expressed by / expresses". + +## Direction is metadata, not verb + +``` +x> do NOT infer upstream/downstream from the verb ("depends on" sounds passive) +x> do NOT infer direction from which node is stored as sourceId +-> direction = the category's `affected` endpoint (direction projection) +-> label = the category + anchor end + stance (label projection) +``` + +`reconciliation_need` records are advisory side-channel items, **not** edges; they will not appear as neighborhood edges even though they point at graph state. + +## What a pack is good for + +``` +why does this item stand? -> read upstream (premises, constraints, assumptions, motivating goals) +what breaks if I change it? -> read downstream (impact; reconcile these) +is it verified? -> read witness/criterion/evidence neighbors +is there drift/contradiction?-> check for reconciliation_needs touching the anchor +``` diff --git a/src/agents/contexts/drafting/slice-plane-authoring.md b/src/agents/contexts/drafting/slice-plane-authoring.md new file mode 100644 index 000000000..c92a59640 --- /dev/null +++ b/src/agents/contexts/drafting/slice-plane-authoring.md @@ -0,0 +1,109 @@ +# Slice: authoring by plane + +> Draft injectable context slice (scratch; not wired). Inject the whole, or excerpt one section by anchor (`#intent`, `#oracle`, `#design`, `#plan`), when an agent is generating coherent content on that plane. Source of truth: [`graph-ontology.md`](../references/graph-ontology.md) + [`graph-authoring-heuristics.md`](../references/graph-authoring-heuristics.md). Pairs with `slice-kind-selection.md`, `slice-edge-authoring.md`, and `slice-detail-payloads.md`. + +Each plane answers a different concern. Stay on the plane the active work is on; cross-plane links are edges, not kind changes. Promote across planes only when the material genuinely hardens. + +## intent — what and why #intent + +Build the spec's truth-bearing claims. Coherence here means: every goal/thesis has a path toward something that satisfies it, and every commitment is judged. + +``` +tree intent obligations: + goal / thesis value + bet — the why + requirement obligations that serve the why + invariant what must stay true while requirements operate + criterion how each requirement/invariant is judged + example concrete witnesses + counterexamples + context / term the stipulated frame + lexicon + assumption / unknown what might be false / what is not yet known + constraint what the solution space rules out +``` + +``` +coherence checks (intent) + goal/thesis -> has a rationale edge into >=1 requirement? else: gap + requirement -> has a witness path (criterion/example/oracle)? else: gap + requirement -> pairs with an invariant it must not break? consider + decision -> names >=1 rejected alternative + rationale? else: not a decision + assumption -> high fanout + thin evidence? surface risk + constraint/non-goal -> attached to its subject via exclusion? else: vague +``` + +Typical edges: `rationale` (goal→requirement), `dependency` (claim→claim), `exclusion` (constraint→subject), `refinement` (general→specific), `witness` (example→claim, with stance). + +## oracle — how we know #oracle + +Make claims checkable. Distinguish the intent-plane `criterion` (the oracle *claim*) from oracle-plane nodes (the concrete *machinery*). Choose the **weakest sufficient** artifact; redundancy across independent oracle families is a feature when it reduces bad degrees of freedom at acceptable cost. + +``` +kinds: check (CH) | vv_method (VV) | evidence (E) | vv_obligation (O) + +ladder (weakest sufficient first — this is conduct, not a stored field): + human review -> example/counterexample -> regression/golden + -> runtime contract -> property/model-based -> probe/transcript -> proof obligation +``` + +``` +coherence checks (oracle) + criterion -> witness edge into the requirement/invariant it judges? else: orphan oracle + check/vv_method -> names the observation that discriminates pass from fail? else: it's a task, not an oracle + ensemble -> blind spots named in prose? (false-positive shape, trigger to revisit) + counterexample -> example + witness:against the claim it falsifies +``` + +Typical edges: `witness` (oracle→claim, `stance: for|against`), `realization` (criterion→check). Express evidence breadth (reviewed / example-backed / regression-covered / enforced / proved) as prose, never as graph metadata. + +## design — how it's shaped #design + +Name the implementation shape that realizes the intent. Keep advisory material as `sketch` until it is settled enough to be graph truth. + +``` +kinds: module (MOD) | interface (API) | entity (ENT) | sketch (SKT) + + module a seam / unit of implementation + interface a contract surface + entity a data/domain entity + sketch intentionally lightweight, not yet hardened +``` + +``` +coherence checks (design) + module/interface -> realizes >=1 requirement (realization edge)? else: unanchored design + entity -> referenced by a requirement/criterion it serves? + sketch -> promote to module/interface/entity once settled; don't leave advisory truth + interface -> precondition as constraint, postcondition as criterion/invariant, hung on the API +``` + +Typical edges: `realization` (requirement→module, renders "implemented by"), `refinement` (model→specialization), `composition` (whole→part), `dependency` (module→module). + +## plan — how it's sequenced #plan + +Sequence the work. A `frontier` is the plan/tracker/branch unit; a `slice` is the buildable unit inside it; a `milestone` is a bounded phase. + +``` +kinds: milestone (M) | frontier (F) | slice (S) + +tree plan containment: + milestone + frontier the tracker/branch unit + slice the buildable unit; establishes requirements +``` + +``` +coherence checks (plan) + frontier -> contains >=1 slice (composition)? + slice -> establishes >=1 requirement (realization, renders "established by")? + frontier -> dependencies mirror intent dependencies, not intra-frontier slice order + milestone-> frontiers map to an invariant bundle to establish +``` + +Typical edges: `composition` (milestone→frontier→slice), `realization` (requirement→slice, renders "established by"), `dependency` (frontier→frontier). + +## Cross-plane discipline + +``` +-> the same concept does not change kind to cross planes; you add an edge +-> readiness bands guide questioning/projection; they do NOT gate truth +-> if the user states a later-plane item early, capture it honestly with the right kind + basis +``` diff --git a/src/agents/contexts/drafting/slice-promotion-capture.md b/src/agents/contexts/drafting/slice-promotion-capture.md new file mode 100644 index 000000000..ee3e5f938 --- /dev/null +++ b/src/agents/contexts/drafting/slice-promotion-capture.md @@ -0,0 +1,74 @@ +# Slice: promotion & capture routing + +> Draft injectable context slice (scratch; not wired). Inject during the capture sweep — turning an answered turn into graph truth, gaps, or reconciliation needs. Source of truth is [`graph-authoring-heuristics.md`](../references/graph-authoring-heuristics.md). + +Two disciplines: **promote** descriptive material to its sharpest kind before filing, and **route** each span to the substrate that matches its confidence and conflict state. Capture first, then ask from the updated world. + +## Capture-then-ask + +``` +chain capture-then-ask: + unswept transcript tail + -> classify each span by modality (see slice-kind-selection) + -> promote context to its sharpest kind + -> route by confidence/conflict (table below) + -> mutate_graph / update_elicitation_gaps / raise reconciliation_need + -> compose next question over the updated graph + gaps +``` + +## Promote before filing as context + +`context` is the broadest attractor and the most common misclassification. Promote before writing. + +``` +policy: first-match +context: a span that reads as "descriptive" + +if the material… | -> route to +--------------------------------------------------------|------------------------ +states the desired outcome / why the work matters | goal or thesis +defines a term or naming commitment | term +must be true for success or safety | requirement or invariant +limits acceptable solutions or scope | constraint +is believed but might be materially false | assumption +is an acknowledged unknown, not answerable now | unknown +chooses among alternatives with durable consequences | decision +explains how success will be judged | criterion or oracle node +gives a concrete case / trace / counterexample | example +only aids interpretation, no stronger role yet | keep context +``` + +## Route by confidence and conflict + +``` +policy: first-match +context: where does this span go? + +state of the material | -> route +--------------------------------------------------------|--------------------------------- +directly stated, or exact-review approved | graph truth, basis: explicit +confidently materialized from accepted content | graph truth, basis: implicit +low-confidence noticing / suspicion / possible implication / missing piece | elicitation_gap (question + rationale) +contradicts existing graph truth | reconciliation_need +a batch awaiting human judgment | review-set draft (not accepted truth) + +notes: + - basis is approval DIRECTNESS, not the mutation path. The path lives in change_log. + - rejected proposals are absent from active truth (audit only). There is no `status` field. +``` + +## Substrates at a glance + +``` +graph truth accepted nodes + edges; present-or-absent; no status +elicitation_gaps prospective coverage obligations; flat table; NOT graph nodes + predicate: presence (structural) | field, coverage (unsupported) | manual + disposition: open | answered | not_applicable | irrelevant | reopened +reconciliation_needs retrospective repair obligations; NOT edges + kind: edge_revalidation | possible_relation | possible_duplicate | semantic_conflict +review-set drafts candidate material before acceptance +``` + +## Abstain + +Weak support is a stop signal. Prefer an `elicitation_gap` over a speculative node, and a `reconciliation_need` over a competing edge. Speculative captures degrade graph signal. diff --git a/src/agents/contexts/references/context-slice-index.md b/src/agents/contexts/references/context-slice-index.md new file mode 100644 index 000000000..a1307e586 --- /dev/null +++ b/src/agents/contexts/references/context-slice-index.md @@ -0,0 +1,61 @@ +# Context slice index + +Draft injectable reference. Use this as a selector when composing short-lived LLM context over the graph ontology. Exact vocabulary lives in `graph-ontology.md`; broad graph-authoring judgment lives in `graph-authoring-heuristics.md`. + +## Selection rule + +Inject the smallest slice that matches the current job. Prefer one topical slice plus `neighborhood-consumption-slice.md` over a full ontology dump. + +| Current job | Inject | Avoid | +| --- | --- | --- | +| Capture user/world/spec material into graph truth | `intent-capture-slice.md` | oracle/design/plan slices unless the user gave that material directly | +| Project accepted intent into modules, interfaces, entities, or sketches | `design-projection-slice.md` + relevant neighborhoods | free-floating design guesses with no intent anchor | +| Design verification, criteria, tests, probes, evidence, or proof obligations | `oracle-witness-slice.md` + relevant neighborhoods | claim-level checkability fields or bespoke oracle tools | +| Sequence milestones, frontiers, and buildable slices | `plan-sequencing-slice.md` + relevant neighborhoods | task lists detached from requirements/invariants/design seams | +| Draft a reviewable graph proposal batch | `review-set-drafting-slice.md` + the topical slice | direct graph writes for unsettled or low-confidence material | +| Explain or edit one existing item | `neighborhood-consumption-slice.md` | global kind lists without incident edges | + +## Context-stack graph + +```pseudo +nodes: + ontology: generated vocabulary + authoring: shared judgment + neighborhood: item-centered context + intent: spec claim capture + design: shape projection + oracle: verification projection + plan: sequencing projection + review: human-adjudicated proposal batch + +edges: + ontology -> authoring + authoring -> intent + intent -> design + intent -> oracle + intent -> plan + design -> oracle + design -> plan + oracle -> plan + neighborhood -> intent, design, oracle, plan, review + intent, design, oracle, plan -> review + +notes: + - ontology is generated; do not restate its tables in topical slices. + - topical slices are conduct, not schema. They teach how to use the graph vocabulary. +``` + +## Mainline use chain + +```pseudo +incoming task + -> read selected spec overview and relevant neighborhoods + -> choose topical slice from the table above + -> classify or project material using current graph vocabulary + -> route unsettled material to gaps or review drafts + -> commit only settled graph truth through the graph mutation boundary +``` + +## Draft status + +These slices are candidate injectable references. A skill or prompt should cite a slice only when that slice has a concrete reader and improves behavior more than loading the larger `graph-authoring-heuristics.md` reference. diff --git a/src/agents/contexts/references/design-projection-slice.md b/src/agents/contexts/references/design-projection-slice.md new file mode 100644 index 000000000..d1aa518a5 --- /dev/null +++ b/src/agents/contexts/references/design-projection-slice.md @@ -0,0 +1,99 @@ +# Design projection slice + +Draft injectable reference for agents projecting accepted intent into coherent design-plane content. Use when generating, reviewing, or explaining `module`, `interface`, `entity`, or `sketch` nodes. + +## Job + +Turn intent pressure into design shape without pretending early sketches are settled architecture. + +```pseudo +accepted intent neighborhood + -> identify load-bearing goals, constraints, invariants, requirements, examples + -> project candidate design seams + -> choose design kind: sketch | module | interface | entity + -> attach design nodes back to intent with role-named edges + -> surface missing anchors as gaps or review notes, not design facts +``` + +## Design-kind routing + +| Design material | Kind | Use when | Avoid | +| --- | --- | --- | --- | +| implementation part that hides complexity | `module` | there is a named responsibility and boundary | dumping every file/class into graph truth | +| contract across a seam | `interface` | callers/callees, tool schemas, API contracts, or data exchange matter | using interface as a synonym for module | +| domain or data object | `entity` | identity, lifecycle, relationships, or storage shape matter | modelling every noun as an entity | +| tentative diagram, option, or advisory shape | `sketch` | design helps thinking but should not yet constrain work | hardening speculative architecture | + +## Projection graph + +```pseudo +nodes: + goal: intent + constraint: intent + invariant: intent + requirement: intent + example: intent + module: design + interface: design + entity: design + sketch: design + +edges: + goal, requirement -[rationale:for]-> module + constraint -[exclusion]-> module, interface, entity + invariant -[dependency]-> module, interface + requirement -[realization]-> module, interface + interface -[composition]-> entity + sketch -[refinement]-> module, interface, entity # only after accepted + example -[witness:for]-> interface, entity # if the case demonstrates the seam + +notes: + - `realization` reads abstract -> concrete: requirement/invariant/interface -> module/slice/check. + - `refinement` reads broad -> specific: generic model -> specialized model. + - Use `sketch` for early advisory design material instead of promoting it prematurely. +``` + +## Coherent design content checklist + +A design node is coherent when it names: + +- the pressure it answers: which requirement, invariant, constraint, goal, or example forced this shape; +- what it hides or stabilizes; +- what may depend on it downstream; +- whether it is settled design (`module`/`interface`/`entity`) or advisory design (`sketch`); +- at least one useful edge back to intent unless the node is explicitly a `sketch`. + +## Design projection matrix + +| Intent pressure | Likely design response | Edge to create when settled | +| --- | --- | --- | +| requirement needs behavior | `module` or `interface` | `realization` | +| invariant protects state or authority | `interface`, `entity`, or module boundary | `dependency` from invariant to design subject, plus `realization` if the design expresses it | +| constraint rules out options | boundary/design node that is limited | `exclusion` | +| example reveals a domain object | `entity` | `witness` or `rationale` depending on whether it proves or motivates | +| term stabilizes vocabulary | `entity` or `interface` name | `rationale` only if the term motivates the design choice | +| unknown must be accommodated | `sketch` or explicit design seam | `dependency` only if the unknown is truly load-bearing | + +## Anti-patterns + +- Do not make a module for every source file; graph design nodes are conceptual seams. +- Do not use `sketch` as a permanent parking lot for accepted design truth. +- Do not create a design node whose only support is “this is how systems like this usually look.” +- Do not encode method/style vocabulary as node kinds; use existing kinds plus `detail.form` and renderer/heuristic conduct. +- Do not infer design readiness from `detail.form`; behavior comes from `kind` and edges. + +## Minimum design proposal shape + +```yaml +design_candidate: + node: + kind: module | interface | entity | sketch + title: string + body: string # responsibility, boundary, and why it exists + anchors: + intent_refs: string[] # projected graph codes or titles + edge_plan: string[] # role-named edges to create if accepted + risk_notes: string[] # what is speculative or missing +``` + +If `anchors.intent_refs` is empty, the design is probably a brainstorm; keep it as prose, a `sketch`, or a candidate review item until anchored. diff --git a/src/agents/contexts/references/graph-authoring-heuristics.md b/src/agents/contexts/references/graph-authoring-heuristics.md index 008b8a1ab..9c4d0c0f4 100644 --- a/src/agents/contexts/references/graph-authoring-heuristics.md +++ b/src/agents/contexts/references/graph-authoring-heuristics.md @@ -1,28 +1,187 @@ # Graph authoring heuristics -Runtime-eligible shared reference for graph-writing judgment (D97-L/D98-L). Use this for authoring discipline that is shared by `capture` and `commit-graph`; use `graph-ontology.md` for generated kind/band, edge-category, and detail/form vocabulary instead of restating tables here. +Runtime-eligible shared reference for graph-writing judgment (D97-L/D98-L). Use `graph-ontology.md` for generated vocabulary tables: exact kind list, codes, readiness bands, edge-category policy, required detail payloads, and `detail.form` legality. This file carries the authored judgment the elicitor needs to classify, promote, relate, and verify graph material without copying schema tables into skill bodies. -## Author declarative graph claims +## Mental model -Every graph node should read as a stable claim, not an interview prompt or scratch note. +Brunch's graph is a typed graph of stable specification material. Most nodes should read as declarative claims or named artifacts, not interview prompts, scratch notes, or hidden chain-of-thought. -- Normalize questions into the underlying declarative claim before writing graph truth. -- Keep follow-ups with no stable claim out of graph truth; route them to elicitation gaps instead. -- Promote before filing as `context`: if the material is success-critical, limiting, possibly false but consequential, a choice among alternatives, or a value bet, use the sharper node kind. -- Use `context` only for descriptive material that aids interpretation but does not yet carry a stronger graph role. +```pseudo +spec graph: + intent plane what / why / obligation / uncertainty / examples + oracle plane how claims are checked or evidenced + design plane how the system is shaped + plan plane how the work is sequenced + +accepted graph truth: + nodes: stable graph items with kind, basis, source, optional detail + edges: structural categories with role-named endpoints + gaps: prospective elicitation obligations, not graph truth + reconciliation_needs: retrospective repair obligations, not graph edges +``` + +The old nine-kind claim ontology is superseded. The current model has four planes and 24 node kinds. The current exact set is generated in `graph-ontology.md`; use this guide for the semantic routing behind those kinds. + +## Classify by modality, then by plane + +Start from the role the material plays, not the words the user happened to use. + +### Intent plane + +- `goal` — value or outcome claim: what result is sought, without committing to implementation. +- `thesis` — position or bet claim: who/what/why framing, target user, problem theory, or product bet. +- `term` — vocabulary commitment: canonical definition, alias, or ubiquitous-language clarification. `term` is graph-addressable now, but band-less. +- `context` — descriptive claim: a relevant fact about the world, repo, domain, environment, or starting situation. +- `story` — intra-spec grouping: a mid-level narrative or Gherkin-Feature-like cluster inside one spec. +- `unknown` — known-unknown: a domain uncertainty that is not presently answerable but must be structurally accommodated. +- `requirement` — obligation claim: what the system shall do or satisfy. +- `assumption` — deferred-falsifiable belief: something believed enough to proceed, but possibly false. +- `constraint` — boundary claim: what rules out solution space, scope, policy, resource envelope, platform, or non-goal interpretations. +- `invariant` — preservation claim: what must remain true across states, transitions, versions, or semantic revisions. +- `decision` — choice claim: a durable selected option among real alternatives; requires chosen option, rejected alternatives, and rationale. +- `criterion` — oracle claim: how a requirement, invariant, or other claim will be judged. +- `example` — concrete witness or disambiguator: positive case, counterexample, edge case, trace, or labelled out-of-scope case. Polarity comes from wording and edges, not a subtype field. + +### Oracle, design, and plan planes + +Use these when the material is no longer only intent capture. -## Commit only settled material +- Oracle plane (`check`, `vv_method`, `evidence`, `vv_obligation`) — concrete verification checks, verification methods, observed evidence, or proof/verification obligations. +- Design plane (`module`, `interface`, `entity`, `sketch`) — implementation shape, seams, data/domain entities, or intentionally lightweight design sketches. +- Plan plane (`milestone`, `frontier`, `slice`) — sequencing units. A `frontier` is the plan/tracker/branch unit; a `slice` is the buildable implementation unit inside it. -Graph writes are for material whose commitment path is settled. +Readiness bands guide questioning and projection; they do not gate graph truth. If the user clearly states a later-band item early, capture it honestly with the right kind and basis. -- Direct user statements and approved review-set items are `explicit` graph truth. -- Confident structure materialized from accepted content may be `implicit` graph truth. -- Low-confidence noticings, suspicions, possible implications, or missing pieces do not become graph truth; route them to an `elicitation_gap` question plus rationale. -- Contradictions with existing graph truth are retrospective repair work; route them to a `reconciliation_need`, not a gap and not an overwrite. +## Promote before filing as context -## Build relation-bearing batches from confident endpoints +`context` is the broadest attractor and therefore the most common misclassification. Promote to a sharper kind before writing graph truth. -Create relation-bearing graph batches only after endpoint confidence is settled. +| If the descriptive material... | Route to... | +| --- | --- | +| states the desired outcome or why the work matters | `goal` or `thesis` | +| defines a term or naming commitment | `term` | +| must be true for the system to succeed or stay safe | `requirement` or `invariant` | +| limits acceptable solutions or scope | `constraint` | +| is believed but might be false in a material way | `assumption` | +| is an acknowledged unknown that cannot simply be answered now | `unknown` | +| chooses among alternatives with durable consequences | `decision` | +| explains how success will be judged | `criterion` or an oracle-plane node | +| gives a concrete case, trace, or counterexample | `example` | +| only helps interpretation and has no stronger graph role yet | keep `context` | + +A formal axiom or given is `context` with `detail.form:"given"` when it is stipulated as true and load-bearing. Load-bearing-ness comes from edges such as `dependency`, not from inventing a `given` kind. + +## Distinguish nearby kinds + +### `requirement` vs `invariant` + +A requirement says the system must do or provide something. An invariant says a property must keep holding while the system operates or evolves. They often pair: + +```pseudo +requirement: users can export accepted review items +invariant: rejected or draft review items never appear in exports +``` + +### `criterion` vs oracle-plane nodes + +A criterion is the acceptance/oracle claim in intent space: how we judge a property. Oracle-plane nodes name concrete verification machinery or evidence. + +```pseudo +criterion: export excludes draft review items in the reviewer-visible artifact +check: vitest golden for exported review payload +vv_method: golden-file regression plus fixture replay +``` + +Link the concrete oracle to the claim with a `witness` edge (`stance: for` when it supports, `stance: against` when it refutes or falsifies). + +### `assumption` vs `unknown` vs `context` + +- `context`: treated as known or stipulated for the current spec. +- `assumption`: believed enough to proceed, but later validation could overturn it. +- `unknown`: explicitly not known; the system or plan must accommodate that ignorance. + +Do not launder a known-unknown into an assumption just to make the graph look complete. + +### `constraint` vs `invariant` + +A constraint narrows the acceptable solution space. An invariant protects a property across operation or change. + +```pseudo +constraint: must not require a network service during local CLI runs +invariant: local CLI runs never send workspace graph data to a remote service +``` + +### `story` vs `example` + +A story groups related behavior inside a spec. An example is a concrete witness. A Gherkin `Feature` inside one spec usually maps to `story`; a Scenario / Examples row usually maps to `criterion` or `example` depending on whether it is the oracle statement or a concrete case. + +### `sketch` vs committed design nodes + +Use `sketch` for advisory or early design material that should not yet harden into module/interface/entity truth. Promote to `module`, `interface`, or `entity` only when the design claim is settled enough to be part of graph truth. + +## Decision capture criteria + +Do not turn every user answer into a `decision`. A `decision` needs all of these: + +1. Real alternatives existed. +2. The choice is durable enough to constrain future interpretation or implementation. +3. The choice can be stated as “we chose A over B/C.” +4. At least one rejected alternative can be named. +5. There is a rationale. + +Current required detail fields are `chosen_option`, `rejected`, and `rationale` (see `graph-ontology.md`). Put scope and consequences in the title/body or express them with edges; do not invent decision-detail fields. + +## Examples and negative knowledge + +There are no `example` subtype fields. Preserve example semantics through the node text and edge structure. + +```pseudo +positive witness: + EX1 concrete accepted export case + create_edge witness: + oracle: EX1 + claim: REQ3 + stance: for + +counterexample / rejected interpretation: + EX2 rejected review item appears in export + create_edge witness: + oracle: EX2 + claim: INV4 + stance: against + +out-of-scope disambiguator: + EX3 importing old local dev fixtures + create_edge exclusion: + boundary: CON2 + subject: EX3 +``` + +Intent is often clarified by what has been ruled out. Prefer a concrete `example` plus `witness:against` or an `exclusion` edge over vague prose such as “not that.” + +## Edge authoring + +Accepted edges use the closed structural categories generated in `graph-ontology.md`. Do not use retired named-relation dialects such as `derived_from`, `motivated_by`, `rules_out`, `counterexample_for`, or `tested_by` as edge categories. + +Use the role-named `mutate_graph` grammar. Endpoint storage order does not carry impact meaning; category metadata owns endpoint roles, affected endpoint, impact strength, criteria-help signal, and projection effect. + +| If you mean... | Use current edge category | +| --- | --- | +| one claim relies on another remaining true | `dependency` (`dependency` -> `dependent`) | +| an oracle, example, check, evidence, or criterion supports/refutes a claim | `witness` with `stance: for` or `stance: against` | +| a goal, thesis, rationale, or argument motivates/opposes a claim | `rationale` with `stance: for` or `stance: against` | +| an abstract claim is implemented or expressed by a concrete artifact | `realization` | +| a general claim/model is specialized by a more specific one | `refinement` | +| a boundary, non-goal, or constraint limits a subject | `exclusion` | +| a whole contains a part | `composition` | +| two items are related but no stronger relation is justified | `cross_reference` | +| one item replaces an older item | `supersession` | + +Stance is required only for `witness` and `rationale`; omit it everywhere else. + +## Relation-bearing batches need confident endpoints + +Create edges only after both endpoints are settled enough to stand as graph truth. ```pseudo chain relation-bearing-authoring: @@ -32,12 +191,103 @@ chain relation-bearing-authoring: -> use role-named mutate_graph endpoints ``` -Do not use capture-local or prose-local edge dialects. `graph-ontology.md` lists the generated edge-category policy table; `mutate_graph` edges use role fields such as `dependency/dependent`, `support/claim`, `abstract/concrete`, and `boundary/subject`; diagnostics from `structural_illegal` are the repair path. +If an endpoint is uncertain, spawn or reuse an elicitation gap for the missing claim. If a relation contradicts existing graph truth, create a reconciliation need instead of overwriting or adding a competing edge. + +## Accepted graph truth has no proposal/status fields + +Do not add old edge metadata such as `support`, `status`, `provenanceTurnId`, `createdBy`, or per-claim `checkability`/`strength` fields. + +Current ownership: + +- `basis: explicit | implicit` records approval directness for accepted nodes and edges. +- `source` on nodes is lightweight epistemic attribution text, not policy. +- `rationale` on edges explains the relation. +- `change_log` owns audit/provenance by LSN. +- Review-set drafts own proposed graph material before acceptance. +- Rejected proposals are absent from active graph truth plus audit history. +- Staleness is represented by `reconciliation_need`, not by mutating edge status. + +## Treat `detail.form` as inert payload + +`kind` drives graph behavior. `detail.form` is method payload plus a renderer hook. + +- `plain`, `gherkin`, and `formal` are legal on `requirement`, `criterion`, and `invariant`. +- `given` is legal on `context`. +- `decision` and `term` have their own required detail payloads. + +Do not infer edge legality, readiness, commitment strength, or runtime method state from `detail.form`. A Gherkin or formal payload changes how the node renders or round-trips; it does not change what kind of graph thing the node is. + +## Capture routes + +| Material confidence / conflict state | Route | +| --- | --- | +| directly stated or exact-review approved graph material | graph truth with `basis: explicit` | +| confidently materialized from accepted content | graph truth with `basis: implicit` | +| low-confidence noticing, suspicion, possible implication, or missing piece | `elicitation_gap` with a question and rationale | +| contradiction with existing graph truth | `reconciliation_need` | +| candidate batch awaiting human judgment | review-set draft, not accepted graph truth | + +Abstain rather than guess. Speculative captures degrade graph signal. + +## Edge-local neighborhoods are the useful context unit + +For LLM collaboration, an item-centered neighborhood is usually stronger than “all goals” or “all requirements.” Read/render neighborhoods with the policy-derived labels and impact direction. + +```pseudo +REQ17: Each phase exposes an explicit kickoff/frontier/recovery/handoff affordance. + upstream / dependencies: + motivated by G2: avoid fake closure and stranded users + bounded by CON8: no generic task-planning surface + downstream / impact: + realized by S13: phase affordance renderer slice + refined by REQ18: interview phases expose kickoff/frontier/generation/recovery + evidence: + witnessed by AC13: open phases bottom-load one visible artifact +``` + +Do not reconstruct directionality from raw `sourceId` / `targetId` or from the English verb. Use the category policy and label projections. + +## Topology-driven question ranking + +Use graph topology to pick the next useful question: + +| Signal | Suggested question shape | +| --- | --- | +| High-fanout `assumption` with thin evidence | “Many claims depend on X. Should we validate it or mark the risk?” | +| `requirement` or `invariant` with no witness/evidence path | “How will we know this holds?” | +| `criterion` not linked to the claim it judges | “Which requirement or invariant does this criterion check?” | +| Candidate `decision` lacks rejected alternatives or rationale | “What did we consider and rule out before choosing this?” | +| Constraints/exclusions appear to disagree about one subject | “These boundaries conflict. Which one wins?” | +| `goal` / `thesis` has no path into requirements, design, or plan | “What would satisfy this goal in the actual system?” | +| Requirement has no example/counterexample and high ambiguity | “What concrete case would settle this interpretation?” | +| `unknown` blocks a design or plan edge | “Do we accommodate the unknown, investigate it, or narrow scope around it?” | + +These are ranking heuristics, not automatic graph writes. + +## Progressive checkability is conduct, not schema + +Choose the weakest sufficient oracle artifact for the claim at hand: human review, example/counterexample, regression/golden, runtime contract, property/model-based rule, probe/transcript, or proof obligation. Express the artifact as existing graph vocabulary (`criterion`, `check`, `vv_method`, `evidence`, `vv_obligation`, `example`, `witness` edges) and name blind spots in prose. -## Keep mutation grammar role-named +Do not add claim-level `checkability`, `strength`, `validTraces`, or `invalidTraces` fields. Evidence breadth is prompt/oracle conduct unless a future scoped reader proves it needs schema. -Prepare one coherent `mutate_graph` batch when the user-facing commitment is already settled. Prefer create-only direct commits in the current product posture: `create_node` ops plus role-named `create_edge` ops. Do not invent graph payload fields, LSNs, edge categories, result shapes, or partial-write recovery paths. +## Phrase-to-kind priors -## Treat detail.form as inert payload +Treat these as priors, not rigid rules. -Use `graph-ontology.md` for the generated required-detail and allowed-form tables. Node `kind` drives graph behavior; `detail.form` is only method payload plus a renderer hook. Do not infer edge legality, readiness, commitment strength, or runtime method state from `form`. +| User phrase pattern | Likely route | +| --- | --- | +| “we want Y” / “so that Y” | `goal` | +| “this is for X because...” | `thesis` | +| “by X we mean...” | `term` | +| “true about the environment/repo/domain...” | `context` unless promotable | +| “a known unknown is...” | `unknown` | +| “must not”, “cannot”, “out of scope” | `constraint` | +| “probably”, “we think”, “if X is true” | `assumption` | +| “the system must...” | `requirement` | +| “always”, “never”, “must remain” | `invariant` | +| “we chose A over B because...” | `decision` | +| “we'll know it works when...” | `criterion` or oracle-plane node | +| “for example”, “case where”, “counterexample” | `example` | +| “module”, “API”, “entity”, “sketch” | design-plane kind | +| “test”, “proof”, “evidence”, “verification method” | oracle-plane kind | +| “milestone”, “frontier”, “slice” | plan-plane kind | diff --git a/src/agents/contexts/references/intent-capture-slice.md b/src/agents/contexts/references/intent-capture-slice.md new file mode 100644 index 000000000..38e0930dd --- /dev/null +++ b/src/agents/contexts/references/intent-capture-slice.md @@ -0,0 +1,77 @@ +# Intent capture slice + +Draft injectable reference for the elicitor, capture sweep, and any foreground agent turning user/world material into coherent specification graph content. Use after reading `graph-ontology.md` for exact kind legality. + +## Job + +Capture stable intent-plane material as graph truth; route everything else to the correct non-truth substrate. + +```pseudo +incoming material + -> normalize to declarative claim or named graph item + -> classify by modality + -> promote away from context when a sharper kind is earned + -> decide route: graph truth | elicitation_gap | reconciliation_need | review draft + -> add edges only after endpoint confidence is settled +``` + +## Intent-kind routing matrix + +| Material role | Kind | Good node title shape | Common false route | +| --- | --- | --- | --- | +| desired outcome, value, win | `goal` | “Reduce fake closure in review flows” | `requirement` too early | +| audience/problem/bet/positioning | `thesis` | “The spec workspace is for teams evolving uncertain software intent” | vague `context` | +| canonical vocabulary | `term` | “Frontier means plan/tracker/branch unit” | duplicating prose in every node | +| ambient fact about world/repo/domain | `context` | “Runtime state is transcript-backed” | absorbing constraints or decisions | +| intra-spec behavior grouping | `story` | “Review-set approval story” | `milestone` or `example` | +| acknowledged unknown | `unknown` | “Provider payload drift is not fully known” | pretending it is an `assumption` | +| required behavior/property | `requirement` | “Review acceptance commits the batch atomically” | `criterion` | +| believed-but-falsifiable premise | `assumption` | “LLMs can produce legal edge drafts after prompt guidance” | `context` | +| boundary/non-goal/resource/policy | `constraint` | “Graph writes must not bypass CommandExecutor” | `invariant` when it is only design-space narrowing | +| always/never/must-remain property | `invariant` | “Rejected drafts never enter accepted graph truth” | `constraint` when it protects runtime/evolution | +| durable choice among alternatives | `decision` | “Use role-named edges over generic source/target drafts” | any ordinary answer | +| acceptance/oracle condition | `criterion` | “Mutation batch is accepted only if dry-run validates” | `check` too concrete | +| concrete case/counterexample/trace | `example` | “Counterexample: rejected item appears in export” | hidden note in body text | + +## Promotion decision table + +policy: first-match + +| rule | If material... | → route | +| --- | --- | --- | +| R1 | contradicts existing accepted graph truth | `reconciliation_need` | +| R2 | is low-confidence, suspected, or missing | `elicitation_gap` | +| R3 | is a candidate batch awaiting approval | review-set draft | +| R4 | selects A over named B/C with rationale | `decision` | +| R5 | rules out solution space or scope | `constraint` | +| R6 | must remain true across operation/change | `invariant` | +| R7 | describes how success will be judged | `criterion` or oracle-plane node | +| R8 | gives a concrete witness/counterexample | `example` | +| R9 | only helps interpretation | `context` | + +notes: + - #R1 is retrospective repair; do not file contradictions as gaps. + - #R2 is prospective elicitation agenda; the gap stores question/rationale, not hidden domain truth. + - #R4 must satisfy the decision criteria in `graph-authoring-heuristics.md`. + +## Coherent intent content checklist + +- Each node can be read aloud as a stable claim or named item. +- `context` nodes are not carrying obligations, choices, boundaries, or uncertainty in disguise. +- Requirements say what must hold; criteria say how we judge; examples make interpretation concrete. +- Invariants protect preservation; constraints narrow solution space. +- Decisions name rejected alternatives and rationale, not just “the user answered yes.” +- Negative knowledge is preserved as `example` + `witness:against` or as `constraint`/`exclusion`, not as vague prose. + +## Edge hints + +| From | To | Edge | +| --- | --- | --- | +| `goal` / `thesis` | `requirement` / `decision` | `rationale` with `stance: for` | +| `constraint` | any bounded subject | `exclusion` | +| `assumption` / `context` | claim depending on it | `dependency` | +| `criterion` / `example` | claim it checks or illustrates | `witness` with `stance: for` or `against` | +| broader claim | narrower claim | `refinement` | +| story | grouped requirements/criteria/examples | `composition` | + +Do not create relation-bearing batches until both endpoints are confident graph truth. diff --git a/src/agents/contexts/references/neighborhood-consumption-slice.md b/src/agents/contexts/references/neighborhood-consumption-slice.md new file mode 100644 index 000000000..0507ce30a --- /dev/null +++ b/src/agents/contexts/references/neighborhood-consumption-slice.md @@ -0,0 +1,104 @@ +# Neighborhood consumption slice + +Draft injectable reference for agents reading existing graph context before answering, proposing, or mutating. Use when a task is centered on one claim, design seam, oracle, or plan item. + +## Job + +Reason from an item-centered neighborhood, not from global kind buckets. Direction, labels, and impact come from edge policy projections, not raw `sourceId` / `targetId` order. + +## Neighborhood data shape + +```yaml +neighborhood: + anchor: + code: string + kind: string + title: string + body: string? + buckets: + dependencies: item_edge[] # premises, constraints, assumptions, upstream support + dependents: item_edge[] # likely affected claims/work if anchor changes + evidence: item_edge[] # criteria, checks, examples, evidence, counterexamples + refinements: item_edge[] # abstract/concrete, whole/part, successor/predecessor as useful + lateral: item_edge[] # cross_reference or low-impact neighbors + open_needs: + gaps: string[] + reconciliation_needs: string[] + +item_edge: + label: string # rendered from anchor perspective + neighbor_code: string + neighbor_kind: string + neighbor_title: string + stance: string? # for | against, only when category carries stance + impact: string # upstream | downstream | lateral + rationale: string? +``` + +## Reading chain + +```pseudo +task about anchor + -> read anchor text and edge-local neighborhood + -> bucket neighbors by relation to anchor: dependencies | dependents | evidence | lateral + -> inspect open gaps/reconciliation needs before assuming completeness + -> answer or propose with explicit references to affected neighbors +``` + +## Bucket interpretation matrix + +| Bucket | Means | Ask | +| --- | --- | --- | +| dependencies | what the anchor relies on or is conditioned by | “If this changes, does the anchor still stand?” | +| dependents | what may be affected if the anchor changes | “What downstream claims/work need reconciliation?” | +| evidence | what witnesses, refutes, checks, or exemplifies the anchor | “Is the claim actually observed?” | +| refinements | more abstract/concrete versions or parts | “Is this the right level of specificity?” | +| lateral | related but non-driving context | “Is this merely adjacent or should it become a stronger edge?” | + +## Edge-local context pattern + +```pseudo +REQ17: Each phase exposes explicit kickoff/frontier/recovery/handoff affordances. + dependencies: + motivated by G2: avoid fake closure and stranded users + bounded by CON8: no generic task-planning surface + depends on A4: users will tolerate visible phase state + dependents: + implemented by MOD5: phase affordance renderer + establishes S13: open-phase artifact slice + evidence: + witnessed by AC13: open phases bottom-load one visible artifact + challenged by EX4: cancelled interview with no handoff artifact + open: + gap: Should recovery affordances appear before or after generation failure? +``` + +This shape is more useful than “all goals, all constraints, all requirements” because it carries why the anchor stands and what changes if it moves. + +## Consumption rules + +- Do not infer direction from raw storage coordinates. Use rendered labels and impact buckets. +- Do not treat lack of visible evidence as proof that no evidence exists; ask for a targeted read if needed. +- Do not flatten `witness:against` into generic “related evidence.” Negative evidence is semantically important. +- Do not update an anchor without checking dependents; changes can create reconciliation needs. +- Do not answer “why is this here?” without looking for `rationale`, `dependency`, and `realization` paths. + +## Task-specific neighborhood choices + +| Task | Prioritize | +| --- | --- | +| explain why a claim exists | dependencies + rationale edges | +| edit a claim | dependents + reconciliation needs | +| generate criteria | evidence gaps + existing criteria/checks/examples | +| project design | requirements/invariants/constraints + examples | +| plan work | requirements/design seams/oracle obligations + dependency edges | +| reconcile conflict | conflicting neighbors + edge rationale + change history if available | + +## Output discipline + +When responding from a neighborhood, name the graph relation rather than merely citing nodes: + +- Good: “REQ17 depends on A4 and is bounded by CON8; changing A4 would affect REQ17.” +- Weak: “Relevant nodes: A4, CON8, REQ17.” + +When proposing a mutation, include the edge intent in prose before committing or presenting review-set drafts. diff --git a/src/agents/contexts/references/oracle-witness-slice.md b/src/agents/contexts/references/oracle-witness-slice.md new file mode 100644 index 000000000..37ed2d185 --- /dev/null +++ b/src/agents/contexts/references/oracle-witness-slice.md @@ -0,0 +1,109 @@ +# Oracle witness slice + +Draft injectable reference for agents designing verification, criteria, checks, evidence, and proof obligations. Use when the task is “how will we know this holds?” + +## Job + +Make claims checkable using existing graph vocabulary. Do not add checkability metadata to claims. + +```pseudo +claim neighborhood + -> identify property under test: requirement | invariant | decision | design seam + -> choose weakest sufficient oracle artifact + -> express it as criterion, check, vv_method, evidence, vv_obligation, or example + -> attach it with witness edge and stance + -> name blind spots in prose or proposal notes +``` + +## Criterion vs oracle-plane routing + +| User gives... | Graph route | Why | +| --- | --- | --- | +| “we know it works when...” | `criterion` | acceptance/oracle claim in intent space | +| “test X should run” | `check` | concrete executable or manual check | +| “use property testing / golden / proof” | `vv_method` | verification method family | +| “this run/transcript/log proves it” | `evidence` | observed artifact | +| “must prove this before release” | `vv_obligation` | outstanding proof/verification obligation | +| “for example, this case should pass” | `example` + `witness:for` | concrete positive witness | +| “this counterexample should fail” | `example` + `witness:against` | concrete negative witness | +| “metric M moved” | `evidence` or `criterion` | evidence if observed, criterion if proposed | + +## Weakest-sufficient oracle ladder + +Use the weakest artifact that honestly witnesses the claim. + +```pseudo +unwitnessed claim + -> human review # qualitative judgment enough + -> example/counterexample # concrete disambiguation enough + -> regression/golden # stable fixture can catch drift + -> runtime contract # boundary must fail loud + -> property/model rule # many cases matter + -> probe/transcript # LLM or integration behavior needs repeated evidence + -> proof obligation # formal proof is economically justified +``` + +This ladder is conduct. Do not store `checkability`, `strength`, `validTraces`, or `invalidTraces` on graph nodes. + +## Witness edge patterns + +```pseudo +positive acceptance: + criterion AC4 + create_edge witness: + oracle: AC4 + claim: REQ9 + stance: for + +negative case: + example EX2 + create_edge witness: + oracle: EX2 + claim: INV3 + stance: against + +observed evidence: + evidence E7 + create_edge witness: + oracle: E7 + claim: REQ9 + stance: for + +method rationale: + vv_method VV2 + create_edge rationale: + support: VV2 + claim: AC4 + stance: for +``` + +Use `witness` for evidence that bears on truth. Use `rationale` for why an oracle/method is a good choice. + +## Oracle-family matrix + +| Oracle family | Good for | Graph expression | Blind spot to name | +| --- | --- | --- | --- | +| human/manual review | judgment, UX, semantic quality | `criterion` or `check` | reviewer variance | +| example/counterexample | ambiguity collapse | `example` + `witness` | narrow coverage | +| fixture/golden | stable render/projection output | `check` + `evidence` when run | overfitting to fixture | +| schema/static check | boundary shape and structural legality | `check` or `vv_method` | behavior may still be wrong | +| property/model-based | invariant across many generated cases | `vv_method` + `vv_obligation` | model may omit real-world cases | +| probe/transcript | LLM/tool/harness behavior | `check` + `evidence` | non-determinism, provider drift | +| runtime contract | trust boundary / data loss prevention | `check` or design `interface` realization | only observes reached paths | +| formal proof | all-state property in a formal model | `vv_obligation`, `vv_method`, `invariant` | proof-model mismatch | + +## Coherent oracle content checklist + +- Every oracle node says what observation would discriminate success from failure. +- Criteria point to the requirement/invariant/claim they judge. +- Checks and evidence do not masquerade as requirements. +- Counterexamples are preserved with `witness:against`. +- The oracle’s breadth is stated honestly in prose: reviewed, example-backed, regression-covered, enforced, or proved. +- Blind spots are named; a passing check is not generalized into a proof. + +## Anti-patterns + +- Do not present implementation work as an oracle unless it names the observation it makes possible. +- Do not create a bespoke oracle tool or schema field when review-set + graph vocabulary can express the proposal. +- Do not use “tested by” as an edge category; use `witness` with role-named endpoints. +- Do not require the strongest oracle by default. Strong oracles have carrying cost. diff --git a/src/agents/contexts/references/plan-sequencing-slice.md b/src/agents/contexts/references/plan-sequencing-slice.md new file mode 100644 index 000000000..cf9c75d41 --- /dev/null +++ b/src/agents/contexts/references/plan-sequencing-slice.md @@ -0,0 +1,116 @@ +# Plan sequencing slice + +Draft injectable reference for agents creating or reviewing plan-plane content: `milestone`, `frontier`, and `slice` nodes. Use when the question is “what work should happen, in what unit, and why now?” + +## Job + +Turn accepted intent/design/oracle pressure into sequenced work without losing the distinction between phase, tracker unit, and implementation slice. + +```pseudo +accepted graph pressure + -> identify invariant bundle or product threshold + -> group into milestone if it marks phase readiness + -> define frontier if it is the canonical named work item + -> thin to slice when it is buildable execution scope + -> link plan nodes back to the claims/design/oracles they realize or protect +``` + +## Plan-kind routing + +| Planning material | Kind | Use when | False route | +| --- | --- | --- | --- | +| phase boundary / invariant bundle | `milestone` | advancing means a bundle of properties now holds | vague roadmap heading | +| named frontier / tracker / branch unit | `frontier` | a coherent work item owns a seam or coverage frontier | build task too small | +| thin buildable unit | `slice` | one execution context can implement and verify it | dumping an entire frontier into one slice | + +## Sequencing graph + +```pseudo +nodes: + goal: intent + requirement: intent + invariant: intent + criterion: intent/oracle-anchor + module: design + check: oracle + milestone: plan + frontier: plan + slice: plan + +edges: + goal, requirement, invariant -[rationale:for]-> milestone + milestone -[composition]-> frontier + frontier -[composition]-> slice + requirement, invariant -[realization]-> frontier, slice + module, interface -[realization]-> slice + check, criterion -[dependency]-> slice # if work depends on oracle being present + frontier -[dependency]-> frontier # sequencing dependency + +notes: + - Plan nodes should explain what accepted graph pressure they discharge. + - `composition` groups plan units; `dependency` orders them; `realization` ties work to claims/design. +``` + +## Coherent plan content checklist + +A plan node is coherent when it names: + +- the claim/design/oracle pressure it exists to satisfy; +- the unit boundary: phase, frontier, or slice; +- the acceptance signal for that unit; +- the dependency or composition edges that matter; +- what is explicitly out of scope for this unit. + +## Frontier vs slice decision table + +policy: exclusive + +| rule | Work shape | → kind | +| --- | --- | --- | +| R1 | establishes a phase threshold across multiple work items | `milestone` | +| R2 | is the canonical named work item with its own planning/tracker identity | `frontier` | +| R3 | is one buildable execution scope inside a frontier | `slice` | +| R4 | is just a note, risk, or unresolved question | not plan graph truth; use body text, gap, or unknown | + +notes: + - #R2 can contain several slices through `composition`. + - #R3 should have a plausible verification route. + +## Plan projection matrix + +| Upstream pressure | Plan response | Edge hint | +| --- | --- | --- | +| goal has no satisfying work | create or attach frontier | `rationale:for` from goal to frontier | +| requirement needs implementation | frontier or slice | `realization` | +| invariant needs protection | frontier/slice plus oracle | `realization` and `dependency` | +| criterion/check missing | oracle slice or attach to existing work | `dependency` when required before proceeding | +| design seam needs materialization | slice | `realization` from module/interface to slice | +| high-fanout assumption is risky | validation slice or milestone gate | `dependency` from assumption to work that relies on it | +| known unknown blocks sequencing | investigation slice or scoped non-goal | `dependency` only if the work truly relies on it | + +## Anti-patterns + +- Do not turn every task into a `frontier`; frontiers are named work items, slices are build units. +- Do not create plan nodes detached from accepted claims/design/oracles. +- Do not sequence by aesthetic completeness; sequence by pressure, dependency, risk, and verification economics. +- Do not use plan nodes as a hidden backlog for uncertain facts; use `elicitation_gap`, `unknown`, or review notes. +- Do not infer that a passing slice proves a whole frontier; state the acceptance breadth honestly. + +## Minimum plan proposal shape + +```yaml +plan_candidate: + node: + kind: milestone | frontier | slice + title: string + body: string # boundary, objective, and out-of-scope + anchors: + satisfies: string[] # requirements/goals/invariants/design/oracles + blocked_by: string[] # dependencies or unknowns + verifies_with: string[] + acceptance: + observation: string + breadth: example | bounded | sweep | milestone +``` + +Keep `acceptance.breadth` in proposal prose unless a current schema field exists for it. Do not invent plan-node detail fields. diff --git a/src/agents/contexts/references/review-set-drafting-slice.md b/src/agents/contexts/references/review-set-drafting-slice.md new file mode 100644 index 000000000..300ba99dc --- /dev/null +++ b/src/agents/contexts/references/review-set-drafting-slice.md @@ -0,0 +1,122 @@ +# Review-set drafting slice + +Draft injectable reference for agents proposing graph changes for human review. Use when material is plausible and structured but should not be directly committed yet. + +## Job + +Draft a coherent reviewable graph batch: settled enough to inspect, not yet accepted graph truth. Keep low-confidence noticings out of review sets unless the proposal explicitly asks the user to accept them as graph truth. + +```pseudo +candidate material + -> choose topical slice: intent | design | oracle | plan + -> draft nodes with stable titles and bodies + -> draft only edges whose endpoints are present or resolvable + -> validate role-named edge shape mentally before presenting + -> ask for approve | request changes | reject +``` + +## Review-set batch shape + +```yaml +review_set_candidate: + heading: string + purpose: string + grounding: + summary: string + support: string[] # why this batch is worth reviewing + nodes: + - draft_id: string + plane: intent | oracle | design | plan + kind: string + title: string + body: string? + detail: object? + edges: + - category: string + roles: object # use role-named endpoints, not generic source/target + stance: for | against | null + rationale: string? + user_choice: + options: approve | request_changes | reject +``` + +This shape is explanatory, not a replacement schema. The actual review-set and graph mutation schemas are owned by graph code. + +## Draft quality matrix + +| Draft element | Required quality | Reject or ask changes when... | +| --- | --- | --- | +| heading | names the reviewable unit | it is generic (“Proposed updates”) | +| grounding.summary | says why this batch exists | it hides uncertainty or overclaims source support | +| node title | stable claim/item title | it is phrased as a question or TODO | +| node body | enough context to review | it contains multiple unrelated claims | +| node kind | current legal kind | it revives old subtype/relation vocabulary | +| detail | only legal kind/detail payload | it invents fields or stores prompt conduct | +| edge | role-named endpoints | either endpoint is missing or low-confidence | +| stance | present only on `witness`/`rationale` | stance is omitted there or added elsewhere | +| rationale | explains non-obvious relation | it merely repeats the category name | + +## Plane-specific drafting hints + +| Plane | Good proposal content | Common overreach | +| --- | --- | --- | +| intent | goals, requirements, constraints, assumptions, decisions, criteria, examples | treating every answer as a decision or requirement | +| design | modules, interfaces, entities, sketches anchored in intent | speculative architecture without anchors | +| oracle | criteria, checks, methods, evidence, obligations, examples | implementation tasks with no observation | +| plan | milestones, frontiers, slices tied to claims/design/oracles | backlog tasks detached from graph pressure | + +## Edge drafting patterns + +```pseudo +claim motivated by goal: + category: rationale + support: G1 + claim: REQ2 + stance: for + +counterexample challenges invariant: + category: witness + oracle: EX3 + claim: INV4 + stance: against + +requirement depends on assumption: + category: dependency + dependency: A5 + dependent: REQ2 + +constraint bounds design: + category: exclusion + boundary: CON6 + subject: MOD7 + +frontier contains slice: + category: composition + whole: F2 + part: S4 +``` + +## Direct commit vs review set vs gap + +policy: exclusive + +| rule | Material state | → route | +| --- | --- | --- | +| R1 | exact user-approved graph item or safe explicit direct commit | `mutate_graph` | +| R2 | coherent candidate batch that needs human judgment | `present_review_set` | +| R3 | useful but low-confidence noticing | `elicitation_gap` | +| R4 | conflicts with accepted graph truth | `reconciliation_need` | +| R5 | only conversational response needed | no graph artifact | + +notes: + - #R2 becomes `basis: explicit` only after the user approves the exact reviewed items. + - #R3 stores a question/rationale, not the low-confidence claim as hidden truth. + +## Coherence checks before presenting + +- Can the user approve the whole batch atomically without guessing your intent? +- Does each edge have both endpoints in the batch or selected spec? +- Are negative cases represented with `witness:against` or `exclusion` where appropriate? +- Are design/oracle/plan nodes anchored to accepted or proposed intent? +- Are all invented schema fields removed? +- Are uncertainty and blind spots visible in proposal prose, not hidden in graph truth? diff --git a/src/agents/docs/context-reference-harvest.md b/src/agents/docs/context-reference-harvest.md index c37aa6cc9..e11c3de1e 100644 --- a/src/agents/docs/context-reference-harvest.md +++ b/src/agents/docs/context-reference-harvest.md @@ -1,138 +1,147 @@ -# Context reference harvest ledger - -Status: closed backstage-only curation ledger. This file is not runtime prompt payload, is not copied into packaged agent assets, and is not a shared context reference. Runtime-eligible references live under `src/agents/contexts/references/`; skill-local progressive-disclosure references live under the owning skill's `references/` directory. - -Purpose: record the row-by-row disposition of recovered or design-era data-model guidance after the data-model-legibility frontier. Rows point to materialized homes, rejected carrying cost, or explicit future tripwires; they do not restate the ontology. - -A source may carry more than one disposition class when it has separable uses. Treat the classes as labels, not an exclusive enum. Generated-reference inputs are the exception that preserves D97-L: only typed code sources may generate reference tables; recovered/design prose may motivate which table to generate, but it is authored-reference input or backstage rationale, not the source of truth. - -## Disposition classes - -| Class | Meaning | -| -------------------------------- | ----------------------------------------------------------------------------------------------- | -| generated-reference input | Typed code source for generated content, not hand-authored prose. | -| authored-runtime-reference input | Candidate source for a shared reference under `src/agents/contexts/references/`. | -| skill-local-reference input | Candidate source for a specific skill's `references/` payload. | -| backstage-only rationale | Useful design history or validation record, but not model-facing prompt payload. | -| historical/archive candidate | Superseded or stale enough that future work should retire/archive rather than harvest directly. | -| leave-as-is | Current prompt/resource file already sits in the right home; no harvest action now. | - -## Source ledger - -| Source | Disposition labels | Candidate future reference | Reader / blocker | D98-sensitive notes | Next action | -| ------------------------------------------------------------------ | ------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `/private/tmp/igs_recovered.md` (`INTENT_GRAPH_SEMANTICS`) | skill-local-reference input; backstage-only rationale; historical/archive candidate | `graph-authoring-heuristics.md` materialized; generated edge-category/detail-form table materialized from typed sources; oracle checkability guidance accepted skill-local only | Reader: capture/commit methods already cite shared graph-authoring judgment. Oracle `generate-proposal` may use progressive-checkability language when designing verification ensembles. | Contains retired subtype proposals and old edge/ontology language; do not revive stale modality/subtype claims, claim metadata, `strength`, or runtime `strategy` / `lens` / `method` session state. | Verdict complete: promotion rules already accepted in `graph-authoring-heuristics.md`; the 8-rung ladder is narrowed to skill-local oracle prompting; `strength` and claim-level `checkability` fields are rejected carrying cost; subtype/detail candidates are rejected as parallel enums except for already-shipped `detail.form` from typed sources. | -| `docs/design/ELICITATION_QUESTIONS.md` | authored-runtime-reference input | `elicitation-question-hints.md` | Reader: future elicitor question/gap guidance. Blocker: refresh against post-FE-1052 kind names, `story` / `unknown` / `entity` / `sketch`, and four-band D94-L model. | Uses older band framing and mentions strategy/lens as prompt-space terms; keep as prompt-resource vocabulary only, not runtime state. | Treat the durable thesis as: node kind is closed ontology; questions are open/projectable hints inside a kind. Rewrite examples before model-facing use. | -| `docs/design/ONTOLOGY_REVIEW_PROTOCOL.md` | backstage-only rationale; authored-runtime-reference input | possible `graph-authoring-heuristics.md` citations; may motivate generated edge-category/detail-form table scope but typed code remains the generated-reference input | Reader: data-model maintainers and future generated-reference authors. Blocker: live code/SPEC are authoritative; §0/§2–3/§9 are historical and `thesis → claim` did not land. | Mentions methods as validation lenses; preserve only as prompt/resource vocabulary where useful, never as user-changeable runtime axes. | Use as design-validation record for D87-L/D88-L, not as prompt payload. Pull only claims that still match current SPEC/code. | -| `docs/design/ELICITATION_LENSES.md` | authored-runtime-reference input; skill-local-reference input; historical/archive candidate | `proposal-meta-rubric.md`; `projection-guidance.md` | Reader: `generate-proposal` and future `project` capability. Blocker: D98-L retired `strategy` / `lens` / `method` as runtime state; A33-L still design-gates `project`. | Highly D98-sensitive: old lens catalogue must not reintroduce runtime lens/strategy/method axes. Fan-out/fan-in and D31 meta-rubric may survive as prompt conduct. | Harvest fan-out/fan-in, grounding-density, and meta-rubric ideas only into the relevant method/reference home after translating away runtime-axis assumptions. | -| `docs/design/BEHAVIORAL_KERNELS.md` | skill-local-reference input; backstage-only rationale; historical/archive candidate | oracle checkability phrasing in `generate-proposal/references/oracle.md`; possible future `elicitation-question-hints.md` | Reader: oracle generate guidance for weakest-sufficient verification artifacts; future elicitation/gap guidance only if a scoped reader appears. Blocker: no current runtime kernel ontology; must not create a parallel data model or prompt taxonomy without a concrete reader. | Kernel terminology is interviewer machinery at most, not graph state and not runtime session state. | Accepted only as skill-local oracle prompting for progressive verification artifacts; kernel labels/taxonomy remain rejected as runtime model and deferred for elicitation questions. | -| `src/agents/skills/methods/capture/SKILL.md` | leave-as-is; authored-runtime-reference input; partially materialized | `graph-authoring-heuristics.md` materialized; `checkability-ladder.md` deferred | Reader: capture now cites the shared authoring reference for declarative graph claims, low-confidence routing, contradiction routing, relation-bearing confidence, and role-named mutation grammar. FE-861 sweep sequencing, gap conduct, and commitment-gradient table remain local. | Method is a prompt-resource id, not runtime state. No D98 issue while it stays code-owned prompt-resource conduct. | Materialized shared graph-authoring guidance; defer checkability-ladder extraction until a second concrete reader needs it. | -| `src/agents/skills/methods/commit-graph/SKILL.md` | leave-as-is; authored-runtime-reference input; materialized | `graph-authoring-heuristics.md` | Reader: graph-write methods needing declarative-node, promotion, settled-commitment, confident-endpoint, and role-named mutation discipline. | Method remains prompt-resource conduct; do not make it a user-changeable runtime mode. | Materialized shared authoring reference and cite from this method; remaining direct-commit sequencing stays local. | -| `src/agents/skills/methods/generate-proposal/SKILL.md` | leave-as-is; skill-local-reference input; authored-runtime-reference input | `proposal-meta-rubric.md`; `projection-guidance.md` | Reader: current generate method and future project design. Blocker: proposal meta-rubric might belong skill-local unless `project` becomes a second reader. | Names intent/design/oracle lenses/planes as prompt conduct; keep out of runtime state and schema fields. | Leave body unchanged now. Revisit after `elicitor-project` design chooses whether projection folds into generate or needs a distinct surface. | -| `src/agents/skills/methods/generate-proposal/references/intent.md` | leave-as-is; skill-local-reference input | `proposal-meta-rubric.md` only if shared beyond generate | Reader: generate intent-plane fan-out. Blocker: no second reader yet. | Plane-specific prompt payload is okay; do not turn `pick` into a schema/runtime field. | Leave in skill-local home. | -| `src/agents/skills/methods/generate-proposal/references/design.md` | leave-as-is; skill-local-reference input | `proposal-meta-rubric.md`; possible `projection-guidance.md` | Reader: generate design-plane fan-out and possible future project design. Blocker: A33-L design verdict. | `synthesize` is method conduct, not schema or runtime axis. | Leave in skill-local home; use as input to `project` design only if that frontier needs it. | -| `src/agents/skills/methods/generate-proposal/references/oracle.md` | leave-as-is; skill-local-reference input; materialized | `proposal-meta-rubric.md`; skill-local progressive-checkability guidance | Reader: generate oracle-plane fan-out and fan-in. | `compose` and progressive-checkability language are method conduct, not schema fields, stored claim metadata, or runtime axes. | Materialized the accepted narrow verdict here: choose the weakest sufficient oracle artifact and name evidence breadth/blind spots without adding `checkability`, `strength`, kernel, or subtype schema. | - -## Closed reference dispositions - -```pseudo -tree context-reference-dispositions: - graph-authoring-heuristics.md: - home: src/agents/contexts/references/ - status: materialized - readers: - - capture/SKILL.md - - commit-graph/SKILL.md - - future project/generate graph-draft guidance - included: - - declarative graph claims - - promotion before context - - settled commitment paths - - low-confidence and contradiction routing - - relation-bearing endpoint confidence - - role-named mutate_graph grammar - rejected_or_deferred: - - claim-level checkability / strength fields (rejected carrying cost) - - constraint/invariant subtype enums (rejected parallel ontology) - - generated edge-category/detail-form tables (materialized in graph-ontology.md) - d98_guard: method vocabulary allowed only as prompt conduct - - checkability-ladder.md: - status: rejected_as_shared_runtime_reference - verdict: - - no new context reference: only `generate-proposal/references/oracle.md` has a concrete present reader - - no schema fields: do not add `checkability`, `strength`, `validTraces`, or `invalidTraces` to graph nodes - - accepted nucleus: oracle prompting should choose the weakest sufficient verification artifact and name evidence breadth plus blind spots - materialized_in: - - src/agents/skills/methods/generate-proposal/references/oracle.md - deferred: - - future elicitation question/gap guidance may reopen shared reference status only with a second concrete reader - d98_guard: no new runtime lens/kernel state - - elicitation-question-hints.md: - status: deferred_to_future_scope - owner_tripwire: elicitation-gap-guidance or another scoped elicitor-question reader - reason: - - no current second reader needs reusable question patterns - - source examples need refresh against current kind names and D94-L bands before model-facing use - reopen_only_if: - - a scoped build names concrete readers and updates stale examples - d98_guard: examples are prompt hints, not strategy/lens/method runtime state - - proposal-meta-rubric.md: - status: skill_local_unless_project_design_earns_shared_home - current_home: src/agents/skills/methods/generate-proposal/references/ - owner_tripwire: elicitor-project / A33-L - reason: - - generate-proposal is the only present reader - - project may become a second reader, but its shape is design-gated - d98_guard: pick/synthesize/compose remain conduct, not schema/runtime fields - - projection-guidance.md: - status: deferred_to_elicitor-project - owner_tripwire: elicitor-project / A33-L - reason: - - projection has no current capability surface or prompt reader - - A33-L must decide whether projection folds into generate or becomes distinct - d98_guard: no revival of project strategy/lens/method as user-changeable state -``` - -## Closed verdict: checkability, subtypes, and inert detail facets - -```pseudo -tree data-model-legibility-verdict: - accepted: - graph-authoring promotion rules: - home: src/agents/contexts/references/graph-authoring-heuristics.md - readers: [capture, commit-graph] - generated vocabulary tables: - home: src/agents/contexts/references/graph-ontology.md - source: typed graph schema and policy only - oracle progressive-checkability conduct: - home: src/agents/skills/methods/generate-proposal/references/oracle.md - reader: generate-proposal oracle plane - rule: choose the weakest sufficient oracle artifact and disclose evidence breadth/blind spots - rejected_carrying_cost: - graph node fields: [checkability, strength, validTraces, invalidTraces] - subtype enums: [constraint subtype, invariant subtype, criterion subtype, example subtype] - reason: no current reader needs a stored axis beyond kind, readiness band, edge policy, and detail.form - already_covered: - method-specific claim structure: - mechanism: detail.form - guard: kind drives behavior; form is inert payload plus renderer hook - negative/positive examples: - mechanism: example kind + edge stance/category policy where structurally legal - deferred: - elicitation question hints: - trigger: a future scoped reader needs reusable question patterns - project/projection guidance: - trigger: A33-L design verdict -``` - -## Runtime/backstage guardrails - -- This ledger is a pointer and disposition table, not a canonical ontology or prompt body. -- Generated kind/band, edge-category, and detail/form tables must come from typed graph sources, not recovered prose or design docs. -- Authored references need concrete readers; otherwise leave material in the current skill-local or backstage home. -- D98-sensitive vocabulary is allowed only when it describes prompt-resource organization or internal conduct. It must not become session-agent state beyond SPEC/CODE operational mode. -- Rows are harvested one at a time. Do not bulk-import old design docs into runtime references. +# Context reference harvest notes + +Status: backstage curation note. This file is not runtime prompt payload and is not copied into packaged agent assets. Runtime-eligible shared references live in `src/agents/contexts/references/`; skill-local progressive-disclosure references live under the owning skill's `references/` directory. + +## Current outcome + +The ontology/context harvest now has one generated vocabulary home, one broad authored judgment home, and draft topical slice candidates: + +- `src/agents/contexts/references/graph-ontology.md` — generated from typed graph schema/policy sources. Exact kind list, readiness bands, edge-category policy, detail payloads, and `detail.form` legality live there. +- `src/agents/contexts/references/graph-authoring-heuristics.md` — broad authored prompt judgment: kind classification, promotion rules, decision criteria, negative examples, edge authoring, edge-local neighborhoods, topology-driven question ranking, and progressive-checkability conduct. +- `src/agents/contexts/references/context-slice-index.md` — draft selector for smaller injectable slices. +- `src/agents/contexts/references/intent-capture-slice.md` — draft intent/spec capture slice. +- `src/agents/contexts/references/design-projection-slice.md` — draft design projection slice. +- `src/agents/contexts/references/oracle-witness-slice.md` — draft oracle/verification slice. +- `src/agents/contexts/references/plan-sequencing-slice.md` — draft plan-plane slice. +- `src/agents/contexts/references/neighborhood-consumption-slice.md` — draft edge-local context consumption slice. +- `src/agents/contexts/references/review-set-drafting-slice.md` — draft review-set proposal slice. + +This note exists only to explain what was translated, rejected, or deferred while turning older design material into those references. The topical slices are not yet cited by skill bodies; promote citations only when a concrete reader chooses them. + +## Harvest rule + +Design prose can motivate authored guidance, but it is not the source of closed vocabulary. Closed node kinds, edge categories, endpoint roles, readiness bands, and detail shapes come from typed code and are projected into `graph-ontology.md`. + +When an old document proposes a schema field or enum that the current model rejected, translate the useful conduct into existing graph vocabulary instead of reviving the old field. + +## Source dispositions + +### Salvaged `INTENT_GRAPH_SEMANTICS` / older nine-kind ontology + +**Translated into:** `graph-authoring-heuristics.md` and the draft topical slice files listed above. + +Kept: + +- graph-as-typed-claims mental model, updated to the current four-plane graph; +- modality-based kind classification; +- promotion before defaulting to `context`; +- strict decision-capture criteria; +- negative examples and counterexamples as first-class prompt guidance; +- edge-local neighborhood guidance; +- topology-driven question ranking; +- weakest-sufficient verification/checkability conduct. + +Updated for the current model: + +- nine top-level kinds became current four-plane vocabulary from `graph-ontology.md`; +- `term`, `thesis`, `story`, `unknown`, oracle-plane, design-plane, and plan-plane kinds are included; +- subtype proposals are not schema: preserve subtype-like distinctions in node text, `detail.form`, or edges; +- old named relations map to current structural edge categories; +- edge status/provenance/support metadata maps to `basis`, edge `rationale`, `change_log`, review-set drafts, and `reconciliation_need`; +- checkability/strength fields remain prompt/oracle conduct, not graph metadata. + +Rejected as current schema: + +- `constraint` / `invariant` / `criterion` / `example` subtype enums; +- accepted-edge `support` / `status` / `provenanceTurnId` metadata; +- claim-level `checkability`, `strength`, `validTraces`, or `invalidTraces` fields; +- a per-relation policy registry with free-form relation names. + +### `docs/design/ONTOLOGY_REVIEW_PROTOCOL.md` + +**Translated into:** `graph-authoring-heuristics.md`; canonical facts already live in `memory/SPEC.md` and typed schema. + +Kept: + +- method closure rule: a method is `spec.kind` + `detail.form` + renderer + heuristic set, not new node/edge kinds; +- `detail.form` is inert payload; `kind` drives graph behavior; +- context/assumption/unknown routing; +- Gherkin/formal-verification mapping discipline; +- role-named endpoints and explicit impact policy, not verb-direction inference. + +Do not revive: + +- historical `thesis -> claim` rename proposal (did not land); +- stale pre-FE-1052 baseline tables; +- workbench/bench/speculation plane proposals without a new scoped reader; +- deferred nodes/edges such as `actor`, `scenario`, `conflict`, `participation`, `coverage`. + +### `docs/design/ELICITATION_QUESTIONS.md` + +**Partially translated into:** `graph-authoring-heuristics.md` kind classification and phrase-to-kind priors. + +Kept: + +- node kind is the closed ontology; +- questions are open, situated projections inside a kind; +- elicitation gaps carry free-text questions referring to node kinds, not a parallel question-type enum. + +Still deferred: + +- a refreshed `elicitation-question-hints.md` shared reference. Reopen only when a scoped reader such as `elicitation-gap-guidance` needs reusable question patterns and updates examples against current kind names and D94-L bands. + +### `docs/design/ELICITATION_LENSES.md` + +**Disposition:** skill-local/reference input only when a concrete reader appears. + +Kept as conduct where already relevant: + +- fan-out/fan-in prompting; +- grounding-density judgment; +- D31-style meta-rubric language for proposal/oracle generation. + +Do not revive: + +- runtime `strategy` / `lens` / `method` axes as user-changeable session state; +- old lens catalogues as schema or graph state. + +Possible future homes: + +- `proposal-meta-rubric.md` if a second reader beyond `generate-proposal` earns a shared reference; +- `projection-guidance.md` only after the `elicitor-project` design verdict. + +### `docs/design/BEHAVIORAL_KERNELS.md` + +**Disposition:** skill-local oracle and elicitation conduct; not runtime ontology. + +Kept: + +- examples/counterexamples clarify intent; +- weakest-sufficient verification artifact language; +- behavioral kernels as question/probe inspiration. + +Do not revive: + +- kernel labels as graph kinds, runtime state, or a parallel prompt taxonomy. + +### Current skill bodies + +`capture` and `commit-graph` cite `graph-authoring-heuristics.md` for shared graph-authoring judgment. `generate-proposal/references/oracle.md` remains the skill-local home for progressive verification/oracle conduct until another concrete reader needs the same payload. + +## Deferred tripwires + +Reopen a shared reference only when the reader is concrete: + +- `elicitation-question-hints.md` — when `elicitation-gap-guidance` or another elicitor-question feature needs reusable question patterns. +- `proposal-meta-rubric.md` — when `project` or another generator becomes a second reader for the current generate-proposal rubric. +- `projection-guidance.md` — after `elicitor-project` decides whether cross-plane derivation folds into `generate` or needs a distinct surface. + +Until then, do not bulk-import old design docs into runtime prompt references. + +## Guardrails for future harvests + +- Keep generated vocabulary generated; run `npm run generate:ontology` only when typed sources change. +- Keep authored runtime guidance tied to at least two concrete readers, or keep it skill-local. +- Preserve negative knowledge through `example` + `witness:against` or `exclusion`, not through new relation names. +- Put low-confidence material in `elicitation_gap`; put contradictions in `reconciliation_need`. +- Treat D98-sensitive vocabulary as prompt-resource conduct only, not persisted runtime axes. diff --git a/src/agents/skills/lenses/README.md b/src/agents/skills/lenses/README.md index 9affe47ff..bcb82b4f3 100644 --- a/src/agents/skills/lenses/README.md +++ b/src/agents/skills/lenses/README.md @@ -18,9 +18,9 @@ Future execute-mode lenses (`plan`, `sync`, `scope`) are deferred. ## Heuristic provenance Topology-driven next-question heuristics (look for goals with no derived -requirements, requirements with no examples/proof, decisions with empty rejected -alternatives, conflicting boundaries — ask about the most graph-shaping absence -first) are authored and locked into each `/SKILL.md` body in distilled form -(D97-L: cite/distill, do not copy vocabulary tables). Graph vocabulary itself is -owned by `src/graph/schema/kinds.ts`. This README owns the current lens membership +requirements, requirements with no examples/witnesses, decisions with empty +rejected alternatives, conflicting boundaries — ask about the most graph-shaping +absence first) are authored and locked into each `/SKILL.md` body in distilled +form (D97-L: cite/distill, do not copy vocabulary tables). Graph vocabulary itself is owned by +`src/graph/schema/kinds.ts`. This README owns the current lens membership only — not a parallel copy of the per-lens ranking heuristics. diff --git a/src/agents/skills/lenses/oracle/SKILL.md b/src/agents/skills/lenses/oracle/SKILL.md index 202dd2ec1..982df60ab 100644 --- a/src/agents/skills/lenses/oracle/SKILL.md +++ b/src/agents/skills/lenses/oracle/SKILL.md @@ -7,8 +7,8 @@ description: "Focus on verification obligations, checks, evidence, and blind spo Use this lens when the conversation is about how claims will be checked, witnessed, or kept honest. The plane focus is oracle: checks, validation methods, evidence, obligations, criteria, and blind spots. -Favor oracle-plane checks and validation methods, criteria/examples in intent when they express expected behavior, and proof/support edges from evidence to claims. Ask what would convince the user, what counterexample would break the claim, what fixture or probe would reveal failure, and which obligation remains unwitnessed. +Favor oracle-plane checks and validation methods, criteria/examples in intent when they express expected behavior, and `witness` edges from evidence to claims. Ask what would convince the user, what counterexample would break the claim, what fixture or probe would reveal failure, and which obligation remains unwitnessed. -Interpretation rule: do not confuse an implementation task with an oracle. A good oracle says what observation would discriminate success from failure. If the user gives a metric, ask what claim it validates; if they give a requirement, ask what evidence would prove it. Treat absence honestly as verification debt, not as a passed check. +Interpretation rule: do not confuse an implementation task with an oracle. A good oracle says what observation would discriminate success from failure. If the user gives a metric, ask what claim it validates; if they give a requirement, ask what evidence would witness it. Treat absence honestly as verification debt, not as a passed check. -Topology-driven next questions: prioritize requirements with no incoming proof, criteria with no outgoing proof target, high-fanout assumptions with low confidence, and review/proposal material that lacks evidence. Ask the smallest question that turns an unwitnessed claim into a checkable obligation. +Topology-driven next questions: prioritize requirements with no incoming witness, criteria with no outgoing witness target, high-fanout assumptions with low confidence, and review/proposal material that lacks evidence. Ask the smallest question that turns an unwitnessed claim into a checkable obligation. diff --git a/src/agents/skills/methods/generate-proposal/references/oracle.md b/src/agents/skills/methods/generate-proposal/references/oracle.md index 0f1f26cb2..bd78623ef 100644 --- a/src/agents/skills/methods/generate-proposal/references/oracle.md +++ b/src/agents/skills/methods/generate-proposal/references/oracle.md @@ -37,6 +37,6 @@ present_candidates({ heading, candidates: [oracle ensembles] }) Do not add an oracle-specific tool, schema field, multi-select affordance, or bespoke commit path. The user may recognize one ensemble as the base direction and ask for pieces from another; the composition happens in your reasoning and is made concrete only in the review-set batch. If the ensemble cannot be expressed without a multi-select affordance, stop and surface that as the fan-in falsifier instead of inventing `fan_in_mode` here. -When producing the review set, express oracle commitments as graph vocabulary learned from the active context/rendered ontology surfaces. Prefer checks, criteria, evidence obligations, proof/support edges, fixture/probe commitments, and named blind spots. Do not present an implementation task as an oracle unless it names the observation that discriminates success from failure. +When producing the review set, express oracle commitments as graph vocabulary learned from the active context/rendered ontology surfaces. Prefer checks, criteria, evidence obligations, `witness` / `rationale` edges, fixture/probe commitments, and named blind spots. Do not present an implementation task as an oracle unless it names the observation that discriminates success from failure. Keep epistemic status honest. With thin grounding, offer low-resolution oracle ensembles and name what evidence would make them safer. With richer graph context, attach the ensemble to specific claims, invariants, criteria, and known failure modes. From 6ea6a9d8efa6fb26f78a101a35f9407e979e588e Mon Sep 17 00:00:00 2001 From: Lu Nelson Date: Fri, 26 Jun 2026 17:41:18 +0200 Subject: [PATCH 29/29] massive new skill drafting Signed-off-by: Lu Nelson --- HANDOFF.md | 117 ++++++++++++++++++ src/agents/contexts/drafting/README.md | 18 ++- src/agents/contexts/drafting/skill-ingest.md | 64 ++++++++++ .../contexts/drafting/slice-band-walk.md | 70 +++++++++++ src/agents/skills/methods/capture/SKILL.md | 37 ++++-- .../methods/elicit-by-question/SKILL.md | 6 +- .../methods/explore-and-characterize/SKILL.md | 6 +- .../skills/methods/generate-proposal/SKILL.md | 31 +++-- .../skills/methods/ingest-paste/SKILL.md | 4 +- .../skills/methods/read-context/SKILL.md | 18 ++- .../read-referenced-documents/SKILL.md | 4 +- 11 files changed, 340 insertions(+), 35 deletions(-) create mode 100644 HANDOFF.md create mode 100644 src/agents/contexts/drafting/skill-ingest.md create mode 100644 src/agents/contexts/drafting/slice-band-walk.md diff --git a/HANDOFF.md b/HANDOFF.md new file mode 100644 index 000000000..0f8a3b6df --- /dev/null +++ b/HANDOFF.md @@ -0,0 +1,117 @@ +# Handoff + +> Generated by `ln-handoff` at 2026-06-26T14:45:15Z. Read this file to resume work. +> This file is volatile transfer state only. After its contents are reconciled into canonical docs or superseded by a newer handoff, overwrite or delete it. + +## Goal + +Finish and tie off FE-1091 `renderer-golden-coverage` by ensuring the accepted flat foreground prompt / background subagent topology is implemented, verified, and free of stale `SYSTEM.md` packaging or planning regressions. + +## Session State + +- **Last completed skill**: `ln-refactor` — produced `memory/REFACTOR.md`; builder implemented all planned commits and deleted the refactor plan. +- **Current skill**: `ln-handoff` — capturing final volatile review/refactor context. +- **Flow position**: `grill → spec → plan → [design] → [oracles] → scope → [spike] → build → review → [refactor] → [sync]`; current branch is after review/refactor/build, with no active implementation work remaining. +- **Handoff trigger**: user requested handoff after builder reported success and a read-only spot check confirmed the judo blockers are closed. + +## In-flight work + +> CRITICAL: These artifacts exist only in the prior conversation, not on disk. +> Reproduce them here with full fidelity. + +No active scope card or refactor plan remains in flight. `memory/REFACTOR.md` was intentionally deleted after completion. The only remaining volatile state is the review/refactor closure evidence below. + +### Completed refactor sequence + +The deleted `memory/REFACTOR.md` planned these commits; builder reports they were completed and committed: + +1. Make generated agent asset packaging destructive only inside the generated agent-body asset homes, then rebuild those homes from canonical flat foreground and background bodies. +2. Add packaging/topology characterization that fails if generated or package-bound agent assets contain retired nested prompt-body directories after a build. +3. Reconcile the active orchestration follow-on scope from the retired orchestrator body to the executor surface and real plan-check target, without implementing that future tool. +4. Correct prompt/topology prose so current runtime terms and future product vocabulary are not conflated. +5. Add an executor prompt-composition golden that proves executor prompts do not receive elicitation recommendations or prompt-resource guidance meant only for the elicitor. +6. Introduce the smallest role boundary in prompt composition so elicitor-only sections are emitted only for elicitor, leaving current elicitor goldens behaviorally unchanged. +7. Run the full gate and delete the refactor plan once the branch is clean and the topology closure is committed. + +### Review findings + +> ALL findings from ln-judo-review, not just the one being acted on. + +| # | Finding | Status | Implications | +| --- | --- | --- | --- | +| 1 | Packaged assets still preserved deleted nested prompt topology because `build:pi-assets` copied into existing `dist/agents/prompts` without cleaning old generated homes. | `addressed` | Commit `86261476` rebuilds generated agent asset homes from flat sources; commit `135f3416` covers generated topology. Spot check showed `dist/agents/prompts` contains only `elicitor.md` and `executor.md`. | +| 2 | The active `orchestrator-tool-port` scope pointed at retired `src/agents/prompts/orchestrator/SYSTEM.md`, risking resurrection of obsolete topology. | `addressed` | Commit `47ff6748` reconciled the scope to the execute-mode `executor` surface and `cook_plan_check` target. `memory/cards/orchestrator-tool-port--plan-check-tool.md` now reads from `src/agents/prompts/executor.md`. | +| 3 | SPEC/CODE vocabulary was overclaimed while runtime still exposes `elicit` / `execute`; prompt prose conflated target vocabulary with current runtime state. | `addressed` | Commit `e9d323ab` clarified current execute vocabulary in prompt prose. Do not infer that runtime mode ids have been renamed to SPEC/CODE. | +| 4 | Prompt composition could leak elicitor-only recommendation / prompt-resource guidance into executor prompts. | `addressed` | Commits `baf1c307` and `054ca69a` locked the executor negative case and gated prompt sections by foreground role. | +| 5 | `src/.pi/extensions/__tests__/subagents.test.ts` is large (~894 lines) and approaching the 1000-line review threshold. | `deferred` | Not a blocker because this branch did not add new substantive subagent behavior after review. Next subagent feature should consider splitting parser/registry/session/registrar coverage before adding much more. | + +### Diagnostic evidence + +> Concrete proof points that informed diagnoses or shifted direction. +> Without these, a new thread inherits conclusions but not reasoning. + +- `git status --short --untracked-files=all` returned clean after builder completion: proves no dirty/untracked branch work remained at handoff. +- `git log --oneline -8` showed the completed refactor stack: `86261476`, `135f3416`, `47ff6748`, `e9d323ab`, `baf1c307`, `054ca69a`, `0809c47d`, preceded by `623fdb39`. +- `find dist/agents/prompts -maxdepth 3 -type f` showed only `dist/agents/prompts/elicitor.md` and `dist/agents/prompts/executor.md`: proves stale generated nested `SYSTEM.md` bodies are gone from the prompt package asset home. +- `find dist/agents/subagents -maxdepth 2 -type f` showed `explorer.md`, `projector.md`, `researcher.md`, and `reviewer.md`: proves background subagent assets live in the flat subagent home. +- `rg` for retired `prompts//SYSTEM.md` references found only canonical SPEC/PLAN retirement notes and negative test assertions under `src/agents/prompts/__tests__/prompt-bodies.test.ts`: no active loader/package/reference path was found in the spot check. +- Builder reported `npm run verify` passed after the final commit: 145 test files passed; 1091 tests passed, 1 skipped, 1 todo; build completed successfully. + +## Decisions and assumptions + +| Item | Type | Status | Source | +| --- | --- | --- | --- | +| Foreground bodies are flat files at `src/agents/prompts/{elicitor,executor}.md`. | `decision` | `persisted` | `memory/SPEC.md`, `memory/PLAN.md`, topology READMEs, code/tests. | +| Background subagents are flat files at `src/agents/subagents/{explorer,researcher,projector,reviewer}.md`. | `decision` | `persisted` | `memory/SPEC.md`, `memory/PLAN.md`, topology READMEs, code/tests. | +| Nested `src/agents/prompts//SYSTEM.md` is obsolete; no alias/fallback/bridge should be restored. | `decision` | `persisted` | `memory/SPEC.md`, `memory/PLAN.md`, prompt-body tests, build asset topology tests. | +| Generated package asset homes may be cleaned/rebuilt during build; they are not durable state. | `decision` | `persisted` | Refactor commits and package/build tests. | +| Runtime mode ids are still current-code `elicit` / `execute`; SPEC/CODE remains target/product vocabulary until a future runtime rename. | `decision` | `persisted` | Prompt/prose clarification commit and current runtime schema. | +| `orchestrator-tool-port` is next work and should target the execute-mode `executor` surface plus `cook_plan_check`, not a separate orchestrator prompt body. | `decision` | `persisted` | `memory/cards/orchestrator-tool-port--plan-check-tool.md`. | +| Large subagent test file should be split before substantial future subagent expansion. | `assumption` | `volatile` | Judo review watch item; not yet canonical planning state. | + +## Repo state + +- **Branch**: `ln/fe-1091-renderer-golden-coverage-and-prompt-assembly-lock` +- **Recent commits**: + - `0809c47d` Flatten agent prompt topology + - `054ca69a` Gate prompt sections by foreground role + - `baf1c307` Lock executor prompt composition negative case + - `e9d323ab` Clarify current execute vocabulary in prompt prose + - `47ff6748` Reconcile plan-check scope to executor surface + - `135f3416` Cover generated agent asset topology + - `86261476` Rebuild agent asset homes from flat sources + - `623fdb39` Reopen prompt topology flattening scope +- **Dirty files**: none at pre-handoff spot check; this `HANDOFF.md` is newly written by `ln-handoff` and should be deleted/overwritten once no longer needed. +- **Test status**: last known `npm run verify` passed per builder report; not rerun during handoff because no code changed after the builder's full gate. + +## Artifact status + +| Artifact | Exists | Current vs conversation | +| --- | --- | --- | +| `memory/SPEC.md` | yes | current for flat prompt/subagent topology and retired nested prompt-body convention. | +| `memory/PLAN.md` | yes | current: `renderer-golden-coverage` / context pipeline marked done after topology closure. | +| `memory/cards/` | one file | `memory/cards/orchestrator-tool-port--plan-check-tool.md` remains live for the next frontier and has been reconciled to the executor surface. The renderer topology scope card was deleted after completion. | +| `memory/REFACTOR.md` | no | deleted after completed refactor sequence; absence is expected. | + +## Next steps + +1. If continuing this branch, prepare tie-off/submission: read `docs/praxis/graphite-workflow.md`, inspect the final diff/stack, and use `gt` per project workflow. +2. If starting the next frontier, use `ln-build` only after confirming the existing active scope `memory/cards/orchestrator-tool-port--plan-check-tool.md` still matches `memory/PLAN.md` and the current executor topology. +3. Delete `HANDOFF.md` once the next thread has absorbed this volatile state or the branch is submitted. + +## Retirement rule + +- Delete or overwrite this file once the volatile state above is absorbed into `memory/SPEC.md`, `memory/PLAN.md`, code, or a newer `HANDOFF.md`. + +## Open questions + +- Should the completed FE-1091 branch be submitted now, or should any additional manual PR review happen first? +- When the next subagent feature touches `src/.pi/extensions/__tests__/subagents.test.ts`, should that test file be split as preparatory refactor before behavior changes? + +## Resume prompt + +Paste this into a new session: + +> Read `HANDOFF.md` in the workspace root for this work area. It contains the full state of in-progress work. +> The immediate next step is: prepare tie-off/submission for the completed FE-1091 branch, or start the next `orchestrator-tool-port` scope if the user asks to move on. +> Start by checking `git status`, reading `docs/praxis/graphite-workflow.md` before any `gt submit`, and confirming `memory/cards/orchestrator-tool-port--plan-check-tool.md` if starting next work. diff --git a/src/agents/contexts/drafting/README.md b/src/agents/contexts/drafting/README.md index f2da784c7..c391923ed 100644 --- a/src/agents/contexts/drafting/README.md +++ b/src/agents/contexts/drafting/README.md @@ -7,7 +7,8 @@ Promoting anything from here into `src/agents/contexts/references/` is a separat ## Contents - [`intent-graph-semantics.md`](intent-graph-semantics.md) — the design-reasoning synthesis: the current ontology (4 planes / 24 kinds / 4 bands, 9 edge categories, `detail`/`detail.form`, reconciliation + elicitation substrates) with the rationale preserved from the recovered `INTENT_GRAPH_SEMANTICS.md`. Read this for *why*; read the slices for *do this now*. -- `slice-*.md` — compact, model-facing injectable slices distilled from that synthesis. +- `slice-*.md` — compact, model-facing injectable slices distilled from that synthesis (reference tier: vocabulary + judgment). +- [`skill-ingest.md`](skill-ingest.md) — a draft method skill (step tier) for generalized-content ingestion: one deep procedure with *source* as a shallow branch, citing the slices. Demonstrates the consolidated shape that would replace the four live acquisition modes. ## Injectable slices — when to inject which @@ -20,12 +21,27 @@ slice-kind-selection.md | picking a node `kind` for new graph truth slice-edge-authoring.md | relating two nodes (which category + stance) | commit-graph, generate slice-detail-payloads.md | creating decision/term, or attaching detail.form | capture, generate slice-promotion-capture.md | sweeping a turn into truth/gaps/reconciliation | capture, review-for-gaps +slice-band-walk.md | walking bands while ingesting/sweeping material | capture, ingest slice-neighborhood-reading.md | consuming an anchored context pack to reason | any agent reading graph context slice-plane-authoring.md | generating coherent intent/oracle/design/plan | generate-proposal (per lens) ``` `slice-plane-authoring.md` is section-anchored (`#intent`, `#oracle`, `#design`, `#plan`) so a per-lens caller can inject one plane's conduct rather than the whole file. +## Skill drafts + +- [`skill-ingest.md`](skill-ingest.md) — the consolidated generalized-content ingestion method (step tier). It sequences the slices: identify source → digest-if-raw → banded capture sweep ([`slice-band-walk.md`](slice-band-walk.md) + [`slice-kind-selection.md`](slice-kind-selection.md)) → route by confidence/conflict ([`slice-promotion-capture.md`](slice-promotion-capture.md)) → ask. It collapses the four live acquisition modes into one deep procedure with *source* as the only shallow branch. + +## Design rationale (meta-skill-design) + +These drafts apply the meta-skill-design levers: + +- **Description as routing surface.** `skill-ingest`'s `description` front-loads the leading word (ingest/acquire) and names one trigger per source branch, disambiguated from `capture` (the sweep), edge authoring, and review. +- **Deep module, simple interface.** One ingestion procedure; *source* is the only shallow branch. The four live acquisition modes split a single behavior four ways and duplicate one spine — `skill-ingest` shows the merged shape. +- **Single source of truth.** The band-walk, kind selection, confidence routing, and edge grammar each live in one slice; the skill cites them rather than restating tables (honors D97-L). +- **Reference tier vs step tier.** `slice-*.md` are reference (vocabulary + judgment); `skill-ingest.md` is the sequencing step layer that cites them. Progressive disclosure runs skill → slices → generated `graph-ontology.md`. +- **Completion criteria.** Each ingest step ends on a checkable, exhaustive criterion ("every span classified or abstained") to resist premature completion. + ## Slice form conventions Slices use the `pseudo` notations — `matrix` decision tables (with explicit `policy:`), `chain` flows, `graph` node/edge lists, `data-shape` YAML — plus markdown tables, kept terse and activation-dense. Each slice header states its purpose, its inject-trigger, and the source of truth it cites. A slice is operational ("do this"); the synthesis doc is explanatory ("why"). diff --git a/src/agents/contexts/drafting/skill-ingest.md b/src/agents/contexts/drafting/skill-ingest.md new file mode 100644 index 000000000..6789b70d3 --- /dev/null +++ b/src/agents/contexts/drafting/skill-ingest.md @@ -0,0 +1,64 @@ +--- +name: ingest +description: "Ingest source material into the selected spec — a human answer, a pasted block, a referenced document/URL, or a bounded brownfield area — through one banded capture sweep. Not for relating existing nodes (edge authoring), committing a settled batch (commit-graph), or auditing accepted truth (review-for-gaps)." +--- + +# Method: ingest (draft) + +> Draft skill (scratch; not wired). This file demonstrates the consolidated shape for generalized-content ingestion. It is **not** enumerated in `agents/runtime/state.ts` / `agents/registry.ts`, so it is inert and advertises nothing. It collapses the four current acquisition modes (`elicit-by-question`, `ingest-paste`, `read-referenced-documents`, `explore-and-characterize`) into one deep procedure with *source* as the only shallow branch. +> +> Source of truth: the band-walk [`slice-band-walk.md`](slice-band-walk.md), kind selection [`slice-kind-selection.md`](slice-kind-selection.md), confidence/conflict routing [`slice-promotion-capture.md`](slice-promotion-capture.md), edges [`slice-edge-authoring.md`](slice-edge-authoring.md); generated vocabulary [`graph-ontology.md`](../references/graph-ontology.md). Cite these; do not restate their tables (D97-L). + +Ingest is one procedure: whatever the source, material enters the transcript, a banded capture sweep turns it into graph truth or agenda, then you ask from the updated world. The source only changes how the material arrives and whether it needs a digest first. + +## Procedure + +``` +chain ingest: + identify source (ask | paste | reference | brownfield) + -> digest if raw/large (reference + brownfield: required; paste: if large; ask: n/a) + -> banded capture sweep (walk slice-band-walk over digest + conversation) + -> route by confidence/conflict (slice-promotion-capture: truth | gap | reconciliation) + -> ask from the updated graph + gaps +``` + +Each step ends on a checkable criterion: + +1. **Identify source.** Name the source in ordinary language. Done when the source and its provenance phrasing are explicit. +2. **Digest if raw/large.** For a referenced document or brownfield area, read with legal read tools and write an assistant-authored digest in the transcript that separates direct claims from interpretation and names open uncertainties; raw tool output stays background. Done when the sweep has a bounded digest to work from, not unbounded raw bulk. (Skip for a direct human answer; optional for a small paste.) +3. **Banded capture sweep.** Walk [`slice-band-walk.md`](slice-band-walk.md) over the digest + conversation, classifying each span to a kind ([`slice-kind-selection.md`](slice-kind-selection.md)). Done when every span is either classified to a kind or deliberately abstained — none left as untyped prose. +4. **Route by confidence/conflict.** Send each classified span to its substrate ([`slice-promotion-capture.md`](slice-promotion-capture.md)): high-confidence → graph truth (`explicit` / `implicit`); low-confidence → `elicitation_gap`; contradiction → `reconciliation_need`. Relate only settled endpoints with edges ([`slice-edge-authoring.md`](slice-edge-authoring.md)). Done when nothing low-confidence is committed and no contradiction was written as truth. +5. **Ask from the updated world.** Compose the next question over the updated graph + gaps, not the pre-ingest state. + +## Source branch + +The only thing that differs by source is arrival + whether a digest is required: + +``` +matrix source -> conduct +policy: exclusive (one source per ingest) + +source | trigger | digest? | provenance phrasing +-----------|-------------------------------------------|----------|--------------------------- +ask | the human is the authority for the answer | no | "you said…" +paste | user pasted a block of text/notes/logs | if large | "from your pasted …" +reference | user named files / URLs / tickets | yes | "from " +brownfield | an existing codebase/area needs a map | yes | "from as inspected" +``` + +- ask: the human is the source; do not read or search just because a question *could* be researched. +- paste: do not require saving to a file before learning from it. +- reference: use only legal read tools; `web_fetch` for a known URL, `web_search` only when external context would change the next move. +- brownfield: smallest useful reconnaissance bounded by the user's area and the current gap; do not crawl for completeness. + +## Anti-goals (one source of truth for all sources) + +- Do not treat every sentence as a graph node. +- Do not make raw tool output the capture source for bulk material; digest first. +- Do not launder ambiguous material into graph truth to avoid a follow-up question. +- Do not bypass the capture sweep with direct graph claims in prose. +- Do not run a product-side extraction pass or revive observer/auditor queues; this is transcript conduct plus the standard sweep. + +## If promoted (not in scope now) + +To wire this, it becomes `src/agents/skills/methods/ingest/SKILL.md` enumerated in `agents/runtime/state.ts` + `agents/registry.ts`. The four current acquisition modes either collapse into the source branch here or shrink to thin trigger-shells that delegate to it; `capture` keeps the banded sweep (this skill cites it rather than duplicating it). That restructuring touches the sealed skills tree and is out of scope for this drafting pass. diff --git a/src/agents/contexts/drafting/slice-band-walk.md b/src/agents/contexts/drafting/slice-band-walk.md new file mode 100644 index 000000000..932e5c4cc --- /dev/null +++ b/src/agents/contexts/drafting/slice-band-walk.md @@ -0,0 +1,70 @@ +# Slice: the band-walk (ingestion movements) + +> Draft injectable context slice (scratch; not wired). Inject as the procedural backbone for generalized-content ingestion: the order in which the elicitor walks readiness bands while sweeping ingested material into graph truth. Source of truth for the kind→band table is [`graph-ontology.md`](../references/graph-ontology.md) (D94-L); kind selection is [`slice-kind-selection.md`](slice-kind-selection.md); confidence/conflict routing is [`slice-promotion-capture.md`](slice-promotion-capture.md). This slice owns the *walk* (a procedure), not the vocabulary (a lookup). + +Readiness bands are `grounding → elicitation → projection → commitment` (plus band-less kinds), derived per-kind by the schema (D94-L). Walked as a procedure, they are four **movements** the elicitor moves through while ingesting any source material. Bands guide *what to look for and ask next*; they **do not gate truth** — if the user states a later-movement item early, capture it honestly with the right kind and basis. + +``` +chain band-walk: + ingested material (digest + conversation) + -> GROUND establish the initiative frame + -> ELICIT expand the working middle + -> PROJECT materialize structure (design / oracle) + -> CLOSE harden obligations + sequence the work + ANYTIME: term / example / sketch are capturable in any movement +``` + +## GROUND — grounding band + +- Gathers: `goal`, `thesis`, `context`, `constraint` (band membership: cite ontology). +- Routing question: "What outcome, for whom and why, is true about the world, and what is ruled out?" +- Completion: the initiative frame is anchored (problem / for-whom / value / bounding context present as truth) or the smallest missing anchor is a spawned gap. Do not push deeper just to look complete. + +(reconciliation: the sketch's "pitch" = `thesis`.) + +## ELICIT — elicitation band + +- Gathers: `context`, `story`, `unknown`, `assumption`, `constraint`, `invariant`, `decision`. +- Routing question: "What is believed-but-falsifiable, what is an acknowledged unknown, what was chosen among alternatives, what must stay true, what bounds the space?" +- Completion: open forks captured as truth or gaps; tentative language preserved as `assumption` / `unknown`, not laundered into commitment. + +(reconciliation: the sketch placed `story` under ANYTIME and omitted `invariant`; canonically both are elicitation-band.) + +## PROJECT — projection band + +Projection is gated: do not materialize structure ahead of a settled-enough intent frame. The sketch splits this into two sub-movements: + +- design — `module`, `interface`, `entity`. Routing question: "How is it shaped?" +- oracle — `check`, `vv_method`, `evidence`, `vv_obligation`. Routing question: "How is it checked or evidenced?" +- Completion: a projection node only when the intent it serves is settled — each design node realizes a claim (`realization`), each oracle node witnesses one (`witness`). + +(reconciliation: the sketch placed `check` under CLOSURE; canonically `check` is projection-band, and `entity` / `evidence` belong here too.) + +## CLOSE — commitment band + +- Gathers: `requirement`, `criterion`; plan kinds `milestone`, `frontier`, `slice`. +- Routing question: "What must the system do, how will we judge it, and how is the work sequenced?" +- Completion: commitments are reviewed; `requirement` / `criterion` become truth via the user's direct statement or an accepted review set, not auto-promoted from a sweep. + +(reconciliation: the sketch grouped `milestone` / `frontier` under PROJECTION:PLAN; canonically plan kinds are commitment-band, and `slice` belongs here too.) + +## ANYTIME — band-less + +`term`, `example`, `sketch` carry no readiness band; capture them in whatever movement they surface. `term` fixes lexicon; `example` is a witness (pair with a `witness` edge, stance `for` / `against`); `sketch` is advisory design, not yet hardened. + +## Sketch → canonical reconciliation (overlay) + +``` +matrix sketch-group -> canonical band +policy: overlay (procedural; no schema change) + +sketch group | canonical band | reconciliation +--------------------|----------------|-------------------------------------------------- +GROUNDING | grounding | "pitch" = thesis +ELICITATION | elicitation | + invariant; + story (sketch put story in ANYTIME) +ANYTIME | band-less | term, example, sketch (story is elicitation-band) +PROJECTION:DESIGN | projection | + entity +PROJECTION:ORACLE | projection | + evidence; check is projection (sketch put it in CLOSURE) +PROJECTION:CLOSURE | commitment | requirement, criterion +PROJECTION:PLAN | commitment | milestone, frontier, + slice +``` diff --git a/src/agents/skills/methods/capture/SKILL.md b/src/agents/skills/methods/capture/SKILL.md index 5ea1765be..0fdc17121 100644 --- a/src/agents/skills/methods/capture/SKILL.md +++ b/src/agents/skills/methods/capture/SKILL.md @@ -1,27 +1,41 @@ --- name: capture -description: "Capture selected-spec facts and gap noticings through the deferred FE-861 sweep conduct." +description: "Capture selected-spec facts and gap noticings through the banded capture-sweep conduct." --- # Method: capture -Capture is the single home for FE-861 foreground elicitor selected-spec sweep discipline. Use it after every elicitor turn, before composing the next question: first turn the un-swept transcript tail into graph truth or elicitation agenda, then ask from the updated world. +Capture is the single home for foreground elicitor selected-spec sweep discipline. Use it after every elicitor turn, before composing the next question: first turn the un-swept transcript tail into graph truth, elicitation agenda, or reconciliation agenda, then ask from the updated world. ## Goal -Keep graph truth high-confidence without losing useful low-confidence material. +Keep graph truth high-confidence without losing useful low-confidence material. The sweep bands are procedural passes over the conversation, not ontology law: they guide what to notice next, while graph legality still comes from the generated ontology and mutation tools. ```pseudo chain capture-then-ask: - unswept transcript tail - -> banded capture sweep - -> mutate_graph / update_elicitation_gaps + unswept transcript tail or acquisition digest + -> capture-sweep passes + -> mutate_graph / update_elicitation_gaps / update_reconciliation_needs -> next question over updated graph + gaps ``` -## Sweep frame +## Capture-sweep passes -Walk the un-swept material once by readiness band and likely node kind. The canonical band order and per-kind band membership are the generated kind→band table in `src/agents/contexts/references/graph-ontology.md` (projected from the typed schema — cite it, do not restate it; D97-L). Shared graph-authoring judgment — declarative claims, low-confidence routing, contradiction routing, confident endpoints, and role-named mutation grammar — lives in `src/agents/contexts/references/graph-authoring-heuristics.md`. Conversational answers, ordinary user text, and acquisition digests are all sweep inputs. Large raw reads or tool results should be digested first; capture from the digest plus the conversation, not from unbounded raw bulk. +Walk the un-swept material once using the current conversation stage as an attention order. If the user gives later-band material early, capture it honestly; do not down-rank or discard it because the session has not “reached” that band. Stage guides questioning, not graph validity. + +| Pass | Attention job | Common route | +| --- | --- | --- | +| Grounding → orient | domain, protagonist, pain/pull, vocabulary, constraints, ambient context | intent graph truth or grounding gaps | +| Elicitation → strengthen | requirements, assumptions, constraints, invariants, decisions, criteria, examples | graph truth when explicit/implicit confidence is high; gaps otherwise | +| Anytime sidecar | `story`, `sketch`, and `example` material that appears opportunistically | capture as the given kind when useful; do not force it into the current band | +| Projection: design → derive shape | modules, interfaces, entities, sketches implied by accepted intent | usually review-set/proposal material; direct capture only when user explicitly establishes it | +| Projection: oracle → derive witness | criteria, checks, methods, evidence, obligations, examples/counterexamples | graph truth or proposal material; keep checkability as conduct, not schema | +| Projection: closure → commitments | reviewable decisions, accepted batches, contradictions, unsettled commitments | graph truth, review set, or reconciliation need | +| Projection: plan → sequence work | milestones, frontiers, slices, dependencies, acceptance signals | graph truth or proposal material anchored to accepted pressure | + +Use `src/agents/contexts/references/context-slice-index.md` to choose the smallest topical reference for the pass. For ordinary capture, prefer `intent-capture-slice.md` plus `graph-authoring-heuristics.md`; pull design/oracle/plan slices only when the user actually supplied or requested that material. The exact node/edge vocabulary lives in `src/agents/contexts/references/graph-ontology.md` and the shared graph-authoring judgment — declarative claims, low-confidence routing, contradiction routing, confident endpoints, and role-named mutation grammar — lives in `src/agents/contexts/references/graph-authoring-heuristics.md`. + +Conversational answers, ordinary user text, and acquisition digests are all sweep inputs. Large raw reads or tool results should be digested first; capture from the digest plus the conversation, not from unbounded raw bulk. Use the graph, gap, and reconciliation tools as the mutation boundary: @@ -31,6 +45,7 @@ Use the graph, gap, and reconciliation tools as the mutation boundary: | New agenda | `update_elicitation_gaps` `spawn` | one gap write; question/rationale only, not domain truth | | Manual gap disposition | `update_elicitation_gaps` `set_disposition` | one disposition write on the graph clock | | Contradiction with existing graph truth | `update_reconciliation_needs` `create` | one reconciliation need; records the impasse, never overwrites the conflicting node | +| Candidate batch needing judgment | `present_review_set` | reviewable proposal only; commitment waits for explicit approval | Do not invent graph payload fields, LSNs, result shapes, or capture-local edge syntax. Follow the role-named mutation grammar in `graph-authoring-heuristics.md`. @@ -42,6 +57,7 @@ Confidence controls commitment. Directness alone does not. | --- | --- | --- | | directly stated by the user | commit graph truth | `explicit` | | confidently materialized from stated content, including safe implied edges or structure | commit graph truth | `implicit` | +| coherent but judgment-heavy candidate material | present a review set | no graph basis until accepted | | a low-confidence noticing, suspicion, possible implication, or missing piece | never commit; map to an elicitation gap | gap `basis: implicit` | | a contradiction with existing graph truth | never commit or spawn a gap; create a reconciliation need | `semantic_conflict` over the conflicting `node_pair` | @@ -74,11 +90,14 @@ Structural gaps become answered from graph truth. Do not hand-set `answered` for Review captured nodes before adding edges. Use `graph-authoring-heuristics.md` for the shared relation-bearing rule: commit missing high-confidence endpoints first, use role-named endpoints, and skip the edge when either endpoint is low-confidence. Spawn or reuse a gap for the missing endpoint/relationship instead. +When reading existing context for an anchor, prefer edge-local neighborhoods over global kind buckets: dependencies, dependents, evidence, refinements, lateral neighbors, gaps, and reconciliation needs tell you why the anchor stands and what changes if it moves. + ## Anti-goals - Do not run a product-side extraction pass or revive submit-time labeled-prefix capture. +- Do not create a separate model-invoked skill for each ontology kind; the capture-sweep bands are procedural conduct. - Do not create observer/auditor queues as the primary capture path. - Do not store low-confidence domain content inside gaps as if it were truth. - Do not file contradictions as elicitation gaps; use reconciliation needs for retrospective impasses. -- Do not create a second mutation clock; graph mutations, gap writes, and reconciliation-need writes share the selected spec's `{specId, lsn}` change log. +- Do not create a second mutation clock; graph mutations, gap writes, review acceptance, and reconciliation-need writes share the selected spec's `{specId, lsn}` change log. - Do not use capture-local `{category, source, target}` edge dialects; use the canonical role-named `mutate_graph` grammar. diff --git a/src/agents/skills/methods/elicit-by-question/SKILL.md b/src/agents/skills/methods/elicit-by-question/SKILL.md index d0b6b372e..bd2278b8d 100644 --- a/src/agents/skills/methods/elicit-by-question/SKILL.md +++ b/src/agents/skills/methods/elicit-by-question/SKILL.md @@ -5,7 +5,7 @@ description: "Acquire missing material by asking the human one focused question. # Method: elicit by question -Use this acquisition mode when the next missing material is best obtained by asking the human one focused question and letting the answer enter the transcript directly. This is the ordinary Brunch elicitation path: ask, receive, then let the capture sweep decide what becomes graph truth and what becomes agenda. +Use this acquisition mode when the next missing material is best obtained by asking the human one focused question and letting the answer enter the transcript directly. This is the ordinary Brunch elicitation path: ask, receive, then let the banded capture-sweep decide what becomes graph truth and what becomes agenda. ## Use when @@ -25,7 +25,7 @@ chain elicit-by-question: open gap or uncertainty -> one focused assistant question -> human answer in transcript - -> capture sweep over answer + -> banded capture-sweep over answer -> next question from updated graph + gaps ``` @@ -34,4 +34,4 @@ chain elicit-by-question: - Do not read files or search the web just because a question could be researched; ask when the human is the source of truth. - Do not batch unrelated questions into a questionnaire. - Do not treat a leading question as established graph truth. -- Do not bypass the capture sweep with direct graph claims in prose. +- Do not bypass the banded capture-sweep with direct graph claims in prose. diff --git a/src/agents/skills/methods/explore-and-characterize/SKILL.md b/src/agents/skills/methods/explore-and-characterize/SKILL.md index e08e00e82..9b204bb0e 100644 --- a/src/agents/skills/methods/explore-and-characterize/SKILL.md +++ b/src/agents/skills/methods/explore-and-characterize/SKILL.md @@ -12,20 +12,20 @@ Use this acquisition mode when the session needs an initial map of a brownfield - The situating gap or user says this is a brownfield codebase or existing system. - The user asks Brunch to understand an area before asking detailed questions. - The current gap cannot be answered until you know the shape of files, docs, routes, APIs, or tests. -- A digest of the territory would let the capture sweep produce grounded graph truth or better gaps. +- A digest of the territory would let the banded capture-sweep produce grounded graph truth or better gaps. ## Conduct Start with the smallest useful reconnaissance: list or search only the named area, read nearby README or topology notes first, then inspect a few files that answer the current orientation question. Keep exploration bounded by the user's stated area and the current gap. Use `web_search` / `web_fetch` only for external references that are actually needed; local brownfield reading should prefer local read/search tools. -After exploring, write an assistant-authored characterization digest in the transcript. The digest is the handoff artifact to capture: it should name the area inspected, the observed topology, high-confidence facts, and open uncertainties. The capture sweep then commits high-confidence material or spawns gaps. Raw file listings, search hits, and tool outputs stay background. +After exploring, write an assistant-authored characterization digest in the transcript. The digest is the handoff artifact to capture: it should name the area inspected, the observed topology, high-confidence facts, and open uncertainties. The banded capture-sweep then commits high-confidence material or spawns gaps. Raw file listings, search hits, and tool outputs stay background. ```pseudo chain explore-and-characterize: brownfield orientation need -> bounded local/web reads -> assistant characterization digest - -> capture sweep over digest + -> banded capture-sweep over digest -> next question from updated graph + gaps ``` diff --git a/src/agents/skills/methods/generate-proposal/SKILL.md b/src/agents/skills/methods/generate-proposal/SKILL.md index 17c7c8f6b..7c81e8e9e 100644 --- a/src/agents/skills/methods/generate-proposal/SKILL.md +++ b/src/agents/skills/methods/generate-proposal/SKILL.md @@ -12,15 +12,18 @@ Generate proposal material by fanning out alternatives, making comparison legibl Use the same spine for every plane: 1. Read the active lens/plane and the current graph/session context. -2. Load the matching plane reference: - - `references/intent.md` when generating intent-plane territory candidates. - - `references/design.md` when generating design-plane module or boundary candidates. - - `references/oracle.md` when generating oracle-plane verification ensembles. -3. Fan out candidate alternatives with explicit comparison axes. -4. Call `present_candidates` for recognition/comparison before any commit-facing draft. -5. Call `present_review_set` only when the user is reviewing a structurally valid graph-draft batch. -6. Call `request_response` after the presentation tool. Do not call retired request-specific tools directly. -7. Treat the user's response as selection/review input, not as an automatic graph write. +2. Prefer edge-local neighborhoods for the anchors under proposal, then load the smallest matching topical slice from `../../../contexts/references/context-slice-index.md`. +3. Load the matching plane reference: + - `references/intent.md` plus `../../../contexts/references/intent-capture-slice.md` when generating intent-plane territory candidates. + - `references/design.md` plus `../../../contexts/references/design-projection-slice.md` when generating design-plane module or boundary candidates. + - `references/oracle.md` plus `../../../contexts/references/oracle-witness-slice.md` when generating oracle-plane verification ensembles. + - `../../../contexts/references/plan-sequencing-slice.md` only when the proposal is explicitly about milestones, frontiers, slices, or work sequencing. + - `../../../contexts/references/review-set-drafting-slice.md` before presenting any commit-facing graph-draft batch. +4. Fan out candidate alternatives with explicit comparison axes. +5. Call `present_candidates` for recognition/comparison before any commit-facing draft. +6. Call `present_review_set` only when the user is reviewing a structurally valid graph-draft batch. +7. Call `request_response` after the presentation tool. Do not call retired request-specific tools directly. +8. Treat the user's response as selection/review input, not as an automatic graph write. Intent uses `pick`, design uses `synthesize`, and oracle uses `compose`. These are method conduct inside this skill, not schema fields or bespoke tools. Do not add a fan-in field, a multi-select affordance, or a plane-specific commit path unless a later build explicitly changes the architecture. @@ -51,8 +54,10 @@ present_candidates({ heading, candidates: [...] }) ## Plane references -The disclosed references are branch-specific payload, not independently advertised skills. Load exactly one unless the user explicitly asks to compare planes. +The disclosed references are branch-specific payload, not independently advertised skills. Load exactly one branch reference unless the user explicitly asks to compare planes. Topical context slices are supporting references, not new model-invoked skills. -- `references/intent.md`: intent plane, single pick, grounding-density scaling. -- `references/design.md`: design plane, synthesize, radically different module/interface shapes. -- `references/oracle.md`: oracle plane, compose, additive verification ensembles and blind spots. +- `references/intent.md`: intent plane, single pick, grounding-density scaling. Pair with `../../../contexts/references/intent-capture-slice.md` when candidate material names graph kinds or capture routes. +- `references/design.md`: design plane, synthesize, radically different module/interface shapes. Pair with `../../../contexts/references/design-projection-slice.md` and relevant neighborhoods. +- `references/oracle.md`: oracle plane, compose, additive verification ensembles and blind spots. Pair with `../../../contexts/references/oracle-witness-slice.md` and relevant neighborhoods. +- `../../../contexts/references/plan-sequencing-slice.md`: supporting slice for explicit plan projection; keep it behind this method for now rather than adding a separate `project-graph` or plan lens. +- `../../../contexts/references/review-set-drafting-slice.md`: supporting slice for turning selected material into a reviewable graph batch. diff --git a/src/agents/skills/methods/ingest-paste/SKILL.md b/src/agents/skills/methods/ingest-paste/SKILL.md index f35b48b1b..299e70fe1 100644 --- a/src/agents/skills/methods/ingest-paste/SKILL.md +++ b/src/agents/skills/methods/ingest-paste/SKILL.md @@ -5,7 +5,7 @@ description: "Acquire user-provided pasted material as conversational transcript # Method: ingest paste -Use this acquisition mode when the human provides a block of text, notes, requirements, logs, transcript excerpts, or other pasted material as the ground material for the selected spec. The paste enters the conversation directly; capture stays uniform and happens through the next capture sweep. +Use this acquisition mode when the human provides a block of text, notes, requirements, logs, transcript excerpts, or other pasted material as the ground material for the selected spec. The paste enters the conversation directly; this method stays thin and capture stays uniform through the next banded capture-sweep. ## Use when @@ -24,7 +24,7 @@ For large pasted material, compress before committing: name the sections, preser chain ingest-paste: user paste -> brief assistant orientation if useful - -> capture sweep over paste/conversation + -> banded capture-sweep over paste/conversation -> explicit or implicit graph commits only when confidence is high -> gaps for unresolved or ambiguous implications ``` diff --git a/src/agents/skills/methods/read-context/SKILL.md b/src/agents/skills/methods/read-context/SKILL.md index d55298607..9ebe410b9 100644 --- a/src/agents/skills/methods/read-context/SKILL.md +++ b/src/agents/skills/methods/read-context/SKILL.md @@ -7,8 +7,22 @@ description: "Use pushed context handles and read-only context tools for selecte Use this method when pushed prompt context is insufficient for the next elicitation move. It tells you how to sequence selected-spec reads without turning context gathering into a separate research project. -Start from the handles in the runtime prompt: selected spec, soft readiness estimate, active strategy/lens, workspace posture, and any graph overview. Pull more context only when it will change the next question, proposal, capture decision, or graph write. Prefer compact overview for orientation and focused node neighborhoods for a specific claim or projected code. +Start from the handles in the runtime prompt: selected spec, soft readiness estimate, active strategy/lens, workspace posture, and any graph overview. Pull more context only when it will change the next question, proposal, capture decision, or graph write. Prefer compact overview for orientation, then edge-local neighborhoods for the claim, design seam, oracle, or plan item under discussion. -Use read-only context tools such as `read_graph`, `read_session_context`, `web_fetch`, and `web_search` where available. Reach for `web_fetch` when a specific URL is already in hand; use `web_search` only when current external context or alternate sources would change the next elicitation move. Keep graph truth distinct from active-context projections: accepted records are truth, while rendered summaries and web extracts are orientation until captured through Brunch's graph path. If the user mentions a node code, resolve it through the product read path rather than guessing from memory. +## Edge-local preference + +When a move is centered on an existing graph item, read the anchor and its neighborhood before asking or proposing. Use `src/agents/contexts/references/neighborhood-consumption-slice.md` as the conduct reference: bucket neighbors as dependencies, dependents, evidence, refinements, lateral context, open gaps, and reconciliation needs. This is usually more useful than loading all nodes of a kind, because it shows why the anchor stands and what downstream material changes if it moves. + +```pseudo +context read: + selected-spec overview + -> anchor node / node code when present + -> edge-local neighborhood buckets + -> topical slice only if the next move needs projection or capture guidance +``` + +Use global kind lists only for orientation, coverage scans, or when no anchor exists yet. Do not infer relation direction from raw storage coordinates; rely on rendered labels, role names, and impact buckets. If the user mentions a node code, resolve it through the product read path rather than guessing from memory. + +Use read-only context tools such as `read_graph`, `read_session_context`, `web_fetch`, and `web_search` where available. Reach for `web_fetch` when a specific URL is already in hand; use `web_search` only when current external context or alternate sources would change the next elicitation move. Keep graph truth distinct from active-context projections: accepted records are truth, while rendered summaries and web extracts are orientation until captured through Brunch's graph path. Compose this before `generate-proposal`, `commit-graph`, and topology-driven lens questions. Out of scope: filesystem exploration unrelated to the selected spec, direct DB inspection, or treating stale prompt context as proof when a fresh graph read is needed. diff --git a/src/agents/skills/methods/read-referenced-documents/SKILL.md b/src/agents/skills/methods/read-referenced-documents/SKILL.md index 7eac32175..c58bfb238 100644 --- a/src/agents/skills/methods/read-referenced-documents/SKILL.md +++ b/src/agents/skills/methods/read-referenced-documents/SKILL.md @@ -5,7 +5,7 @@ description: "Read bounded user-referenced documents and digest them before capt # Method: read referenced documents -Use this acquisition mode when the human points at specific files, URLs, docs, tickets, or other bounded references that should ground the spec. The job is to read the referenced material, author a digest in the conversation, and let the capture sweep work from that digest plus any conversational framing. +Use this acquisition mode when the human points at specific files, URLs, docs, tickets, or other bounded references that should ground the spec. The job is to read the referenced material, author a digest in the conversation, and hand off to the banded capture-sweep over that digest plus any conversational framing. ## Use when @@ -25,7 +25,7 @@ chain read-referenced-documents: bounded user reference -> legal read/fetch/search tools -> assistant-authored digest in transcript - -> capture sweep over digest + conversation + -> banded capture-sweep over digest + conversation -> graph truth / elicitation gaps by confidence ```