Skip to content

feat(v2.1.0): prompt quality patterns — Iterations 1, 2, 3#1

Merged
demwick merged 26 commits into
mainfrom
develop
Apr 15, 2026
Merged

feat(v2.1.0): prompt quality patterns — Iterations 1, 2, 3#1
demwick merged 26 commits into
mainfrom
develop

Conversation

@demwick

@demwick demwick commented Apr 15, 2026

Copy link
Copy Markdown
Owner

Summary

Ships v2.1.0 prompt-quality layer across three iterations, all merged into develop:

  • Iter 1 — Comprehension + Evidence: Rule 7 (Evidence-Bearing Exit Reports) in _common.md, Step 0 (Demonstrate Comprehension) in researcher.md, planner.md, executor.md. verifier.md intentionally excluded.
  • Iter 2 — Negative scope bounds: per-task Allowed paths / Forbidden paths in planner Mode B schema, pre-commit scope check in executor with scope violation blocked status, sample-plan-with-scope.md fixture, scope-creep-detection.sh suite.
  • Iter 3 — Risk gates: per-plan risk_gates taxonomy in planner, gate-pause protocol in executor (STATUS: gate + .sea/phases/phase-N/gate-pending.json marker), Step 4.5 "Risk gate inspection" + "Resume after gate" in sea-go/SKILL.md, sample-plan-with-gates.md fixture, risk-gate-flow.sh suite, docs/STATE.md marker entry.
  • Infra: scripts/check-coverage.sh + coverage-matrix-shape.sh eval suite; .DS_Store added to .gitignore; source spec + superpowers plan committed under docs/.

Eval status: bash evals/run.sh → 23 passed, 0 failed.

Test plan

  • Structural evals: bash evals/run.sh → 23/23 green
  • bash evals/suites/agents/prompt-quality.sh → passed
  • bash evals/suites/agents/scope-creep-detection.sh → passed
  • bash evals/suites/agents/risk-gate-flow.sh → passed
  • Live end-to-end validation (spec gate, Iter 3): run claude --plugin-dir "$(pwd)" against a throwaway repo with sample-plan-with-gates.md as the target plan; observe executor pausing at gate task, sea-go surfacing confirmation prompt, and resume path completing. Log run in PR comment.
  • Remove "Pending (Iter 3)" notice from CHANGELOG once live validation passes
  • Tag v2.1.0 after merge

🤖 Generated with Claude Code

demwick and others added 26 commits April 15, 2026 20:33
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Source spec and its superpowers execution plan for Iterations 1 & 2
(already merged in 2b9149c). Commits the authoring artifacts so the
rationale behind Rule 7, Step 0, and scope bounds is traceable in-repo.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Parses criterion ids (Rx.y) from a plan file and intersects them with
completed_tasks[].covered[] from a progress.json to emit covered /
uncovered / errors sets. Used by the new coverage-matrix-shape eval
suite and available for future planner/verifier integration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Planner now declares a risk_gates section at the top of every plan,
enumerating tasks that trigger destructive ops, schema migrations, dep
removals, or network state mutation. Each gate carries kind, reason,
and confirmation prompt consumed by sea-go and executor in subsequent
commits.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nding marker

Executor now checks the current task against the plan's risk_gates
section before execution. On a gate match, writes
.sea/phases/phase-N/gate-pending.json, marks progress.json task status
'gated', and exits with STATUS: gate. Resume path deletes the marker
and proceeds as a normal task after sea-go captures user confirmation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… path

Step 4.5 surfaces plan risk_gates entries for explicit user confirmation
before executor launch. Step 5 grows a 'gate' case alongside done/blocked
that reads gate-pending.json, collects user confirmation, and re-launches
executor with resume context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds the v2.1.0 gate-pending marker to the STATE.md inventory: writer,
readers, required fields, invariants, and missing/corrupted behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixture plan exercising the v2.1.0 risk_gates taxonomy with one task
per gate kind: schema-migration, dependency-removal, destructive-git,
network-state-mutation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Structural simulation of the gate state machine: parses the fixture
plan's risk_gates yaml block, asserts each expected gate kind is
present with a non-empty confirmation, and round-trips a synthetic
gate-pending.json marker through the fields sea-go reads on resume.
Does not run a real executor.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds structural checks for v2.1.0 Iter 3: risk_gates in planner.md,
Gate-pause protocol and STATUS: gate in executor.md, Risk gate
inspection and Resume after gate in sea-go/SKILL.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Logs risk_gates schema, gate-pause protocol, Step 4.5/Resume-after-gate,
gate-pending.json, fixture, risk-gate-flow suite, and prompt-quality
extension. Flags pending live end-to-end validation as merge gate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merges feat/prompt-quality-risk-gates into develop.

Iteration 3 installs the risk_gates taxonomy in planner, a gate-pause
protocol in executor (STATUS: gate + gate-pending.json), and Step 4.5
risk gate inspection plus resume-after-gate handling in sea-go.

Live end-to-end validation still pending per spec — see CHANGELOG
'Pending (Iter 3)'.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@demwick demwick merged commit 4e8535b into main Apr 15, 2026
2 checks passed
@demwick demwick deleted the develop branch April 15, 2026 20:38
demwick added a commit that referenced this pull request Apr 17, 2026
…guard

Fix #1: Executor must complete all tasks or report STATUS: blocked.
Sea-go now validates progress.json after STATUS: done and re-launches
if tasks are missing. Ambiguous/missing STATUS treated as blocked.
Sea-go never completes remaining tasks itself — only executor writes code.

Fix #2: Sea-go Step 7d no longer blindly sets current_phase=N+1.
When N >= total_phases, current_phase stays at total_phases and
completed=true is set instead. Prevents out-of-range phase numbers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant