fix(shield): add P0-gate to /plan-review verdict + fix doc inconsistencies by ashwinimanoj · Pull Request #63 · infraspecdev/tesseract

ashwinimanoj · 2026-06-04T17:13:06Z

Summary

Brings /plan-review to parity with /prd-review's verdict semantics and fixes four file-level contradictions discovered while comparing the two skills.

The core fix is the P0-gate: prd-review/scoring.md was explicitly forked from plan-review/scoring.md with a P0-gate added — but the gate was never back-ported. So a plan with a Critical-severity D/F finding could still score Ready, because that F got averaged across the 10 PM dims and then down-weighted to 0.7 (the "averaging problem"). /plan-review computed P0s but never let them gate the verdict.

Changes

#	Issue	Fix
4	No P0-gate (vs prd-review)	New `shield/scripts/compute_plan_verdict.py` computes composite + P0 count and applies the gate (composite ≥ 2.5 with any P0 → Needs Work). Skill scoring step now invokes it.
3	`scoring.md` weight table incomplete + divergent persona names	Script `WEIGHTS` is now the single source of truth; `scoring.md`/`dimensions.md` reference it. Added platform/backend engineers; fixed `Cloud Architect`/`Operations` naming; fixed overlapping verdict-label ranges.
1	`templates.md` reported the dead `plan-review/{N}-{slug}/` path	Replaced with date-keyed `reviews/plan/{date}{_counter}/`.
2	`templates.md` Dispatch Prompt used the pre-restructure inlined-persona pattern	Rewrote for Pattern A `subagent_type` dispatch.
6	Summary template predated gates 0a–0i	Added a Deterministic Gates section.
5	No immutable source snapshot (vs prd-review's `source-prd.md`)	New Step 1b writes `source-plan.md`.

Eval coverage (per `updating-plugin-assets`)

New deterministic suite: shield/evals/plan-review-verdict.yaml (+ 4 grades.json fixtures), run via uv run shield/evals/run.py plan-review-verdict.

RED (script absent): 0/4 cases passed.
GREEN: 4/4 cases passed, including high-composite-p0 → Needs Work (composite 3.61, blocked by 1 P0) — the exact bug.
No regression: existing plan-review-trd gates suite still 5/5.

=== eval suite: plan-review-verdict (4 cases) ===
  PASS clean-ready
  PASS high-composite-p0
  PASS needs-work-threshold
  PASS not-ready
=== 4/4 cases passed ===

Note for reviewers

This makes the verdict script-computed rather than LLM-computed — a deliberate execution-model change that makes the gate deterministic and testable, matching the existing run.py + check_plan_review_trd.py pattern. If preferred, the script can instead stay an eval-only oracle with the verdict left to the LLM (small revert of the SKILL.md scoring step).

Version: shield 2.26.0 → 2.27.0.

🤖 Generated with Claude Code

…ncies Brings /plan-review to parity with /prd-review's verdict semantics and resolves four file-level contradictions in the skill. P0-gate (#4) — the substantive fix: - New shield/scripts/compute_plan_verdict.py computes the weighted composite, detects P0s (Critical-severity findings graded D/F), and applies the P0-gate: a composite >= 2.5 with any P0 is gated to "Needs Work", not "Ready". Previously the verdict was composite-only, so a Critical-F could be diluted across the 10 PM dims (then down-weighted to 0.7) and the plan still scored "Ready". prd-review/scoring.md was forked from plan-review WITH this gate added; it was never back-ported until now. - The script is the single source of truth for persona weights (WEIGHTS), which resolves #3 (scoring.md omitted platform/backend engineers and used divergent persona names vs dimensions.md). Eval (RED -> GREEN, committed): - shield/evals/plan-review-verdict.yaml + 4 grades.json fixtures; run.py gains a `verdict` branch. RED: 0/4 (script absent). GREEN: 4/4, including high-composite-p0 -> "Needs Work (composite 3.61, blocked by 1 P0)". - Existing plan-review-trd gates suite: 5/5, no regression. Doc/consistency fixes: - scoring.md (#3,#4): P0-gate verdict table, completed+renamed weight table, fixed overlapping verdict-label ranges, references the script for weights. - templates.md (#1,#2,#6): replaced dead plan-review/{N}-{slug}/ paths with the date-keyed reviews/plan/{date}{_counter}/ paths; rewrote the Dispatch Prompt for Pattern A subagent_type dispatch (no inlined agent markdown); added a Deterministic Gates section to the summary template. - SKILL.md (#4,#5): P0-gated verdict in description; new Step 1b source-plan.md immutable snapshot; scoring step now invokes compute_plan_verdict.py; documented source-plan.md + grades.json in the output tree. - dimensions.md: weights reference the canonical script table. Version: shield 2.26.0 -> 2.27.0 (marketplace.json). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ashwinimanoj merged commit 5ab6a65 into main Jun 4, 2026
12 checks passed

ashwinimanoj deleted the fix/plan-review-p0-gate-and-consistency branch June 4, 2026 17:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(shield): add P0-gate to /plan-review verdict + fix doc inconsistencies#63

fix(shield): add P0-gate to /plan-review verdict + fix doc inconsistencies#63
ashwinimanoj merged 1 commit into
mainfrom
fix/plan-review-p0-gate-and-consistency

ashwinimanoj commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ashwinimanoj commented Jun 4, 2026

Summary

Changes

Eval coverage (per updating-plugin-assets)

Note for reviewers

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Eval coverage (per `updating-plugin-assets`)