Test discipline: behavioral-name linter + runbook stage-readiness doctrine

## Background

PR #663 (closes #662) enforced behavioral test names (no planning-artifact identifiers like F-row labels or sketch IDs) and used a counter-test runbook for fresh-session validation of hook behavior. Two test-discipline follow-ups surfaced during review.

## Task A — Behavioral-name linter for tests

Add a static check that flags planning-artifact identifiers in test names. The pinned `feedback_no_planning_artifacts_in_repo` rule covers this in prose; mechanical enforcement closes the drift surface.

Audit pattern:

```bash
rg -nE '\bF[0-9]+\b|\bT[0-9]+\b|\bS-[0-9]+\b|Sketch [A-Z]|architect §' \
   pact-plugin/tests/ \
   --type py
```

Should return zero hits. Wrap as a pytest plugin or a pre-commit hook. Counter-test by adding a test named `test_f1_deny_when_name_empty` and confirming the linter catches it.

Carve-outs:
- `Closes #N` / `Fixes #N` in commit messages and runbook titles
- External standards refs (RFC, OWASP, CVE)
- Internal Python identifiers with no user-visible counterpart (e.g., a private module-local constant)

Surfaced by test-engineer-blind in PR #663 review as Q2.

## Task B — Stage-readiness doctrine for runbooks

PR #663 added `pact-plugin/tests/runbooks/662-dispatch-gate.md` as a counter-test runbook for fresh-session validation. The runbook format works but lacks a documented standard for:

- When to add a runbook (which PRs warrant one)
- Required sections (counter-test by mutation, fail-closed wrapper sabotage, observable-behavior diff against `main`)
- Stage-readiness criteria — when is a runbook "ready" to ship as part of a PR

Document the doctrine in `pact-plugin/tests/runbooks/README.md`. Reference the 662-dispatch-gate runbook as the canonical example.

Stage-readiness criteria draft:
1. Counter-test cardinality is documented (e.g., "revert against `main` produces deny cardinality {3} for matcher mutation")
2. Each section header is behavioral, not planning-artifact (e.g., "Matcher registration fidelity (counter-test by mutation)" not "F22 fidelity")
3. Section content includes the exact bash/python commands to reproduce, not just prose description
4. Runbook title may include `(#N)` provenance per the planning-artifact carve-out

Surfaced by test-engineer-blind in PR #663 review as Q3.

## Relationship to other follow-ups

- `feedback_no_planning_artifacts_in_repo` (project CLAUDE.md pin) — Task A is the mechanical enforcement of this rule. The pin states the rule; this issue codifies it.
- PR #663 (closed) — established the counter-test runbook format that Task B documents.

## Test plan

For Task A: add a parametrized test that exercises each planning-artifact pattern against a known-bad fixture and confirms the linter flags it. Counter-test by removing the linter rule and confirming the test fails.

For Task B: this is documentation, not code. Stage-readiness is met when the doctrine README is written and the next runbook PR references it.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test discipline: behavioral-name linter + runbook stage-readiness doctrine #668

Background

Task A — Behavioral-name linter for tests

Task B — Stage-readiness doctrine for runbooks

Relationship to other follow-ups

Test plan

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Test discipline: behavioral-name linter + runbook stage-readiness doctrine #668

Description

Background

Task A — Behavioral-name linter for tests

Task B — Stage-readiness doctrine for runbooks

Relationship to other follow-ups

Test plan

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions