Skip to content

Test discipline: behavioral-name linter + runbook stage-readiness doctrine #668

@michael-wojcik

Description

@michael-wojcik

Background

PR #663 (closes #662) enforced behavioral test names (no planning-artifact identifiers like F-row labels or sketch IDs) and used a counter-test runbook for fresh-session validation of hook behavior. Two test-discipline follow-ups surfaced during review.

Task A — Behavioral-name linter for tests

Add a static check that flags planning-artifact identifiers in test names. The pinned feedback_no_planning_artifacts_in_repo rule covers this in prose; mechanical enforcement closes the drift surface.

Audit pattern:

rg -nE '\bF[0-9]+\b|\bT[0-9]+\b|\bS-[0-9]+\b|Sketch [A-Z]|architect §' \
   pact-plugin/tests/ \
   --type py

Should return zero hits. Wrap as a pytest plugin or a pre-commit hook. Counter-test by adding a test named test_f1_deny_when_name_empty and confirming the linter catches it.

Carve-outs:

  • Closes #N / Fixes #N in commit messages and runbook titles
  • External standards refs (RFC, OWASP, CVE)
  • Internal Python identifiers with no user-visible counterpart (e.g., a private module-local constant)

Surfaced by test-engineer-blind in PR #663 review as Q2.

Task B — Stage-readiness doctrine for runbooks

PR #663 added pact-plugin/tests/runbooks/662-dispatch-gate.md as a counter-test runbook for fresh-session validation. The runbook format works but lacks a documented standard for:

  • When to add a runbook (which PRs warrant one)
  • Required sections (counter-test by mutation, fail-closed wrapper sabotage, observable-behavior diff against main)
  • Stage-readiness criteria — when is a runbook "ready" to ship as part of a PR

Document the doctrine in pact-plugin/tests/runbooks/README.md. Reference the 662-dispatch-gate runbook as the canonical example.

Stage-readiness criteria draft:

  1. Counter-test cardinality is documented (e.g., "revert against main produces deny cardinality {3} for matcher mutation")
  2. Each section header is behavioral, not planning-artifact (e.g., "Matcher registration fidelity (counter-test by mutation)" not "F22 fidelity")
  3. Section content includes the exact bash/python commands to reproduce, not just prose description
  4. Runbook title may include (#N) provenance per the planning-artifact carve-out

Surfaced by test-engineer-blind in PR #663 review as Q3.

Relationship to other follow-ups

Test plan

For Task A: add a parametrized test that exercises each planning-artifact pattern against a known-bad fixture and confirms the linter flags it. Counter-test by removing the linter rule and confirming the test fails.

For Task B: this is documentation, not code. Stage-readiness is met when the doctrine README is written and the next runbook PR references it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions