Skip to content

[problem] @windyroad/risk-scorer@0.12.7 — appetite-from-policy parser regex too narrow + subagent prompt hardcodes ≤4 with false RISK-POLICY citation (follow-up to closed #149) #241

@tompahoward

Description

@tompahoward

Description

Follow-up to #149 (closed-with-fix 2026-05-24 — appetite-from-policy parser shipped in @windyroad/risk-scorer@0.11.0). Two distinct defects persist in 0.12.7 against a real adopter project's RISK-POLICY.md format. Both were observed empirically during a 2026-06-08 release session on a downstream private repo running @windyroad/risk-scorer@0.12.7; the release ultimately landed via BYPASS_RISK_GATE=1 inline as the documented P007 adopter-side workaround. This report files the two follow-up defects so they can be fixed structurally.

Defect 1 — Appetite-from-policy parser regex doesn't match this project's RISK-POLICY.md format

The parser shipped in @windyroad/risk-scorer@0.11.0 was intended to read RISK-POLICY.md and derive the project-specific appetite threshold (closing #149 / P007). On the adopter's RISK-POLICY.md format, the parser falls through to the default N=4 instead of detecting the policy's stated 9.

The adopter's RISK-POLICY.md uses this shape:

## 3. Risk Appetite

**Threshold**: pipeline actions are blocked when cumulative residual risk exceeds **9/25** (Medium band).

And section 6:

**Label bands**:

| Score | Label |
|-------|-------|
| 1-2 | Very Low |
| 3-4 | Low |
| 5-9 | Medium |
| 10-16 | High |
| 17-25 | Very High |

Pipeline appetite (§ 3) of 9 means commit/push/release actions whose cumulative residual risk lands at Medium-band-top (9) or lower pass the gate. Anything 10 or higher (High or Very High) blocks pending remediation.

The parser regex does NOT recognize either form. Observed empirically: every release attempt with a score of 5-9/25 was rejected by the hook with release risk score N/25 exceeds the project appetite of 4/25 (RISK-POLICY.md), despite RISK-POLICY.md explicitly stating the appetite is 9.

Workaround used today: BYPASS_RISK_GATE=1 npm run release:watch. Fired three times in one session for in-appetite work.

Defect 2 — Subagent prompt hardcodes ≤ 4 / > 4 per RISK-POLICY.md with false policy citation

The wr-risk-scorer:pipeline subagent's prompt at agents/pipeline.md carries hardcoded threshold instructions at lines 162, 171, 208, and 266 (line numbers from @windyroad/risk-scorer@0.11.1; positions may have shifted since). These lines instruct the subagent to gate at ≤ 4 and cite the project's RISK-POLICY.md as the source. The citation is false on any adopter whose RISK-POLICY sets a different appetite (this adopter sets ≤9, and the project explicitly documents this in § 3 + § 6).

Net effect: the subagent emits verdicts that contradict the project's stated policy AND claims the project's own policy as the authority. The verdict is then further gated by the hook library (hooks/lib/risk-gate.sh) which has its own independent threshold logic — which Defect 1 above shows is ALSO miscalibrated.

This defect was not in #149's fix scope (@windyroad/risk-scorer@0.11.0's changeset is "appetite-from-policy parser" — gate-library scope only). The subagent-prompt surface was never updated.

Composed framing — both defects are downstream of the same trust-mismatch shape

Both defects share the same root cause: the package treats 4 as a hardcoded "safe default" without surfacing that override to the adopter at install OR letting the adopter's RISK-POLICY.md reliably correct it. Adopters whose policy intentionally sets a different appetite (per ISO 31000 risk-management policy authoring practice) are silently overridden.

Symptoms

  • Adopter's release is rejected with release risk score N/25 exceeds the project appetite of 4/25 (RISK-POLICY.md) for any 5-9 score, despite RISK-POLICY.md explicitly setting appetite at 9.
  • The hook's deny message cites RISK-POLICY.md as authority but uses a contradictory threshold.
  • The subagent emits verdict text using the same false threshold.
  • Adopter resorts to BYPASS_RISK_GATE=1 for every in-appetite release. Empirically, 3 firings in a single 2-hour session on the adopter today.
  • The cumulative effect is to train adopters to mechanically bypass the gate, regressing the gate's signal-to-noise ratio for the cases where the gate IS load-bearing.

Workaround

BYPASS_RISK_GATE=1 inline on the affected npm run release:watch / git commit invocation. Costs the gate's load-bearing intent (the bypass is the same shape adopters would use to skip a genuine policy violation, training them to mechanically apply it).

Alternative workarounds discussed in the adopter's P007 ticket:

  1. Project-side RISK_APPETITE=9 env override — assumes the package looks for the env var; status unverified in 0.12.7.
  2. Reformat RISK-POLICY.md to match the parser's regex — requires the adopter to reformat policy authoring around a tool's regex, which inverts the responsibility direction.

Affected plugin or component

@windyroad/risk-scorer

Frequency

Every release attempt whose cumulative risk score is 5-9/25 (Medium band) on this adopter. Observed at least 3 times during a single 2026-06-08 session.

Local plugin version

@windyroad/risk-scorer@0.12.7

Upstream package version

not applicable

Claude Code CLI version

2.1.150

Node version

v22.17.1

Operating system

Darwin 25.3.0 x86_64

Evidence

  • Hook deny message verbatim from this session: Release blocked: release risk score 5/25 exceeds the project appetite of 4/25 (RISK-POLICY.md). To proceed: (1) split the release, (2) add risk-reducing measures, or (3) for a LIVE INCIDENT, delegate to wr-risk-scorer:pipeline ... with incident context for an incident bypass.
  • The adopter's RISK-POLICY.md sections 3 + 6 (quoted above) clearly state appetite is 9.
  • Subagent agent description at packages/risk-scorer/agents/pipeline.md lines 162, 171, 208, 266 hardcode ≤ 4 / > 4 per RISK-POLICY.md (line numbers as of 0.11.1; may have shifted in 0.12.7).
  • Adopter's P007 ticket re-investigation (2026-05-30) confirmed the parser fall-through and the subagent-prompt defect class.

Probable fix shapes (pick one — preference is option 1)

  1. Broaden the appetite-from-policy parser's regex so it tolerates the heading shape ## N. Risk Appetite (numbered heading) and emphasis-wrapped numeric phrasing **N/25**. Also parse from the Label-bands table in § 6 ("Pipeline appetite (§ 3) of 9 means...") as a secondary signal. The adopter's policy is shaped per ISO 31000 risk-management practice; the parser should meet that shape rather than constraining it.
  2. Add RISK_APPETITE env var override as a first-class adopter knob, documented in SKILL.md / agent description / hook deny message. Falls back to parsed value or default 4. Adopters with non-standard policy shapes get a documented escape hatch.
  3. Update the subagent prompt (agents/pipeline.md) to read the appetite from the parsed config rather than hardcoding ≤ 4 per RISK-POLICY.md. Defect 2 should be fixed alongside Defect 1; the false citation is a documentation-as-code defect independent of the parser fix.
  4. Improve the hook deny message to say ... exceeds the configured appetite of N/25 (parsed from RISK-POLICY.md OR default N=4) so the adopter can see whether the parser actually matched their policy. Currently the message claims RISK-POLICY.md as authority even when the parser fell through to the default — adopters cannot diagnose from the message alone.

Cross-reference

Reported from a downstream private repo. Tracked locally as P007 in the downstream project's docs/problems/known-error/ directory. Composes-with #149 (closed-with-fix; this report covers the residual defects in that fix scope plus the subagent-prompt defect outside that scope).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions