Description
Follow-up to #149 (closed-with-fix 2026-05-24 — appetite-from-policy parser shipped in @windyroad/risk-scorer@0.11.0). Two distinct defects persist in 0.12.7 against a real adopter project's RISK-POLICY.md format. Both were observed empirically during a 2026-06-08 release session on a downstream private repo running @windyroad/risk-scorer@0.12.7; the release ultimately landed via BYPASS_RISK_GATE=1 inline as the documented P007 adopter-side workaround. This report files the two follow-up defects so they can be fixed structurally.
Defect 1 — Appetite-from-policy parser regex doesn't match this project's RISK-POLICY.md format
The parser shipped in @windyroad/risk-scorer@0.11.0 was intended to read RISK-POLICY.md and derive the project-specific appetite threshold (closing #149 / P007). On the adopter's RISK-POLICY.md format, the parser falls through to the default N=4 instead of detecting the policy's stated 9.
The adopter's RISK-POLICY.md uses this shape:
## 3. Risk Appetite
**Threshold**: pipeline actions are blocked when cumulative residual risk exceeds **9/25** (Medium band).
And section 6:
**Label bands**:
| Score | Label |
|-------|-------|
| 1-2 | Very Low |
| 3-4 | Low |
| 5-9 | Medium |
| 10-16 | High |
| 17-25 | Very High |
Pipeline appetite (§ 3) of 9 means commit/push/release actions whose cumulative residual risk lands at Medium-band-top (9) or lower pass the gate. Anything 10 or higher (High or Very High) blocks pending remediation.
The parser regex does NOT recognize either form. Observed empirically: every release attempt with a score of 5-9/25 was rejected by the hook with release risk score N/25 exceeds the project appetite of 4/25 (RISK-POLICY.md), despite RISK-POLICY.md explicitly stating the appetite is 9.
Workaround used today: BYPASS_RISK_GATE=1 npm run release:watch. Fired three times in one session for in-appetite work.
Defect 2 — Subagent prompt hardcodes ≤ 4 / > 4 per RISK-POLICY.md with false policy citation
The wr-risk-scorer:pipeline subagent's prompt at agents/pipeline.md carries hardcoded threshold instructions at lines 162, 171, 208, and 266 (line numbers from @windyroad/risk-scorer@0.11.1; positions may have shifted since). These lines instruct the subagent to gate at ≤ 4 and cite the project's RISK-POLICY.md as the source. The citation is false on any adopter whose RISK-POLICY sets a different appetite (this adopter sets ≤9, and the project explicitly documents this in § 3 + § 6).
Net effect: the subagent emits verdicts that contradict the project's stated policy AND claims the project's own policy as the authority. The verdict is then further gated by the hook library (hooks/lib/risk-gate.sh) which has its own independent threshold logic — which Defect 1 above shows is ALSO miscalibrated.
This defect was not in #149's fix scope (@windyroad/risk-scorer@0.11.0's changeset is "appetite-from-policy parser" — gate-library scope only). The subagent-prompt surface was never updated.
Composed framing — both defects are downstream of the same trust-mismatch shape
Both defects share the same root cause: the package treats 4 as a hardcoded "safe default" without surfacing that override to the adopter at install OR letting the adopter's RISK-POLICY.md reliably correct it. Adopters whose policy intentionally sets a different appetite (per ISO 31000 risk-management policy authoring practice) are silently overridden.
Symptoms
- Adopter's release is rejected with
release risk score N/25 exceeds the project appetite of 4/25 (RISK-POLICY.md) for any 5-9 score, despite RISK-POLICY.md explicitly setting appetite at 9.
- The hook's deny message cites RISK-POLICY.md as authority but uses a contradictory threshold.
- The subagent emits verdict text using the same false threshold.
- Adopter resorts to
BYPASS_RISK_GATE=1 for every in-appetite release. Empirically, 3 firings in a single 2-hour session on the adopter today.
- The cumulative effect is to train adopters to mechanically bypass the gate, regressing the gate's signal-to-noise ratio for the cases where the gate IS load-bearing.
Workaround
BYPASS_RISK_GATE=1 inline on the affected npm run release:watch / git commit invocation. Costs the gate's load-bearing intent (the bypass is the same shape adopters would use to skip a genuine policy violation, training them to mechanically apply it).
Alternative workarounds discussed in the adopter's P007 ticket:
- Project-side
RISK_APPETITE=9 env override — assumes the package looks for the env var; status unverified in 0.12.7.
- Reformat RISK-POLICY.md to match the parser's regex — requires the adopter to reformat policy authoring around a tool's regex, which inverts the responsibility direction.
Affected plugin or component
@windyroad/risk-scorer
Frequency
Every release attempt whose cumulative risk score is 5-9/25 (Medium band) on this adopter. Observed at least 3 times during a single 2026-06-08 session.
Local plugin version
@windyroad/risk-scorer@0.12.7
Upstream package version
not applicable
Claude Code CLI version
2.1.150
Node version
v22.17.1
Operating system
Darwin 25.3.0 x86_64
Evidence
- Hook deny message verbatim from this session:
Release blocked: release risk score 5/25 exceeds the project appetite of 4/25 (RISK-POLICY.md). To proceed: (1) split the release, (2) add risk-reducing measures, or (3) for a LIVE INCIDENT, delegate to wr-risk-scorer:pipeline ... with incident context for an incident bypass.
- The adopter's RISK-POLICY.md sections 3 + 6 (quoted above) clearly state appetite is 9.
- Subagent agent description at
packages/risk-scorer/agents/pipeline.md lines 162, 171, 208, 266 hardcode ≤ 4 / > 4 per RISK-POLICY.md (line numbers as of 0.11.1; may have shifted in 0.12.7).
- Adopter's P007 ticket re-investigation (2026-05-30) confirmed the parser fall-through and the subagent-prompt defect class.
Probable fix shapes (pick one — preference is option 1)
- Broaden the appetite-from-policy parser's regex so it tolerates the heading shape
## N. Risk Appetite (numbered heading) and emphasis-wrapped numeric phrasing **N/25**. Also parse from the Label-bands table in § 6 ("Pipeline appetite (§ 3) of 9 means...") as a secondary signal. The adopter's policy is shaped per ISO 31000 risk-management practice; the parser should meet that shape rather than constraining it.
- Add
RISK_APPETITE env var override as a first-class adopter knob, documented in SKILL.md / agent description / hook deny message. Falls back to parsed value or default 4. Adopters with non-standard policy shapes get a documented escape hatch.
- Update the subagent prompt (
agents/pipeline.md) to read the appetite from the parsed config rather than hardcoding ≤ 4 per RISK-POLICY.md. Defect 2 should be fixed alongside Defect 1; the false citation is a documentation-as-code defect independent of the parser fix.
- Improve the hook deny message to say
... exceeds the configured appetite of N/25 (parsed from RISK-POLICY.md OR default N=4) so the adopter can see whether the parser actually matched their policy. Currently the message claims RISK-POLICY.md as authority even when the parser fell through to the default — adopters cannot diagnose from the message alone.
Cross-reference
Reported from a downstream private repo. Tracked locally as P007 in the downstream project's docs/problems/known-error/ directory. Composes-with #149 (closed-with-fix; this report covers the residual defects in that fix scope plus the subagent-prompt defect outside that scope).
Description
Follow-up to #149 (closed-with-fix 2026-05-24 — appetite-from-policy parser shipped in
@windyroad/risk-scorer@0.11.0). Two distinct defects persist in 0.12.7 against a real adopter project's RISK-POLICY.md format. Both were observed empirically during a 2026-06-08 release session on a downstream private repo running@windyroad/risk-scorer@0.12.7; the release ultimately landed viaBYPASS_RISK_GATE=1inline as the documentedP007adopter-side workaround. This report files the two follow-up defects so they can be fixed structurally.Defect 1 — Appetite-from-policy parser regex doesn't match this project's RISK-POLICY.md format
The parser shipped in
@windyroad/risk-scorer@0.11.0was intended to readRISK-POLICY.mdand derive the project-specific appetite threshold (closing #149 / P007). On the adopter's RISK-POLICY.md format, the parser falls through to the defaultN=4instead of detecting the policy's stated9.The adopter's RISK-POLICY.md uses this shape:
And section 6:
The parser regex does NOT recognize either form. Observed empirically: every release attempt with a score of 5-9/25 was rejected by the hook with
release risk score N/25 exceeds the project appetite of 4/25 (RISK-POLICY.md), despite RISK-POLICY.md explicitly stating the appetite is 9.Workaround used today:
BYPASS_RISK_GATE=1 npm run release:watch. Fired three times in one session for in-appetite work.Defect 2 — Subagent prompt hardcodes
≤ 4 / > 4 per RISK-POLICY.mdwith false policy citationThe
wr-risk-scorer:pipelinesubagent's prompt atagents/pipeline.mdcarries hardcoded threshold instructions at lines 162, 171, 208, and 266 (line numbers from@windyroad/risk-scorer@0.11.1; positions may have shifted since). These lines instruct the subagent to gate at≤ 4and cite the project'sRISK-POLICY.mdas the source. The citation is false on any adopter whose RISK-POLICY sets a different appetite (this adopter sets≤9, and the project explicitly documents this in § 3 + § 6).Net effect: the subagent emits verdicts that contradict the project's stated policy AND claims the project's own policy as the authority. The verdict is then further gated by the hook library (
hooks/lib/risk-gate.sh) which has its own independent threshold logic — which Defect 1 above shows is ALSO miscalibrated.This defect was not in #149's fix scope (
@windyroad/risk-scorer@0.11.0's changeset is "appetite-from-policy parser" — gate-library scope only). The subagent-prompt surface was never updated.Composed framing — both defects are downstream of the same trust-mismatch shape
Both defects share the same root cause: the package treats
4as a hardcoded "safe default" without surfacing that override to the adopter at install OR letting the adopter's RISK-POLICY.md reliably correct it. Adopters whose policy intentionally sets a different appetite (per ISO 31000 risk-management policy authoring practice) are silently overridden.Symptoms
release risk score N/25 exceeds the project appetite of 4/25 (RISK-POLICY.md)for any 5-9 score, despite RISK-POLICY.md explicitly setting appetite at 9.BYPASS_RISK_GATE=1for every in-appetite release. Empirically, 3 firings in a single 2-hour session on the adopter today.Workaround
BYPASS_RISK_GATE=1inline on the affectednpm run release:watch/git commitinvocation. Costs the gate's load-bearing intent (the bypass is the same shape adopters would use to skip a genuine policy violation, training them to mechanically apply it).Alternative workarounds discussed in the adopter's P007 ticket:
RISK_APPETITE=9env override — assumes the package looks for the env var; status unverified in 0.12.7.Affected plugin or component
@windyroad/risk-scorer
Frequency
Every release attempt whose cumulative risk score is 5-9/25 (Medium band) on this adopter. Observed at least 3 times during a single 2026-06-08 session.
Local plugin version
@windyroad/risk-scorer@0.12.7
Upstream package version
not applicable
Claude Code CLI version
2.1.150
Node version
v22.17.1
Operating system
Darwin 25.3.0 x86_64
Evidence
Release blocked: release risk score 5/25 exceeds the project appetite of 4/25 (RISK-POLICY.md). To proceed: (1) split the release, (2) add risk-reducing measures, or (3) for a LIVE INCIDENT, delegate to wr-risk-scorer:pipeline ... with incident context for an incident bypass.packages/risk-scorer/agents/pipeline.mdlines 162, 171, 208, 266 hardcode≤ 4 / > 4 per RISK-POLICY.md(line numbers as of 0.11.1; may have shifted in 0.12.7).Probable fix shapes (pick one — preference is option 1)
## N. Risk Appetite(numbered heading) and emphasis-wrapped numeric phrasing**N/25**. Also parse from the Label-bands table in § 6 ("Pipeline appetite (§ 3) of 9 means...") as a secondary signal. The adopter's policy is shaped per ISO 31000 risk-management practice; the parser should meet that shape rather than constraining it.RISK_APPETITEenv var override as a first-class adopter knob, documented in SKILL.md / agent description / hook deny message. Falls back to parsed value or default4. Adopters with non-standard policy shapes get a documented escape hatch.agents/pipeline.md) to read the appetite from the parsed config rather than hardcoding≤ 4 per RISK-POLICY.md. Defect 2 should be fixed alongside Defect 1; the false citation is a documentation-as-code defect independent of the parser fix.... exceeds the configured appetite of N/25 (parsed from RISK-POLICY.md OR default N=4)so the adopter can see whether the parser actually matched their policy. Currently the message claimsRISK-POLICY.mdas authority even when the parser fell through to the default — adopters cannot diagnose from the message alone.Cross-reference
Reported from a downstream private repo. Tracked locally as P007 in the downstream project's
docs/problems/known-error/directory. Composes-with #149 (closed-with-fix; this report covers the residual defects in that fix scope plus the subagent-prompt defect outside that scope).