feat: add JSON settings runtime policy flow#1
Conversation
WalkthroughAdds a runtime policy system and verifier registry, refactors CLI to forward Changes
Sequence Diagram(s)sequenceDiagram
participant CLI
participant RuntimePolicy as RuntimePolicy\n(runner)
participant Verifier as SuccessVerifiers\n(verifier)
participant ParserAgent as SubAgent\n(parser)
participant FS as Filesystem
Note over CLI,RuntimePolicy: CLI invokes spawn/monitor/verify-step/parse-output
CLI->>RuntimePolicy: load_runtime_policy(state_file?)
RuntimePolicy->>FS: read bundled/override policy & snapshot
FS-->>RuntimePolicy: policy JSON + snapshot hash
RuntimePolicy-->>CLI: step contract + parser/runtime config
alt verify-step / monitor completion
CLI->>Verifier: run_success_verifier(step, story_key, contract, state_file)
Verifier->>FS: read artifacts / story files / sprint-status
FS-->>Verifier: artifact list / frontmatter / sprint data
Verifier-->>CLI: {verified: true|false, reason?, metadata}
else parse-output
CLI->>ParserAgent: run sub-agent with provider/model/timeout and prompt
ParserAgent->>FS: read prompt templates / parse schema if needed
ParserAgent-->>CLI: agent stdout (JSON)
CLI->>RuntimePolicy: validate returned JSON against parse schema
RuntimePolicy-->>CLI: validation result (ok / error reason)
end
CLI->>FS: write/read policy snapshot or emit structured JSON result
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🤖 Augment PR SummarySummary: This PR introduces a snapshot-backed JSON “runtime policy” layer for story-automator so prompts, parse contracts, budgets, and success verification become data-driven and deterministic on resume. Changes:
Technical Notes: Resume paths for new-format state docs are enforced to use the pinned snapshot (hash-verified), while legacy state docs continue to use bundled defaults. 🤖 Was this summary useful? React with 👍 or 👎 |
There was a problem hiding this comment.
Actionable comments posted: 18
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
payload/.claude/skills/bmad-story-automator/steps-c/step-03-execute.md (1)
155-157:⚠️ Potential issue | 🟠 MajorPass
--state-filein devbuild-cmdto keep policy snapshot pinned.
Line 155-157 still builds the dev command without--state-file, while create now pins it. This can cause create/dev to resolve different runtime policy sources in the same orchestration run.Proposed fix
session=$("$scripts" tmux-wrapper spawn dev {epic} {story_id} \ --agent "$current_agent" \ - --command "$("$scripts" tmux-wrapper build-cmd dev {story_id} --agent "$current_agent")") + --command "$("$scripts" tmux-wrapper build-cmd dev {story_id} --agent "$current_agent" --state-file "$state_file")")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@payload/.claude/skills/bmad-story-automator/steps-c/step-03-execute.md` around lines 155 - 157, The dev tmux-wrapper invocation (session=... tmux-wrapper spawn dev) builds the dev command via tmux-wrapper build-cmd but omits the --state-file flag, so dev can pick a different policy snapshot than create; update the build-cmd call inside the session assignment to pass the same --state-file value used by create (e.g. --state-file "$state_file" or {state_file}) so tmux-wrapper build-cmd dev {story_id} --agent "$current_agent" includes --state-file and both spawn dev and build-cmd use the pinned policy snapshot.source/src/story_automator/commands/tmux.py (1)
206-219:⚠️ Potential issue | 🔴 CriticalUse
shlex.quote()to properly escape the rendered prompt before passing it to the shell.The current escaping
prompt.replace("\\", "\\\\").replace('"', '\\"')only neutralizes\and"characters. However, within double-quoted shell strings,$VAR,$(...), and backticks will still be expanded. Sincepromptis constructed from policy templates (repo-controlled) andextra_instruction(CLI-provided), these shell metacharacters create a command injection vulnerability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/src/story_automator/commands/tmux.py` around lines 206 - 219, The prompt built by _render_step_prompt is being escaped manually into escaped and passed into shell strings which leaves shell metacharacters injectable; replace the manual replace-based escaping with shlex.quote(prompt) (import shlex at top) and use that quoted value in both branches (the codex exec command and the fallback print using cli) so the prompt is safely shell-escaped; update references to escaped in the codex branch (the CODEX_EXEC string) and the else branch (the cli invocation) to use shlex.quote(prompt) instead of the current escaped variable.
🧹 Nitpick comments (1)
source/tests/test_state_policy_metadata.py (1)
254-262: Assert the state-file value is excluded too.This only checks that the literal
--state-filetoken is absent. A parser regression could still leakstr(state_file)into the rendered prompt and this test would stay green.✅ Small test hardening
rendered = stdout.getvalue() self.assertNotIn("--state-file", rendered) + self.assertNotIn(str(state_file), rendered)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/test_state_policy_metadata.py` around lines 254 - 262, The test test_build_cmd_does_not_treat_state_file_flag_as_prompt_text currently only asserts the literal "--state-file" is absent from the rendered output; extend the assertion to also ensure the actual state file path value (the str(state_file) variable) is not present. Modify the test to add an assertion like self.assertNotIn(str(state_file), rendered) after captured output is read so the test rejects any regression that leaks the state file value into the rendered prompt.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/development.md`:
- Around line 16-18: The verify sequence is inconsistent: the bullet list under
`npm run verify` currently lists `npm run test:python -> npm run pack:dry-run ->
npm run test:smoke` but the Mermaid diagram shows `npm run test:python -> npm
run test:smoke -> npm run pack:dry-run`; pick the correct canonical order and
make both places match by editing either the bullet list or the Mermaid flow to
use the same sequence of scripts (`npm run test:python`, `npm run pack:dry-run`,
`npm run test:smoke` or the alternate ordering), ensuring the script names
(`test:python`, `pack:dry-run`, `test:smoke`, and the `npm run verify`
description) are updated consistently.
In `@docs/plans/json-settings/02-policy-model.md`:
- Around line 132-160: The example uses unimplemented keys (workflow.optional
and workflow.triggers) and models "retro" as a trigger
(retrospective_on_epic_complete), which will be rejected by the current
validator; update the example to only use the implemented policy surface: keep
workflow.sequence (include "retro" in the sequence if you want that step),
workflow.repeat (review block with maxCycles, successVerifier, onIncomplete,
onExhausted) and workflow.crash (maxRetries, onExhausted), and remove
workflow.optional, the triggers array, and the retrospective_on_epic_complete
trigger entry as well as skipWhenOverride/skipAutomate usage so the example
matches the bundled policy and covered tests.
In `@docs/plans/json-settings/README.md`:
- Around line 61-63: The README currently lists only the snapshot pointer/hash
but omits other runtime-required metadata; update the "state-policy metadata
contract" to explicitly enumerate all four fields used by the runtime loader:
the snapshot pointer, snapshot hash, policyVersion, and legacyPolicy (or
alternatively add a cross-reference link to the section that fully defines
them). For clarity, show expected types/semantics for each field (e.g., pointer:
string, hash: string, policyVersion: semver/string, legacyPolicy: boolean) and
indicate how the loader uses policyVersion and legacyPolicy to distinguish
legacy state from snapshot-backed state.
In `@payload/.claude/skills/bmad-story-automator-review/workflow.yaml`:
- Line 19: The top-level "contract" field in workflow.yaml is unused by the
runtime; remove or deprecate the top-level contract entry ("contract:
\"./contract.json\"") and ensure all policy contracts are referenced via
step-level settings ("contractFile" / "contractPath") in the step configs or
update the loader to read the top-level field—pick one approach: either delete
the top-level "contract" key from workflow.yaml to avoid confusion, or modify
the policy resolution code to merge/consider the top-level "contract" alongside
step-level "contractFile"/"contractPath" so runtime enforcement is consistent.
In `@payload/.claude/skills/bmad-story-automator/data/orchestration-policy.json`:
- Around line 110-115: The policy currently relies on the text field
defaultExtraInstruction to trigger autonomous auto-fixes which is brittle;
instead add a stable boolean flag (e.g., "autoFix": true) at the same level as
"prompt" or inside the "prompt" object and remove reliance on
defaultExtraInstruction for behavior control; update the
orchestration-policy.json entry containing "prompt", "templateFile",
"interactionMode", "acceptExtraInstruction", and "defaultExtraInstruction" so
that the runtime checks the new "autoFix" flag (and ignore/override
defaultExtraInstruction) to deterministically enable auto-fix behavior.
- Around line 43-49: The snapshot currently records only
templateFile/schemaFile/contract paths (e.g., templateFile, schemaFile,
contractPath) while runtime consumers call read_text() to load files, so mid-run
edits bypass snapshot integrity; change the snapshot logic that builds
orchestration-policy.json to store the actual file contents (e.g.,
templateContent, schemaContent, contractContent) or at minimum the content hash
alongside paths, and update consumers that now call read_text() to instead use
the embedded content from the snapshot (or verify read_text() results against
the stored content hash and fail if mismatched); specifically update the code
that resolves "prompt.templateFile"/"parse.schemaFile" to capture file contents
or hashes at snapshot time and use those fields during resume.
In `@payload/.claude/skills/bmad-story-automator/data/parse/review.json`:
- Around line 2-14: The parser currently only checks top-level "requiredKeys"
presence but must enforce the nested "schema" rules; update the
parsing/validation logic so that "status" only accepts the enum values
SUCCESS|FAILURE|AMBIGUOUS, "issues_found" is an object with integer fields
"critical","high","medium","low", "all_fixed" is a boolean (true|false), and
"next_action" only allows proceed|retry|escalate; locate the code that reads the
"requiredKeys"/"schema" definition (the parser/validator handling review.json)
and implement recursive shape validation and enum/type checks for those specific
keys ("status","issues_found","all_fixed","next_action") and return/raise a
clear validation error when the shape or types do not match.
In `@source/src/story_automator/commands/orchestrator_parse.py`:
- Around line 69-77: The _load_parse_contract function currently only checks
that payload.get("requiredKeys") is a list; update _load_parse_contract to
validate that each element of requiredKeys is a non-empty string (or other
expected key type) and raise ValueError("invalid parse schema") if any item is
not; locate the checks in _load_parse_contract (payload, requiredKeys, schema)
and add an iteration over requiredKeys to assert isinstance(k, str) and
k.strip() != "" (or the project-specific key format), so malformed entries like
true are rejected before runtime and avoid downstream _has_required_keys
failures.
- Around line 14-21: The loop that parses args currently silently ignores a bare
"--state-file" token and leaves state_file as empty, causing
load_runtime_policy(...) to read the live policy; change the parsing in the
while loop that inspects args and state_file so that when args[idx] ==
"--state-file" but there is no following value (idx + 1 >= len(args)) you fail
closed: raise or return an error (e.g., raise ValueError or exit with non-zero
and clear error message) indicating a malformed "--state-file" flag instead of
continuing; ensure the error path is tied to the same parsing block that sets
state_file so load_runtime_policy(...) is only called when a valid non-empty
state_file was provided.
In `@source/src/story_automator/commands/orchestrator.py`:
- Around line 424-444: The loop over tail should treat a flag present without a
following value as a contract error instead of silently ignoring it and the
verifier run must not allow ValueError to escape; update the for loop that
inspects tail (the block that sets state_file/output_file) so that if arg ==
"--state-file" or "--output-file" but idx+1 is out of range or
tail[idx+1].startswith("--"), raise a PolicyError (or otherwise set payload to
the same verifier_contract_invalid error path) instead of ignoring it, and
expand the exception handler to catch ValueError in addition to
FileNotFoundError and PolicyError so that resolve_success_contract and
run_success_verifier errors (including malformed contract ValueError) map to the
verifier_contract_invalid payload.
In `@source/src/story_automator/commands/state.py`:
- Line 46: The snapshot_effective_policy(get_project_root()) call in the
state-doc creation path can raise and currently will crash the command; wrap
that call in a try/except, catch exceptions from snapshot_effective_policy, log
the error, and ensure the command returns a structured JSON result instead of
propagating the exception (for example include keys like "policy_snapshot": null
and "policy_error": "<error message>" in the state-doc output). Update the code
around snapshot_effective_policy/get_project_root so failures are handled
gracefully and the rest of state-doc creation continues or returns the handled
JSON error response.
In `@source/src/story_automator/commands/tmux.py`:
- Around line 787-788: The current branch returning None when verifier_name is
one of {"create_story_artifact", "review_completion", "epic_complete"} and
story_key.strip() is empty hides a verifier misconfiguration; change this to
return an explicit failure (e.g., raise a ValueError or return a failure
sentinel) instead of None so cmd_monitor_session can detect and abort rather
than falling back to normal_completion; update the branch around the
verifier_name check that references story_key.strip() to raise (or return) a
clear error containing the verifier_name and that story_key is required.
In `@source/src/story_automator/commands/validate_story_creation.py`:
- Around line 180-184: The try/except around
create_check_payload()/build_check_response() currently only catches PolicyError
and ValueError so a FileNotFoundError from resolve_success_contract (when
--state-file points to a missing snapshot or resolved contract is absent)
bubbles up; update the exception handling to also catch FileNotFoundError and
call print_check_error(story_id, reason=str(exc), before_count=before_count,
after_count=after_count) so missing snapshot/contract errors are normalized into
the JSON retry/error payload. Ensure the change references create_check_payload,
build_check_response, resolve_success_contract, and print_check_error so the
missing-file path is handled the same as PolicyError/ValueError.
In `@source/src/story_automator/core/review_verify.py`:
- Around line 16-20: The wrapper currently uses "success_contract or
resolve_success_contract(...)" which prevents an injected empty contract and
only catches FileNotFoundError/ValueError, so PolicyError leaks; change to
select the contract with an explicit None check (if success_contract is None:
contract = resolve_success_contract(... ) else: contract = success_contract) and
extend the except clause to include PolicyError (e.g., except
(FileNotFoundError, ValueError, PolicyError) as exc) so malformed/missing
contracts are normalized to the existing {"verified": False, "reason":
"review_contract_invalid", "input": story_key, "error": str(exc)} return;
reference resolve_success_contract and review_completion to locate the code to
update.
In `@source/src/story_automator/core/runtime_policy.py`:
- Around line 70-74: The snapshot currently only hashes the policy JSON but
keeps templateFile, schemaFile, and contractFile as paths so resumed runs
re-resolve those files against the working tree; update
snapshot_effective_policy()/_stable_policy_json to read and embed the actual
contents (or content hashes) of any referenced templateFile, schemaFile, and
contractFile into the stable JSON before computing snapshot_hash and writing
snapshot_path, and ensure load_policy_snapshot() uses those embedded contents
(or verifies embedded content hashes) instead of re-reading from disk; reference
the symbols _stable_policy_json, snapshot_effective_policy,
load_policy_snapshot, templateFile, schemaFile, contractFile, snapshot_dir,
snapshot_path, and policySnapshotHash when making the change.
- Around line 68-74: The code builds snapshot_dir from
_snapshot_relative_dir(policy) and then calls ensure_dir/write_atomic, allowing
an absolute or path-traversal override via snapshot.relativeDir; restrict this
by validating and normalizing the returned path against the workspace root:
obtain root (project_root), compute candidate = (root /
_snapshot_relative_dir(policy)).resolve(), ensure
candidate.is_relative_to(root.resolve()) (or compute candidate.relative_to(root)
and raise/log and abort if it fails), and use that constrained candidate for
ensure_dir and write_atomic; apply the same validation/sanitization to the other
places that use _snapshot_relative_dir or snapshot.relativeDir (the occurrences
around the other reported blocks).
- Around line 354-386: Both _resolve_candidate_file and _resolve_data_path
currently allow absolute paths and parent traversal to arbitrary files; update
them to reject any resolved path that escapes its allowed root(s). For
_resolve_candidate_file (references: function name _resolve_candidate_file,
variables skill_dir, candidates, project_root) resolve each candidate path and
ensure candidate.resolve().is_relative_to(skill_dir.resolve()) (or use try:
candidate.resolve().relative_to(skill_dir.resolve())) before checking is_file(),
and if it escapes raise PolicyError. For _resolve_data_path (references:
function name _resolve_data_path, parameters project_root and bundle_root)
disallow absolute paths that are outside both bundle_root.resolve() and
project_root.resolve(); for relative candidates resolve against each base and
verify resolved_candidate.relative_to(base.resolve()) before accepting it,
otherwise raise PolicyError indicating the path escapes allowed roots. Ensure
checks occur prior to returning file paths so no external filesystem locations
can be referenced.
In `@source/tests/test_runtime_policy.py`:
- Around line 120-132: The tests currently expect load_runtime_policy to
silently fall back to the live policy when a .claude/.story-automator-active
marker is dangling or malformed; instead update the two tests
(test_missing_marker_state_falls_back_to_effective_policy and
test_malformed_marker_falls_back_to_effective_policy) to assert that
load_runtime_policy(str(self.project_root)) raises PolicyError for those marker
conditions (dangling stateFile and invalid JSON respectively), so the codebase
fails closed on a broken active-run marker rather than switching back to the
live policy.
---
Outside diff comments:
In `@payload/.claude/skills/bmad-story-automator/steps-c/step-03-execute.md`:
- Around line 155-157: The dev tmux-wrapper invocation (session=... tmux-wrapper
spawn dev) builds the dev command via tmux-wrapper build-cmd but omits the
--state-file flag, so dev can pick a different policy snapshot than create;
update the build-cmd call inside the session assignment to pass the same
--state-file value used by create (e.g. --state-file "$state_file" or
{state_file}) so tmux-wrapper build-cmd dev {story_id} --agent "$current_agent"
includes --state-file and both spawn dev and build-cmd use the pinned policy
snapshot.
In `@source/src/story_automator/commands/tmux.py`:
- Around line 206-219: The prompt built by _render_step_prompt is being escaped
manually into escaped and passed into shell strings which leaves shell
metacharacters injectable; replace the manual replace-based escaping with
shlex.quote(prompt) (import shlex at top) and use that quoted value in both
branches (the codex exec command and the fallback print using cli) so the prompt
is safely shell-escaped; update references to escaped in the codex branch (the
CODEX_EXEC string) and the else branch (the cli invocation) to use
shlex.quote(prompt) instead of the current escaped variable.
---
Nitpick comments:
In `@source/tests/test_state_policy_metadata.py`:
- Around line 254-262: The test
test_build_cmd_does_not_treat_state_file_flag_as_prompt_text currently only
asserts the literal "--state-file" is absent from the rendered output; extend
the assertion to also ensure the actual state file path value (the
str(state_file) variable) is not present. Modify the test to add an assertion
like self.assertNotIn(str(state_file), rendered) after captured output is read
so the test rejects any regression that leaks the state file value into the
rendered prompt.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c0373739-64c5-4287-bdbb-6d49573cb2e8
📒 Files selected for processing (45)
docs/changelog/260413.mddocs/cli-reference.mddocs/development.mddocs/plans/json-settings/01-why-json-settings.mddocs/plans/json-settings/02-policy-model.mddocs/plans/json-settings/03-code-and-payload-changes.mddocs/plans/json-settings/04-migration-testing-and-risks.mddocs/plans/json-settings/README.mddocs/plans/json-settings/TODO.mddocs/state-and-resume.mdpackage.jsonpayload/.claude/skills/bmad-story-automator-review/contract.jsonpayload/.claude/skills/bmad-story-automator-review/workflow.yamlpayload/.claude/skills/bmad-story-automator/data/code-review-loop.mdpayload/.claude/skills/bmad-story-automator/data/monitoring-pattern.mdpayload/.claude/skills/bmad-story-automator/data/orchestration-policy.jsonpayload/.claude/skills/bmad-story-automator/data/parse/auto.jsonpayload/.claude/skills/bmad-story-automator/data/parse/create.jsonpayload/.claude/skills/bmad-story-automator/data/parse/dev.jsonpayload/.claude/skills/bmad-story-automator/data/parse/retro.jsonpayload/.claude/skills/bmad-story-automator/data/parse/review.jsonpayload/.claude/skills/bmad-story-automator/data/prompts/auto.mdpayload/.claude/skills/bmad-story-automator/data/prompts/create.mdpayload/.claude/skills/bmad-story-automator/data/prompts/dev.mdpayload/.claude/skills/bmad-story-automator/data/prompts/retro.mdpayload/.claude/skills/bmad-story-automator/data/prompts/review.mdpayload/.claude/skills/bmad-story-automator/data/retry-fallback-implementation.mdpayload/.claude/skills/bmad-story-automator/data/scripts-reference.mdpayload/.claude/skills/bmad-story-automator/steps-c/step-03-execute.mdpayload/.claude/skills/bmad-story-automator/templates/state-document.mdpayload/.claude/skills/bmad-story-automator/workflow.mdscripts/smoke-test.shsource/src/story_automator/commands/orchestrator.pysource/src/story_automator/commands/orchestrator_parse.pysource/src/story_automator/commands/state.pysource/src/story_automator/commands/tmux.pysource/src/story_automator/commands/validate_story_creation.pysource/src/story_automator/core/review_verify.pysource/src/story_automator/core/runtime_policy.pysource/src/story_automator/core/success_verifiers.pysource/src/story_automator/core/workflow_paths.pysource/tests/test_orchestrator_parse.pysource/tests/test_runtime_policy.pysource/tests/test_state_policy_metadata.pysource/tests/test_success_verifiers.py
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
payload/.claude/skills/bmad-story-automator/steps-c/step-03a-execute-review.md (1)
77-84:⚠️ Potential issue | 🟠 MajorThe compact review summary drops issue counts from the shipped review contract.
parse-output "$review_log" reviewis validated against the review parse contract, which guaranteesissues_foundandsummary, not an.issuesarray. In this form,issues_countwill stay0andtop_issueswill stay[]unless the model happens to emit undocumented extras, so the parent flow loses the actual review signal.Suggested fix
-review_summary=$("$scripts" orchestrator-helper parse-output "$review_log" review --state-file "$state_file" | jq -c ' - { - next_action: (.next_action // "retry"), - confidence: (.confidence // 0), - error_class: (.error_class // "unknown"), - issues_count: ((.issues // []) | length), - top_issues: ((.issues // [])[:3]) - } -') +review_summary=$("$scripts" orchestrator-helper parse-output "$review_log" review --state-file "$state_file" | jq -c ' + { + next_action: (.next_action // "retry"), + confidence: (.confidence // 0), + error_class: (.error_class // "unknown"), + issues_count: ((.issues_found // {}) | [.critical, .high, .medium, .low] | map(. // 0) | add), + top_issues: ([.summary] | map(select(type == "string" and length > 0))[:3]) + } +')🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@payload/.claude/skills/bmad-story-automator/steps-c/step-03a-execute-review.md` around lines 77 - 84, The compact review summary is reading a non-existent .issues array so issues_count and top_issues are always empty; update the parsing of the output produced by parse-output "$review_log" review to use the contract fields (e.g. use .issues_found for the count and .summary for the returned summary/top issues) instead of .issues; modify the block that builds review_summary (the jq transform invoked after parse-output) to compute issues_count from (.issues_found // 0) and set top_issues from (.summary // [] or appropriate shape) while preserving the existing next_action/confidence/error_class fallbacks.
🧹 Nitpick comments (2)
source/src/story_automator/core/runtime_policy.py (1)
501-516: Consider clarifying the fallback logic for missing policy fields.The
_state_policy_modefunction returns("", "", True)(legacy mode) in two cases: whenlegacyPolicy == "true"and when all policy fields are absent. This fallback behavior is correct but the duplicate return on lines 515-516 could be consolidated.💡 Simplify fallback return
The final two returns (lines 514-516) both return
("", "", True). Consider consolidating:if legacy_policy == "true": return "", "", True - return "", "", True + # No policy metadata present - treat as legacy + return "", "", TrueA comment would clarify that reaching line 516 means no policy metadata was found, defaulting to legacy mode.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/src/story_automator/core/runtime_policy.py` around lines 501 - 516, In _state_policy_mode, consolidate the duplicate final returns that both return ("", "", True) by removing the redundant branch and leaving a single default return for legacy mode; specifically, handle the explicit legacy_policy == "true" case with one return and then have a final return ("", "", True) as the default when no metadata is present, and add a short comment explaining that the final return is the fallback to legacy mode when no policy metadata is supplied.source/tests/test_orchestrator_parse.py (1)
131-146: Verify mock assertion for runtime config wiring.The test checks that
run_cmdis called with expected provider/model arguments and timeout. However, the assertion usescall_args.args[:4]which assumes positional argument ordering.💡 Consider using keyword argument assertions for clarity
If
run_cmdsignature changes, this test could become fragile. Consider verifying specific arguments by name if the mock supports it, or add a comment documenting the expected argument order.# Current assertion relies on positional order: (provider, "-p", "--model", model) self.assertEqual(mock_run.call_args.args[:4], ("claude", "-p", "--model", "sonnet"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/test_orchestrator_parse.py` around lines 131 - 146, The assertion in test_parser_runtime_uses_policy_settings is fragile because it inspects run_cmd positional args via mock_run.call_args.args[:4]; update the test to assert the specific parameters more robustly by either checking mock_run.call_args.kwargs for named parameters (e.g., "timeout") and using helper checks for the provider/model flags, or use mock_run.assert_any_call / mock_run.assert_called_with with the exact expected signature to explicitly verify ("claude", "-p", "--model", "sonnet") and timeout=33; update references to run_cmd, parse_output_action, and mock_run accordingly so the test no longer depends on positional ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@source/src/story_automator/commands/tmux.py`:
- Around line 793-801: The code in resolve_success_contract handling returns
None when the contract omits "verifier", causing cmd_monitor_session() to treat
the run as "normal_completion"; change that behavior to fail closed like
orchestrator.py by returning the same error tuple used for invalid contracts.
Specifically, inside the block after verifier_name =
str(contract.get("verifier") or "").strip(), replace the "if not verifier_name:
return None" with a return ({"verified": False, "reason":
"verifier_contract_invalid"}, "") so resolve_success_contract /
cmd_monitor_session and the verifier_name logic (and the subsequent special-case
check for create_story_artifact/review_completion/epic_complete) handle missing
verifiers as an invalid contract rather than a successful normal completion.
- Around line 202-216: The code currently catches
load_runtime_policy/step_contract errors but calls _render_step_prompt
afterwards which can raise FileNotFoundError (missing template) and crash;
update the flow to catch template-loading/rendering errors too by either moving
the call to _render_step_prompt (and subsequent shlex.quote usage) into the same
try block that handles load_runtime_policy and step_contract, or wrap the
_render_step_prompt call in its own try/except that catches FileNotFoundError
(and PolicyError if applicable), prints the exception to stderr (same style as
existing handler) and returns 1; reference policy, contract,
_render_step_prompt, prompt, and shlex.quote so the fix keeps consistent error
handling for missing prompt templates.
---
Outside diff comments:
In
`@payload/.claude/skills/bmad-story-automator/steps-c/step-03a-execute-review.md`:
- Around line 77-84: The compact review summary is reading a non-existent
.issues array so issues_count and top_issues are always empty; update the
parsing of the output produced by parse-output "$review_log" review to use the
contract fields (e.g. use .issues_found for the count and .summary for the
returned summary/top issues) instead of .issues; modify the block that builds
review_summary (the jq transform invoked after parse-output) to compute
issues_count from (.issues_found // 0) and set top_issues from (.summary // []
or appropriate shape) while preserving the existing
next_action/confidence/error_class fallbacks.
---
Nitpick comments:
In `@source/src/story_automator/core/runtime_policy.py`:
- Around line 501-516: In _state_policy_mode, consolidate the duplicate final
returns that both return ("", "", True) by removing the redundant branch and
leaving a single default return for legacy mode; specifically, handle the
explicit legacy_policy == "true" case with one return and then have a final
return ("", "", True) as the default when no metadata is present, and add a
short comment explaining that the final return is the fallback to legacy mode
when no policy metadata is supplied.
In `@source/tests/test_orchestrator_parse.py`:
- Around line 131-146: The assertion in test_parser_runtime_uses_policy_settings
is fragile because it inspects run_cmd positional args via
mock_run.call_args.args[:4]; update the test to assert the specific parameters
more robustly by either checking mock_run.call_args.kwargs for named parameters
(e.g., "timeout") and using helper checks for the provider/model flags, or use
mock_run.assert_any_call / mock_run.assert_called_with with the exact expected
signature to explicitly verify ("claude", "-p", "--model", "sonnet") and
timeout=33; update references to run_cmd, parse_output_action, and mock_run
accordingly so the test no longer depends on positional ordering.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c8edcd3e-5163-4da9-91f9-7aab50250164
📒 Files selected for processing (21)
docs/changelog/260414.mddocs/changelog/260415.mddocs/development.mddocs/plans/json-settings/02-policy-model.mddocs/plans/json-settings/03-code-and-payload-changes.mddocs/plans/json-settings/README.mdpayload/.claude/skills/bmad-story-automator/steps-c/step-03-execute.mdpayload/.claude/skills/bmad-story-automator/steps-c/step-03a-execute-review.mdscripts/smoke-test.shsource/src/story_automator/commands/orchestrator.pysource/src/story_automator/commands/orchestrator_parse.pysource/src/story_automator/commands/state.pysource/src/story_automator/commands/tmux.pysource/src/story_automator/commands/validate_story_creation.pysource/src/story_automator/core/review_verify.pysource/src/story_automator/core/runtime_policy.pysource/src/story_automator/core/success_verifiers.pysource/tests/test_orchestrator_parse.pysource/tests/test_runtime_policy.pysource/tests/test_state_policy_metadata.pysource/tests/test_success_verifiers.py
✅ Files skipped from review due to trivial changes (4)
- docs/changelog/260415.md
- payload/.claude/skills/bmad-story-automator/steps-c/step-03-execute.md
- docs/plans/json-settings/README.md
- docs/plans/json-settings/02-policy-model.md
🚧 Files skipped from review as they are similar to previous changes (4)
- docs/development.md
- docs/plans/json-settings/03-code-and-payload-changes.md
- source/src/story_automator/commands/validate_story_creation.py
- source/src/story_automator/commands/orchestrator.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@source/src/story_automator/commands/tmux.py`:
- Around line 736-755: The code currently infers output_verified from the
presence of the output file when calling _emit_monitor, causing contradictory
JSON; instead thread the actual verifier result returned by
_verify_monitor_completion into _emit_monitor. In the branch where verification
is not None (the variables verification -> verified, verifier_name), pass
output_verified as bool(verified.get("verified")) (or the appropriate truthy
check of verified["verified"]) to _emit_monitor for both the "completed" and
"incomplete" calls rather than using bool(output) so the emitted JSON reflects
the verifier outcome; make the same change in the similar block around the lines
noted (the other _emit_monitor calls at 777-779).
- Around line 802-811: The try/except around run_success_verifier currently only
catches PolicyError, so file-missing errors (e.g., FileNotFoundError from a
verifier contract referencing a deleted JSON) escape cmd_monitor_session; update
the handler around run_success_verifier to also catch FileNotFoundError (and
optionally OSError) and return ({"verified": False, "reason":
"verifier_contract_invalid"}, verifier_name) just like the existing PolicyError
branch; touch the block that calls run_success_verifier (and related
resolve_success_contract flow) so missing verifier-side files are normalized to
the same verifier_contract_invalid result.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0dc5f9d4-2169-44a1-910f-1d56791d95a1
📒 Files selected for processing (4)
docs/changelog/260415.mdsource/src/story_automator/commands/tmux.pysource/tests/test_state_policy_metadata.pysource/tests/test_success_verifiers.py
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/changelog/260415.md
There was a problem hiding this comment.
🧹 Nitpick comments (2)
source/tests/test_success_verifiers.py (1)
700-717: Consider importingosat module level.The
patch_envclass importsoslocally in each method. Sinceosis a standard library module with no side effects on import, moving it to the top-level imports would be more idiomatic.🔧 Suggested change
Add to imports at top of file:
import io import json +import os import shutilThen simplify the class:
def __enter__(self) -> None: - import os - self.previous = os.environ.get("PROJECT_ROOT") os.environ["PROJECT_ROOT"] = self.project_root def __exit__(self, exc_type, exc, tb) -> None: - import os - if self.previous is None:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/test_success_verifiers.py` around lines 700 - 717, Import the standard library module os at the module level and remove the local imports inside the patch_env class; update the class (patch_env) to reference os directly in its __enter__ and __exit__ methods instead of performing "import os" there, keeping the existing behavior of saving/restoring PROJECT_ROOT via self.previous and os.environ.source/src/story_automator/commands/tmux.py (1)
743-765: Dead code path at line 765.
_verify_monitor_completion()always returns atuple[dict, str]—no code path returnsNone. This makes line 765 unreachable under the current implementation. While not harmful (it's defensive), consider either:
- Removing the
Nonereturn type from_verify_monitor_completionand this fallback, or- Adding a comment explaining this is intentional defensive code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/src/story_automator/commands/tmux.py` around lines 743 - 765, The final fallback branch handling verification is None is unreachable because _verify_monitor_completion() always returns a (dict, str) tuple; remove the dead path by deleting the unconditional return that assumes verification is None and instead rely on the verified/incomplete branches; also update _verify_monitor_completion's return type/signature (and any callers) to reflect it never returns None or, if you prefer defensive clarity, keep the fallback but add a concise comment above it stating it is intentionally defensive and why. Ensure references to _verify_monitor_completion and the local variable verification are updated accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@source/src/story_automator/commands/tmux.py`:
- Around line 743-765: The final fallback branch handling verification is None
is unreachable because _verify_monitor_completion() always returns a (dict, str)
tuple; remove the dead path by deleting the unconditional return that assumes
verification is None and instead rely on the verified/incomplete branches; also
update _verify_monitor_completion's return type/signature (and any callers) to
reflect it never returns None or, if you prefer defensive clarity, keep the
fallback but add a concise comment above it stating it is intentionally
defensive and why. Ensure references to _verify_monitor_completion and the local
variable verification are updated accordingly.
In `@source/tests/test_success_verifiers.py`:
- Around line 700-717: Import the standard library module os at the module level
and remove the local imports inside the patch_env class; update the class
(patch_env) to reference os directly in its __enter__ and __exit__ methods
instead of performing "import os" there, keeping the existing behavior of
saving/restoring PROJECT_ROOT via self.previous and os.environ.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c94a33d9-0635-4ee7-a317-11a091c4498e
📒 Files selected for processing (3)
docs/changelog/260415.mdsource/src/story_automator/commands/tmux.pysource/tests/test_success_verifiers.py
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/changelog/260415.md
Summary
runtime_policyand shared success verifier flowsTesting
Summary by CodeRabbit
New Features
Bug Fixes & Improvements
Documentation
Tests