mesh(tools): robot_mesh HITL via tool_context.interrupt + per-action rate limit (7/9 of #195 split)#227
Conversation
…rate limit The LLM tool surface. Approval delivered out-of-band of the LLM's tool-argument flow (via Strands SDK tool_context.interrupt), so prompt injection cannot smuggle approval. Only y / yes / approve / approved (case- and whitespace-insensitive, exact-match after strip().lower()) approve. Declined approvals do not consume rate-limit slots. Per-action rate limit (separate buckets for emergency_stop / broadcast / execute) with token-bucket semantics. Modified: - strands_robots/tools/robot_mesh.py (+324/-8) -- HITL flow, per- action rate limit, client-side validate_command (defence-in-depth -- programmatic callers go through the same gate as wire receive). Carries review fixes from strands-labs#195: R11 (tool surface API drift), R16 (test file naming -- subject-named not methodology-named), R24 (_exec_cmd non-dict rejection client-side counterpart). Tests (325 LOC): HITL exact-match string parsing, declined != consumed, per-action bucket isolation, prompt-injection isolation (approval not extractable from LLM context), validate_command client-side. Tracking: strands-labs#219 Source PR: strands-labs#195 Lands after: strands-labs#220 (PR-1), PR-2 (security), PR-6 (core.Mesh contract)
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Replaces the old confirm: bool tool argument on robot_mesh with an out-of-band Strands SDK tool_context.interrupt(...) for emergency_stop / broadcast, adds per-action sliding-window rate limits with separate buckets, and runs mesh.security.validate_command client-side as defence-in-depth. Approval is exact-match (y / yes / approve / approved). Declined approvals deliberately do not consume a rate-limit slot, with a TOCTOU close-out via an atomic _rate_limit_check_and_record on the post-interrupt path. The threat model and the four layers of defence are clearly stated in the test docstring.
The core security argument holds: nothing the LLM can put in tool args can flip the approval bit, since the operator response arrives via the SDK's interrupt channel, not via the tool's argument dict. The fail-closed behaviour when interrupts are unavailable (RuntimeError from tool_context.interrupt) is correctly distinguished from InterruptException propagation, and the comment at line 375-379 calls this out explicitly.
What's good
- Check / record split for the rate limiter (with rationale in the docstring) is a thoughtful fix for the "declined approval starves real emergency_stop" failure mode.
- TOCTOU close-out via
_rate_limit_check_and_recordis a real bug fix backed byTestRateLimitTOCTOU. - Pre-interrupt validation of the broadcast command (R8-7) avoids burning operator approval and audit rows on a payload the validator will reject.
_audit_tool_action's wideexceptis correctly justified inline against the AGENTS.md narrow-clause rule.- Tests cover all four advertised defence layers (interrupt gate, rate limit, validation, audit).
Concerns
- Rate-limit list drift in PR description: the description says "separate buckets for
emergency_stop/broadcast/execute", but the implementation has buckets fortell/send/broadcast/stop/emergency_stop(noexecute). Minor but worth fixing in the description so reviewers don't grep for a non-existent bucket. - Signature change is a breaking API change: adding a required
tool_context: ToolContextparameter (even with@tool(context=True)) breaks any non-SDK caller that previously didrobot_mesh(action="peers")directly. The CHANGELOG / migration note for #195's split should call this out, especially because pre-1.0 breaking changes need explicit documentation per AGENTS.md. - CI red on this branch in isolation is acknowledged in the PR description. Re-confirm green after PR-1 (#220), PR-2 (#223), PR-6 land before merging this.
- Inline comments below flag a real defence-in-depth bug (the
assert), an audit-coverage gap on mesh.X exceptions, and a missing regression test.
Verification suggestions
# Confirm assert-stripped behaviour reproduces the broadcast-with-None-cmd path:
python -OO -c 'from strands_robots.tools import robot_mesh as m; ...' # see inline
# Run the PR's test file against the un-stacked branch (will need PR-1/2/6 first):
hatch run pytest tests/mesh/test_robot_mesh_security.py -v
# Spot-check the audit path: trigger an emergency_stop where mesh.emergency_stop()
# raises (e.g. session torn down) and confirm an audit row is still emitted.| # error (someone added "broadcast" without the pre-validate). | ||
| assert validated_broadcast_cmd is not None, ( | ||
| "broadcast reached its handler without pre-validation — R8-7 contract broken" | ||
| ) |
There was a problem hiding this comment.
assert is stripped under python -O — if anyone ever runs the agent with optimisation flags (or a downstream packager does), this assert becomes a no-op and cmd falls through to None, which then gets passed straight into mesh.broadcast(None, timeout=timeout) on the next line. That defeats the R8-7 contract this comment is asserting.
Replace with an explicit raise so the invariant survives -O:
if validated_broadcast_cmd is None:
raise RuntimeError(
"broadcast reached its handler without pre-validation — R8-7 contract broken"
)
cmd = validated_broadcast_cmd(General rule: assert is for tests and developer self-checks the program can survive losing; control-flow invariants that are part of a security contract should never be guarded by assert.)
| ) | ||
| cmd = validated_broadcast_cmd | ||
| results = mesh.broadcast(cmd, timeout=timeout) | ||
| _audit_tool_action(action, "*", True, f"action={cmd.get('action')} responses={len(results)}") |
There was a problem hiding this comment.
Audit log gap when mesh.X raises. The docstring (line 310-311) promises "Every tell / send / broadcast / stop / emergency_stop is audited." In practice, the audit row is only written after the underlying mesh.broadcast / mesh.tell / mesh.send / mesh.emergency_stop call returns. If those raise (e.g. zenoh session dropped, RPC timeout that surfaces as an exception, broken transport), the @tool decorator catches the exception but no _audit_tool_action(..., success=False, ...) row is emitted — so a successful operator approval that then failed at the wire layer is invisible in the audit log.
For a safety-significant tool the audit log should be the last thing to break. Wrap each mesh.X call:
try:
results = mesh.broadcast(cmd, timeout=timeout)
except Exception as exc: # noqa: BLE001 — audit must record any failure
_audit_tool_action(action, "*", False, f"mesh.broadcast raised: {exc!r}")
raiseApplies symmetrically to tell (L466), send (L491), stop (L519), emergency_stop (L527), subscribe (L537), watch (L552). Per AGENTS.md > Review Learnings (#85) > "Pin regression tests for reviewed fixes", this also wants a test that asserts an audit row exists when mesh.broadcast raises.
| # pause the agent loop, so we deliberately do NOT catch | ||
| # ``Exception`` here — that would swallow the normal | ||
| # interrupt-pause flow and turn every approval into an | ||
| # immediate "interrupt unavailable" error. |
There was a problem hiding this comment.
Add a regression test pinning InterruptException propagation. The comment correctly explains why the clause is except RuntimeError and not except Exception — swallowing InterruptException would break the SDK's pause flow and silently turn every approval into "interrupt unavailable". But there's no test that asserts this. A future refactor that widens the except (or adds a try: ... except Exception: wrapper higher up) would silently re-introduce the bug.
Per AGENTS.md > Review Learnings (#85) > "Pin regression tests for reviewed fixes", add a test along the lines of:
class _FakeInterrupt(BaseException): # mimics SDK behaviour
pass
def test_interrupt_exception_propagates(self):
ctx = MagicMock(name="ToolContext")
ctx.interrupt.side_effect = _FakeInterrupt("agent loop pause")
with patch("strands_robots.tools.robot_mesh._resolve_mesh", return_value=_stub_mesh()):
with pytest.raises(_FakeInterrupt):
_call("emergency_stop", ctx=ctx)Ideally use the actual SDK exception type, but a stand-in BaseException subclass works because the production code only catches RuntimeError.
| rl_race_err = _rate_limit_check_and_record(action) | ||
| if rl_race_err is not None: | ||
| _audit_tool_action(action, target, False, f"rate_limit_race: {rl_race_err}") | ||
| return _err(rl_race_err) |
There was a problem hiding this comment.
Surprising UX: operator approves, action then rejected by rate-limit race. When the post-approval _rate_limit_check_and_record rejects (because a concurrent invocation raced past while waiting on the operator), the audit row reads rate_limit_race, the operator was prompted and said yes, and the action did not run. From the operator's chair this looks like "I approved an emergency stop and nothing happened."
The behaviour is technically correct (the limit is the limit), but consider:
- The error message returned to the agent should explicitly tell the operator their approval was wasted, so they don't assume the e-stop succeeded. The current text ("rate limit exceeded ... raced past") is for the LLM, not the human.
- Logging at
WARNINGlevel here (not just audit-DEBUG) would surface this rare race in production logs. - Worth considering whether
emergency_stopspecifically should have an exception path that bypasses the limit on a confirmed operator override — the docstring on_rate_limit_checksays "the rate limit exists to bound LLM-driven nuisance, not to inhibit a genuine emergency," but the post-approval rejection does exactly that for the unlucky race-loser.
| @tool(context=True) | ||
| def robot_mesh( | ||
| action: str, | ||
| tool_context: ToolContext, |
There was a problem hiding this comment.
Breaking signature change worth a CHANGELOG / migration note. Adding tool_context: ToolContext as a required positional parameter (even with @tool(context=True) injecting it via the SDK) breaks any caller that previously invoked robot_mesh(action="peers") directly without going through the agent loop — e.g. unit tests in downstream consumers, scripts that did from strands_robots.tools.robot_mesh import robot_mesh; robot_mesh(action="peers"). Such callers will now get TypeError: missing required positional argument: 'tool_context'.
This is fine as an intentional change (the tool is by definition agent-facing) but per the AGENTS.md "pre-1.0 breaking change without a CHANGELOG entry or migration sketch" guidance from past review-learnings, the split's release notes / CHANGELOG should call this out explicitly along with the recommended replacement (the __wrapped__ / mock-context pattern your tests use).
The test_application_security.py file (358 LOC) and the TestF15WireCommandKeyRequired class in test_validate_command.py exercise modules that land in later PRs of the strands-labs#195 split: - bridge_transport.DEFAULT_BRIDGE_SUFFIXES with 'safety/resume' (bridge_transport additions land in PR-5 / strands-labs#222) - audit._seq_sidecar_path, audit._resolve_log_max_bytes, F2/F3 symlink protections (land in PR-4 / strands-labs#221) - core.Mesh._expected_responders, core.BROADCAST_RESPONDER, the wire-side _exec_cmd dict-shape gate (land in PR-6 / strands-labs#225) - robot_mesh._interrupt_approves (lands in PR-7 / strands-labs#227) PR-2 is scoped to the leaf module strands_robots/mesh/security.py with zero internal mesh deps. These tests were pre-staged here but need their consumers to land first. They will re-land in the appropriate consumer PR. Test impact: 9 failing tests removed (8 from test_application_security.py, the F15 class moved out). All 571 remaining mesh tests pass. No production code change; no behaviour change on validate_command or any security.py contract.
🎯 Pentest evidence for this PR (#227 — HITL + rate limit)2026-05-26 pentest run What this PR is achieving (positive control)The HITL
Sample model reasoning quoted in
Caveats worth a regression test in this PR
Full writeup: F-07 / B-07 and F-12. LGTM signals from the pentest
Mapping for all PRs: |
Part 7 / 9 of the split of #195 — tracked by #219.
Drafted until PR-2 (#223) and PR-6 land.
The LLM tool surface
Approval delivered out-of-band of the LLM's tool-argument flow (via Strands SDK
tool_context.interrupt), so prompt injection cannot smuggle approval. Onlyy/yes/approve/approved(case- and whitespace-insensitive, exact-match afterstrip().lower()) approve. Declined approvals do not consume rate-limit slots.Per-action rate limit (separate buckets for
emergency_stop/broadcast/execute) with token-bucket semantics.What's in this PR
strands_robots/tools/robot_mesh.py(+324/-8) — HITL flow, per-action rate limit, client-sidevalidate_command(defence-in-depth — programmatic callers go through the same gate as wire receive).Reviewer focus
tool_context.interrupt, not via tool args. Verify the LLM has no mechanism to extract or fabricate the approval token.emergency_stopflood from cannibalisingexecutequota.Carries review fixes from #195
R11 (tool surface API drift), R16 (test file naming — subject-named not methodology-named), R24 (
_exec_cmdnon-dict rejection client-side counterpart).Stacking note
Depends on PR-1 (#220), PR-2 (#223), PR-6. CI on this branch in isolation is expected red until those land.
Landing order: PR-1 → PR-2/3/4/5 → PR-6 → PR-7 → PR-8 → PR-9. Full plan:
PR_LIST.md. Tracking: #219.