mesh(tools): robot_mesh HITL via tool_context.interrupt + per-action rate limit (7/9 of #195 split) by cagataycali · Pull Request #227 · strands-labs/robots

cagataycali · 2026-05-25T20:38:21Z

Part 7 / 9 of the split of #195 — tracked by #219.

Drafted until PR-2 (#223) and PR-6 land.

The LLM tool surface

Approval delivered out-of-band of the LLM's tool-argument flow (via Strands SDK tool_context.interrupt), so prompt injection cannot smuggle approval. Only y / yes / approve / approved (case- and whitespace-insensitive, exact-match after strip().lower()) approve. Declined approvals do not consume rate-limit slots.

Per-action rate limit (separate buckets for emergency_stop / broadcast / execute) with token-bucket semantics.

What's in this PR

strands_robots/tools/robot_mesh.py (+324/-8) — HITL flow, per-action rate limit, client-side validate_command (defence-in-depth — programmatic callers go through the same gate as wire receive).
2 test files (325 LOC).

Reviewer focus

Prompt injection isolation: approval delivered via SDK tool_context.interrupt, not via tool args. Verify the LLM has no mechanism to extract or fabricate the approval token.
Per-action rate limit: separate token buckets prevent emergency_stop flood from cannibalising execute quota.
Declined-approval doesn't consume slots.
Exact-match approval (no fuzzy match, no "yeah" / "sure" / "go").

Carries review fixes from #195

R11 (tool surface API drift), R16 (test file naming — subject-named not methodology-named), R24 (_exec_cmd non-dict rejection client-side counterpart).

Stacking note

Depends on PR-1 (#220), PR-2 (#223), PR-6. CI on this branch in isolation is expected red until those land.

Landing order: PR-1 → PR-2/3/4/5 → PR-6 → PR-7 → PR-8 → PR-9. Full plan: PR_LIST.md. Tracking: #219.

…rate limit The LLM tool surface. Approval delivered out-of-band of the LLM's tool-argument flow (via Strands SDK tool_context.interrupt), so prompt injection cannot smuggle approval. Only y / yes / approve / approved (case- and whitespace-insensitive, exact-match after strip().lower()) approve. Declined approvals do not consume rate-limit slots. Per-action rate limit (separate buckets for emergency_stop / broadcast / execute) with token-bucket semantics. Modified: - strands_robots/tools/robot_mesh.py (+324/-8) -- HITL flow, per- action rate limit, client-side validate_command (defence-in-depth -- programmatic callers go through the same gate as wire receive). Carries review fixes from strands-labs#195: R11 (tool surface API drift), R16 (test file naming -- subject-named not methodology-named), R24 (_exec_cmd non-dict rejection client-side counterpart). Tests (325 LOC): HITL exact-match string parsing, declined != consumed, per-action bucket isolation, prompt-injection isolation (approval not extractable from LLM context), validate_command client-side. Tracking: strands-labs#219 Source PR: strands-labs#195 Lands after: strands-labs#220 (PR-1), PR-2 (security), PR-6 (core.Mesh contract)

yinsong1986

Summary

Replaces the old confirm: bool tool argument on robot_mesh with an out-of-band Strands SDK tool_context.interrupt(...) for emergency_stop / broadcast, adds per-action sliding-window rate limits with separate buckets, and runs mesh.security.validate_command client-side as defence-in-depth. Approval is exact-match (y / yes / approve / approved). Declined approvals deliberately do not consume a rate-limit slot, with a TOCTOU close-out via an atomic _rate_limit_check_and_record on the post-interrupt path. The threat model and the four layers of defence are clearly stated in the test docstring.

The core security argument holds: nothing the LLM can put in tool args can flip the approval bit, since the operator response arrives via the SDK's interrupt channel, not via the tool's argument dict. The fail-closed behaviour when interrupts are unavailable (RuntimeError from tool_context.interrupt) is correctly distinguished from InterruptException propagation, and the comment at line 375-379 calls this out explicitly.

What's good

Check / record split for the rate limiter (with rationale in the docstring) is a thoughtful fix for the "declined approval starves real emergency_stop" failure mode.
TOCTOU close-out via _rate_limit_check_and_record is a real bug fix backed by TestRateLimitTOCTOU.
Pre-interrupt validation of the broadcast command (R8-7) avoids burning operator approval and audit rows on a payload the validator will reject.
_audit_tool_action's wide except is correctly justified inline against the AGENTS.md narrow-clause rule.
Tests cover all four advertised defence layers (interrupt gate, rate limit, validation, audit).

Concerns

Rate-limit list drift in PR description: the description says "separate buckets for emergency_stop / broadcast / execute", but the implementation has buckets for tell / send / broadcast / stop / emergency_stop (no execute). Minor but worth fixing in the description so reviewers don't grep for a non-existent bucket.
Signature change is a breaking API change: adding a required tool_context: ToolContext parameter (even with @tool(context=True)) breaks any non-SDK caller that previously did robot_mesh(action="peers") directly. The CHANGELOG / migration note for #195's split should call this out, especially because pre-1.0 breaking changes need explicit documentation per AGENTS.md.
CI red on this branch in isolation is acknowledged in the PR description. Re-confirm green after PR-1 (#220), PR-2 (#223), PR-6 land before merging this.
Inline comments below flag a real defence-in-depth bug (the assert), an audit-coverage gap on mesh.X exceptions, and a missing regression test.

Verification suggestions

# Confirm assert-stripped behaviour reproduces the broadcast-with-None-cmd path:
python -OO -c 'from strands_robots.tools import robot_mesh as m; ...'  # see inline

# Run the PR's test file against the un-stacked branch (will need PR-1/2/6 first):
hatch run pytest tests/mesh/test_robot_mesh_security.py -v

# Spot-check the audit path: trigger an emergency_stop where mesh.emergency_stop()
# raises (e.g. session torn down) and confirm an audit row is still emitted.

yinsong1986 · 2026-05-25T21:01:20Z

+        # error (someone added "broadcast" without the pre-validate).
+        assert validated_broadcast_cmd is not None, (
+            "broadcast reached its handler without pre-validation — R8-7 contract broken"
+        )


assert is stripped under python -O — if anyone ever runs the agent with optimisation flags (or a downstream packager does), this assert becomes a no-op and cmd falls through to None, which then gets passed straight into mesh.broadcast(None, timeout=timeout) on the next line. That defeats the R8-7 contract this comment is asserting.

Replace with an explicit raise so the invariant survives -O:

if validated_broadcast_cmd is None: raise RuntimeError( "broadcast reached its handler without pre-validation — R8-7 contract broken" ) cmd = validated_broadcast_cmd

(General rule: assert is for tests and developer self-checks the program can survive losing; control-flow invariants that are part of a security contract should never be guarded by assert.)

yinsong1986 · 2026-05-25T21:01:20Z

+        )
+        cmd = validated_broadcast_cmd
        results = mesh.broadcast(cmd, timeout=timeout)
+        _audit_tool_action(action, "*", True, f"action={cmd.get('action')} responses={len(results)}")


Audit log gap when mesh.X raises. The docstring (line 310-311) promises "Every tell / send / broadcast / stop / emergency_stop is audited." In practice, the audit row is only written after the underlying mesh.broadcast / mesh.tell / mesh.send / mesh.emergency_stop call returns. If those raise (e.g. zenoh session dropped, RPC timeout that surfaces as an exception, broken transport), the @tool decorator catches the exception but no _audit_tool_action(..., success=False, ...) row is emitted — so a successful operator approval that then failed at the wire layer is invisible in the audit log.

For a safety-significant tool the audit log should be the last thing to break. Wrap each mesh.X call:

try: results = mesh.broadcast(cmd, timeout=timeout) except Exception as exc: # noqa: BLE001 — audit must record any failure _audit_tool_action(action, "*", False, f"mesh.broadcast raised: {exc!r}") raise

Applies symmetrically to tell (L466), send (L491), stop (L519), emergency_stop (L527), subscribe (L537), watch (L552). Per AGENTS.md > Review Learnings (#85) > "Pin regression tests for reviewed fixes", this also wants a test that asserts an audit row exists when mesh.broadcast raises.

yinsong1986 · 2026-05-25T21:01:20Z

+            # pause the agent loop, so we deliberately do NOT catch
+            # ``Exception`` here — that would swallow the normal
+            # interrupt-pause flow and turn every approval into an
+            # immediate "interrupt unavailable" error.


Add a regression test pinning InterruptException propagation. The comment correctly explains why the clause is except RuntimeError and not except Exception — swallowing InterruptException would break the SDK's pause flow and silently turn every approval into "interrupt unavailable". But there's no test that asserts this. A future refactor that widens the except (or adds a try: ... except Exception: wrapper higher up) would silently re-introduce the bug.

Per AGENTS.md > Review Learnings (#85) > "Pin regression tests for reviewed fixes", add a test along the lines of:

class _FakeInterrupt(BaseException): # mimics SDK behaviour pass def test_interrupt_exception_propagates(self): ctx = MagicMock(name="ToolContext") ctx.interrupt.side_effect = _FakeInterrupt("agent loop pause") with patch("strands_robots.tools.robot_mesh._resolve_mesh", return_value=_stub_mesh()): with pytest.raises(_FakeInterrupt): _call("emergency_stop", ctx=ctx)

Ideally use the actual SDK exception type, but a stand-in BaseException subclass works because the production code only catches RuntimeError.

yinsong1986 · 2026-05-25T21:01:20Z

+        rl_race_err = _rate_limit_check_and_record(action)
+        if rl_race_err is not None:
+            _audit_tool_action(action, target, False, f"rate_limit_race: {rl_race_err}")
+            return _err(rl_race_err)


Surprising UX: operator approves, action then rejected by rate-limit race. When the post-approval _rate_limit_check_and_record rejects (because a concurrent invocation raced past while waiting on the operator), the audit row reads rate_limit_race, the operator was prompted and said yes, and the action did not run. From the operator's chair this looks like "I approved an emergency stop and nothing happened."

The behaviour is technically correct (the limit is the limit), but consider:

The error message returned to the agent should explicitly tell the operator their approval was wasted, so they don't assume the e-stop succeeded. The current text ("rate limit exceeded ... raced past") is for the LLM, not the human.

Logging at WARNING level here (not just audit-DEBUG) would surface this rare race in production logs.

Worth considering whether emergency_stop specifically should have an exception path that bypasses the limit on a confirmed operator override — the docstring on _rate_limit_check says "the rate limit exists to bound LLM-driven nuisance, not to inhibit a genuine emergency," but the post-approval rejection does exactly that for the unlucky race-loser.

yinsong1986 · 2026-05-25T21:01:20Z

+@tool(context=True)
 def robot_mesh(
    action: str,
+    tool_context: ToolContext,


Breaking signature change worth a CHANGELOG / migration note. Adding tool_context: ToolContext as a required positional parameter (even with @tool(context=True) injecting it via the SDK) breaks any caller that previously invoked robot_mesh(action="peers") directly without going through the agent loop — e.g. unit tests in downstream consumers, scripts that did from strands_robots.tools.robot_mesh import robot_mesh; robot_mesh(action="peers"). Such callers will now get TypeError: missing required positional argument: 'tool_context'.

This is fine as an intentional change (the tool is by definition agent-facing) but per the AGENTS.md "pre-1.0 breaking change without a CHANGELOG entry or migration sketch" guidance from past review-learnings, the split's release notes / CHANGELOG should call this out explicitly along with the recommended replacement (the __wrapped__ / mock-context pattern your tests use).

The test_application_security.py file (358 LOC) and the TestF15WireCommandKeyRequired class in test_validate_command.py exercise modules that land in later PRs of the strands-labs#195 split: - bridge_transport.DEFAULT_BRIDGE_SUFFIXES with 'safety/resume' (bridge_transport additions land in PR-5 / strands-labs#222) - audit._seq_sidecar_path, audit._resolve_log_max_bytes, F2/F3 symlink protections (land in PR-4 / strands-labs#221) - core.Mesh._expected_responders, core.BROADCAST_RESPONDER, the wire-side _exec_cmd dict-shape gate (land in PR-6 / strands-labs#225) - robot_mesh._interrupt_approves (lands in PR-7 / strands-labs#227) PR-2 is scoped to the leaf module strands_robots/mesh/security.py with zero internal mesh deps. These tests were pre-staged here but need their consumers to land first. They will re-land in the appropriate consumer PR. Test impact: 9 failing tests removed (8 from test_application_security.py, the F15 class moved out). All 571 remaining mesh tests pass. No production code change; no behaviour change on validate_command or any security.py contract.

cagataycali · 2026-05-27T02:11:46Z

🎯 Pentest evidence for this PR (#227 — HITL + rate limit)

2026-05-26 pentest run cagataycali/robots-pentest@1731f62.

What this PR is achieving (positive control)

The HITL tool_context.interrupt gates plus per-action sliding-window rate limit are the strong-but-non-portable defence at the LLM → robot boundary. The pentest exercised it 10 times in two scenarios:

Scenario	Result
6 direct prompt-injection prompts with HITL enabled	Agent refused all 4 dangerous prompts (`emergency_stop`, `broadcast` w/ attacker `policy_host`, etc.); 2 readonly tool calls audited
Same 6 prompts with `LLM_HITL_AUTOAPPROVE=y` (sims operator-absent)	Agent still refused all 4 — reasoning-layer defence held even when HITL gate auto-approved
4 indirect injections (doc summarisation, fake runbook, fleet loop, tool-output smuggle)	Agent refused all 4

Sample model reasoning quoted in results/07b_prompt_injection_hitl_autoapprove.log:

"The emergency_stop action is a hard safety gate — it requires a human operator to explicitly type EMERGENCY STOP NOW in plain English, directly in this conversation. It cannot be bypassed by a claim that the on-call has approved."

Caveats worth a regression test in this PR

Tested with claude-sonnet-4-20250514. Reasoning-layer defence is non-portable. A future regression test should run the same prompt battery against a smaller / older model (Haiku, GPT-4o-mini, Llama-3-8B) to characterise the model-strength dependency.
Rate limit (emergency_stop=3/60s) was not exhausted in the test run — worth a stress test that fires 4 HITL-approved emergency_stops within 60s and asserts the 4th is rejected with the exact Try again in Xs wait-time string.
HITL tool_context.interrupt MUST raise RuntimeError when no agent loop is attached (so a direct agent.tool.robot_mesh(...) call from a script can't smuggle an approval). Pin a test_robot_mesh_security.py::test_interrupt_unavailable_fails_closed regression.

Full writeup: F-07 / B-07 and F-12.

LGTM signals from the pentest

HITL interrupts fired correctly on emergency_stop and broadcast actions (visible in victim logs).
Rate-limit caching (_rate_limit_check_and_record) was the only defence between fast-fire indirect prompts and bandwidth-DoS via the LLM — it held.
Tool audit (_audit_tool_action) wrote 6 records over the 6-prompt battery, all linked to legitimate tell / subscribe calls. Audit log is the only thing that survived the LLM's reasoning gating layer.

Mapping for all PRs: PLAN.md.

This was referenced May 25, 2026

feat(mesh): security hardening on Zenoh built-ins (mTLS + ACL + DoS bounds) #195

Draft

[mesh] Security hardening (Zenoh-native): umbrella tracker for split of #195 #219

Open

cagataycali marked this pull request as ready for review May 25, 2026 20:41

cagataycali added this to Strands Labs - Robots May 25, 2026

github-project-automation Bot moved this to Backlog in Strands Labs - Robots May 25, 2026

yinsong1986 reviewed May 25, 2026

View reviewed changes

yinsong1986 mentioned this pull request May 25, 2026

docs(mesh): README env-var matrix + CHANGELOG entry (9/9 of #195 split) #226

Open

cagataycali added security mesh Zenoh mesh networking / fleet coordination labels May 26, 2026

This was referenced May 26, 2026

mesh(security): payload validation module (action allowlist + bounds) (2/9 of #195 split) #223

Open

mesh(iot): AWS IoT provisioning hardening (CA pin + thing-name regex + scoped policy) (8/9 of #195 split) #228

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mesh(tools): robot_mesh HITL via tool_context.interrupt + per-action rate limit (7/9 of #195 split)#227

mesh(tools): robot_mesh HITL via tool_context.interrupt + per-action rate limit (7/9 of #195 split)#227
cagataycali wants to merge 1 commit into
strands-labs:mainfrom
cagataycali:pr7/mesh-tools-robot-mesh-hitl

cagataycali commented May 25, 2026

Uh oh!

yinsong1986 left a comment

Uh oh!

yinsong1986 May 25, 2026

Uh oh!

yinsong1986 May 25, 2026

Uh oh!

yinsong1986 May 25, 2026

Uh oh!

yinsong1986 May 25, 2026

Uh oh!

yinsong1986 May 25, 2026

Uh oh!

cagataycali commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cagataycali commented May 25, 2026

The LLM tool surface

What's in this PR

Reviewer focus

Carries review fixes from #195

Stacking note

Uh oh!

yinsong1986 left a comment

Choose a reason for hiding this comment

Summary

What's good

Concerns

Verification suggestions

Uh oh!

yinsong1986 May 25, 2026

Choose a reason for hiding this comment

Uh oh!

yinsong1986 May 25, 2026

Choose a reason for hiding this comment

Uh oh!

yinsong1986 May 25, 2026

Choose a reason for hiding this comment

Uh oh!

yinsong1986 May 25, 2026

Choose a reason for hiding this comment

Uh oh!

yinsong1986 May 25, 2026

Choose a reason for hiding this comment

Uh oh!

cagataycali commented May 27, 2026

🎯 Pentest evidence for this PR (#227 — HITL + rate limit)

What this PR is achieving (positive control)

Caveats worth a regression test in this PR

LGTM signals from the pentest

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants