mesh(security): payload validation module (action allowlist + bounds) (2/9 of #195 split)#223
Conversation
Adds the application-layer security boundary that runs after a peer
has cleared mTLS + ACL but whose payload contents we still need to
bound. Without this an authorised operator could steer a robot at
an attacker-controlled inference server, request a 24-hour execute
action, or drive the robot to download an arbitrary HuggingFace model.
New module:
- strands_robots/mesh/security.py (786 LOC) -- standalone, zero
internal mesh deps. Pure validation functions:
* validate_command -- action allowlist + per-action bounds
(instruction length, duration, step count, finite-numeric checks,
teleop peer_id charset+length, resume override-code clauses)
* is_safe_policy_host -- VLA inference target host/CIDR allowlist
* is_safe_model_path -- HuggingFace repo + local model gate
* is_safe_policy_type / is_safe_policy_provider -- policy registry
* is_safe_server_address -- composite host[:port] check
Carries review fixes from strands-labs#195: R10 (resume clause in validate_command),
R11 (action allowlist completeness), R23 (NaN coercion / HF shape /
allowlist charset+cache), R24-B (non-dict command payloads rejected
at wire boundary in core.py -- pinned client-side here).
Tests (1,710 LOC across 8 files): action allowlist coverage,
finite-float boundaries, peer_id charset/length, allowlist env-var
parsing + LRU cache invalidation, policy host CIDR matching, HF
two-segment shape enforcement.
Tracking: strands-labs#219
Source PR: strands-labs#195
Lands after: strands-labs#220 (PR-1)
Lands before: PR-3, PR-4, PR-5, PR-6 (none of which depend on this
for compilation, but PR-6 imports validate_command at
runtime via core.py)
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
This PR introduces a self-contained payload-validation module (strands_robots/mesh/security.py) that gates command actions, numeric bounds, host/CIDR allowlists, HF repo IDs, policy types, and teleop peer-id charsets behind a single validate_command entry point. The module is well-scoped, has no internal mesh deps as advertised, and ships ~1700 LOC of tests covering the documented attack surfaces (NaN/inf coercion, traversal, shell metacharacters, charset/length caps, allowlist env-var parsing + LRU-cache invalidation, two-segment HF shape).
The defence-in-depth posture (build out from validated subset only, charset-validate operator-supplied env-var entries with WARNING on misconfig, explicit math.isfinite check on numeric coercion, ValidationError rather than bare ValueError so _exec_cmd audits structurally) is consistent with the review learnings in AGENTS.md.
What's good
- Numeric coercion defences (
_coerce_float/_coerce_int) correctly reject NaN/inf, wrapOverflowError/ValueErrorinValidationError, and have a dedicated regression test (test_validate_command_finite_numerics.py) that pins the exception class — exactly the pattern AGENTS.md > Review Learnings (#85) calls for. - Env-var allowlist parsers are charset-validated through a single helper (
_validate_env_allowlist_entries) with LRU caching keyed on the raw env string — avoids WARNING flooding the audit log on a typo'd value, and the test pins the one-warning-per-distinct-value semantics. outis built from the validated subset rather thandict(cmd)overlay, with explicit pinning test (TestValidateCommandKeyAllowlist) — closes the silent-pickup gap a future**cmdhandler would otherwise hit.- Per-action peer_id / device_name charset + length caps with regression tests covering shell metacharacters, control bytes, NUL, unicode.
- Two-segment HF shape gate with a thoughtful comment explaining why downstream 404 isn't sufficient.
- Scope is disciplined — module is self-contained as the PR description claims; no internal mesh imports.
Concerns
ALLOWED_ACTIONS↔Mesh._dispatchsync is comment-only. The header comment (line 128) says "Keep these two sets in sync when adding a new action" but no test pins it. Sincecore.pylands in PR-6, this can't be tested here, but a follow-up integration test in PR-6 should assert the two sets are equal — silent drift between an allowlist and the dispatcher it gates is exactly the bug class action allowlists exist to prevent.MAX_TIMEOUT_Sis defined and exported in__all__but never used in this module or its tests. AGENTS.md > Key Conventions #10: "No dead code — if it's not called and not part of base class, delete it." If it's intended for PR-6/7/8 consumers, prefer landing it with the consumer.- Docstring drift on
validate_command. The docstring (line 593) listspolicy_provideras "defaults to\"mock\"" but the implementation (line 689) raisesValidationErrorwhen it is missing. AGENTS.md > Review Learnings (#85): "Match docstrings to semantics." - The truncated tests file (
test_application_security.py, +358) is mostly orthogonal to this module — it importsbridge_transport,audit,core, asserts symlink-swap defences, log-injection, response-hijack. Those modules are not in the PR-2 diff, so the tests will skip / fail in the PR-2 self-contained landing unless those modules already exist onmain. Worth confirming the dependency is actually satisfied by prior merges before this PR lands; otherwise this file belongs in PR-6.
Verification suggestions
# 1. Run the new test files standalone to confirm they pass without PR-6/7/8.
hatch run test tests/mesh/test_validate_command.py \
tests/mesh/test_validate_command_finite_numerics.py \
tests/mesh/test_validate_command_teleop_peer_id.py \
tests/mesh/test_hf_repo_two_segment_shape.py \
tests/mesh/test_policy_host_allowlist.py \
tests/mesh/test_allowlist_caches.py \
tests/mesh/test_allowlist_env_charset_validation.py
# 2. test_application_security.py imports modules outside this PR --
# confirm they exist on the head SHA of `main`:
git grep -l 'def _should_bridge' -- 'strands_robots/**'
git grep -l 'def log_safety_event' -- 'strands_robots/**'
# 3. Cross-check that ALLOWED_ACTIONS will actually match Mesh._dispatch
# once PR-6 lands. Until then, eyeball the dispatch table:
git grep -A 2 'def _dispatch' strands_robots/mesh/core.py 2>/dev/null || echo 'core.py not yet present (PR-6)'…dress cap, docstring drift Addresses 4 of 5 R1 review threads on PR strands-labs#223 (yinsong1986). Each fix is surgical and pinned by a regression test in `tests/mesh/test_security_review_r1.py`. Threads addressed: 1. **`MAX_TIMEOUT_S` was dead code.** Defined and exported, no consumer in PR-2 scope. Removed pending PR-6 (strands-labs#225) where the call site that enforces the cap should land alongside the constant. Per AGENTS.md Key Conventions strands-labs#10 ("No dead code -- if it's not called and not part of base class, delete it"). 2. **`MAX_SERVER_ADDRESS_LEN` is now a dedicated constant.** A model path and a `host[:port]` URL are semantically unrelated; reusing `MAX_MODEL_PATH_LEN` (512) for the address cap meant a future tightening of one would silently shift the other. New constant is 256 (FQDN bounded at 253 by RFC 1035; smallest power-of-2 ceiling that admits any legitimate input). 3. **`policy_provider` docstring drift.** The `validate_command` docstring claimed `policy_provider (optional): defaults to "mock"` but the implementation explicitly raises `ValidationError( "policy_provider is required for execute/start actions...")`. Docstring now reads `REQUIRED, in is_safe_policy_provider. No silent default -- a peer that omits this is rejected so it is never ambiguous whether mock was an explicit choice or a bug.` 4. **`policy_host` casing-preservation contract documented.** `is_safe_policy_host` casefolds + strips for the allowlist check, but `validate_command` preserves the caller's original casing / whitespace in the validated output. A downstream consumer doing `policy_host == "localhost"` for a loopback fast-path will fail on `" LOCALHOST "` despite both validating successfully. The contract is now documented in a comment adjacent to the `out["policy_host"] = policy_host` assignment so a future contributor reading the source sees the rule before stumbling into it. Same contract applies to `server_address` (noted in the comment). Deferred (cross-PR scope): 5. **`ALLOWED_ACTIONS == set(Mesh._dispatch)` regression test.** The reviewer correctly notes this pin must live in PR-6 (strands-labs#225) where `core.py` lands, since `ALLOWED_ACTIONS` and `Mesh._dispatch` live in different PR scopes today. Will reply on the thread. Pin verification (AGENTS.md S0 round-trip): - Pre-fix: 9/13 new tests fail (correctly identifying the absent constant, the present dead constant, the drift in the docstring, and the missing comment). - Post-fix: 13/13 pass. - Existing `test_validate_command.py` regression: 57/58 pass; the 1 failure (`test_flat_envelope_rejected_in_exec_cmd`) is pre-existing cross-PR — verified by re-running with my changes stashed: same failure. It depends on `Mesh._dispatch` from `core.py` (PR-6). No behaviour change on the wire-side validation path; only the constant a future consumer reads (`MAX_SERVER_ADDRESS_LEN` vs `MAX_MODEL_PATH_LEN`) differs, and only for inputs in the 257-512 char range, which is past every legitimate `host[:port]` shape.
Fix CI lint failure: ruff format --check exit 1. Pure formatter output (slice spacing PEP 8 + quote-style normalization). No semantic changes. ruff format --check strands_robots tests tests_integ now clean (193 files already formatted, this one brought to spec).
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
PR-2/9 of #195 — adds strands_robots/mesh/security.py (~800 LOC) with payload-semantic validators (validate_command, is_safe_policy_host, is_safe_model_path, is_safe_policy_type, is_safe_policy_provider, is_safe_server_address) plus 1.7k LOC of tests. Self-contained — no internal mesh imports, so the diff truly is a leaf module the rest of the stack will depend on.
The security posture is reasonable: NaN/inf guards on numeric coercion, regex+length+traversal gates on model paths, separate dedicated cap for server addresses, charset-validated env-var allowlists, and a fail-loud-on-misconfig WARNING channel. The R1 round-trip is well-documented and the pin tests are solid.
What's good
_coerce_float/_coerce_intcorrectly reject NaN/inf and wrapOverflowError/ValueErrorasValidationErrorso_exec_cmd(PR-6) can audit them ascommand_rejectedrather than generic dispatch errors.- The R1 fixes have proper pin tests in
tests/mesh/test_security_review_r1.pythat fail on pre-fix code (per AGENTS.md > Review Learnings #85). - Strict per-action key allowlist (
out: dict[str, Any] = {"action": action}building forward) is a meaningful defence-in-depth improvement overdict(cmd)-overlay. MAX_SERVER_ADDRESS_LENdecoupled fromMAX_MODEL_PATH_LENis the right call.policy_providerno longer silently defaults to"mock"— that was a real security boundary footgun.- Test suite is broad: action allowlist coverage, finite-float boundaries, peer_id charset, allowlist env parsing + LRU cache invalidation, IPv6/CIDR matching, HF two-segment shape.
Concerns
ALLOWED_ACTIONS↔Mesh._dispatchsync is comment-only. The PR acknowledges this is deferred to PR-6, but as currently structured a future contributor could add an action to_dispatchwithout updatingALLOWED_ACTIONSand only discover the drift via runtime rejection. PR-6 should land a self-test (assert ALLOWED_ACTIONS == set(Mesh._dispatch_table.keys())) in the same diff. Worth a comment in the source pointing forward to that pin so the next reader doesn't have to reconstruct the contract from PR descriptions.- HF
<org>/<repo>shape gate is permissive on degenerate-but-regex-valid inputs. See inline onis_safe_model_path. Not exploitable today (HF 404s the resolution) but the validator is supposed to enforce the wire contract independent of downstream behaviour, per the docstring on line 357. policy_host/server_addressround-trip preserves whitespace verbatim. The R1 thread-5 documentation (line 645-651) commits the project to a contract where downstream callers must strip+casefold themselves. This is a footgun: a future_dispatchauthor who writesif cmd["policy_host"] == "localhost"for a loopback fast-path will silently miss" localhost "payloads. Forensic value of preserving whitespace is questionable vs. emitting both raw and normalised fields. Worth revisiting as a follow-up.- Action allowlist completeness review is incomplete in this PR's scope. Reviewer-focus item #1 in the description says "Action allowlist completeness vs
_exec_cmdpaths incore.py" — butcore.pyis in PR-6, so a reviewer of PR-2 cannot actually verify completeness. Recommend the PR-6 review thread cross-reference this PR'sALLOWED_ACTIONSline range explicitly. - No CHANGELOG / migration note for the breaking change.
policy_providergoing from optional-defaults-mock to required is a wire-contract break. Even though no callers exist yet onmain, the PR-6/7/8 follow-ups will need to set this field; flagging it in a top-level docstring or aCHANGELOG.mdentry would help future maintainers.
Verification suggestions
# Run the new test surface
hatch run test tests/mesh/test_validate_command.py \
tests/mesh/test_validate_command_finite_numerics.py \
tests/mesh/test_validate_command_teleop_peer_id.py \
tests/mesh/test_security_review_r1.py \
tests/mesh/test_hf_repo_two_segment_shape.py \
tests/mesh/test_allowlist_caches.py \
tests/mesh/test_allowlist_env_charset_validation.py \
tests/mesh/test_policy_host_allowlist.py
# Spot-check the degenerate HF path inputs flagged inline
python -c "from strands_robots.mesh.security import is_safe_model_path as f; \
print(f('nvidia//repo', hf_only=True), f('nvidia/.', hf_only=True), f('nvidia/./repo', hf_only=True))"
# Today: True True True (per the inline comment, these should ideally be False)
# Confirm policy_host whitespace round-trip
python -c "from strands_robots.mesh.security import validate_command as v; \
print(repr(v({'action':'execute','instruction':'go','policy_host':' LOCALHOST ','policy_provider':'mock'})['policy_host']))"Per umbrella tracker strands-labs#219 the four PR-2/3/4/5 slices land in parallel and must each go green standalone. test_r25_audit_symmetry.py imports modules introduced by sibling slices that are not present on this branch: from strands_robots.mesh import _zenoh_config as zc # lives on PR strands-labs#224 from strands_robots.mesh import audit as audit_mod # lives on this PR from strands_robots.mesh import core as core_mod # lives on this PR from strands_robots.mesh import security as sec # lives on PR strands-labs#223 That makes the file structurally un-passable on the audit slice in isolation, which is why call-test-lint fails mypy here with "Module strands_robots.mesh has no attribute 'security'" / "... no attribute '_zenoh_config'". The four pin sites the file describes (peer-id charset on security.py, partial-failure cleanup on core.py, O_NOFOLLOW on the seq lockfile in audit.py, and the STRANDS_MESH_TLS_KEY mode-0o600 docstring on _zenoh_config.py) span four sibling slices and only make sense as a cross-cutting integration pin once strands-labs#223, strands-labs#224, strands-labs#221 and strands-labs#225 have all merged. The natural home is PR-6 (strands-labs#225, mesh/core replay caches + override-resume + safety topic handlers) which is the integration slice that needs all the listed modules present anyway, or a follow-up issue queued behind PR-9 (strands-labs#226, docs). The audit-side pin (point 3, O_NOFOLLOW on the seq lockfile) is already covered on this PR by tests/mesh/test_audit_seq_symlink.py and tests/mesh/test_audit_psk_poison_record.py, so removing this file does not weaken regression coverage on the audit slice itself.
sundargthb
left a comment
There was a problem hiding this comment.
LGTM. Please address the lint issues.
The test_application_security.py file (358 LOC) and the TestF15WireCommandKeyRequired class in test_validate_command.py exercise modules that land in later PRs of the strands-labs#195 split: - bridge_transport.DEFAULT_BRIDGE_SUFFIXES with 'safety/resume' (bridge_transport additions land in PR-5 / strands-labs#222) - audit._seq_sidecar_path, audit._resolve_log_max_bytes, F2/F3 symlink protections (land in PR-4 / strands-labs#221) - core.Mesh._expected_responders, core.BROADCAST_RESPONDER, the wire-side _exec_cmd dict-shape gate (land in PR-6 / strands-labs#225) - robot_mesh._interrupt_approves (lands in PR-7 / strands-labs#227) PR-2 is scoped to the leaf module strands_robots/mesh/security.py with zero internal mesh deps. These tests were pre-staged here but need their consumers to land first. They will re-land in the appropriate consumer PR. Test impact: 9 failing tests removed (8 from test_application_security.py, the F15 class moved out). All 571 remaining mesh tests pass. No production code change; no behaviour change on validate_command or any security.py contract.
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Introduces strands_robots/mesh/security.py (~797 LOC) as a self-contained payload-validation module: action allowlist + per-action bounds, host/CIDR allowlist for VLA targets, HuggingFace <org>/<repo> shape-and-prefix gate, and a composite host[:port] validator. Pairs the production module with 1,155 LOC of tests across 8 test files (the description says 7 files / 1,285 LOC, minor drift). The module is genuinely zero-dep on the rest of mesh/, which makes it a clean PR-2 of the 9-way split and lets reviewers verify the boundary in isolation. The R1 changelog and pin tests in test_security_review_r1.py are the kind of evidence-of-care that the AGENTS.md "Pin regression tests for reviewed fixes" rule asks for.
What's good
- Numeric coercion explicitly rejects NaN/inf via
math.isfinite, with a tightly-pinned regression test (test_validate_command_finite_numerics.py) that catches the IEEE-754 bounds bypass. The H2 pin (rejection class isValidationError, not bareValueError) is exactly the audit-shape contract_exec_cmdwill need. - HF
<org>/<repo>shape gate (exactly 2 non-empty segments + leading-/rejection) closes thenvidia/etc/passwd-shape hole with a dedicated test file. - Strict per-action key allowlist on the output dict (build-from-validated-only rather than
dict(cmd)overlay) is the correct defence-in-depth posture against future handlers that might do**cmd. lru_cache(maxsize=1)keyed on the raw env-var string +_clear_security_caches_for_testshelper threads a clean needle: WARNING fires once per distinct misconfig in production, deterministically per-test under monkeypatch.- DNS-resolution caveat is called out in the
is_safe_policy_hostdocstring (a not-uncommon thing reviewers find missing).
Concerns
ALLOWED_ACTIONS↔Mesh._dispatchdrift is acknowledged but unpinned. The R1 changelog defers the cross-check to PR-6 (#225). That is reasonable scoping, but means PR-6 ships with a hard dependency that is invisible from this PR's CI. Consider adding a stub test here (skipped ifcoreis unimportable) so the assertion lights up automatically the moment PR-6 lands; otherwise the sync invariant lives only in a code comment until someone remembers.policy_providerandpolicy_typeshare one allowlist.is_safe_policy_providerreads_policy_type_allowlist()(line 435) and the dispatch on line 695 directs operators to extendSTRANDS_MESH_POLICY_TYPE_ALLOWfor both. This is documented but couples two different security knobs: an operator who legitimately needs a new internalpolicy_type(e.g.my_runtime_v2) silently grants peers permission to use it as apolicy_providerin execute/start commands. Worth a separate env var or at least a more prominent docstring warning on the coupling.- HF allowlist defaults can override operator tightening.
_hf_repo_allowlist_cachedalways prepends("nvidia", "huggingface", "lerobot")(line 308). An operator who wants to restrict access to specificallynvidia/gr00t-n1.5cannot — the bare-orgnvidiafrom defaults always matches first. The asymmetry is a real footgun for tightening deployments and is undocumented in the env-var help. - Test count mismatch with PR description. PR description says "1,285 LOC across 7 files"; the diff shows 8 test files totalling 1,155 LOC. Minor.
LockoutErrordefined and exported with no consumer in this PR. Same shape as the R1 fix that removedMAX_TIMEOUT_S("will re-land in PR-6 alongside the consumer"). Calling out for consistency — either remove now or accept the precedent has shifted.
Verification suggestions
# Smoke-test the validator boundary with adversarial payloads
hatch run pytest tests/mesh/ -x -v -k "validate_command or hf_repo or allowlist or policy_host or finite"
# Verify the action allowlist matches the eventual dispatch table once PR-6 is checked out:
python -c "
from strands_robots.mesh.security import ALLOWED_ACTIONS
# (PR-6 only) from strands_robots.mesh.core import Mesh
# assert ALLOWED_ACTIONS == set(Mesh._dispatch_table().keys())
print(sorted(ALLOWED_ACTIONS))
"
# Confirm cache-clearing helper isolation:
hatch run pytest tests/mesh/test_allowlist_caches.py -v…ne 167) Per AGENTS.md Key Conventions strands-labs#10 ("No dead code"), LockoutError was defined and exported in __all__ but had no consumer in this PR scope. Same pattern as R1 removal of MAX_TIMEOUT_S. Will re-land in PR-6 alongside Mesh._dispatch (the consumer that raises it). Pin test: tests/mesh/test_security_no_dead_exports.py - test_lockout_error_not_exported: fails on pre-fix HEAD - test_lockout_error_class_not_defined: fails on pre-fix HEAD - test_all_exports_are_importable: guards __all__ integrity
🎯 Pentest evidence for this PR (#223 — payload validation)2026-05-26 pentest run Finding landing in this PR
Pinned regression test waiting for this fix
Posture confirmed by this PR
Mapping for all PRs: |
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Introduces strands_robots/mesh/security.py (~785 LOC) as the payload-semantic security boundary for mesh commands: action allowlist + per-action bounds in validate_command, plus host/CIDR allowlist (is_safe_policy_host), HF/local-path gate (is_safe_model_path), policy-type/provider gate, and a composite is_safe_server_address. Self-contained module with no internal mesh imports; ~1,285 LOC of tests across seven files cover the action vocabulary, finite-float boundaries, peer_id charset, env-var allowlist parsing + LRU caching, and HF two-segment shape. PR-2 of a 9-PR split tracked in #219.
Overall the diff is well-scoped and the validator is conservative-by-default (loopback-only policy hosts, three built-in HF orgs, REQUIRED policy_provider with no silent mock default, math.isfinite checks on numeric fields, narrow charset regexes anchored with fullmatch). The R-round commits land each review concern with a paired pin test, which matches AGENTS.md > Review Learnings (#85) > "Pin regression tests for reviewed fixes".
What's good
- Defence-in-depth posture is consistent. Charset regexes are anchored (
re.fullmatch+^...$), control-byte/whitespace/NUL classes excluded, traversal guarded explicitly. - Numeric coercion handles IEEE-754 footguns.
_coerce_floatand_coerce_intwrapfloat()/int()and explicitly reject NaN / inf viamath.isfinite;_coerce_intalso pre-checksisfinitebeforeint(value)to prevent the bareValueErrorfromint(nan)bypassing the_exec_cmdaudit path. The pin test attest_validate_command_finite_numerics.py::test_step_count_rejects_nan_via_validation_error_not_value_erroris exactly the right shape. - Per-key allowlist on the output dict (lines 612-619) defends future
**cmdconsumers from picking up unvalidated attacker-controlled keys. Good defence-in-depth. - Required
policy_providerwith no silentmockfallback closes the "forgot the field looked the same as explicit mock" footgun called out in R1. - Cache + WARNING discipline. Each
_*_allowlistparser islru_cache(maxsize=1)keyed on the raw env-var string, so the malformed-entry WARNING fires once per distinct misconfig value rather than once per inbound command._clear_security_caches_for_testsis the right escape hatch for fixtures. - Test pins for every R-round fix.
test_security_no_dead_exports.pyassertsLockoutErroris neither defined nor exported (R3),test_security_review_r1.pypins all four R1 fixes including a structural source-text check on the casing-contract comment.
Concerns
- Five tracked follow-ups in #239 (buckets A-E) cover real validator gaps, including the
policy_hostwhitespace/control-character preservation that this review's #1 inline flags as the highest-impact remaining concern. The R4 deferral rationale ("decision deferred until consumer-side literal-equality call sites are visible in PR-6") is defensible for the normalisation question, but rejecting CRLF / NUL / control-character payloads at the validator is independent of how downstream normalises and should not wait. Same shape applies to bucket E (HF shape-gate acceptsnvidia//repoandnvidia/.): the tightening is one line and has reproducer tests pre-staged. turn_id/sender_idpassthrough at line 619 is unvalidated. Inline-flagged. This contradicts the defence-in-depth rationale immediately above it.resume.override_codeat line 765 accepts control characters and NUL bytes. Inline-flagged. The comment correctly identifies that the string flows intoMesh._resume_lockoutand.strip(), but a charset gate matching the peer_id treatment costs one regex.- No CHANGELOG entry, but the module is internal and not yet imported by any consumer in this PR's scope, so this is more of an FYI for the PR-6 / PR-7 review than a blocker here.
- (Style) Several class/function docstrings reference cross-PR symbols (
Mesh._dispatch,_exec_cmd) that don't exist in this PR's diff. Acknowledged in R1 as deferred to PR-6, but a# TODO(PR-6 #225):marker at each forward-reference would help reviewers grep for the lift later.
Verification suggestions
# Smoke-test the CRLF-injection concern raised in inline #1:
python -c "
from strands_robots.mesh import security as sec
out = sec.validate_command({
'action': 'execute', 'instruction': 'go',
'policy_host': 'localhost\r\n',
'policy_provider': 'mock',
})
print(repr(out['policy_host'])) # currently 'localhost\\r\\n'
"
# Verify turn_id passthrough accepts non-string types (inline #2):
python -c "
from strands_robots.mesh import security as sec
out = sec.validate_command({'action': 'status', 'sender_id': {'evil': 'dict'}})
print(repr(out.get('sender_id'))) # currently {'evil': 'dict'}
"
# HF shape-gate degenerate inputs (inline #3, #239 bucket E):
python -c "
from strands_robots.mesh.security import is_safe_model_path
print(is_safe_model_path('nvidia//repo', hf_only=True)) # currently True
print(is_safe_model_path('nvidia/.', hf_only=True)) # currently True
"| # for loopback fast-paths MUST do their own normalisation -- the | ||
| # validator gates membership, not canonical form. Same contract for | ||
| # ``server_address`` below. | ||
| out["policy_host"] = policy_host |
There was a problem hiding this comment.
policy_host preserved verbatim is a wire-injection vector, not just a forensics trade-off.
is_safe_policy_host strips and lowercases for the allowlist match (line 261), so "localhost\r\n", "localhost\t ", " localhost " all pass — but out["policy_host"] then preserves the original string verbatim. Reproducer:
out = validate_command({
'action': 'execute', 'instruction': 'go',
'policy_host': 'localhost\r\n',
'policy_provider': 'mock',
})
assert out['policy_host'] == 'localhost\r\n' # holds todayThe "preserve for forensics" rationale (lines 634-639) is reasonable for case (LocalHost vs localhost) and benign whitespace, but CRLF / NUL / control characters in a host string are never legitimate input — they only arise from malicious payload construction or buggy callers. Any downstream consumer that builds an HTTP/gRPC URL by string-concatenation (f"http://{policy_host}:{port}") inherits a CRLF-header-injection primitive.
The deferral note in #239 bucket A frames this as a normalisation question (validator-side reject vs. dual-emit policy_host_raw), but rejecting control characters at the validator is independent of how downstream chooses to canonicalise — even the "preserve forensics" path can't justify wire-injection-shaped bytes reaching the audit log.
Suggest adding a one-line charset gate at the validator (mirroring _PEER_ID_RE discipline; allow [A-Za-z0-9.:_/-]+ plus brackets for IPv6):
if not _SAFE_HOST_INPUT_RE.fullmatch(str(policy_host)):
raise ValidationError(f"policy_host={policy_host!r} contains control or whitespace characters.")Same contract for server_address at line 700. Per AGENTS.md > Review Learnings (#92) > "Validate before subprocess interpolation… Reject shell metacharacters in paths… \n, \r, \x00."
| # correlation working. Anything else gets dropped silently. | ||
| for passthrough in ("turn_id", "sender_id"): | ||
| if passthrough in cmd: | ||
| out[passthrough] = cmd[passthrough] |
There was a problem hiding this comment.
Unvalidated passthrough of turn_id and sender_id defeats the defence-in-depth rationale 7 lines above.
The comment block at lines 602-611 explicitly explains that out is built from the validated subset only "so any future action handler that did **cmd or pulled a not-yet-validated key would silently pick up an attacker-controlled value." Then this loop accepts turn_id and sender_id as-is — including non-string types and arbitrary control bytes:
out = validate_command({
'action': 'status',
'turn_id': '\x00\nINJECT',
'sender_id': {'evil': 'dict'}, # not even a string
})
# turn_id == '\x00\nINJECT', sender_id == {'evil': 'dict'}These fields flow into _on_cmd / _on_response correlation and very likely into log records. At minimum, both should be type-checked (isinstance(value, str)) and length-bounded. If they're meant to be ULID/UUID-shaped, an explicit charset regex is the right fix; if they really are operator-supplied free-form, then _PEER_ID_RE is a reasonable lower bound. As written, the per-key allowlist immediately above is doing only half its job.
| # Match parts excluding the trailing-slash artefact (split | ||
| # produces an empty trailing segment for "org/"). | ||
| non_empty_parts = [seg for seg in parts if seg] | ||
| if len(non_empty_parts) != 2: |
There was a problem hiding this comment.
Two-segment shape gate accepts degenerate inputs.
The len(non_empty_parts) != 2 check is a real defence (and the test file test_hf_repo_two_segment_shape.py is a great pin), but the empty-segment filter creates a gap:
is_safe_model_path('nvidia//repo', hf_only=True) # True today
is_safe_model_path('nvidia/.', hf_only=True) # True today ('.', not '..')
is_safe_model_path('nvidia/./repo', hf_only=True) # False (3 non-empty segments)Acknowledged in #239 bucket E with a one-line tightening already drafted:
if any(seg in ('', '.', '..') for seg in parts):
return FalseGiven the existing test file already covers nvidia/foo/bar and traversal, folding the additional cases (nvidia//repo, nvidia/., nvidia/./repo) in here costs ~3 lines of code + ~3 parametrised cases and removes a deferred-follow-up. Worth landing in this PR rather than waiting on #239 — the validator's job is to enforce the contract at the wire, and "HF will 404 on these" is exactly the rely-on-downstream-rejection posture R1's review explicitly rejected for etc/passwd-style inputs.
| raise ValidationError("resume.override_code must be a string") | ||
| if len(override_code) > 256: | ||
| raise ValidationError("resume.override_code too long (>256 chars)") | ||
| out["override_code"] = override_code |
There was a problem hiding this comment.
resume.override_code accepts NUL / control characters / CRLF.
The field is bounded only by type and length:
out = validate_command({'action': 'resume', 'override_code': '\x00\x01\x02secret\r\nINJECT'})
assert out['override_code'] == '\x00\x01\x02secret\r\nINJECT' # holds todayThe comment correctly notes that the value flows into Mesh._resume_lockout and .strip(). Two concerns:
- The override-code is operator-supplied and certainly ends up in audit log lines /
command_rejectedrecords. Control bytes and CRLF in log records are an own-foot CRLF-injection primitive on log collectors that don't escape — same family as thepolicy_hostconcern. - This PR's other secret-shaped fields (
source_peer_id,device_name) get a charset+length pair via_PEER_ID_RE+MAX_PEER_ID_LEN. The asymmetry is hard to justify:override_codeis at least as security-sensitive as a peer_id.
One extra regex (printable-ASCII only, no whitespace, no controls) closes this; the existing 256-char cap stays.
| #: ``STRANDS_MESH_POLICY_HOST_ALLOW``. Rejects shell metacharacters, | ||
| #: whitespace, NUL bytes, and any byte outside the printable ASCII | ||
| #: subset typical of DNS labels and CIDR ranges. | ||
| _POLICY_HOST_ENTRY_RE = re.compile(r"^[A-Za-z0-9.:/_\-\[\]]+$") |
There was a problem hiding this comment.
Charset regex admits brackets that downstream is_safe_policy_host cannot match.
The regex allows [ ] (line 174) so an operator can paste STRANDS_MESH_POLICY_HOST_ALLOW="[::1]". The entry survives _validate_env_allowlist_entries, lands in the allowlist, but is_safe_policy_host does literal lowercased compare against host.strip().lower() and ipaddress.ip_address(host_lc) — never against the bracketed form. So [::1] in the env is dead chars (the inbound policy_host would have to also literally be "[::1]" to match, which is not how anyone writes IPv6 in a host field).
Acknowledged in #239 bucket D-1. Two clean fixes (either is fine):
- Drop
\[\]from_POLICY_HOST_ENTRY_REso the entry triggers the malformed-entry WARNING and the operator gets a fail-loud signal. That matches the "fail-loud-on-misconfig" posture documented in_validate_env_allowlist_entries's docstring. - Strip brackets in
_policy_host_allowlist_cached(entry.strip('[]')) so[::1]works.
Option 1 is more consistent with the surrounding code's posture; option 2 is more user-friendly. Either lands in <5 lines.
… gate
Addresses R3 review feedback on the hf_only path-shape gate. The prior
non_empty_parts length-2 check accepted three degenerate inputs that
the regex / traversal gates above did not catch:
is_safe_model_path('nvidia//repo', hf_only=True) # was True
is_safe_model_path('nvidia/.', hf_only=True) # was True
is_safe_model_path('nvidia/repo/', hf_only=True) # was True
HuggingFace would 404 on resolution, but the validator's job is to
enforce the <org>/<repo> wire contract at the boundary -- relying on
downstream rejection is the same anti-pattern that R1 rejected for
the 'nvidia/etc/passwd' case. The reviewer's suggested one-line fix
in the hf_only branch closes all three:
if any(seg in ('', '.') for seg in parts):
return False
Scoped to hf_only because legitimate local relative paths like
'./local/checkpoint' MUST keep passing under hf_only=False (existing
test_hf_repo_two_segment_shape.py::test_local_path_branch_unchanged
covers this; broadening the reject would have regressed lerobot_local
consumers that point model_path at on-disk dirs).
Pin: tests/mesh/test_hf_repo_degenerate_segments.py (18 tests across
4 categories). Pre-fix verification per AGENTS.md S0 round-trip:
pre-fix HEAD f7fd065: 7 failed, 11 passed
post-fix HEAD this: 18 passed
The 11 that pre-pass are the canonical-shape regression guards and
the local-path-with-dot-segment guards (verifying the fix stays
scoped to the hf_only branch).
Verification:
tests/mesh/test_hf_repo_degenerate_segments.py 18 passed
tests/mesh/test_hf_repo_two_segment_shape.py 16 passed
tests/mesh/test_validate_command.py + sib pins 118 passed
ruff check + format --check clean
…s threads line 640, 619, 765, 174) Adds _SAFE_PASSTHROUGH_RE ([\\x20-\\x7E]+) control-character rejection to all string fields that previously passed through unvalidated: 1. policy_host: rejects CRLF/NUL/C0 after allowlist membership passes (closes wire-injection vector where "localhost\\r\\n" passed membership via strip+lower but stored injection-shaped bytes in out). 2. server_address: same control-char gate. 3. turn_id / sender_id: type-check (must be str), length-bound (MAX_PASSTHROUGH_LEN=128), and charset-validate. Previously accepted dicts, lists, and control-byte strings without any gate. 4. resume.override_code: printable-ASCII-only charset gate. 5. _POLICY_HOST_ENTRY_RE: drop dead bracket chars (\\[\\]) that the allowlist matching code could never match against. Per AGENTS.md > Review Learnings (strands-labs#92): "Reject shell metacharacters in paths... \\n, \\r, \\x00" and "Validate before subprocess interpolation." Pin: tests/mesh/test_security_control_char_gates.py (25 tests across 5 classes). Updated tests/mesh/test_security_review_r1.py to reflect the comment change from "Preserve caller casing" to "Gate control characters."
Part 2 / 9 of the split of #195 — tracked by #219.
The application-layer security boundary that runs after a peer has cleared mTLS + ACL but whose payload contents we still need to bound. Without this, an authorised operator could steer a robot at an attacker-controlled inference server, request a 24-hour
executeaction, or drive the robot to download an arbitrary HuggingFace model.What's in this PR
New module:
strands_robots/mesh/security.py(~830 LOC) — standalone, zero internal mesh deps.Reviewer focus
_exec_cmdpaths incore.py(PR-6 carries the wire-side counterpart in_exec_cmdnon-dict rejection).^...$).math.isfinite).Tests
~1,600 LOC across 9 files: action allowlist coverage, finite-float boundaries, peer_id charset/length, allowlist env-var parsing + LRU cache invalidation, policy host CIDR matching, HF two-segment shape enforcement, HF degenerate-segment rejection (R5), and control-character charset gates (R6).
Stacking note
No internal mesh deps — PR-2 is self-contained. PR-6 (core), PR-7 (tools), and PR-8 (iot) will later import from this module.
§13 R-round changelog
acda084MAX_TIMEOUT_S, docstring drift,MAX_SERVER_ADDRESS_LENsemantic owner, casing-contract docstests/mesh/test_security_review_r1.pybc89c0ff7fd065LockoutErrorclass + exporttests/mesh/test_security_no_dead_exports.py04849bftests/mesh/test_hf_repo_degenerate_segments.pyc058814tests/mesh/test_security_control_char_gates.pyR6 (2026-05-27 ~05:4X UTC) —
c058814Addresses the five R5 review threads (2026-05-27T03:38:41Z) that flagged control-character/injection vectors in passthrough string fields. All five are the same semantic concern: string fields stored in
outwithout charset validation can carry CRLF/NUL/C0 control bytes into audit logs and downstream string ops.policy_hostCRLF injection_SAFE_PASSTHROUGH_REgate rejects[\x00-\x1F\x7F]after allowlist membership passesturn_id/sender_idunvalidated passthroughmust be str), length-bound (MAX_PASSTHROUGH_LEN=128), charset-validate04849bf)resume.override_codecontrol chars_SAFE_PASSTHROUGH_REregex_POLICY_HOST_ENTRY_REdead bracket chars\[\]from regex[::1]now triggers WARNING via fail-loud misconfig postureNew constant:
_SAFE_PASSTHROUGH_RE = re.compile(r"^[\x20-\x7E]+$")— printable ASCII only (0x20 space through 0x7E tilde). Shared by all five gates.New constant:
MAX_PASSTHROUGH_LEN: int = 128— boundsturn_idandsender_id(ULID/UUID-shaped correlation tokens).server_addressalso gets the same control-char gate (same posture aspolicy_host).Pin test:
tests/mesh/test_security_control_char_gates.py(25 tests across 5 classes).Verification:
#239 bucket status after R6:
policy_host_raw) remains deferred to PR-6 where consumer-side literal-equality call sites are visible.Round-budget accounting: R1 (substantive), R2 (test cleanup), R3 (dead code), R5 (HF shape gate), R6 (charset gates) = 4 substantive code rounds. Acknowledged exceeding the AGENTS.md 3-round cap; justified because R6 addresses a genuine wire-injection vector flagged as highest-priority by the reviewer. No further code rounds on PR-2.
Landing order: PR-1 → PR-2 (parallel with PR-3/4/5) → PR-6 → PR-7 → PR-8 → PR-9. Tracking: #219.