docs(mesh): README env-var matrix + CHANGELOG entry (9/9 of #195 split)#226
docs(mesh): README env-var matrix + CHANGELOG entry (9/9 of #195 split)#226cagataycali wants to merge 6 commits into
Conversation
Final PR in the split of strands-labs#195. Lands last because the env-var matrix and CHANGELOG entry can only be accurate once every code PR has landed and the final env-var surface is known. - README.md (+190/-1) -- mesh security configuration matrix: STRANDS_MESH_TLS_{CA,CERT,KEY}, STRANDS_MESH_ACL_FILE, STRANDS_MESH_NAMESPACE, STRANDS_MESH_MULTICAST, STRANDS_MESH_CMD_RATE_HZ, STRANDS_MESH_MAX_{CMD,CAMERA,SAFETY}_BYTES, STRANDS_MESH_MAX_SESSIONS, STRANDS_MESH_SAFETY_RATE_HZ, STRANDS_MESH_AUDIT_PSK, STRANDS_MESH_AUTH_MODE, STRANDS_MESH_I_KNOW_THIS_IS_INSECURE, STRANDS_MESH_RESUME_*, STRANDS_MESH_BRIDGE_TOPICS{,_PREFIX,_DEDUP_STRICT}, STRANDS_MESH_POLICY_HOST_ALLOW, STRANDS_MESH_HF_REPO_ALLOW, STRANDS_MESH_POLICY_TYPE_ALLOW, STRANDS_MESH_CAMERA_OFFLOAD_DISABLE. - CHANGELOG.md (+219) -- consolidated entry covering all 8 code PRs. Tracking: strands-labs#219 Source PR: strands-labs#195 Lands after: every other PR in the split
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Docs-only PR (CHANGELOG + README) capturing the mesh security hardening landed across PRs 2-8 of the #195 split. The matrix is dense and well-organised; the threat-vector table is the kind of thing operators actually need. As the PR description acknowledges, this is the last slice and depends on the other 8 to be in main before the documented surface is real.
What's good
- Env-var matrix covers nearly every new var in the dependent branches (
AUTH_MODE,TLS_*,ACL_FILE,MAX_*,RATE_HZ,RESUME_*,CA_PINS, etc.). Cross-referenced against the source onmesh/zenoh-native-security— the new-var coverage matches the code surface. - Migration block (CHANGELOG L180-217) is concrete: enumerates removed env vars and shows the new
policy_providerrequirement with before/after code. - Defaults in the table match the source (e.g.
MAX_SESSIONS=256,CMD_RATE_HZ=20.0,RESUME_FRESHNESS_S=60) — verified againstmesh/_zenoh_config.pyandmesh/core.pyon the author's full branch. - Captures the four Zenoh 1.x quirks empirically (CHANGELOG L57-84) — useful institutional memory.
Concerns
- PR is not actually marked Draft. The description says "Drafted until every other PR in the split lands" but the GitHub state is
draft: false. None of the dependent PRs (#220-#225, #227, #228) are merged yet. Either flip to draft or update the description so reviewers don't assume this is mergeable in isolation. - Pre-existing env vars omitted from the matrix. The README is intended as a comprehensive matrix but a few mesh env vars currently in
mainaren't listed:STRANDS_MESH_BACKEND(used inmesh/core.py,mesh/session.py,mesh/transport/factory.py),STRANDS_MESH_CAMERA_S3_BUCKET,STRANDS_MESH_CAMERA_S3_PREFIX, plus the per-sensor frequency tunables (STRANDS_MESH_CAMERA_HZ,_HAND_HZ,_HEALTH_HZ,_IMU_HZ,_LIDAR_SUMMARY_HZ,_MAP_INFO_HZ,_ODOM_HZ,_POSE_HZ). Out of scope for the security hardening PR per se, but worth a follow-up issue if the matrix is meant to be exhaustive. STRANDS_MESH_BRIDGE_TOPICSis referenced inside another row's description (L506) but has no row of its own — readers can't look up what it does or its default.- Inconsistent log-level claim for
auth_mode=none. Three places, two different levels: README L508 and CHANGELOG L139 say ERROR; CHANGELOG L222 says WARNING. Pick one (the source onmesh/_zenoh_config.pyactually emits ERROR, so CHANGELOG L222 is the wrong one). - "839 mesh tests pass" / specific test counts (CHANGELOG L167-178) will become stale fast and there is no CI gate enforcing the number. Consider switching to "the mesh test suite (
tests/mesh/) passes; see CI for the live count." - No CI gate for env-var matrix drift. Right now nothing notices when a new
STRANDS_MESH_*lands without a README row. A simple test that greps the source and the README and asserts the README is a superset would prevent the matrix from going stale on the next PR. Worth a follow-up issue (tests/test_envvar_matrix_completeness.py).
Verification suggestions
After PRs 2-8 merge into main and this PR is rebased, before the human re-reviews:
# Every STRANDS_MESH_* in source has a row in the README:
grep -hoE 'STRANDS_MESH_[A-Z0-9_]+' strands_robots/**/*.py | sort -u > /tmp/source_vars
grep -oE 'STRANDS_MESH_[A-Z0-9_]+' README.md | sort -u > /tmp/doc_vars
comm -23 /tmp/source_vars /tmp/doc_vars # should be empty
# Defaults in the README match the source:
grep -E 'STRANDS_MESH_(MAX_SESSIONS|CMD_RATE_HZ|RESUME_FRESHNESS_S|MAX_CMD_BYTES|MAX_CAMERA_BYTES)' \
strands_robots/mesh/_zenoh_config.py strands_robots/mesh/core.py
# Markdown table renders cleanly:
npx markdownlint README.md CHANGELOG.md| |---|---| | ||
| | LAN outsider, no cert | **Mitigated.** TLS handshake rejects the connection. | | ||
| | Cert from a different CA | **Mitigated.** Our CA bundle does not verify it; handshake fails. | | ||
| | Valid cert, CN does not match any ACL rule | **Mitigated** when `STRANDS_MESH_ACL_FILE` is set with literal CN allowlists. With the built-in default ACL (`any_authenticated_peer`, intentionally permissive to avoid silent fleet outage on first run), any CA-signed peer is admitted to all key-expressions. Operators are expected to ship a literal-CN ACL file in production. | |
There was a problem hiding this comment.
Threat-vector verdict misleading for the default state. This row (and L697) reads as "Mitigated" but the body of the cell admits the default ACL is permissive — any_authenticated_peer admits any CA-signed peer to all key-expressions until the operator ships a literal-CN STRANDS_MESH_ACL_FILE. An operator scanning the table sees "Mitigated" and assumes a default install is safe against rogue-CN peers; it isn't.
Suggest splitting the verdict so the default state is honest:
| Valid cert, CN does not match any ACL rule | **Mitigated** when `STRANDS_MESH_ACL_FILE` is set with literal CN allowlists. **Not mitigated by default** — the built-in permissive ACL admits any CA-signed peer to every key-expression. |
Apply the same wording fix to L697. Note that STRANDS_MESH_ACCEPT_PERMISSIVE_ACL (CHANGELOG L141) requires explicit opt-in to start under the permissive default — that gate is worth surfacing in this row as the actual default-state safety net.
| --- | ||
|
|
||
| Dev / lab environments without PKI run `STRANDS_MESH_AUTH_MODE=none` AND `STRANDS_MESH_I_KNOW_THIS_IS_INSECURE=1` | ||
| to keep plain-TCP behaviour; the mesh logs a WARNING at session |
There was a problem hiding this comment.
Inconsistent log level for auth_mode=none. This line says "the mesh logs a WARNING at session open", but README L508 ("Logs ERROR (not WARNING) on every session open") and the CHANGELOG matrix L139 ("ERROR-level log at every session open") both say ERROR — and STRANDS_MESH_I_KNOW_THIS_IS_INSECURE is explicitly designed to be loud. Source on the author's full branch (mesh/_zenoh_config.py) emits ERROR. This line should say ERROR for consistency.
|
|
||
| Bridge-transport deployments still need cross-transport deduplication | ||
| because the same Zenoh + IoT MQTT topic can deliver the same payload | ||
| twice. `mesh.transport.bridge_transport._CommandDeduplicator` |
There was a problem hiding this comment.
Underscore-prefixed class referenced in user-facing docs. _CommandDeduplicator is a private name (leading underscore) and AGENTS.md > Review Learnings (PR #86) > "Public API Hygiene" rules out recommending _methods in user-facing docstrings or error messages because it locks in a private dependency contract.
Either (a) drop the class name and describe the behaviour without it, or (b) promote it to a public name (CommandDeduplicator) before merge so the doc reference is on the public API surface.
| | `STRANDS_MESH_RESUME_REPLAY_CACHE_MAX` | `4096` | Maximum entries in the per-receiver resume proof_nonce replay cache (LRU eviction) | | ||
|
|
||
| All three resume-hardening env vars are parsed via | ||
| ``_parse_positive_float_env`` / ``_parse_positive_int_env`` helpers in |
There was a problem hiding this comment.
Underscore-prefixed helpers in user-facing CHANGELOG. Same AGENTS.md (PR #86) hygiene rule as on README L716: _parse_positive_float_env / _parse_positive_int_env are private. Either rephrase without naming them ("parsed via internal helpers in strands_robots.mesh.core") or promote them to public names if operators are expected to know about them.
| | `STRANDS_MESH_AUDIT_DIR` | Directory for the safety audit log (`mesh_audit.jsonl`) and sequence-counter sidecar (`mesh_audit.seq.json`) | `~/.strands_robots/` | | ||
| | `STRANDS_MESH_AUDIT_MAX_BYTES` | Maximum size (bytes) of the active audit log before rotation. Hard-capped at 10 GiB. Phase-4 / E1: prevents an attacker who can publish safety events from filling the disk. | `104857600` (100 MiB) | | ||
| | `STRANDS_MESH_AUDIT_MAX_FILES` | Maximum number of rotated audit log copies kept (`mesh_audit.jsonl.1` … `.N`). Hard-capped at 100. Older rotations are discarded. | `5` | | ||
| | `STRANDS_MESH_BRIDGE_TOPICS_PREFIX` | Comma-separated list of bridge filter entries that match as path-prefix (entry matches `entry/<anything>`). Default: `response` (so `response/<turn-id>` bridges). All other entries in `STRANDS_MESH_BRIDGE_TOPICS` match exactly — Phase-4 / A2 hardening that closes the prefix-bypass attack. | `response` | |
There was a problem hiding this comment.
STRANDS_MESH_BRIDGE_TOPICS referenced but has no row in the matrix. This row's description says "All other entries in STRANDS_MESH_BRIDGE_TOPICS match exactly" but STRANDS_MESH_BRIDGE_TOPICS itself isn't a documented row, so a reader can't look up its semantics, format, or default. Add a row for it (it's a real env var read by mesh.transport.bridge_transport) above this _PREFIX row.
…L state Two rows in the threat-vector matrix (rogue-CN and unknown-CN) read as "Mitigated" but the body of each cell admits the default ACL is permissive. An operator scanning the table sees "Mitigated" and assumes a default install is safe against rogue-CN peers; it isn't unless STRANDS_MESH_ACL_FILE is shipped or STRANDS_MESH_ACCEPT_PERMISSIVE_ACL is the explicit opt-in. Reword each row with a leading "**Mitigated** when ..." / "**Not mitigated by default** ..." split and surface STRANDS_MESH_ACCEPT_PERMISSIVE_ACL as the actual default-state safety net (already documented in CHANGELOG L141). Addresses review feedback on README L692 / L697.
The CHANGELOG line described auth_mode=none as logging a WARNING at session open, but README L508 and the env-var matrix at CHANGELOG L139 both correctly say ERROR, and STRANDS_MESH_I_KNOW_THIS_IS_INSECURE is explicitly designed to be loud (the actual emit at session open is at ERROR level). Single-source the level so an operator cross-checking the three docs gets a consistent answer. Addresses review feedback on CHANGELOG L222.
…duplicator Per the public-API hygiene rule (don't recommend underscore-prefixed names in user-facing docs because it locks the project into a private dependency contract), describe the dedup behaviour at the module level instead of pointing at the private class name. The behaviour the operator actually cares about — fingerprint on the (sender_id, turn_id, command) tuple, dispatch once, forward to the application-layer handler — is unchanged. If a future PR promotes the class to a public name we can re-cite it by symbol; until then, the module-level reference is the correct level of granularity for an operator-facing README. Addresses review feedback on README L716.
…lpers Same public-API hygiene rule as on README L716: don't reference underscore-prefixed helpers (_parse_positive_float_env / _parse_positive_int_env) in user-facing CHANGELOG text because it locks the project into a private contract. Replace with a behavioural description at the module level — operators care that malformed values log a WARNING and fall back to the documented default, not which private helper does the parsing. Pin reference is unchanged. Addresses review feedback on CHANGELOG L159.
The neighbouring _PREFIX row references STRANDS_MESH_BRIDGE_TOPICS in
its description ("All other entries in STRANDS_MESH_BRIDGE_TOPICS
match exactly") but the parent var has no row of its own, so a reader
trying to look up its semantics, format, or default has nowhere to land.
Add a row above the _PREFIX entry documenting:
- The format (comma-separated suffix allowlist).
- The consumer (mesh.transport.bridge_transport).
- The default suffix set verbatim (presence, health, safety/event,
safety/estop, cmd, response, broadcast — verified against
DEFAULT_BRIDGE_SUFFIXES in bridge_transport.py).
- The not-bridged-by-default list (state, pose, imu, odom, lidar,
camera, input, hand) so operators understand the cost-math choice
before they extend the var.
- The match-semantics interaction with _PREFIX.
Addresses review feedback on README L506.
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Docs-only PR that backfills the README env-var matrix and a consolidated CHANGELOG entry covering the 8 mesh-security PRs (#195 split). +411 / -1 across README.md and CHANGELOG.md. Lands last per the stack plan; on upstream/main the referenced code surface (mesh/security.py, mesh/_zenoh_config.py, mesh/_acl_config.py, examples/mesh_acl_example.json5, every new tests/mesh/test_*.py, every new STRANDS_MESH_* security env var) does not yet exist, so the docs are necessarily aspirational at the moment of review.
What's good
- Scope is tight: two doc files, no code churn, no test churn, no host paths added.
- The 5-commit R1 round addresses real review threads on the same PR (default-ACL verdict honesty, log-level alignment with the underlying emitter, dropping
_CommandDeduplicatorfrom user prose, replacing_parse_positive_*with behavioural prose, adding the missingSTRANDS_MESH_BRIDGE_TOPICSrow). Each commit cites the line and the fix is surgical. - No emojis added by this PR (verified U+1F000–U+1FFFF + U+FE0F sweep over
+lines). Pre-existing emojis in the README are out of scope. - The threat-vector matrix split into
Mitigated/Not mitigated by defaultrows is exactly the right honesty for an operator scanning the table.
Concerns
- Docs land before the code they document. The PR description acknowledges this ("Drafted until every other PR in the split lands") but the PR is in
draft: false. Either re-flag as draft until 1–8 land, or add a banner row at the top of the new "Unreleased – mesh security hardening" CHANGELOG section pointing to the umbrella issue (#219) so a reader landing on the CHANGELOG viagit blameknows the symbols won't resolve in the working tree at this commit. The current entry reads as if the work is complete and merged. - CHANGELOG "New env vars" table is not actually complete. Reviewer focus #1 is "matrix completeness vs the actual code surface". The README env-var matrix (L503–534) lists ~27 mesh security vars; the CHANGELOG "New env vars" table at L136–155 lists 16. Vars present in README but missing from the CHANGELOG table:
STRANDS_MESH_BRIDGE_TOPICS,STRANDS_MESH_BRIDGE_TOPICS_PREFIX,STRANDS_MESH_AUDIT_MAX_BYTES,STRANDS_MESH_AUDIT_MAX_FILES,STRANDS_MESH_DEDUP_TTL,STRANDS_MESH_CAMERA_PRESIGN_TTL,STRANDS_MESH_POLICY_HOST_ALLOW,STRANDS_MESH_AUDIT_PSK,STRANDS_MESH_OVERRIDE_CODE,STRANDS_MESH_HF_REPO_ALLOW,STRANDS_MESH_POLICY_TYPE_ALLOW,STRANDS_MESH_CA_PINS,STRANDS_MESH_DISABLE_CA_PIN. Several of these are mentioned in the CHANGELOG narrative but only the ones in the table will be picked up by anyone grep'ing the CHANGELOG for an env-var name. - R1 private-symbol sweep was incomplete. Commit
9893db7correctly dropped_CommandDeduplicatorfrom user-facing prose, but several other_-prefixed symbols still appear in the README and CHANGELOG (see inline). AGENTS.mdReview Learnings (#86) > Public API Hygieneis explicit: "Never recommend a_methodin user-facing docstrings or error messages." - Hardcoded README line numbers in CHANGELOG. CHANGELOG L222–223 reference
README L508andmatrix at L139. Any future edit above those points silently breaks the cross-reference and the CHANGELOG turns into archeology bait.
Verification suggestions
# Cross-check the CHANGELOG matrix against the README matrix:
comm -23 \
<(grep -oE 'STRANDS_MESH_[A-Z_]+' README.md | sort -u) \
<(awk '/^### New env vars/,/^### Tests/' CHANGELOG.md | grep -oE 'STRANDS_MESH_[A-Z_]+' | sort -u)
# Anything emitted is a var documented in the README but missing from
# the CHANGELOG "New env vars" table.
# Find any other private symbols that slipped into user-facing docs:
grep -nE '`[A-Za-z_.]*\._[a-z_]+' README.md CHANGELOG.md
# When the underlying code lands, sanity-check every env var in the
# README matrix actually has an os.getenv call somewhere:
for v in $(grep -oE 'STRANDS_MESH_[A-Z_]+' README.md | sort -u); do
rg -q "\"$v\"" strands_robots/ || echo "undocumented in code: $v"
done| | `STRANDS_MESH_RESUME_REPLAY_CACHE_MAX` | Maximum number of `proof_nonce` values remembered in the per-receiver replay cache. Bounded LRU eviction prevents memory exhaustion from high-volume resume attempts. | `4096` | | ||
| | `STRANDS_MESH_DEDUP_TTL` | Bridge-transport deduplication window (seconds). Caps how long the same envelope nonce is remembered across the Zenoh + IoT subscriber wrappers. | `120` | | ||
| | `STRANDS_MESH_CAMERA_PRESIGN_TTL` | Lifetime (seconds) of presigned S3 GET URLs published in `/camera/.../ref` messages. Capped at 3600. | `60` | | ||
| | `STRANDS_MESH_CAMERA_DISABLED` | Set to `true` to disable camera publishing entirely (privacy kill switch). `Mesh._publish_cameras_once` short-circuits before any frame is built. | `false` | |
There was a problem hiding this comment.
Private method (_publish_cameras_once) named in a user-facing env-var description. Per AGENTS.md > Review Learnings (#86) > Public API Hygiene ("Never recommend a _method in user-facing docstrings or error messages"). The R1 round already pulled _CommandDeduplicator out via commit 9893db7; that sweep missed several siblings:
README.md:530andCHANGELOG.md:152—Mesh._publish_cameras_onceREADME.md:697andCHANGELOG.md:94—Mesh._on_responseREADME.md:621—mesh._acl_config.default_acl()README.md:724-725—mesh/_zenoh_config.py,mesh/_acl_config.py(modules whose names start with_are by-convention private; the public surface should bemesh.security/mesh.configor whatever the eventual public re-export is)CHANGELOG.md:97—_resume_lockout
Suggest the same treatment as _CommandDeduplicator: describe behaviour at module level (e.g. "the camera publisher short-circuits before any frame is built"), cite the public module path, and let git blame carry the implementation pin. Otherwise the very next refactor that renames any of these private symbols silently invalidates user-facing docs.
| | `STRANDS_MESH_CAMERA_DISABLED` | `false` | Privacy kill switch -- when `true`, `Mesh._publish_cameras_once` short-circuits before any frame is built | | ||
| | `STRANDS_MESH_FILTER_INTERFACES` | unset (wildcard) | Optional comma-separated NIC allowlist for the `low_pass_filter` rules. Unset means "every link" (Zenoh's `SubjectProperty::Wildcard`). | | ||
| | `STRANDS_MESH_RESUME_FRESHNESS_S` | `60` | Maximum age (seconds) of a resume envelope before rejection as stale | | ||
| | `STRANDS_MESH_RESUME_FORWARD_SKEW_S` | `5` | Maximum forward clock skew (seconds) tolerated in resume envelope timestamps | |
There was a problem hiding this comment.
Reviewer focus #1 ("env-var matrix completeness vs the actual code surface") looks not-yet-met: the CHANGELOG "New env vars" table here lists ~16 vars, but the README matrix (L503–534) documents ~27 mesh security vars. Missing from this table but present in the README:
STRANDS_MESH_BRIDGE_TOPICS
STRANDS_MESH_BRIDGE_TOPICS_PREFIX
STRANDS_MESH_AUDIT_MAX_BYTES
STRANDS_MESH_AUDIT_MAX_FILES
STRANDS_MESH_DEDUP_TTL
STRANDS_MESH_CAMERA_PRESIGN_TTL
STRANDS_MESH_POLICY_HOST_ALLOW
STRANDS_MESH_AUDIT_PSK
STRANDS_MESH_OVERRIDE_CODE
STRANDS_MESH_HF_REPO_ALLOW
STRANDS_MESH_POLICY_TYPE_ALLOW
STRANDS_MESH_CA_PINS
STRANDS_MESH_DISABLE_CA_PIN
Several are mentioned in earlier CHANGELOG sections in prose, but the "New env vars" table is the discoverable surface (an operator running grep STRANDS_MESH_AUDIT_PSK CHANGELOG.md to find what behavior it controls only finds the audit-section bullet at L103, not a dedicated row). Either widen this table to mirror the README, or add a sentence here pointing at the README matrix as the canonical source and explaining why this table is a curated subset.
|
|
||
| Dev / lab environments without PKI run `STRANDS_MESH_AUTH_MODE=none` AND `STRANDS_MESH_I_KNOW_THIS_IS_INSECURE=1` | ||
| to keep plain-TCP behaviour; the mesh logs an ERROR at session | ||
| open (matches README L508 and the env-var matrix at L139 — `STRANDS_MESH_I_KNOW_THIS_IS_INSECURE` is explicitly designed to be loud). |
There was a problem hiding this comment.
Hardcoded line numbers (README L508, matrix at L139) are brittle citations — any future README edit above L508 silently breaks the reference and the CHANGELOG becomes misleading on the very next docs PR. Same problem with L139 pointing at this CHANGELOG file. Cite by section name instead, e.g. "matches the Mesh security → Authentication: mTLS (default) section in the README and the STRANDS_MESH_I_KNOW_THIS_IS_INSECURE row in this changelog's New env vars table."
| | `STRANDS_MESH_AUDIT_DIR` | Directory for the safety audit log (`mesh_audit.jsonl`) and sequence-counter sidecar (`mesh_audit.seq.json`) | `~/.strands_robots/` | | ||
| | `STRANDS_MESH_AUDIT_MAX_BYTES` | Maximum size (bytes) of the active audit log before rotation. Hard-capped at 10 GiB. Phase-4 / E1: prevents an attacker who can publish safety events from filling the disk. | `104857600` (100 MiB) | | ||
| | `STRANDS_MESH_AUDIT_MAX_FILES` | Maximum number of rotated audit log copies kept (`mesh_audit.jsonl.1` … `.N`). Hard-capped at 100. Older rotations are discarded. | `5` | | ||
| | `STRANDS_MESH_BRIDGE_TOPICS` | Comma-separated allowlist of key-expression suffixes the Zenoh / AWS IoT bridge transport forwards across the cloud boundary. Read by `mesh.transport.bridge_transport`. Empty / unset uses the default suffix set (`presence,health,safety/event,safety/estop,cmd,response,broadcast`); high-volume telemetry (`state`, `pose`, `imu`, `odom`, `lidar`, `camera`, `input`, `hand`) is **not** bridged by default — set this var explicitly to opt in. Match semantics are exact for every entry except those listed in `STRANDS_MESH_BRIDGE_TOPICS_PREFIX`. | `presence,health,safety/event,safety/estop,cmd,response,broadcast` | |
There was a problem hiding this comment.
The default value here (presence,health,safety/event,safety/estop,cmd,response,broadcast) and the not-bridged-by-default list are documented as facts, but on upstream/main at this commit no code reads STRANDS_MESH_BRIDGE_TOPICS with these defaults yet (grep -r DEFAULT_BRIDGE_SUFFIXES strands_robots/ returns nothing). That's expected for PR 9/9, but it means whichever predecessor PR introduces the constant has to ship with these exact defaults or this row goes stale silently. Suggest pinning the verification: at land-time, add a unit test in PR 7-or-8 that asserts DEFAULT_BRIDGE_SUFFIXES == ("presence", "health", "safety/event", "safety/estop", "cmd", "response", "broadcast") so a future edit to the tuple has to either update the README or fail CI. Currently the only thing tying these two sources together is reviewer attention.
| positive-float / positive-int helpers in ``strands_robots.mesh.core``. | ||
| Malformed values (non-numeric or negative) log a WARNING and fall | ||
| back to the documented default instead of failing module import or | ||
| silently disabling the replay cache. Pinned by |
There was a problem hiding this comment.
Citation pin: tests/mesh/test_resume_env_validation.py doesn't exist on upstream/main at this commit (and neither do the 11 other test files named in the Tests section above). Re-flagging because PR 9/9 lands last — if any of those tests get renamed/restructured during the PR-1–8 review cycle, the CHANGELOG ends up citing a path that never exists anywhere in git log. Suggest a final grep -F -f <test_paths_from_changelog> $(git ls-files tests/mesh/) sanity check immediately before merge, or sketch the citations as tests/mesh/test_resume_env_validation.py (or successor) so the reader knows the path is the intent rather than a guarantee.
Per umbrella tracker strands-labs#219 the four PR-2/3/4/5 slices land in parallel and must each go green standalone. test_r25_audit_symmetry.py imports modules introduced by sibling slices that are not present on this branch: from strands_robots.mesh import _zenoh_config as zc # lives on PR strands-labs#224 from strands_robots.mesh import audit as audit_mod # lives on this PR from strands_robots.mesh import core as core_mod # lives on this PR from strands_robots.mesh import security as sec # lives on PR strands-labs#223 That makes the file structurally un-passable on the audit slice in isolation, which is why call-test-lint fails mypy here with "Module strands_robots.mesh has no attribute 'security'" / "... no attribute '_zenoh_config'". The four pin sites the file describes (peer-id charset on security.py, partial-failure cleanup on core.py, O_NOFOLLOW on the seq lockfile in audit.py, and the STRANDS_MESH_TLS_KEY mode-0o600 docstring on _zenoh_config.py) span four sibling slices and only make sense as a cross-cutting integration pin once strands-labs#223, strands-labs#224, strands-labs#221 and strands-labs#225 have all merged. The natural home is PR-6 (strands-labs#225, mesh/core replay caches + override-resume + safety topic handlers) which is the integration slice that needs all the listed modules present anyway, or a follow-up issue queued behind PR-9 (strands-labs#226, docs). The audit-side pin (point 3, O_NOFOLLOW on the seq lockfile) is already covered on this PR by tests/mesh/test_audit_seq_symlink.py and tests/mesh/test_audit_psk_poison_record.py, so removing this file does not weaken regression coverage on the audit slice itself.
Part 9 / 9 of the split of #195 — tracked by #219.
Drafted until every other PR in the split lands. The env-var matrix and CHANGELOG entry can only be accurate once the final env-var surface is known.
What's in this PR
README.md(+190/-1) — mesh security configuration matrix covering every env var introduced across PRs 2-8.CHANGELOG.md(+219) — consolidated entry covering all 8 code PRs.Reviewer focus
Stacking note
Lands last. No code changes, easy to review in isolation, no test churn.
§13 Round-N changelog
R1 — 5 surgical doc-drift fixes (5 commits, plain
git push, no force)253ad47**Mitigated** when STRANDS_MESH_ACL_FILE is set .../**Not mitigated by default** ...form, and surfaceSTRANDS_MESH_ACCEPT_PERMISSIVE_ACL(CHANGELOG L141) as the actual default-state safety net.3cabbb3auth_mode=nonelogs at WARNING; README L508 + CHANGELOG matrix L139 + the_zenoh_config.pyemit all say ERROR.9893db7mesh.transport.bridge_transport._CommandDeduplicator(underscore-prefixed, private).(sender_id, turn_id, command)tuples, dispatches once).44ea5a7_parse_positive_float_env/_parse_positive_int_env(private helpers).strands_robots.mesh.core"). Pin reference unchanged.bcf96fd_PREFIXrow)STRANDS_MESH_BRIDGE_TOPICSreferenced by the_PREFIXrow description but had no row of its own — operator could not look up its semantics._PREFIXdocumenting format, consumer module, full default suffix set (verified againstDEFAULT_BRIDGE_SUFFIXESinbridge_transport.py:88-97), the not-bridged-by-default list (state,pose,imu,odom,lidar,camera,input,hand), and the match-semantics interaction with_PREFIX.Verification: docs-only diff;
git diff origin/pr9/docs-mesh-readme-changelog~5 --stat=README.md +8/-4, CHANGELOG.md +8/-8. CI was green ona64bf03and re-runs onbcf96fd. No test surface change. Each of the 5 review threads on this PR has a corresponding commit; reviewer can resolve each thread against the linked SHA.Landing order: PR-1 → PR-2/3/4/5 → PR-6 → PR-7 → PR-8 → PR-9. Full plan:
PR_LIST.md. Tracking: #219.