Skip to content

Fix runtime approval threshold for shell metacharacters#99

Merged
Mr-Lucky merged 4 commits into
mainfrom
fix/metachar-approval-threshold
May 29, 2026
Merged

Fix runtime approval threshold for shell metacharacters#99
Mr-Lucky merged 4 commits into
mainfrom
fix/metachar-approval-threshold

Conversation

@Mr-Lucky
Copy link
Copy Markdown
Contributor

Summary

  • Keep shell metacharacter-only runtime findings below the approval threshold so benign commands are auto-allowed without audit events, Cloud sync, or pending approvals.
  • Require agents to show the exact one-time approval command and wait for explicit user approval before approving and retrying protected actions.
  • Improve OpenClaw runtime tool detection for alternate tool name fields and exec/execute-style tools.
  • Reuse matching pending approval ids and skip AgentGuard approval/self commands wrapped through simple shell launchers.

Tests

  • Added/updated runtime, action detector, and OpenClaw integration coverage for the approval threshold, approval reuse, shell-wrapper self commands, and OpenClaw tool name handling

@github-actions
Copy link
Copy Markdown

AgentGuard PR Review

This patch introduces a few concrete security and correctness regressions.

  1. severity: high — src/runtime/protect.ts and src/runtime/evaluator.ts (auto-allowing low-risk runtime decisions)

    • What can go wrong: shouldAutoAllowRuntimeDecision(riskScore < 20 || riskLevel === 'safe') causes any action with a low score to be silently allowed, even when the policy or OSS evaluator would otherwise have produced require_approval or a block. This broadens the bypass beyond shell metacharacter-only cases and can suppress required user approvals for other low-scoring but still sensitive actions.
    • Fix: restrict the auto-allow fast path to the specific metacharacter-only reason/tag combination, or require an exact safe decision (riskScore === 0 && riskLevel === 'safe' && reasons.length === 0) before suppressing approval/blocking.
  2. severity: high — src/runtime/protect.ts (shouldSuppressRuntimeReport)

    • What can go wrong: shouldSuppressRuntimeReport(decision) now returns true for any riskScore < 20, so low-risk-but-actionable findings are not audited, not spooled, and not sent to Cloud. This creates a visibility gap for security-relevant behavior and can hide repeated suspicious activity from downstream policy enforcement.
    • Fix: only suppress reporting for the narrow, explicitly intended no-op case; keep audit/sync for all decisions that carry any security reason or non-empty reasons list.
  3. severity: medium — src/action/detectors/exec.ts (SHELL_METACHAR_PATTERN)

    • What can go wrong: the new pattern no longer marks backticks in a command substitution context as a low-risk signal unless they match the broader character class, and the logic now treats many shell operators as “advisory.” Combined with the runtime changes, this can let command-injection-shaped inputs proceed without approval even when they include clear shell control flow.
    • Fix: keep a dedicated injection signal for shell control operators and command substitution, and only downgrade specifically benign redirection-only cases after validating the full command shape.
  4. severity: high — src/runtime/self-command.ts (shellWrapperCommand)

    • What can go wrong: any command launched via sh/bash/zsh/dash/fish -c/-lc/... that contains an agentguard approve ... payload is treated as a self-command and skipped. That allows an attacker-controlled prompt or tool output to hide approval commands inside a shell wrapper and bypass runtime protection.
    • Fix: do not exempt wrapped shell commands by default; if this behavior is needed, require strict parsing of the wrapped payload and verify it is a direct, exact self-command originating from a trusted source rather than arbitrary shell text.
  5. severity: medium — src/runtime/approvals.ts (writePendingApproval)

    • What can go wrong: returning an existing pending record for the same fingerprint without updating actionId, TTL, or metadata can make approvals stale. A later identical action may inherit an old pending approval id that has already expired or belongs to a different session/context.
    • Fix: when a matching pending approval exists, refresh expiry/metadata if the intent is to dedupe, or always create a new record and dedupe only at presentation time. At minimum, verify session and context before reusing the record.

@Mr-Lucky Mr-Lucky merged commit 37297dd into main May 29, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants