Skip to content

fix(security): resolve OWASP scanner bypass via argument poisoning#89

Open
SHAURYASANYAL3 wants to merge 8 commits into
sreerevanth:mainfrom
SHAURYASANYAL3:fix/owasp-argument-bypass
Open

fix(security): resolve OWASP scanner bypass via argument poisoning#89
SHAURYASANYAL3 wants to merge 8 commits into
sreerevanth:mainfrom
SHAURYASANYAL3:fix/owasp-argument-bypass

Conversation

@SHAURYASANYAL3
Copy link
Copy Markdown
Contributor

@SHAURYASANYAL3 SHAURYASANYAL3 commented May 29, 2026

Summary

This PR fixes a security gap in OwaspScanner where tool call arguments were not included in heuristic analysis. As a result, malicious payloads embedded within nested JSON structures could bypass safety checks that were only inspecting raw command strings.

What Changed

Recursive Argument Flattening

  • Added _flatten_values() to OwaspScanner.
  • Recursively traverses dictionaries and lists to extract string content from nested tool arguments.

Expanded Scan Coverage

  • Updated _blob_of() to include flattened argument values in the generated scan payload.
  • Ensures security heuristics evaluate both command content and tool arguments.

Improved Threat Detection

The scanner can now detect:

  • A01: Prompt Injection
  • A04: Unsafe Code Execution
  • A05: Data Exfiltration

even when malicious content is embedded deep within structured tool arguments rather than exposed in the top-level command text.

Validation

Added the following test cases in tests/test_owasp_security.py:

  1. test_owasp_detects_exfil_in_poisoned_arguments
  2. test_owasp_detects_prompt_injection_in_arguments
  3. test_owasp_detects_nested_unsafe_code

All three tests pass with this change and fail against the previous implementation, confirming the vulnerability and validating the fix.

Summary by CodeRabbit

Release Notes

  • Tests

    • Added comprehensive test coverage for OWASP vector detection, validating the security scanner's ability to identify data exfiltration, prompt injection attacks, and unsafe code execution attempts in tool arguments.
  • Refactor

    • Improved internal code organization and structure within security scanning utilities.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 29, 2026

Review Change Stack

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

This PR adds recursive argument flattening to the OWASP scanner and comprehensive test coverage. The _flatten_values helper method extracts string-like leaves from nested tool-call arguments to prevent attackers from hiding malicious payloads in deeply nested structures. Three new tests validate detection of exfiltration, prompt injection, and unsafe code vectors.

Changes

OWASP Security Argument Flattening

Layer / File(s) Summary
OWASP Nested Argument Flattening
agentwatch/security/owasp.py
_flatten_values() recursively extracts string-like leaf values from nested dict/list/tuple/set structures with cycle prevention via visited id set, converting non-None non-strings to strings.
OWASP Vector Detection Tests
tests/test_owasp_security.py
Test helper _tool_event constructs AgentEvent objects with configurable tool arguments. Three pytest cases validate detection of DATA_EXFILTRATION, PROMPT_INJECTION, and UNSAFE_CODE_EXEC vectors in poisoned nested arguments.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • sreerevanth/AgentWatch#112: Directly overlaps with this PR's _flatten_values helper addition and OWASP test coverage for nested argument detection.

Poem

🐰 Nested structures hide their secrets deep,
But _flatten_values keeps no sleep—
Recursion guards with visited sets,
Catching exfil, injection, code-exec threats!
Safe scanning now, from root to leaf. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the primary change: fixing a security vulnerability in the OWASP scanner by resolving a bypass caused by argument poisoning.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agentwatch/security/owasp.py`:
- Around line 144-157: The _flatten_values method can infinite-recurse on cyclic
structures; modify it to track seen object ids and skip already-visited
containers: add an optional parameter (e.g., seen: set[int] | None = None)
inside _flatten_values, initialize seen if None, check id(data) and return [] if
already in seen, add id(data) to seen before recursing into
dicts/lists/tuples/sets, and pass seen through all recursive calls so
self-referential payloads are ignored and RecursionError is avoided.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 058502d4-aac3-4c0e-9d00-8f2d8653fd1a

📥 Commits

Reviewing files that changed from the base of the PR and between c0e464b and f14a4e4.

📒 Files selected for processing (8)
  • CONTRIBUTING.md
  • README.md
  • agentwatch/api/server.py
  • agentwatch/security/owasp.py
  • docs/adapters/claude-code.md
  • docs/adapters/langchain.md
  • docs/getting-started.md
  • tests/test_owasp_security.py

Comment thread agentwatch/security/owasp.py Outdated
Adds issues: write permission required to post comments on pull requests.
@sreerevanth sreerevanth self-requested a review May 30, 2026 15:13
Copy link
Copy Markdown
Owner

@sreerevanth sreerevanth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch on the OWASP argument-poisoning bypass and the added test coverage.

Before merge, please address one blocker:

_flatten_values() performs recursive traversal without cycle detection. Self-referential dict/list structures can trigger infinite recursion and raise RecursionError, causing scans to fail unexpectedly.

Suggested approach:

Track visited object IDs during traversal.
Skip already-visited containers.
Preserve current behavior for normal nested payloads.

Once cycle protection is added, this looks ready for final review.

@SHAURYASANYAL3 SHAURYASANYAL3 force-pushed the fix/owasp-argument-bypass branch from 85a2073 to 312848c Compare May 30, 2026 15:53
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
.github/workflows/test-on-pr.yml (1)

19-19: ⚡ Quick win

Pin third-party actions to commit SHAs.

actions/checkout@v4, actions/setup-python@v5, and actions/github-script@v7 are pinned to mutable tags. Static analysis flags this against the repo's pinning policy, and pinning to a full commit SHA is especially important here because this workflow runs with write permissions. Pin each uses: to a SHA (optionally with the version in a trailing comment).

Also applies to: 27-27, 51-51

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-on-pr.yml at line 19, Replace mutable action tags
with immutable commit SHAs for every "uses:" entry shown (e.g., replace
actions/checkout@v4, actions/setup-python@v5, and actions/github-script@v7) so
the workflow uses exact commit SHAs; keep the human-readable version as an
optional trailing comment if desired. Locate the three "uses:" occurrences in
the workflow file (the ones currently referencing actions/checkout,
actions/setup-python, and actions/github-script) and update each to the
corresponding full commit SHA from the action's repository release commit,
ensuring the SHA is the exact full-length commit hash.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test-on-pr.yml:
- Around line 4-5: The workflow uses the privileged pull_request_target trigger
while checking out and running PR-controlled code (e.g., actions/checkout of the
PR head, pip install -e ".[dev]", pytest, ruff), which allows token
exfiltration; fix it by splitting into two workflows: 1) a "PR Tests" workflow
triggered on pull_request (not pull_request_target) with permissions.contents:
read that checks out the PR, runs pip install -e ".[dev]", pytest and ruff, and
uploads results as an artifact; and 2) a privileged "PR Test Comment" workflow
triggered by workflow_run for the "PR Tests" workflow that never checks out PR
code, has permissions.pull-requests: write and issues: write, downloads the
uploaded artifact and posts the comment—ensure actions/checkout is removed from
the privileged workflow and artifacts are used to pass results.
- Around line 18-24: The checkout step uses actions/checkout@v4 with the default
credential persistence, which leaves GITHUB_TOKEN in .git/config; update the
Checkout step (actions/checkout@v4) to set persist-credentials: false to prevent
persisting credentials for this PR-test job since the job does not push and
should not expose GITHUB_TOKEN. Ensure the changed Checkout configuration is
applied where ref: ${{ github.event.pull_request.head.sha }} is used.

In `@tests/test_safety.py`:
- Around line 368-385: The test uses the typing symbol Any in annotations inside
test_flatten_values_cycle_detection (and other tests) but there is no import;
add the missing import "from typing import Any" to the test module so
annotations in test_flatten_values_cycle_detection, and any uses of Any
elsewhere, resolve and CI (F821) no longer fails; locate the
tests/test_safety.py file and insert the import near other typing imports or at
top of the file.

---

Nitpick comments:
In @.github/workflows/test-on-pr.yml:
- Line 19: Replace mutable action tags with immutable commit SHAs for every
"uses:" entry shown (e.g., replace actions/checkout@v4, actions/setup-python@v5,
and actions/github-script@v7) so the workflow uses exact commit SHAs; keep the
human-readable version as an optional trailing comment if desired. Locate the
three "uses:" occurrences in the workflow file (the ones currently referencing
actions/checkout, actions/setup-python, and actions/github-script) and update
each to the corresponding full commit SHA from the action's repository release
commit, ensuring the SHA is the exact full-length commit hash.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 7d8bb7c4-0ebb-480c-8e90-61e351c5d90f

📥 Commits

Reviewing files that changed from the base of the PR and between f14a4e4 and 312848c.

📒 Files selected for processing (3)
  • .github/workflows/test-on-pr.yml
  • agentwatch/security/owasp.py
  • tests/test_safety.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • agentwatch/security/owasp.py

Comment thread .github/workflows/test-on-pr.yml
Comment on lines 18 to +24
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
# Note: pull_request_target runs in the context of the base repo.
# Checking out the PR head is necessary to test the changes,
# but use caution as this runs with a write-capable GITHUB_TOKEN.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Disable credential persistence on checkout.

Since this job runs untrusted PR code under a privileged trigger, the default persist-credentials: true leaves the GITHUB_TOKEN in .git/config where the executed code can read and exfiltrate it. Set persist-credentials: false (you don't push from this job). This hardens the checkout even after restructuring the trigger.

🛡️ Proposed fix
       - name: Checkout
         uses: actions/checkout@v4
         with:
           ref: ${{ github.event.pull_request.head.sha }}
+          persist-credentials: false
🧰 Tools
🪛 zizmor (1.25.2)

[warning] 18-24: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)


[error] 19-19: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-on-pr.yml around lines 18 - 24, The checkout step
uses actions/checkout@v4 with the default credential persistence, which leaves
GITHUB_TOKEN in .git/config; update the Checkout step (actions/checkout@v4) to
set persist-credentials: false to prevent persisting credentials for this
PR-test job since the job does not push and should not expose GITHUB_TOKEN.
Ensure the changed Checkout configuration is applied where ref: ${{
github.event.pull_request.head.sha }} is used.

Comment thread tests/test_safety.py
Copy link
Copy Markdown
Owner

@sreerevanth sreerevanth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes Requested

Thanks for addressing the original blocker. The cycle-detection fix in _flatten_values() resolves the recursion concern and the security regression tests are appreciated.

Before merge, two additional issues need to be addressed:

  1. Critical workflow security issue

The workflow now uses pull_request_target while checking out and executing PR-controlled code (pip install, pytest, etc.). This introduces a privileged-code-execution path and is a higher-severity security concern than the OWASP scanner issue being fixed.

Please revert to a safe workflow design (e.g. unprivileged pull_request testing plus a separate reporting workflow) or otherwise eliminate execution of untrusted PR code under a privileged token context.

  1. Resolve merge conflicts

GitHub currently reports an unresolved conflict in:

  • docs/adapters/langchain.md

Please rebase on the latest main, resolve conflicts, and push the updated branch.

Once the workflow security issue and merge conflict are resolved, this can be reviewed again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants