Please do not open a public issue. Email aihackathon@sans.org with subject VERDICT: <one-line summary>, or DM the maintainer in the SANS AI Hackathon Slack (https://join.slack.com/t/sansaihackathon/shared_invite/zt-3srjz86zo-bwHi_v1aKTg2IJAU4_4OwA).
Include:
- Affected version / commit SHA
- Repro steps (minimal evidence + commands)
- Threat surface (see
docs/spec/VERDICT_AUDIT_v4.4.md— insider, prompt-injection-from-evidence, malicious-tool-output, external-attacker) - Suggested mitigation if you have one
VERDICT is a forensic agent that touches evidence — disk images, memory captures, packet captures, registry hives, event logs. Even though all tool execution happens inside read-only microsandboxes (CLAUDE.md §4.2), evidence integrity is the highest-value asset; we treat any path that lets a writer reach /evidence/ as critical.
In scope:
- Bypass of the three-layer immutability defense (PreToolUse hook,
DenyRuleWrapper, microsandbox read-only mount) - Prompt injection from evidence content that drives the planner or executor to malicious behaviour
- Credential leakage into a microVM (
CLAUDE.md§3.9 — credentials must never enter) - HMAC ledger forgery / chain-of-custody breakage
- Mode-lock bypass on resume
- Any path that allows write to a host evidence file
Out of scope:
- DoS against your own SGLang or Langfuse instance
- Vulnerabilities in upstream dependencies (report to the upstream)
- Issues that require physical access to the host
- Acknowledge within 5 business days
- Coordinate on a fix and a disclosure timeline
- Credit you in the release notes (or stay anonymous, your choice)
See CLAUDE.md §3 for the load-bearing rules built into the codebase. If a reported issue is a violation of one of those rules, it gets prioritized accordingly.
Issues discovered by internal review and disclosed here so contributors and judges can see what is in flight. All open items are tracked in docs/BUILD_PLAN.md and addressed before submission.
- Severity: High
- Affected:
verdict/graph/wrappers/deny_rule.py:221-248(_to_path_str,_is_under_evidence) - Discovered: 2026-05-02 (internal security review of
feat/W2.C.4-compose-executor-work) - Status: Open — fix tracked under W2.C.1.b (deny-rule normalization hardening)
- Scope mapping: "Bypass of the three-layer immutability defense" (in-scope §)
_to_path_str uses pathlib.PurePosixPath(value), which deliberately does not resolve .. segments or collapse leading //. The deny check then compares the un-normalized string against "/evidence/" via prefix match. Inputs like /work/../evidence/out.txt, /tmp/../evidence/out.txt, and //evidence/out.txt slip past Layer 2 even though the kernel resolves them to /evidence/out.txt at syscall time. Layer 3 (read-only mount + noexec + host chattr +i) is the actual write blocker in correctly-configured deployments, but CLAUDE.md §3.1 designates Layer 2 as the architectural guarantee that fires in all three modes — defense-in-depth must hold even if Layer 3 is degraded.
Remediation: replace PurePosixPath with os.path.normpath (lexical .. collapse) plus an explicit double-slash strip, then compare via pathlib.PurePosixPath parents rather than string prefix. Add RED tests for .. traversal, //-prefix, NUL injection, and symlink-style siblings of /evidence.
- Severity: High
- Affected:
verdict/ledger/hmac_key.py:134(_TPMHMACProvider.signand the symmetricverify) - Discovered: 2026-05-02 (internal security review of
feat/W2.C.4-compose-executor-work) - Status: Open — fix tracked under W2.C.3.b (TPM HMAC sequencing)
- Scope mapping: "HMAC ledger forgery / chain-of-custody breakage" (in-scope §)
_TPMHMACProvider.sign() truncates its message argument with TPM2B_MAX_BUFFER(message[:1024]) and returns the digest as if it covered the full input. There is no length guard, no error on overflow, no chunked path via TPM2_HMAC_Start / TPM2_SequenceUpdate / TPM2_SequenceComplete. LedgerWriter._compute_payload_hash (writer.py:73-81) appends prev_entry_hash and entry_id after the JSON payload — for any tool-call entry whose serialized payload exceeds ~960 bytes (the steady state once langfuse_trace_id, output_files_sha256, parse_warnings, and the NIST SP 800-86 metadata are populated), the chain-linkage bytes fall past the truncation window and are not authenticated at all. An attacker with write access to cases/<id>/ledger.jsonl can rewrite prev_entry_hash / entry_id (or splice forged entries) while verdict validate still reports the chain as intact in the TPM configuration. Software (hmac.HMAC) and gpg-derived paths are unaffected.
Remediation: raise an explicit HMACMessageTooLargeError for inputs > TPM2B_MAX_BUFFER until sequenced-HMAC is implemented; then implement TPM2_HMAC_Start / SequenceUpdate / SequenceComplete chunking. Mirror the change in verify. Add a unit test (the §3.10 single-system-boundary mock exception applies at the tpm2_pytss boundary) that signs a ≥ 4 KB message and asserts that two messages differing only in bytes after position 1024 produce different signatures.