Skip to content

1hackerway/eviltrace

Repository files navigation

EvilTrace

License: MIT

EvilTrace is an autonomous DFIR investigation agent for the SANS SIFT Workstation that verifies every finding against raw, SHA-256-sealed forensic tool output and tags each claim GROUNDED, INFERRED, or UNVERIFIED — so an autonomous AI investigation cannot quietly report something the evidence does not support.

Status — hackathon prototype. Built for the SANS Find Evil! AI Hackathon (2026). Developed and tested against a single case (Operation SHIELDBASE, host base-rd-01). This is a focused proof-of-concept that prioritizes a verifiable end-to-end audit chain over breadth. It is not production software and is not a substitute for a trained examiner's judgment.


Submission Compliance Map

# Requirement Location
1 Code repository URL https://github.com/1hackerway/eviltrace (this README)
2 Open-source license LICENSE — MIT, repo root; also shown in the GitHub About sidebar once public
3 README with setup instructions This file → Try It Out Locally
4 Step-by-step run instructions for judges Try It Out Locally (no live deployment; local run on SIFT per rules)
5 Text description of features / functionality This README + DEVPOST.md; samples/investigative_narrative.md and samples/alihadi_01/investigative_narrative.md — structured investigative narratives (analytical-reasoning requirement)
6 Demonstration video Watch on YouTube
7 Architecture diagram Architecture (Mermaid diagram in this README)
8 Evidence dataset documentation datasets/README.md + per-case briefings cases/incident_01/CLAUDE.md, cases/alihadi_01/CLAUDE.md
9 Accuracy report samples/accuracy_report.md, samples/accuracy_report_memory.md, samples/alihadi_01/accuracy_report.md, samples/alihadi_01/accuracy_report_memory.md, and samples/accuracy_self_assessment.md (honest self-assessment incl. evidence-integrity / spoliation section)
10 Agent execution logs samples/execution_log.jsonl and samples/alihadi_01/execution_log.jsonl — per-tool timestamps + per-turn token usage

Every finding in the reports can be traced to the specific tool execution that produced it via the artifact citation and the execution log's SHA-256 seals.

What EvilTrace does

EvilTrace orchestrates SIFT forensic tools (Sleuth Kit, Zimmerman EZ Tools, Plaso, Volatility 3, YARA) through a controlled wrapper layer, runs an autonomous Anthropic-SDK agent loop over them, and then validates every finding it produces against the exact, pre-hashed tool output it claims to be based on.

Each reported finding carries:

  • an exact artifact citation (file:LINE) pointing to the raw tool output line that supports it,
  • a confidence tagGROUNDED (the cited line literally contains the claimed evidence), INFERRED (a defensible deduction across grounded facts, but not a single literal line), or UNVERIFIED (no backing line — excluded from conclusions),
  • an integrity verdict tying the citation back to a SHA-256 hash that was computed on the raw tool output before any LLM saw it.

The validator that assigns these tags is deterministic and never calls an LLM. It cannot be argued with, and it cannot hallucinate.

Why it matters

Autonomous DFIR agents are fast, but they can fabricate. Protocol SIFT's own documentation names this as its primary unsolved risk and prescribes the fix — "use tool outputs as the source of truth, never the AI summary text" — without implementing it. Dr. Brian Carrier (creator of The Sleuth Kit) makes the same point: it is critical for tools to identify what came from AI so you know what to verify.

EvilTrace is that fix, automated. The grounding validator is a direct, deterministic implementation of Carrier's "query for item existence" method: when a finding references an artifact, the validator re-fetches that exact line, re-hashes the source against the pre-LLM seal, and refuses to certify anything it cannot reproduce.

How EvilTrace relates to Protocol SIFT — and to human-in-the-loop platforms

EvilTrace extends the Protocol SIFT baseline rather than replacing it. It adopts the patterns that are table stakes — a layered CLAUDE.md agent persona, on-demand skill context, an stderr → self-correct → re-run loop, an immutable audit log, and a strict "never ask questions mid-task" execution rule — and adds the inference-constraint layer that Protocol SIFT's documentation prescribes but does not build.

It also occupies a deliberately different point on the autonomy/oversight spectrum than human-in-the-loop platforms (for example, AppliedIR's Valhuntir, the hackathon's reference example):

Human-in-the-loop platform EvilTrace
Trust anchor A human examiner approves every finding (findings staged as DRAFT; the AI cannot approve its own work) A deterministic grounding validator certifies every finding against hash-sealed raw output
Speed of trust Bounded by human review Machine speed
What it guarantees A human signed off Every reported finding cites real, pre-LLM-hashed evidence; ungrounded claims are downgraded automatically

Honest boundary: EvilTrace's validator catches fabrication and over-certainty — it does not replace human judgment on interpretation. INFERRED findings are precisely where a human analyst still matters. EvilTrace is not "better than human review"; it is a different tradeoff — full autonomy with a deterministic floor under every claim, useful where human review at machine speed is not available.

Architecture

The model reasons flexibly. Tool execution is restricted by code. Two interfaces — the SDK agent and the MCP server — both converge on the same read-only wrapper layer, so there is exactly one audit chain regardless of which interface drives the investigation.

Framework posture. EvilTrace's orchestrator is a raw Anthropic SDK tool-use loop — a comparable agentic architecture, expressly permitted by the official hackathon rules alongside Claude Code and OpenClaw. The custom MCP server (eviltrace_mcp.py) is the Claude Code integration path: the same wrappers, the same audit chain, runnable directly from a Claude Code session.

graph TB
    PATTERN["ARCHITECTURAL PATTERN<br/>Custom MCP Server + comparable raw-SDK agent loop"]
    subgraph prompt ["PROMPT-BASED GUARDRAILS — guidance only (CLAUDE.md hierarchy)"]
        C1["~/.claude/CLAUDE.md<br/>global persona"]
        C2["./CLAUDE.md<br/>project rules"]
        C3["cases/incident_01/CLAUDE.md<br/>case objective + IOCs"]
    end

    TASK["Investigator task"] --> AGENT

    subgraph agentbox ["agent.py — raw Anthropic SDK tool-use loop"]
        AGENT["loop: sliding context · max-turns ceiling ·<br/>deterministic vs transient failure recovery"]
        ALLOW["typed tool allowlist<br/>(the tool list IS the allowlist)"]
        AGENT --> ALLOW
    end

    prompt -. concatenated into system prompt .-> AGENT

    MCP["eviltrace_mcp.py<br/>Custom MCP Server<br/>(3 read-only tools)"]

    subgraph arch ["ARCHITECTURAL GUARDRAILS — code-enforced"]
        DISK["tools/disk_tools.py<br/>structured tool implementations"]
        WRAP["bin/run_* wrappers + _common.sh<br/>read-only guard · line-numbered capture ·<br/>pre-LLM SHA-256 seal · JSONL audit log"]
        DISK --> WRAP
    end

    ALLOW --> DISK
    MCP -. shells out to .-> WRAP

    WRAP -->|invokes| SIFTT["SIFT forensic tools<br/>TSK · Zimmerman EZ Tools · Plaso · Volatility 3 · YARA"]
    SIFTT -->|read-only| EV[("evidence/ — rd01.E01<br/>READ-ONLY")]
    WRAP -->|writes only| OUT[("analysis/ · reports/<br/>tool_runs/ + execution_log.jsonl")]

    OUT --> VAL

    subgraph valbox ["validator.py — deterministic, never calls an LLM"]
        VAL["re-fetch cited line · re-hash vs pre-LLM seal ·<br/>tag GROUNDED / INFERRED / UNVERIFIED"]
    end

    VAL --> REP["case_report.md + accuracy_report.md<br/>(confidence-tagged)"]
    HOOK["hooks/stop_hook.py<br/>completion-promise verification"] -. gates completion .-> AGENT

    classDef patternLabel fill:none,stroke:none,font-weight:bold
    classDef promptLayer stroke-dasharray:6 4
    classDef archLayer stroke-width:3px
    class PATTERN patternLabel
    class prompt promptLayer
    class arch,valbox archLayer
Loading

Prompt-based guardrails (the CLAUDE.md hierarchy) shape how the agent reasons — persona, objectives, the no-mid-task-questions rule. They are advisory.

Architectural guardrails (the wrapper layer + the typed tool allowlist) decide what can physically execute. The agent has no raw shell. It can only call the tools on its allowlist; every tool shells through a bin/run_* wrapper that refuses to write inside the evidence tree, captures line-numbered output, and SHA-256-seals the raw bytes before any LLM sees them. This is the distinction judges look for, and EvilTrace enforces the safety-critical part in code, not in the prompt.

Security model and guardrails

  • Evidence is treated as read-only; wrappers refuse any output path inside the evidence / /mnt / /media trees and write only under analysis/, reports/, and exports/.
  • The LLM has no raw shell access. The exposed tool list is the allowlist.
  • All forensic execution flows through wrapper scripts that hash raw output before it is summarized or validated.
  • Findings require an exact artifact citation; the validator is deterministic and LLM-free.
  • The completion Stop hook verifies that promised artifacts actually exist before a run is accepted.

Bypass testing. The read-only guardrail (assert_safe_output in bin/_common.sh) is verified by tests/test_guardrail.sh, which probes it with five paths: a legitimate output path, a direct write into the evidence tree, a .. path-traversal escape, a protected-mount (/mnt) write, and an out-of-tree write. Only the legitimate path is allowed; every bypass attempt is refused. Because the guard canonicalizes with readlink -f before checking, the .. traversal collapses into the evidence tree and is caught. Captured run: tests/guardrail_selftest.txt.

EvilTrace reduces hallucination and spoliation risk through architectural guardrails and deterministic validation. It is not designed to defend against a deliberately malicious model, and it does not replace write-blockers, verified evidence handling, or examiner judgment.

Try It Out Locally on SANS SIFT Workstation

EvilTrace runs locally on a SANS SIFT Workstation. There is no hosted deployment — it operates on local forensic evidence and local DFIR tooling.

Prerequisites

  • SANS SIFT Workstation (provides Sleuth Kit, Zimmerman EZ Tools via .NET, Plaso, Volatility 3, YARA)
  • Python 3.10+
  • An Anthropic API key
  • pip install anthropic --break-system-packages (add fastmcp only if you want to run the MCP server)

Run the disk investigation

git clone https://github.com/1hackerway/eviltrace.git
cd eviltrace

export ANTHROPIC_API_KEY="sk-ant-..."

# EvilTrace ships no evidence. Place (or symlink) the disk image:
mkdir -p cases/incident_01/evidence
# cases/incident_01/evidence/rd01.E01   <-- put the image here

# Autonomous investigation (extracts artifacts, parses them, records findings):
python3 agent.py incident_01

# Deterministic validation: tag every finding + emit the accuracy report:
python3 validator.py --write

Review the outputs

cases/incident_01/reports/case_report.md        # findings with confidence tags
cases/incident_01/analysis/findings.json        # structured findings + citations
cases/incident_01/reports/accuracy_report.md    # grounded / inferred / unverified rates
cases/incident_01/reports/execution_log.jsonl   # per-tool + per-turn audit trail

What to expect. The disk run extracts $MFT, Amcache.hve, and the SOFTWARE/SYSTEM hives, parses them, and builds a Plaso timeline — the timeline step alone can take 20–40 minutes. A representative run produces 7 disk findings (5 GROUNDED, 2 INFERRED, 0 UNVERIFIED). Because the agent is autonomous, exact findings can vary slightly run-to-run; the committed samples/ reflect a validated reference run.

Don't have the evidence, or want to skip the run? Pre-computed reference outputs are in samples/ — the disk leg (findings.json, 7 findings) and the memory leg (memory_findings.json, 6 findings) together make up the full incident_01 case: 13 findings — 11 GROUNDED, 2 INFERRED, 0 UNVERIFIED. A judge never needs the 3 GB memory image to verify the project.

Across both cases: 18 findings — 14 GROUNDED, 3 INFERRED, 1 UNVERIFIED. The honest-accuracy write-up — including why the single UNVERIFIED is a correct floor rather than a miss — is in samples/accuracy_self_assessment.md.

Generalization case — alihadi_01

A second, independent host shows the pipeline generalizes beyond the SHIELDBASE case: the Ali Hadi public Web Server case (Windows Server 2008 / XAMPP, a partitioned disk with NTFS at sector 2048 plus a memory image), run with no answer key in the briefing. It produced 5 findings — 3 GROUNDED / 1 INFERRED / 1 UNVERIFIED. The lone UNVERIFIED is a true-but-not-mechanically-certifiable directory-creation finding: the event happened, but a low-entropy directory name is not a hard anchor the deterministic validator can ground against a single line, so it floors honestly rather than over-certifying (see samples/accuracy_self_assessment.md §3). Reference outputs are in samples/alihadi_01/.

Optional — the MCP server (Custom MCP Server pattern)

pip install fastmcp --break-system-packages
claude mcp add eviltrace-forensics python3 eviltrace_mcp.py
claude mcp list      # should show eviltrace-forensics ✓ Connected

The registration is written to your local Claude config, not to this repo, so it must be re-run on a fresh clone.

Verify every number yourself

All confidence counts in the demo come from committed files in this repo. Clone it and run these three commands — no tools or evidence images required:

# Combined scoreboard (per-case + total)
sed -n '/## 2. Results summary/,/Combined/p' samples/accuracy_self_assessment.md
# Per-case scoreboards (independent backing for the sum)
grep -niE 'scoreboard' samples/case_report.md              # incident_01 → 11 / 2 / 0
grep -niE 'scoreboard' samples/alihadi_01/case_report.md   # alihadi_01  →  3 / 1 / 1

Expected: incident_01 (13 = 11/2/0) + alihadi_01 (5 = 3/1/1) = 18 findings — 14 GROUNDED / 3 INFERRED / 1 UNVERIFIED.

Outputs

Live runs write under cases/incident_01/. The analysis/ outputs (extracted artifacts, parsed CSVs) are gitignored; the reports/ outputs are tracked. For one-stop review, samples/ bundles a copy of every submission-facing artifact (disk + memory).

Output Live-run path Tracked copy
Disk findings (structured) cases/incident_01/analysis/findings.json samples/findings.json
Memory findings (structured) cases/incident_01/analysis/memory_findings.json samples/memory_findings.json
Case report (disk) cases/incident_01/reports/case_report.md samples/case_report.md
Accuracy report (disk) cases/incident_01/reports/accuracy_report.md samples/accuracy_report.md
Accuracy report (memory) cases/incident_01/reports/accuracy_report_memory.md samples/accuracy_report_memory.md
Execution log cases/incident_01/reports/execution_log.jsonl samples/execution_log.jsonl
Self-correction excerpt (extracted from the execution log) samples/self_correction_excerpt.jsonl
Raw tool outputs cases/incident_01/analysis/tool_runs/ (excluded — large)

Demo-video run provenance. The demo video's live-run footage is a separate cold run on a scratch copy of the incident_01 case. Its complete audit artifacts — execution log (including the on-camera self-corrections), findings, session promises, and rendered case report — are committed verbatim in samples/video_demo_run/, so every frame of terminal output in the video traces to a logged tool execution. Canonical submitted results remain samples/ (incident_01) and samples/alihadi_01/.

Accuracy and validation method

EvilTrace measures two distinct things and keeps them separate:

  1. Grounding / inference-constraint accuracy (self-contained — this is the differentiator). For every finding, does the cited line exist, does it match the pre-LLM hash, and does it actually support the claim? This needs no external answer key — it is verifiable from the repo alone.
  2. Investigative recall (needs an external answer key). Did the agent find everything a human would? This is not yet externally benchmarked, and the accuracy report says so plainly.

A real excerpt from validator.py (trimmed):

[F001] ✔ GROUNDED    mft_output.csv:L219674
  files matched : procdump.exe
  sizes matched : 515776 bytes
  path corrob.  : tdungan, dashlane, procdump.exe
  integrity     : VERIFIED  (input_seal=OK, csv_seal=SEALED_OK)

[F004] ~ INFERRED    SOFTWARE_recmd.csv:L31
  files matched : msascuil.exe
  MISSING       : file:vmtoolsd.exe  -> not GROUNDED
  integrity     : VERIFIED  (input_seal=OK, csv_seal=SEALED_OK)

F004 is the moat in one frame: the agent's claim referenced a file the cited line did not contain, so the validator downgraded it from GROUNDED to INFERRED automatically — no human cross-examination required.

The reference run also correctly rejected look-alikes rather than over-reporting — e.g. LogUploader.dll and Qt5*.dll (legitimate OneDrive/Dashlane components), csscan.exe (McAfee), remsh.exe (Windows rempl), and Office-installer binaries — and recorded zero hallucinated findings. Documented IOCs that live only in the memory image (e.g. STUN.exe, msedge.exe, 172.15.1.20) were correctly absent from the disk findings rather than fabricated.

Audit trail and evidence integrity

The audit trail is not a narrative summary — it is structured evidence of what the agent and tools actually did. Any finding traces back through a single chain:

finding → artifact citation (file:LINE) → tool_runs/ raw output → pre-LLM SHA-256 seal in execution_log.jsonl

execution_log.jsonl carries three record types:

  • tool runs — timestamp, tool, command, return code, raw-output path, pre-LLM SHA-256 (and a csv_seal for parsed CSVs),
  • self_correction — when a deterministic tool error occurs, the decision is logged, the structured error is returned to the model, and the model reroutes (it is not a blind retry),
  • llm_turn — per-turn timestamp, model, stop reason, and input/output token usage.

The validator's integrity verdicts describe exactly how far the chain can be proven:

Verdict Meaning
VERIFIED Input artifact seal and parsed-CSV seal both match
INPUT_VERIFIED Input artifact seal matches; the parser did not emit a CSV seal
CSV_VERIFIED Parsed output is byte-stable since capture; producer chain not regex-traceable
RAW_VERIFIED Raw tool-output seal matches (used for memory/Volatility findings)

Dataset documentation

Full provenance is in datasets/README.md. In brief: Operation SHIELDBASE, host base-rd-01, disk image rd01.E01 (EWF, single-volume NTFS, image MD5 391be74b6830344eace7272f697cf1ae). Evidence files are not included in this repository.

Submission compliance

Requirement Location
Public GitHub repository this repo
Open-source license (MIT) LICENSE
Demo video (Devpost submission link)
Architecture diagram (prompt-based vs architectural guardrails) Architecture above
Dataset documentation datasets/README.md
Accuracy report samples/accuracy_report.md (disk) + samples/accuracy_report_memory.md (memory)
Try-it-out instructions Try It Out Locally above
Agent execution logs (timestamps + token usage) samples/execution_log.jsonl

Known limitations

  • The primary investigation agent is a raw Anthropic SDK loop. eviltrace_mcp.py is a separate Custom MCP Server interface that demonstrates the MCP pattern over the same wrapper layer; it is not on the agent's runtime path.
  • The pipeline is tuned to the reference image (EWF, single-volume NTFS). Other image formats may require adjustment.
  • Grounding accuracy is self-contained and verifiable; investigative recall is not yet externally benchmarked against an independent answer key.
  • Some findings are intentionally INFERRED when a single cited line does not prove the full claim — this is by design, not a defect.
  • Local SIFT tool paths can vary between workstation builds.
  • Evidence files are not included in the public repository.

Responsible use and legal

EvilTrace is a research prototype. The AI is a tool to be used by trained incident-response professionals; responsibility for the accuracy and completeness of findings remains with the human examiner. Use only on systems and data you are authorized to analyze. This software is provided "as is", without warranty of any kind — see LICENSE.

SIFT Workstation is a product of the SANS Institute.

Acknowledgments

Built for the SANS Find Evil! AI Hackathon (2026) by Anand Kumar. EvilTrace extends the Protocol SIFT reference architecture (Rob T. Lee, SANS). Its hallucination-combat design is informed by Dr. Brian Carrier's work on AI verification in DFIR (cybertriage.com). Implementation was done with assistance from Claude Code (Anthropic). The evaluation dataset is the SANS Operation SHIELDBASE scenario.

License

MIT — see LICENSE.

About

Autonomous evidence-grounded DFIR agent for the SANS Find Evil hackathon — every finding validated against SHA-256-sealed tool output by an LLM-free verifier.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors