mulder

Mulder takes a directory of forensic evidence (disk images, memory dumps, PCAPs, event logs) and runs a five-phase autonomous investigation with hard quality gates between each phase. It produces structured incident reports with MITRE ATT&CK mappings, IOC exports, and a full audit trail. An adversarial "Alternative Narrative" phase challenges every finding before the report is generated. All tool invocations go through typed MCP interfaces - never through a shell - and an append-only audit log validates every evidence citation at the API boundary, making findings with fabricated evidence citations structurally impossible to submit.

Results

Four autonomous investigations against real forensic datasets, unmodified from tool output. Each case has an interactive HTML report on GitHub Pages (sidebar navigation, dark/light theme, audit trail). See the examples index for all report links.

Case	Systems	Evidence	Sources	Tool Calls	Findings	Runtime	Tokens	Report
Rocba	1	~8 GB	67	292	7 (1 high)	66 min	313K	HTML
SRL-2015	4	~30 GB	159	610	29 (4 crit, 9 high)	126 min	300K	HTML
SRL-2018	11	~120 GB	457	1,508	55 (11 crit, 19 high)	336 min	698K	HTML
NIST Data Leakage	4	~8 GB	88	723	33 (15 high)	102 min	330K	HTML

The NIST Data Leakage case has a detailed accuracy report validated against published NIST ground truth: 60% full match, 90% detection rate, 5% false positive rate. The single false positive involved incorrect causal attribution (blaming CCleaner for artifact destruction when the answer key confirms it was launched and closed without action).

How It Works

Each investigation runs through five phases with quality gates between them. Phases 2-4 use a plan-and-execute pipeline with three specialized roles (planner, executor, analyst) that can optionally be assigned to different models for cost optimization.

Catalog - scan evidence directory, classify file types, identify distinct systems
Extraction - run applicable forensic tools per system, index results into FTS5 database
Cross-System Analysis - correlate events across systems, map MITRE ATT&CK techniques, deduplicate findings
Alternative Narrative - challenge the primary narrative with counter-evidence, test alternative hypotheses, audit for tool and evidence coverage gaps
Report - write the investigation narrative, generate Markdown/HTML reports, export IOCs and ATT&CK Navigator layers

Each gate validates structural criteria (minimum sources indexed, findings submitted, MITRE mappings present, audit tools invoked). Failed gates trigger retries with escalating turn budgets and gap-specific remediation instructions. See Architecture for the full pipeline design.

Key Design Decisions

No shell access. All 140+ tool invocations go through typed MCP interfaces with validated parameters. The agent never gets a shell. Every action is auditable and every parameter is constrained to its declared type.

Anti-hallucination at the API boundary. Every finding must cite evidence_refs that are real tool_call_id values from the append-only audit log. The MCP server validates these references at submission time and rejects findings that cite nonexistent tool calls. Timestamps are validated as ISO-8601 and auto-nullified when they appear fabricated. This is enforced architecturally, not by prompting.

Adversarial self-review. Phase 4 explicitly challenges the primary narrative before report generation. It formulates counter-hypotheses, searches for disconfirming evidence, and runs coverage audits to identify which tools were applicable but never invoked and which evidence sources were indexed but never cited.

Token efficiency. The SRL-2018 investigation (11 systems, 120 GB, 1,508 tool calls across 336 minutes) consumed 698K tokens. For cost optimization, the three pipeline roles (planner, executor, analyst) can be assigned to different models - routing mechanical tool-calling to a cheaper model while preserving reasoning quality for analysis.

Quick Start

docker pull ghcr.io/calebevans/mulder:1.3

mkdir -p ~/mulder-cases

docker run -it --privileged \
  -v /path/to/evidence:/evidence:ro \
  -v ~/mulder-cases:/home/mulder/.mulder/cases \
  -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
  ghcr.io/calebevans/mulder:1.3

mulder investigate /evidence my-case-id

For Vertex AI, Amazon Bedrock, non-Anthropic models via LiteLLM, and full CLI options, see the Usage Guide.

Case Briefing (Optional)

Drop a MULDER.md file in your evidence directory to provide case context:

## What We Know
- The network was breached on March 15
- Suspect account: jsmith

## What We're Looking For
- How did the attacker gain initial access?
- Was data exfiltrated?

The briefing is injected into every investigation phase, guiding tool selection, analysis focus, and report framing. See the Usage Guide for details.

Forensic Tools

Mulder integrates 35+ open-source forensic tools exposed as 140+ typed MCP operations:

Category	Tools
Memory	Volatility 3 (14 plugins)
Disk	Sleuthkit, Plaso, foremost, PhotoRec, Scalpel
Windows artifacts	EZ Tools (Prefetch, Amcache, ShimCache, MFT, USN Journal, Jump Lists, Shellbags, SRUM), RegRipper, Hayabusa (3,700+ Sigma rules), Chainsaw
Event logs	python-evtx, Zircolite
Network	tshark, Zeek, Suricata, tcpflow, tcpxtract
Malware	YARA, CAPA, FLOSS, Detect-It-Easy, ClamAV, radare2
Documents	oletools, PDF tools, pst-utils
Mobile	ALEAPP, iLEAPP, MVT
Other	bulk_extractor, binwalk, ExifTool, ssdeep, hashdeep, steghide, Hindsight

Full API reference: Tool Manifest

Output

Each investigation produces:

Markdown and HTML reports - executive summary, attack timeline, findings with MITRE ATT&CK mappings, IOC tables, and audit trail (example HTML reports)
Per-case SQLite database - FTS5 full-text search across all indexed evidence
Append-only audit log - JSONL recording every tool invocation with BLAKE2b output hashes
Optional exports - STIX 2.1 IOC bundle, CSV IOC list, and MITRE ATT&CK Navigator layer via mulder export-iocs and mulder export-navigator

Documentation

Document	Description
Usage Guide	Installation, providers, CLI reference, Docker configuration
Architecture	System design, pipeline phases, quality gates, data flow
Tool Manifest	API reference for all MCP tools
Adding Tools	Contributor guide for adding new forensic tools
Glossary	Terminology and definitions

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 243 Commits
.github		.github
docs		docs
examples		examples
scripts		scripts
src/mulder		src/mulder
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.mcp.json		.mcp.json
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mulder

Results

How It Works

Key Design Decisions

Quick Start

Case Briefing (Optional)

Forensic Tools

Output

Documentation

License

About

Uh oh!

Releases 6

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mulder

Results

How It Works

Key Design Decisions

Quick Start

Case Briefing (Optional)

Forensic Tools

Output

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages