Production-ready malware analysis system designed to safely collect and analyze malware behavior in an isolated VirtualBox sandbox environment. The system captures system activity (Procmon) and network traffic (tshark), then automatically extracts Indicators of Compromise (IOCs).
- ✅ Automated malware execution in isolated VirtualBox sandbox
- ✅ VirtualBox snapshot restore after each analysis
- ✅ Procmon + tshark concurrent data collection
- ✅ IOC extraction (IPs, domains, registry keys, file paths, hashes)
- ✅ Behavior classification (Ransomware, Trojan, Botnet, Worm, etc.)
- ✅ Structured data parsing (CSV from Procmon, PCAP from tshark)
- ✅ Command-line interface with flexible options
- ✅ Comprehensive error handling and safety features
# 1. Install dependencies
pip install -r requirements.txt
# 2. Enable sandbox (once VM is configured)
# Edit config.yaml: sandbox.enabled = true
# 3. Collect malware data
python collect_malware.py --sample samples/test.exe# Analyze a sample
python collect_malware.py --sample samples/malware.exe
# Outputs to: logs/ (procmon.csv, capture.pcap)- 8GB+ RAM (16GB recommended)
- 20GB+ free disk (VM snapshots + logs)
- Windows 10+ host OS
- Python 3.8+
- VirtualBox 6.1+
- Wireshark (for tshark)
- Windows 7/10/11
- Process Monitor (Procmon.exe)
pip install -r requirements.txt
# Includes: pyyaml, pandas, scapy, reportlab, pysharkpython -m venv venv
# Windows PowerShell:
.\venv\Scripts\Activate.ps1
# Windows CMD:
venv\Scripts\activate.batpip install --upgrade pip
pip install -r requirements.txt- Open VirtualBox
- Create Windows VM:
- Name:
analysis_vm(match config.yaml) - RAM: 4GB+
- Disk: 50GB (dynamic)
- Name:
- Disable:
- Network (or use FakeNet-NG)
- Shared folders
- Clipboard/Drag-drop
- Install Procmon.exe on guest
- Take snapshot: Name it
clean
mkdir samples logsExecute malware and collect system/network activity:
python collect_malware.py --sample samples/test.exepython collect_malware.py --sample samples/malware.exe --config custom_config.yamlWhen you run python collect_malware.py --sample samples/malware.exe:
- ✅ Load Configuration → reads
config.yaml - ✅ Create Directories → ensures logs/ exist
- ✅ Snapshot Restore → reverts VM to clean state
- ✅ Start VM → powers on
analysis_vm - ✅ Start Collectors → launches Procmon + tshark
- ✅ Copy Sample → transfers sample to guest VM
- ✅ Execute Sample → runs binary inside VM for
execution_timeoutseconds - ✅ Stop Collectors → terminates Procmon + tshark
- ✅ Parse Logs → converts PML → CSV, reads PCAP
- ✅ Extract IOCs → finds IPs, domains, registry keys, file paths
- ✅ Cleanup → stops VM, restores snapshot
logs/procmon.csv- System activity (file, registry, process operations)logs/capture.pcap- Network traffic (packets)- Console output showing collected IOCs and behavior
sandbox:
vm_name: "analysis_vm" # VirtualBox VM name
snapshot: "clean" # Snapshot to restore
vbox_path: "C:\\Program Files\\Oracle\\VirtualBox\\VBoxManage.exe"
enabled: false # Set true when VM ready
vm_user: "Administrator"
vm_password: "Password123!"
guest_sample_path: "C:/Windows/Temp/sample.exe"
procmon_export_timeout: 120 # Max seconds to wait for CSV export
tools:
procmon_path: "tools/Procmon.exe"
tshark_path: "tshark" # Must be in PATH
execution_timeout: 300 # Malware execution timeout (seconds)
paths:
sample_dir: "samples/"
logs_dir: "logs/"
analysis:
network_simulation: false # Use FakeNet-NG simulationVirtualBox VM
│
├─ Procmon.exe ──→ procmon.pml ──→ procmon.csv (system activity)
│
└─ Network ──→ tshark ──→ capture.pcap (network packets)
│ │
└──────────────────┬───────────┘
│
parser.py (parsing)
│
┌────────────┼────────────┐
│ │ │
ioc_extractor analyzer (console output)
├─ sandbox.py → VM lifecycle (restore/start/stop)
├─ execution.py → Malware execution with timeout
├─ collector.py → Procmon + tshark management
├─ parser.py → CSV/PCAP parsing
├─ ioc_extractor.py → IOC extraction (IP, domain, file, registry)
└─ analyzer.py → Behavior classification
| Module | Purpose |
|---|---|
| sandbox.py | VirtualBox VM lifecycle: snapshot/start/stop/copy/execute |
| execution.py | Executes sample in sandbox with timeout |
| collector.py | Manages Procmon + tshark data collection |
| parser.py | Converts raw logs (PML, PCAP) to structured formats |
| ioc_extractor.py | Regex-based IOC detection (IPs, domains, hashes, registry, files) |
| analyzer.py | Behavioral classification (Ransomware, Trojan, Botnet, etc.) |
Malware Analysis Collection - IOC Extraction System
Starting analysis: samples/malware.exe
Collected Data:
- Procmon CSV: logs/procmon.csv (542 entries)
- Network PCAP: logs/capture.pcap (1,234 packets)
IOCs Extracted:
IP Addresses:
192.168.1.100
10.0.0.5
172.16.0.50
Domains:
malicious.net
c2-server.com
attacker.io
Registry Keys Modified: 15
Files Created: 42
Classification: Trojan/Backdoor
Risk Level: CRITICAL
Detected Behaviors:
- Persistence mechanism
- Process injection
- C2 communication
- Credential harvesting
Analysis Complete. Data saved to logs/
logs/procmon.csv- System activity parsed from Procmonlogs/capture.pcap- Raw network packets from tsharklogs/analysis_results.json- Extracted IOCs and analysis (if exported)
malware-analyzer/
├── README.md # This file
├── config.yaml # Configuration
├── requirements.txt # Python dependencies
├── TODO.md # Progress tracking
│
├── collect_malware.py (200 LOC) # Main collection script
│
├── src/
│ ├── sandbox.py (200 LOC) # VM control
│ ├── execution.py (150 LOC) # Malware execution
│ ├── collector.py (200 LOC) # Data collection
│ ├── parser.py (200 LOC) # Log parsing
│ ├── ioc_extractor.py (250 LOC) # IOC extraction
│ ├── analyzer.py (200 LOC) # Behavior analysis
│ └── __init__.py
│
├── tools/ # External utilities
│ └── Procmon.exe # (download from Sysinternals)
│
├── samples/ # Benign test samples
│ └── test.exe # (user-provided)
│
└── logs/ # Generated during analysis
├── procmon.csv
├── procmon.pml
├── capture.pcap
└── analysis_results.json # Optional: IOC export
-
Isolation Requirements
- Run malware ONLY inside VM snapshot
- Never expose host to untrusted binaries
- Disable: shared folders, network (unless intentional), clipboard, drag-drop
-
Snapshot Restore
- ALWAYS restore VM after each analysis
- ALWAYS restore VM after each analysis
- Prevents malware persistence
- Recreate clean snapshot monthly
-
Storage Best Practices
- Never store actual malware in repository
- Use separate encrypted drive for samples
- Version control only metadata/hashes
-
Network Safety
- Run with network disabled by default
- Use FakeNet-NG for network-based malware analysis
- Monitor all connections without exposing real network
-
Legal Compliance
- Only analyze authorized samples
- Use in controlled lab only
- Never run on production systems
- Comply with local cybersecurity laws
-
Access Control
- Dedicated analysis machine only
- Restrict unauthorized access
- Log all analyses with timestamps
- Document analysis intentions
- ✅ Git track configuration changes only
- ✅ Maintain audit trail with timestamps
- ✅ Test benign files before malware
- ✅ Update tools monthly
- ✅ Use hardware virtualization (AMD-V/Intel VT-x)
- ✅ Disable VM acceleration if concerned about bypasses
| Problem | Solution |
|---|---|
| VBoxManage not found | Verify VirtualBox path in config.yaml |
| tshark not found | Add Wireshark to PATH or update config.yaml |
| VM not starting | Check VM name matches config.yaml; verify snapshot exists |
| Procmon not capturing | Run Procmon manually once to accept EULA |
| Pipeline timeout | Increase tools.execution_timeout in config.yaml |
| CSV parsing fails | Verify Procmon export successful; check encoding |
| Python errors | Verify all dependencies: pip install -r requirements.txt |
| Permission denied | Check write permissions for logs/ directory |
python collect_malware.py \
--sample SAMPLE_PATH # Malware to execute
--config CONFIG_PATH # Optional: custom config- Follow Installation section
- Configure config.yaml (update VM name, paths)
- Test with benign sample:
python collect_malware.py --sample samples/test.exe - Review logs/ directory for output files
- Check Troubleshooting section
- Review config.yaml comments
- Verify Procmon/tshark setup
- Test with benign file first
This project is for educational and authorized malware analysis only. Users must comply with all applicable laws and regulations in their jurisdiction.