Skip to content

BB-24/Malware-Analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Malware Analysis Collection & IOC Extraction System

Overview

Production-ready malware analysis system designed to safely collect and analyze malware behavior in an isolated VirtualBox sandbox environment. The system captures system activity (Procmon) and network traffic (tshark), then automatically extracts Indicators of Compromise (IOCs).

Key Features

  • Automated malware execution in isolated VirtualBox sandbox
  • VirtualBox snapshot restore after each analysis
  • Procmon + tshark concurrent data collection
  • IOC extraction (IPs, domains, registry keys, file paths, hashes)
  • Behavior classification (Ransomware, Trojan, Botnet, Worm, etc.)
  • Structured data parsing (CSV from Procmon, PCAP from tshark)
  • Command-line interface with flexible options
  • Comprehensive error handling and safety features

Table of Contents

  1. Quick Start
  2. Installation
  3. Usage
  4. Configuration
  5. Architecture
  6. Output
  7. Safety & Ethics
  8. Troubleshooting

Quick Start

30-Second Setup

# 1. Install dependencies
pip install -r requirements.txt

# 2. Enable sandbox (once VM is configured)
# Edit config.yaml: sandbox.enabled = true

# 3. Collect malware data
python collect_malware.py --sample samples/test.exe

Basic Command

# Analyze a sample
python collect_malware.py --sample samples/malware.exe

# Outputs to: logs/ (procmon.csv, capture.pcap)

Installation

Requirements

Hardware

  • 8GB+ RAM (16GB recommended)
  • 20GB+ free disk (VM snapshots + logs)
  • Windows 10+ host OS

Software - Host Machine

  • Python 3.8+
  • VirtualBox 6.1+
  • Wireshark (for tshark)

Software - Guest VM

  • Windows 7/10/11
  • Process Monitor (Procmon.exe)

Python Packages

pip install -r requirements.txt
# Includes: pyyaml, pandas, scapy, reportlab, pyshark

Installation Steps

Step 1: Setup Virtual Environment

python -m venv venv
# Windows PowerShell:
.\venv\Scripts\Activate.ps1
# Windows CMD:
venv\Scripts\activate.bat

Step 2: Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt

Step 3: Configure VirtualBox VM

  1. Open VirtualBox
  2. Create Windows VM:
    • Name: analysis_vm (match config.yaml)
    • RAM: 4GB+
    • Disk: 50GB (dynamic)
  3. Disable:
    • Network (or use FakeNet-NG)
    • Shared folders
    • Clipboard/Drag-drop
  4. Install Procmon.exe on guest
  5. Take snapshot: Name it clean

Step 4: Create Directories

mkdir samples logs

Usage

Basic Analysis

Execute malware and collect system/network activity:

python collect_malware.py --sample samples/test.exe

With Custom Config

python collect_malware.py --sample samples/malware.exe --config custom_config.yaml

Process Flow

When you run python collect_malware.py --sample samples/malware.exe:

  1. Load Configuration → reads config.yaml
  2. Create Directories → ensures logs/ exist
  3. Snapshot Restore → reverts VM to clean state
  4. Start VM → powers on analysis_vm
  5. Start Collectors → launches Procmon + tshark
  6. Copy Sample → transfers sample to guest VM
  7. Execute Sample → runs binary inside VM for execution_timeout seconds
  8. Stop Collectors → terminates Procmon + tshark
  9. Parse Logs → converts PML → CSV, reads PCAP
  10. Extract IOCs → finds IPs, domains, registry keys, file paths
  11. Cleanup → stops VM, restores snapshot

Output

  • logs/procmon.csv - System activity (file, registry, process operations)
  • logs/capture.pcap - Network traffic (packets)
  • Console output showing collected IOCs and behavior

Configuration

config.yaml Reference

sandbox:
  vm_name: "analysis_vm"              # VirtualBox VM name
  snapshot: "clean"                   # Snapshot to restore
  vbox_path: "C:\\Program Files\\Oracle\\VirtualBox\\VBoxManage.exe"
  enabled: false                      # Set true when VM ready
  vm_user: "Administrator"
  vm_password: "Password123!"
  guest_sample_path: "C:/Windows/Temp/sample.exe"
  procmon_export_timeout: 120         # Max seconds to wait for CSV export

tools:
  procmon_path: "tools/Procmon.exe"
  tshark_path: "tshark"               # Must be in PATH
  execution_timeout: 300              # Malware execution timeout (seconds)

paths:
  sample_dir: "samples/"
  logs_dir: "logs/"

analysis:
  network_simulation: false           # Use FakeNet-NG simulation

Architecture

Data Collection Pipeline

VirtualBox VM
│
├─ Procmon.exe ──→ procmon.pml ──→ procmon.csv (system activity)
│
└─ Network ──→ tshark ──→ capture.pcap (network packets)
   │                              │
   └──────────────────┬───────────┘
                      │
            parser.py (parsing)
                      │
         ┌────────────┼────────────┐
         │            │            │
    ioc_extractor   analyzer    (console output)

├─ sandbox.py          → VM lifecycle (restore/start/stop)
├─ execution.py        → Malware execution with timeout
├─ collector.py        → Procmon + tshark management
├─ parser.py           → CSV/PCAP parsing
├─ ioc_extractor.py    → IOC extraction (IP, domain, file, registry)
└─ analyzer.py         → Behavior classification

Module Responsibilities

Module Purpose
sandbox.py VirtualBox VM lifecycle: snapshot/start/stop/copy/execute
execution.py Executes sample in sandbox with timeout
collector.py Manages Procmon + tshark data collection
parser.py Converts raw logs (PML, PCAP) to structured formats
ioc_extractor.py Regex-based IOC detection (IPs, domains, hashes, registry, files)
analyzer.py Behavioral classification (Ransomware, Trojan, Botnet, etc.)

Output

Console Output

Malware Analysis Collection - IOC Extraction System
Starting analysis: samples/malware.exe

Collected Data:
- Procmon CSV: logs/procmon.csv (542 entries)
- Network PCAP: logs/capture.pcap (1,234 packets)

IOCs Extracted:

IP Addresses:
  192.168.1.100
  10.0.0.5
  172.16.0.50

Domains:
  malicious.net
  c2-server.com
  attacker.io

Registry Keys Modified: 15
Files Created: 42

Classification: Trojan/Backdoor
Risk Level: CRITICAL

Detected Behaviors:
  - Persistence mechanism
  - Process injection
  - C2 communication
  - Credential harvesting

Analysis Complete. Data saved to logs/

Generated Logs

  • logs/procmon.csv - System activity parsed from Procmon
  • logs/capture.pcap - Raw network packets from tshark
  • logs/analysis_results.json - Extracted IOCs and analysis (if exported)

File Structure

malware-analyzer/
├── README.md                      # This file
├── config.yaml                    # Configuration
├── requirements.txt               # Python dependencies
├── TODO.md                        # Progress tracking
│
├── collect_malware.py       (200 LOC) # Main collection script
│
├── src/
│   ├── sandbox.py           (200 LOC) # VM control
│   ├── execution.py         (150 LOC) # Malware execution
│   ├── collector.py         (200 LOC) # Data collection
│   ├── parser.py            (200 LOC) # Log parsing
│   ├── ioc_extractor.py     (250 LOC) # IOC extraction
│   ├── analyzer.py          (200 LOC) # Behavior analysis
│   └── __init__.py
│
├── tools/                         # External utilities
│   └── Procmon.exe                # (download from Sysinternals)
│
├── samples/                       # Benign test samples
│   └── test.exe                   # (user-provided)
│
└── logs/                          # Generated during analysis
    ├── procmon.csv
    ├── procmon.pml
    ├── capture.pcap
    └── analysis_results.json      # Optional: IOC export

Safety & Ethics

⚠️ Critical Guidelines

  1. Isolation Requirements

    • Run malware ONLY inside VM snapshot
    • Never expose host to untrusted binaries
    • Disable: shared folders, network (unless intentional), clipboard, drag-drop
  2. Snapshot Restore

    • ALWAYS restore VM after each analysis
    • ALWAYS restore VM after each analysis
    • Prevents malware persistence
    • Recreate clean snapshot monthly
  3. Storage Best Practices

    • Never store actual malware in repository
    • Use separate encrypted drive for samples
    • Version control only metadata/hashes
  4. Network Safety

    • Run with network disabled by default
    • Use FakeNet-NG for network-based malware analysis
    • Monitor all connections without exposing real network
  5. Legal Compliance

    • Only analyze authorized samples
    • Use in controlled lab only
    • Never run on production systems
    • Comply with local cybersecurity laws
  6. Access Control

    • Dedicated analysis machine only
    • Restrict unauthorized access
    • Log all analyses with timestamps
    • Document analysis intentions

Recommended Practices

  • ✅ Git track configuration changes only
  • ✅ Maintain audit trail with timestamps
  • ✅ Test benign files before malware
  • ✅ Update tools monthly
  • ✅ Use hardware virtualization (AMD-V/Intel VT-x)
  • ✅ Disable VM acceleration if concerned about bypasses

Troubleshooting

Problem Solution
VBoxManage not found Verify VirtualBox path in config.yaml
tshark not found Add Wireshark to PATH or update config.yaml
VM not starting Check VM name matches config.yaml; verify snapshot exists
Procmon not capturing Run Procmon manually once to accept EULA
Pipeline timeout Increase tools.execution_timeout in config.yaml
CSV parsing fails Verify Procmon export successful; check encoding
Python errors Verify all dependencies: pip install -r requirements.txt
Permission denied Check write permissions for logs/ directory

Command Reference

Collect Malware

python collect_malware.py \
  --sample SAMPLE_PATH       # Malware to execute
  --config CONFIG_PATH       # Optional: custom config

Support

Getting Started

  1. Follow Installation section
  2. Configure config.yaml (update VM name, paths)
  3. Test with benign sample: python collect_malware.py --sample samples/test.exe
  4. Review logs/ directory for output files

For Questions

  • Check Troubleshooting section
  • Review config.yaml comments
  • Verify Procmon/tshark setup
  • Test with benign file first

License

This project is for educational and authorized malware analysis only. Users must comply with all applicable laws and regulations in their jurisdiction.

About

Sandboxed malware analysis system with VM injection capabilities for safe dynamic analysis and behavioral tracing.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages