A real-time network intrusion detection system with a live forensics dashboard.
Capture traffic, score every flow through layered ML and behavioural detection,
map hits to MITRE ATT&CK, and watch alerts stream in over WebSockets.
Features · Architecture · Quick start · How it works
NetGuard does not rely on a single classifier. It layers three detectors so that each one covers the others' blind spots:
| Layer | Technique | What it catches |
|---|---|---|
| Per-flow ML | XGBoost binary gate, then a calibrated multiclass classifier | Known attack signatures: DoS, DDoS, brute force, port scan, web attacks, botnet, infiltration |
| Anomaly | K-Means distance from the benign cluster centroids | Novel or zero-day traffic that does not match any known class |
| Behavioural | Streaming MongoDB aggregation rules over rolling windows | Multi-flow patterns one flow alone cannot reveal, such as distributed scans, volumetric DDoS, and credential spraying |
Every alert is tagged with a MITRE ATT&CK technique (for example T1110
Brute Force, T1046 Network Service Discovery, T1498 Network Denial of
Service), so the output reads like an analyst report rather than a bare label.
- Live Monitor shows the real-time flow feed, KPIs, the alert stream, and confidence trends.
- Forensics gives a per-IP timeline, drill-down on any flow, and an analyst feedback loop.
- World Map plots geolocated source and destination arcs with on-demand IP enrichment.
- Models is a registry of trained models with their metrics and version history.
Capture runs in an isolated subprocess, so a crash in the packet layer never takes the API down with it. Alerts reach the UI through MongoDB change streams into a FastAPI WebSocket, giving end-to-end alert latency of roughly 50 ms.
A flow is reduced to about 70 CICFlowMeter-style features, imputed, and scaled. It then passes through a binary gate that decides benign versus attack. Flows that clear the gate go to a calibrated multiclass model for the attack type, while a K-Means distance check flags anything that fits no known cluster. In parallel, a periodic task runs MongoDB aggregation rules over rolling windows to catch distributed patterns. Thresholds are tuned per class and stored next to the models, so changing sensitivity is a config edit rather than a code change.
flowchart LR
A["Live packets / replay CSV"] --> B["NFStream flow assembly"]
B --> C["~70 flow features"]
C --> D{"Binary gate:<br/>benign or attack?"}
D -->|attack| F["Multiclass classifier<br/>(calibrated XGBoost)"]
D -->|benign| E["K-Means anomaly check"]
F --> I["Attack type + MITRE technique"]
E -->|"far from cluster"| G["Anomaly alert"]
E -->|normal| H["Benign flow"]
I --> DB[("MongoDB")]
G --> DB
H --> DB
DB --> K["Behavioural rules<br/>over rolling windows"]
K -->|"distributed scan / DDoS / brute force"| DB
DB -->|"change stream"| L["FastAPI WebSocket"]
L --> M["React dashboard"]
style D fill:#1e3a5f,color:#cfe3ff,stroke:#3b82f6
style F fill:#7c2d12,color:#fde6d5,stroke:#ea580c
style G fill:#7f1d1d,color:#fee2e2,stroke:#ef4444
style DB fill:#14532d,color:#dcfce7,stroke:#22c55e
style M fill:#3b0764,color:#efe1ff,stroke:#a855f7
| Layer | Technology |
|---|---|
| Backend | Python 3.13, FastAPI, Uvicorn, Motor and PyMongo |
| ML | scikit-learn, XGBoost, imbalanced-learn |
| Capture | NFStream, Npcap |
| Database | MongoDB 8.x with change streams and aggregation pipelines |
| Frontend | React 19, Vite, Tailwind CSS 4, Zustand, Recharts, react-simple-maps |
| Geo | MaxMind GeoLite2, IPinfo, RDAP, reverse DNS |
The production models ship with the repo, so live capture works right after install. You only need the datasets for replay mode or retraining.
| Requirement | Notes |
|---|---|
| Python 3.13 | On PATH. The capture layer uses os.add_dll_directory (3.8 and newer). |
| Node.js 18+ | For the Vite dashboard. |
| MongoDB 8.x | Running as a Windows service named MongoDB. |
| Npcap | Needed for live capture. Install it with the "WinPcap API-compatible mode" option checked. Live capture also needs an elevated (admin) terminal. |
# from the repo root
pip install -r setup/requirements.txt
# frontend dependencies (start.ps1 also does this automatically on first run)
cd frontend; npm install; cd ..Copy-Item backend/.env.example backend/.envEvery value has a working default, so you can run with the file as is. Optionally
add a free IPinfo token to backend/.env for
precise map geolocation.
.\start.ps1 # launches backend (elevated) and frontend, then opens the dashboard- Dashboard: http://localhost:5173
- API health: http://127.0.0.1:8000/api/health
.\start.ps1 -NoElevate # skip the UAC prompt; dashboard works, live capture will not
.\stop.ps1 # stop everythingThese are not committed, because they are large or license-restricted. You only need them for replay mode, retraining, or richer geolocation.
Training and replay data, CICIDS2017 and CIC-IDS2018, via the Kaggle API:
# place your Kaggle token at ~/.kaggle/kaggle.json first
python setup/download_dataset.pyThis pulls the CSVs into backend/data/cicids2017/ and backend/data/cicids2018/
(about 0.8 GB and 6.9 GB extracted).
GeoIP: for offline geolocation, download a free
MaxMind GeoLite2-City
.mmdb and drop it into backend/data/. Without it, and without an IPinfo
token, the World Map still renders but geolocation is limited.
NetGuard/
├── backend/
│ ├── app/
│ │ ├── api/ FastAPI routes, WebSocket, capture subprocess
│ │ ├── capture/ NFStream adapter and feature plugins
│ │ ├── ml/ inference engine, training pipelines, models_c and models_d
│ │ ├── processing/ flow processor and behavioural detector
│ │ ├── security/ MITRE ATT&CK mapping
│ │ ├── geo/ GeoIP, IPinfo, and RDAP enrichment
│ │ └── db/ MongoDB connection, stores, aggregations, schema
│ ├── scripts/ training, benchmarking, and data-audit utilities
│ └── .env.example
├── frontend/ React and Vite dashboard
├── setup/ requirements.txt and the dataset downloader
├── img/ README images
├── start.ps1 / stop.ps1 launchers
└── _journal/ engineering log, one entry per build session
| Symptom | Fix |
|---|---|
MongoDB service not found |
Install MongoDB and make sure the service is named MongoDB. |
| Live capture shows no flows | Run the backend as admin and confirm Npcap is installed in WinPcap-compatible mode. |
| World Map geolocation is sparse | Add a GeoLite2 .mmdb to backend/data/, or set an IPINFO_TOKEN. |
Track C models missing |
The backend/app/ml/models_c/ artifacts must be present. They ship with the repo. |
- CICIDS2017 and CIC-IDS2018, Canadian Institute for Cybersecurity, University of New Brunswick
- MITRE ATT&CK®, the threat-technique taxonomy
- MaxMind GeoLite2 and IPinfo, IP geolocation
Released under the MIT License. See LICENSE. Bundled datasets, GeoIP databases, and the CICIDS-derived models carry their own upstream licenses, so review those before redistribution or commercial use.
Semester Project - NetGaurd
Spring 2026

