Skip to content

BrAtUkA/NetGaurd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NetGuard

A real-time network intrusion detection system with a live forensics dashboard.
Capture traffic, score every flow through layered ML and behavioural detection,
map hits to MITRE ATT&CK, and watch alerts stream in over WebSockets.

Python FastAPI React MongoDB ML License

NetGuard world map view

Features  ·  Architecture  ·  Quick start  ·  How it works


✨ What it does

NetGuard does not rely on a single classifier. It layers three detectors so that each one covers the others' blind spots:

Layer Technique What it catches
Per-flow ML XGBoost binary gate, then a calibrated multiclass classifier Known attack signatures: DoS, DDoS, brute force, port scan, web attacks, botnet, infiltration
Anomaly K-Means distance from the benign cluster centroids Novel or zero-day traffic that does not match any known class
Behavioural Streaming MongoDB aggregation rules over rolling windows Multi-flow patterns one flow alone cannot reveal, such as distributed scans, volumetric DDoS, and credential spraying

Every alert is tagged with a MITRE ATT&CK technique (for example T1110 Brute Force, T1046 Network Service Discovery, T1498 Network Denial of Service), so the output reads like an analyst report rather than a bare label.

Dashboard views

  • Live Monitor shows the real-time flow feed, KPIs, the alert stream, and confidence trends.
  • Forensics gives a per-IP timeline, drill-down on any flow, and an analyst feedback loop.
  • World Map plots geolocated source and destination arcs with on-demand IP enrichment.
  • Models is a registry of trained models with their metrics and version history.

🏗️ Architecture

NetGuard system architecture

Capture runs in an isolated subprocess, so a crash in the packet layer never takes the API down with it. Alerts reach the UI through MongoDB change streams into a FastAPI WebSocket, giving end-to-end alert latency of roughly 50 ms.


🔎 How detection works

A flow is reduced to about 70 CICFlowMeter-style features, imputed, and scaled. It then passes through a binary gate that decides benign versus attack. Flows that clear the gate go to a calibrated multiclass model for the attack type, while a K-Means distance check flags anything that fits no known cluster. In parallel, a periodic task runs MongoDB aggregation rules over rolling windows to catch distributed patterns. Thresholds are tuned per class and stored next to the models, so changing sensitivity is a config edit rather than a code change.

flowchart LR
    A["Live packets / replay CSV"] --> B["NFStream flow assembly"]
    B --> C["~70 flow features"]
    C --> D{"Binary gate:<br/>benign or attack?"}
    D -->|attack| F["Multiclass classifier<br/>(calibrated XGBoost)"]
    D -->|benign| E["K-Means anomaly check"]
    F --> I["Attack type + MITRE technique"]
    E -->|"far from cluster"| G["Anomaly alert"]
    E -->|normal| H["Benign flow"]
    I --> DB[("MongoDB")]
    G --> DB
    H --> DB
    DB --> K["Behavioural rules<br/>over rolling windows"]
    K -->|"distributed scan / DDoS / brute force"| DB
    DB -->|"change stream"| L["FastAPI WebSocket"]
    L --> M["React dashboard"]

    style D fill:#1e3a5f,color:#cfe3ff,stroke:#3b82f6
    style F fill:#7c2d12,color:#fde6d5,stroke:#ea580c
    style G fill:#7f1d1d,color:#fee2e2,stroke:#ef4444
    style DB fill:#14532d,color:#dcfce7,stroke:#22c55e
    style M fill:#3b0764,color:#efe1ff,stroke:#a855f7
Loading

🧰 Tech stack

Layer Technology
Backend Python 3.13, FastAPI, Uvicorn, Motor and PyMongo
ML scikit-learn, XGBoost, imbalanced-learn
Capture NFStream, Npcap
Database MongoDB 8.x with change streams and aggregation pipelines
Frontend React 19, Vite, Tailwind CSS 4, Zustand, Recharts, react-simple-maps
Geo MaxMind GeoLite2, IPinfo, RDAP, reverse DNS

🚀 Quick start (Windows)

The production models ship with the repo, so live capture works right after install. You only need the datasets for replay mode or retraining.

1. Prerequisites

Requirement Notes
Python 3.13 On PATH. The capture layer uses os.add_dll_directory (3.8 and newer).
Node.js 18+ For the Vite dashboard.
MongoDB 8.x Running as a Windows service named MongoDB.
Npcap Needed for live capture. Install it with the "WinPcap API-compatible mode" option checked. Live capture also needs an elevated (admin) terminal.

2. Install

# from the repo root
pip install -r setup/requirements.txt

# frontend dependencies (start.ps1 also does this automatically on first run)
cd frontend; npm install; cd ..

3. Configure

Copy-Item backend/.env.example backend/.env

Every value has a working default, so you can run with the file as is. Optionally add a free IPinfo token to backend/.env for precise map geolocation.

4. Run

.\start.ps1          # launches backend (elevated) and frontend, then opens the dashboard
.\start.ps1 -NoElevate   # skip the UAC prompt; dashboard works, live capture will not
.\stop.ps1               # stop everything

📦 Datasets and geolocation (optional)

These are not committed, because they are large or license-restricted. You only need them for replay mode, retraining, or richer geolocation.

Training and replay data, CICIDS2017 and CIC-IDS2018, via the Kaggle API:

# place your Kaggle token at  ~/.kaggle/kaggle.json  first
python setup/download_dataset.py

This pulls the CSVs into backend/data/cicids2017/ and backend/data/cicids2018/ (about 0.8 GB and 6.9 GB extracted).

GeoIP: for offline geolocation, download a free MaxMind GeoLite2-City .mmdb and drop it into backend/data/. Without it, and without an IPinfo token, the World Map still renders but geolocation is limited.


🗂️ Project structure

NetGuard/
├── backend/
│   ├── app/
│   │   ├── api/          FastAPI routes, WebSocket, capture subprocess
│   │   ├── capture/      NFStream adapter and feature plugins
│   │   ├── ml/           inference engine, training pipelines, models_c and models_d
│   │   ├── processing/   flow processor and behavioural detector
│   │   ├── security/     MITRE ATT&CK mapping
│   │   ├── geo/          GeoIP, IPinfo, and RDAP enrichment
│   │   └── db/           MongoDB connection, stores, aggregations, schema
│   ├── scripts/          training, benchmarking, and data-audit utilities
│   └── .env.example
├── frontend/             React and Vite dashboard
├── setup/                requirements.txt and the dataset downloader
├── img/                  README images
├── start.ps1 / stop.ps1  launchers
└── _journal/             engineering log, one entry per build session

🛠️ Troubleshooting

Symptom Fix
MongoDB service not found Install MongoDB and make sure the service is named MongoDB.
Live capture shows no flows Run the backend as admin and confirm Npcap is installed in WinPcap-compatible mode.
World Map geolocation is sparse Add a GeoLite2 .mmdb to backend/data/, or set an IPINFO_TOKEN.
Track C models missing The backend/app/ml/models_c/ artifacts must be present. They ship with the repo.

Acknowledgements

  • CICIDS2017 and CIC-IDS2018, Canadian Institute for Cybersecurity, University of New Brunswick
  • MITRE ATT&CK®, the threat-technique taxonomy
  • MaxMind GeoLite2 and IPinfo, IP geolocation

📄 License

Released under the MIT License. See LICENSE. Bundled datasets, GeoIP databases, and the CICIDS-derived models carry their own upstream licenses, so review those before redistribution or commercial use.


BrAtUkA

Semester Project - NetGaurd
Spring 2026

About

A real-time network intrusion detection system with a live forensics dashboard. Capture traffic, score every flow through layered ML and behavioural detection, map hits to MITRE ATT&CK, and watch alerts stream in over WebSockets.

Topics

Resources

License

Stars

Watchers

Forks

Contributors