Skip to content

abhinav-phi/nids

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

56 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ The Sentinel β€” Network Intrusion Detection System

A production-grade, ML-powered NIDS with real-time packet capture, SHAP explainability, and a live threat intelligence dashboard.

Python FastAPI React TypeScript scikit-learn XGBoost License: MIT


πŸ“Œ What is This?

The Sentinel is a complete, end-to-end Network Intrusion Detection System that:

  • Captures live network packets using Scapy and assembles them into bidirectional flows
  • Extracts 52 CICIDS2017-compatible features per flow (IATs, packet lengths, TCP flags, etc.)
  • Classifies flows using an ML ensemble (9 models trained, best saved automatically)
  • Explains every prediction using SHAP values β€” top 5 most influential features per alert
  • Streams alerts in real time to a React dashboard via WebSocket
  • Simulates attack traffic for demo and testing without a real adversary

Built for hackathons, research, and production prototyping. No black box β€” every prediction is explainable.


πŸ—οΈ System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         LIVE NETWORK                             β”‚
β”‚                   (Wi-Fi / Ethernet traffic)                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚  Raw packets
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    NetworkSniffer  (Scapy)                       β”‚
β”‚  β€’ Captures IP/TCP/UDP/ICMP packets on auto-detected interface   β”‚
β”‚  β€’ Groups packets into flows via 5-tuple key                     β”‚
β”‚    (src_ip, dst_ip, src_port, dst_port, protocol)                β”‚
β”‚  β€’ Closes flows on TCP FIN/RST or after 30s timeout              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚  Flow packet dicts
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 FlowExtractor  (CICIDS2017)                      β”‚
β”‚  β€’ Computes 52 exact CICIDS2017 features per flow                β”‚
β”‚  β€’ Statistical: packet length mean/std/min/max                   β”‚
β”‚  β€’ Temporal: IATs, flow duration, active/idle periods            β”‚
β”‚  β€’ Protocol: TCP flags (FIN/PSH/ACK), window sizes               β”‚
β”‚  β€’ Rates: Flow Bytes/s, Flow Packets/s, Fwd/Bwd Packets/s        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚  Feature dict (52 keys)
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                FastAPI Backend  (port 8000)                      β”‚
β”‚                                                                  β”‚
β”‚  POST /api/predict ──► StandardScaler ──► ML Model              β”‚
β”‚                                     β”œβ”€β”€β–Ί SHAP TreeExplainer      β”‚
β”‚                                     └──► SQLite / PostgreSQL DB  β”‚
β”‚                                                                  β”‚
β”‚  GET  /api/stats          β€” Total flows, attack counts by type   β”‚
β”‚  GET  /api/alerts         β€” Paginated alert history              β”‚
β”‚  GET  /api/ip-leaderboard β€” Top attacker IPs                     β”‚
β”‚  WS   /ws/live            β€” Real-time alert stream               β”‚
β”‚  POST /api/sniffer/start  β€” Start packet capture                 β”‚
β”‚  POST /api/sniffer/stop   β€” Stop packet capture                  β”‚
β”‚  GET  /health             β€” System health (DB, model, sniffer)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚  WebSocket / REST
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              React Dashboard  (port 5173)                        β”‚
β”‚                                                                  β”‚
β”‚  β€’ KPI Cards β€” total flows, attacks, detection rate, uptime      β”‚
β”‚  β€’ Live Traffic Chart β€” alerts per minute over time              β”‚
β”‚  β€’ Attack Pie Chart β€” distribution by attack type                β”‚
β”‚  β€’ Alert Feed β€” scrollable list of real-time alerts              β”‚
β”‚  β€’ IP Leaderboard β€” top source IPs by attack count               β”‚
β”‚  β€’ Attack Timeline β€” temporal heat map of attack events          β”‚
β”‚  β€’ SHAP Explainer β€” per-alert feature importance bars            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🧠 ML Pipeline

Dataset

The model is trained on the CICIDS2017 dataset β€” a widely used benchmark containing normal traffic and 14 attack categories including DDoS, DoS, PortScan, Brute Force, Bot, and Web Attacks.

Training Pipeline (src/model/train.py)

Step Description
A. Load Reads all CSVs from data/raw/ (up to 400,000 rows via load_data(RAW_DIR, n_samples=400_000))
B. Clean Removes NaN, Inf, and duplicate rows
C. Feature Engineering Adds 7 domain-specific ratio features (flow bytes/packet, fwd/bwd ratios, IAT jitter, etc.)
D. Split Separates features and label column (Attack Type)
E. Encode LabelEncoder β†’ numeric class indices, saved as label_encoder.pkl
F. Stratified Split 80/20 train/test with stratify=y
G. Class Balancing SMOTE disabled (USE_SMOTE = False); class imbalance handled natively via class_weight="balanced" on supported classifiers
H. Dual Scalers Fits StandardScaler and RobustScaler; best scaler saved as scaler.pkl
I. PCA Analysis Experimental comparison at 90/95/99% variance (not used in production)
J. Train 9 Models See table below
K. Cross-Validation 5-fold stratified CV on top 2 models, run on the full standard-scaled training dataset (X_tr_std)
L. Compare Sorted leaderboard by Macro F1
M. Save Best Best model saved as model.pkl

Model Suite (9 Classifiers)

# Model Search Strategy Notes
1 Logistic Regression Fixed params Baseline; class_weight="balanced"
2 Decision Tree GridSearchCV max_depth, criterion; class_weight="balanced" (both Standard & Robust scaler instances)
3 Random Forest RandomizedSearchCV n_estimators, max_features; class_weight="balanced" (both Standard & Robust scaler instances)
4 XGBoost RandomizedSearchCV learning_rate, subsample
5 LightGBM Fixed params Standard dependency (no longer optional fallback)
6 SVM (RBF kernel) Fixed params Deterministic first-15k slice (X_tr_std[:15000]); no CV; class_weight="balanced"
7 Neural Network (MLP) Fixed params 256β†’128β†’64 ReLU, Adam, early stopping
8 Voting Ensemble Soft voting RF + XGB + LGBM/MLP
9 Stacking Ensemble LR meta-learner RF + XGB + LGBM/MLP β†’ LR

Both StandardScaler and RobustScaler are compared for tree models. RobustScaler handles DDoS-induced outliers (e.g. 10⁢ pkt/s) better because it uses median and IQR instead of mean/std.

Neural Network Architecture

Input (52 features)
       β”‚
   Dense(256, relu)
       β”‚
   Dense(128, relu)
       β”‚
    Dense(64, relu)
       β”‚
   Dense(n_classes, softmax)

Optimizer: Adam (lr=1e-3, adaptive)
L2 Reg:    Ξ± = 1e-4
Batch:     512
Early Stop: 15 no-improve epochs on 10% validation split

SHAP Explainability

A TreeExplainer is cached once at server startup. For every non-benign prediction, the top 5 features by absolute SHAP value are returned alongside the prediction. The React SHAPExplainer component visualizes these as a horizontal bar chart per alert.


🌐 Feature Engineering

The FlowExtractor (src/features/extractor.py) converts raw Scapy packet dicts into the exact 52-feature CICIDS2017 vector expected by the model.

Feature Categories

Category Features
Basic Destination Port, Flow Duration, Total Fwd/Bwd Packets
Packet Length Fwd/Bwd Packet Length (Max, Min, Mean, Std), Min/Max/Mean/Std/Variance overall
Flow Rates Flow Bytes/s, Flow Packets/s, Fwd Packets/s, Bwd Packets/s
IAT (Inter-Arrival Time) Flow IAT (Mean/Std/Max/Min), Fwd IAT (Total/Mean/Std/Max/Min), Bwd IAT (same)
Headers Fwd Header Length, Bwd Header Length
TCP Flags FIN Flag Count, PSH Flag Count, ACK Flag Count
Window / Segment Init_Win_bytes_forward, Init_Win_bytes_backward, min_seg_size_forward
Subflow Subflow Fwd Bytes, act_data_pkt_fwd, Average Packet Size
Active / Idle Active Mean/Max/Min, Idle Mean/Max/Min

IATs are computed in microseconds (matching CICIDS2017 scale). Active/Idle periods are classified by a 5-second inter-packet gap threshold.


πŸ”„ Real-Time Data Flow

1. NetworkSniffer captures packet on interface 'eth0' / 'Wi-Fi'
         ↓
2. Packet parsed β†’ 12-field dict (src_ip, dst_ip, ports, protocol,
   size, payload_len, header_len, time, tcp_flags, window_size, ttl)
         ↓
3. Packet added to flow bucket (keyed by 5-tuple)
   Flow closes on: TCP FIN/RST | 30s timeout | 500-packet cap
         ↓
4. FlowExtractor.extract_from_dicts() β†’ {52 CICIDS features}
         ↓
5. POST http://localhost:8000/api/predict (non-blocking thread)
         ↓
6. Backend:
   a. Strips metadata (_source_ip, _dst_port, ...)
   b. Scales 52 features with StandardScaler
   c. model.predict() β†’ class index
   d. model.predict_proba() β†’ confidence score
   e. LabelEncoder.inverse_transform() β†’ human-readable label
   f. SHAP TreeExplainer β†’ top-5 feature importances
   g. Severity mapping (CRITICAL/HIGH/MEDIUM/LOW/NONE)
   h. Saves Alert to database
   i. If not BENIGN β†’ broadcast via WebSocket
         ↓
7. Dashboard WebSocket client receives alert JSON
   β†’ Updates KPICards, AlertFeed, TrafficChart, PieChart in real-time

πŸ–₯️ Frontend Dashboard

Built with React 18 + TypeScript + Vite + Tailwind CSS + shadcn/ui + Recharts.

Component Description
StatusBar WS connection indicator, live clock, system status
KPICards Total flows, attacks detected, detection rate, uptime
TrafficChart Recharts LineChart β€” attacks per minute, last 30 points
AttackPieChart Recharts PieChart β€” attack type distribution
AlertFeed Real-time scrollable alert list with severity color coding
IPLeaderboard Top 10 most aggressive source IPs
AttackTimeline Temporal bar chart of attack events over time
SHAPExplainer Per-alert SHAP feature importance bar chart
Sidebar Navigation rail with system overview

WebSocket Hook (useWebSocket.ts)

  • Connects to ws://localhost:8000/ws/live
  • On connect: receives batch of last 50 alerts for history seeding
  • Auto-reconnects on disconnect (3s delay)
  • Normalizes both new (value) and legacy (impact) SHAP field names

πŸ”Œ REST API Reference

Method Endpoint Description
POST /api/predict Submit a 52-feature flow dict for classification
GET /api/alerts Paginated alert history
GET /api/stats Total flows, attacks by type/severity, uptime
GET /api/ip-leaderboard Top N attacker IPs
POST /api/sniffer/start Start live packet capture
POST /api/sniffer/stop Stop live packet capture
GET /api/sniffer/stats Sniffer stats (packets, flows, alerts)
GET /health System health (DB, model, sniffer, WS clients)
WS /ws/live Real-time alert stream (WebSocket)

Interactive API docs: http://localhost:8000/docs


πŸš€ Quick Start

Prerequisites

Tool Version Purpose
Python 3.10+ Backend & ML
Node.js 18+ Frontend
Npcap Latest Packet capture on Windows
Git Any Clone repo

Windows note: Install Npcap with "WinPcap API-compatible mode" enabled. Run the backend as Administrator for packet capture.


1. Clone the Repository

git clone https://github.com/yourusername/nids.git
cd nids

2. Backend Setup

cd nids-backend

# Create a virtual environment (recommended)
python -m venv venv

# Activate it
# Windows:
venv\Scripts\activate
# Linux/Mac:
source venv/bin/activate

# Install all dependencies
pip install -r requirements.txt

Optional β€” PostgreSQL: Create a .env file if you want to use PostgreSQL instead of the default SQLite:

DATABASE_URL=postgresql://nids:password@localhost:5432/nids_db

3. Train the ML Model

Download the CICIDS2017 dataset CSVs and place them in nids-backend/data/raw/.

The dataset can be obtained from https://www.unb.ca/cic/datasets/ids-2017.html. Place CSV files directly in data/raw/.

# From inside nids-backend/
python src/model/train.py

This will:

  1. Load and clean the CSV data (up to 400,000 samples)
  2. Engineer 7 additional ratio features
  3. Handle class imbalance natively via class_weight="balanced" (SMOTE disabled)
  4. Train and compare 9 ML models (takes ~10–30 minutes depending on hardware)
  5. Save the best model, scaler, and label encoder:
    • nids-backend/model.pkl
    • nids-backend/scaler.pkl
    • nids-backend/label_encoder.pkl

Shortcut: If you have pre-trained artifacts, place them in nids-backend/ and skip this step.


4. Start the Backend Server

# From inside nids-backend/
# Standard mode (no auto-start of packet capture):
uvicorn src.api.main:app --reload --port 8000

# With automatic packet capture on startup:
set NIDS_CAPTURE=1   # Windows
uvicorn src.api.main:app --port 8000

Verify it's running:

curl http://localhost:8000/health

Expected response:

{
  "status": "ok",
  "db": "ok",
  "model": "ok",
  "sniffer": "stopped",
  "uptime_seconds": 3.2,
  "ws_clients": 0
}

5. Start the Frontend

cd nids-frontend
npm install
npm run dev

Open http://localhost:5173 in your browser.


6. Start Packet Capture (Live Mode)

Option A β€” Via the API:

curl -X POST http://localhost:8000/api/sniffer/start

Option B β€” Environment variable (set NIDS_CAPTURE=1 before starting the server, see Step 4).

Option C β€” Standalone sniffer:

# From inside nids-backend/
python src/capture/sniffer.py --interface auto

🎭 Attack Simulation (Demo Mode)

No real adversary? Use the built-in simulators to generate attack traffic for demonstration.

cd nids-backend

# Simulate a DDoS UDP flood (300 packets, ~1000 pps)
python src/simulation/sim_ddos.py --target 127.0.0.1 --count 300

# Simulate a port scan
python src/simulation/sim_portscan.py --target 127.0.0.1

# Simulate a brute force attack
python src/simulation/sim_bruteforce.py --target 127.0.0.1

# Simulate a mixed attack scenario
python src/simulation/sim_mixed.py

The sniffer will pick up the generated packets, extract features, and send them to the prediction API β€” alerts will appear on the dashboard in real time.


πŸ§ͺ Running Tests

cd nids-backend

# Feature extraction sanity check
python test_pipeline.py

# API integration tests (requires server running)
python test_api.py

# ML prediction test
python check.py

# Run the full pytest suite
pytest tests/

πŸ“ Project Structure

nids/
β”œβ”€β”€ nids-backend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ api/
β”‚   β”‚   β”‚   β”œβ”€β”€ main.py          # FastAPI app, WebSocket, sniffer control
β”‚   β”‚   β”‚   β”œβ”€β”€ database.py      # SQLAlchemy engine + session factory
β”‚   β”‚   β”‚   β”œβ”€β”€ models.py        # Alert ORM model
β”‚   β”‚   β”‚   β”œβ”€β”€ schemas.py       # Pydantic request/response schemas
β”‚   β”‚   β”‚   └── routes/
β”‚   β”‚   β”‚       β”œβ”€β”€ predict.py   # POST /api/predict β€” ML inference endpoint
β”‚   β”‚   β”‚       β”œβ”€β”€ alerts.py    # GET /api/alerts β€” alert history
β”‚   β”‚   β”‚       └── stats.py     # GET /api/stats, /api/ip-leaderboard
β”‚   β”‚   β”œβ”€β”€ capture/
β”‚   β”‚   β”‚   └── sniffer.py       # Live packet capture, flow assembly
β”‚   β”‚   β”œβ”€β”€ features/
β”‚   β”‚   β”‚   └── extractor.py     # 52-feature CICIDS2017 extractor
β”‚   β”‚   β”œβ”€β”€ model/
β”‚   β”‚   β”‚   β”œβ”€β”€ train.py         # Full ML training pipeline (9 models)
β”‚   β”‚   β”‚   β”œβ”€β”€ predict.py       # Inference wrapper + SHAP
β”‚   β”‚   β”‚   └── evaluate.py      # Model evaluation utilities
β”‚   β”‚   └── simulation/
β”‚   β”‚       β”œβ”€β”€ sim_ddos.py      # UDP DDoS flood simulator
β”‚   β”‚       β”œβ”€β”€ sim_portscan.py  # Port scan simulator
β”‚   β”‚       β”œβ”€β”€ sim_bruteforce.py# Brute force simulator
β”‚   β”‚       └── sim_mixed.py     # Mixed attack scenario
β”‚   β”œβ”€β”€ notebooks/
β”‚   β”‚   β”œβ”€β”€ 01_eda.ipynb         # Exploratory Data Analysis
β”‚   β”‚   β”œβ”€β”€ 02_training.ipynb    # Training walkthrough
β”‚   β”‚   └── 02_training_with_nn.ipynb # Neural Network training
β”‚   β”œβ”€β”€ model.pkl                # ← Trained model (generated by train.py)
β”‚   β”œβ”€β”€ scaler.pkl               # ← StandardScaler (generated by train.py)
β”‚   β”œβ”€β”€ label_encoder.pkl        # ← LabelEncoder (generated by train.py)
β”‚   └── requirements.txt
β”‚
└── nids-frontend/
    └── src/
        β”œβ”€β”€ pages/
        β”‚   └── Index.tsx        # Main dashboard page layout
        β”œβ”€β”€ components/
        β”‚   β”œβ”€β”€ StatusBar.tsx    # Connection & system status bar
        β”‚   β”œβ”€β”€ KPICards.tsx     # Key performance indicator cards
        β”‚   β”œβ”€β”€ TrafficChart.tsx # Live traffic line chart
        β”‚   β”œβ”€β”€ AttackPieChart.tsx # Attack type pie chart
        β”‚   β”œβ”€β”€ AlertFeed.tsx    # Real-time alert list
        β”‚   β”œβ”€β”€ IPLeaderboard.tsx# Top attacker IP table
        β”‚   β”œβ”€β”€ AttackTimeline.tsx # Temporal attack timeline
        β”‚   β”œβ”€β”€ SHAPExplainer.tsx # SHAP feature importance bars
        β”‚   └── Sidebar.tsx      # Navigation sidebar
        └── hooks/
            └── useWebSocket.ts  # WebSocket connection + alert normalization

πŸ› οΈ Tech Stack

Backend

Technology Version Role
Python 3.10+ Core language
FastAPI 0.110 REST API + WebSocket server
Uvicorn 0.29 ASGI web server
SQLAlchemy 2.0 ORM + database layer
scikit-learn 1.4 ML models, scalers, class weighting
XGBoost 2.0 Gradient-boosted tree classifier
LightGBM 4.x Fast gradient boosting
SHAP 0.45 Model explainability
Scapy 2.5 Live packet capture
pandas / numpy 2.2 / 1.26 Data processing
matplotlib / seaborn β€” Evaluation plots & notebook visualizations
requests β€” API connection testing & sniffer scripts
SQLite / PostgreSQL β€” Alert persistence

Frontend

Technology Version Role
React 18.3 UI framework
TypeScript 5.8 Type-safe JavaScript
Vite 7.3.2 Build tool + dev server
Tailwind CSS 3.4 Utility-first styling
shadcn/ui β€” Accessible component primitives
Recharts 2.15 Charts (line, pie, bar)
TanStack Query 5 Server state management
React Router 6 Client-side routing
Lucide React β€” Icon library

βš™οΈ Configuration

Backend Environment Variables

Variable Default Description
DATABASE_URL sqlite:///./nids.db Database connection string
NIDS_CAPTURE 0 Set to 1 to auto-start sniffer on startup

Key Constants (sniffer.py)

Constant Default Description
FLOW_TIMEOUT_SECONDS 30 Close inactive flows after N seconds
MAX_PACKETS_PER_FLOW 500 Safety cap per flow before early processing
ACTIVE_TIMEOUT 5.0s IAT threshold for active/idle classification

πŸ” Severity Classification

Level Attack Types
πŸ”΄ CRITICAL DDoS, DoS Hulk, DoS GoldenEye, DoS Slowloris, DoS SlowHTTPTest, Heartbleed
🟠 HIGH Bot, FTP-Patator, SSH-Patator, Infiltration
🟑 MEDIUM PortScan, Web Attack (Brute Force, XSS, SQL Injection)
🟒 LOW Brute Force (generic), unknown attack types
βšͺ NONE BENIGN (normal traffic)

πŸ“Š Model Performance (Typical on CICIDS2017)

Exact metrics will vary depending on the CSV files and sample sizes used. Run python src/model/train.py to reproduce.

Model Accuracy Macro F1 Notes
XGBoost ~99%+ ~97%+ Usually best single model
Random Forest ~99% ~96% Very close to XGBoost
Voting Ensemble ~99%+ ~97%+ RF + XGB + LGBM
Neural Network ~98% ~93% 256β†’128β†’64 MLP
Decision Tree ~98% ~90% After GridSearchCV
Logistic Regression ~95% ~75% Baseline
SVM (RBF) ~97% ~85% 15k subset only

🧩 Hackathon Demo Flow

Open 3 separate terminals. Terminals 2 & 3 must be run as Administrator.

1️⃣ Frontend β€” Terminal 1

cd nids-frontend
npm install        # first time only
npm run dev

Dashboard live at β†’ http://localhost:5173 (keep this terminal open)


2️⃣ Backend β€” Terminal 2 (Admin)

cd nids-backend
venv\Scripts\activate          # first time: python -m venv venv
pip install -r requirements.txt  # first time only
uvicorn src.api.main:app --reload --port 8000

Model is already trained β€” do not re-run train.py. (keep this terminal open)


3️⃣ Trigger Attacks β€” Terminal 3 (Admin)

cd nids-backend

# Start the packet sniffer
curl -X POST http://localhost:8000/api/sniffer/start
# Expected: { "running": true }

# Send attack traffic
python send_attacks.py

Live alerts will appear on the dashboard in real time.


4️⃣ Changing Attack Type (send_attacks.py)

Edit the skiprows value inside send_attacks.py to sample different attack traffic from the dataset:

Attack Type skiprows value
Brute Force 500_000
DoS 100_000
Port Scan 2_000_000
DDoS / Web Attack 1_500_000

⚠️ Important Notes

  • Place send_attacks.py inside the nids-backend/ folder
  • Terminals 2 and 3 must be run in Administrator mode (packet capture requires elevated privileges)
  • Do not close any terminal while the demo is running

πŸ›‘οΈ Security Notes

  • The sniffer requires administrator / root privileges to capture raw packets
  • On Windows, Npcap must be installed (download from https://npcap.com)
  • CORS is configured for localhost dev servers; update allow_origins for production
  • The simulation scripts send real packets on your network β€” use 127.0.0.1 as target in demos

πŸ“ Changelog

Latest Changes

ML Pipeline (src/model/train.py)

Class Imbalance Strategy

  • SMOTE has been disabled (USE_SMOTE = False) to eliminate synthetic sample noise and excessive memory overhead during training
  • class_weight="balanced" introduced natively on LogisticRegression, DecisionTreeClassifier (both Standard and Robust Scaler GridSearchCV instances), RandomForestClassifier (both Standard and Robust Scaler RandomizedSearchCV instances), and SVC

SVM Optimization

  • Replaced randomized sampling index with a deterministic first-15k slice (X_tr_std[:15000], y_tr[:15000]) for fully reproducible SVM runs
  • Cross-validation and Grid Search removed from SVM to keep its computational footprint minimal given its O(nΒ²) complexity

Data Loading & Validation

  • Base sample size increased to 400,000 (load_data(RAW_DIR, n_samples=400_000)) for a richer training distribution
  • cross_validate_top() now runs 5-Fold Stratified CV on the full standard-scaled dataset (X_tr_std) instead of a random 30k subsample, producing more robust evaluation metrics

Dependencies

Backend (requirements.txt)

  • Added matplotlib β€” used in evaluation scripts and notebooks
  • Added seaborn β€” used for notebook visualizations
  • Added lightgbm β€” promoted from optional fallback to standard required dependency
  • Added requests β€” used in API connection testing and sniffer scripts
  • Removed imbalanced-learn β€” no longer needed following SMOTE removal

Frontend (package-lock.json)

  • axios: 1.13.6 β†’ 1.15.0
  • proxy-from-env: 1.1.0 β†’ 2.1.0
  • lodash: 4.17.23 β†’ 1.18.1
  • vite: 7.3.1 β†’ 7.3.2

πŸ“„ License

This project is licensed under the MIT License β€” see the LICENSE file for details.


Built with ❀️ for network security research and hackathon competition.

The Sentinel β€” See every packet. Understand every threat.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors