Fraud Detection Benchmark: Random Forest vs XGBoost

A comprehensive comparison of computational efficiency and predictive performance for credit card fraud detection.

📋 Overview

This project benchmarks two popular machine learning algorithms—Random Forest and XGBoost—on the challenging task of credit card fraud detection. The dataset is highly imbalanced (~0.17% fraud), making it an excellent test case for evaluating model performance beyond simple accuracy.

Key Features

Comprehensive Benchmarking: Training time, inference speed, memory usage, and model size
Fraud-Focused Metrics: AUPRC, F1-Score, Recall, Precision (not just accuracy!)
Visual Comparisons: ROC curves, Precision-Recall curves, confusion matrices
Imbalance Handling: Proper use of class_weight and scale_pos_weight
GPU Support: Automatic GPU detection for XGBoost acceleration

🚀 Quick Start

1. Clone the Repository

git clone https://github.com/cauegrassi7/fraud-detection-benchmark.git
cd fraud-detection-benchmark

2. Install Dependencies

pip install -r requirements.txt

3. Download the Dataset

Download the Credit Card Fraud Detection dataset from Kaggle:

👉 https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

Place the creditcard.csv file in the data/ directory:

fraud-detection-benchmark/
└── data/
    └── creditcard.csv

4. Run the Benchmark

python main.py

Or with a custom data path:

python main.py --data-path /path/to/your/creditcard.csv

📊 Sample Output

╔══════════════════════════════════════════════════════════════════╗
║       FRAUD DETECTION BENCHMARK: Random Forest vs XGBoost        ║
╚══════════════════════════════════════════════════════════════════╝

┌──────────────────────────────────────────────────────────────────┐
│  PREDICTIVE PERFORMANCE (Higher is Better)                       │
├────────────────────┬────────────────────┬────────────────────────┤
│ Metric             │ Random Forest      │ XGBoost                │
├────────────────────┼────────────────────┼────────────────────────┤
│ AUPRC              │ 0.8523             │ 0.8701 ★               │
│ F1-Score           │ 0.8234             │ 0.8456 ★               │
│ Recall             │ 0.7891             │ 0.8123 ★               │
│ Precision          │ 0.8612             │ 0.8823 ★               │
└────────────────────┴────────────────────┴────────────────────────┘

┌──────────────────────────────────────────────────────────────────┐
│  COMPUTATIONAL EFFICIENCY (Lower is Better)                      │
├────────────────────┬────────────────────┬────────────────────────┤
│ Metric             │ Random Forest      │ XGBoost                │
├────────────────────┼────────────────────┼────────────────────────┤
│ Training Time      │ 45.23 s            │ 12.87 s ★              │
│ Inference/1k       │ 0.023 s            │ 0.008 s ★              │
│ Peak Memory        │ 1,234 MB           │ 567 MB ★               │
│ Model Size         │ 89.5 MB            │ 12.3 MB ★              │
└────────────────────┴────────────────────┴────────────────────────┘

📁 Project Structure

fraud-detection-benchmark/
├── src/
│   ├── __init__.py          # Package initialization
│   ├── data_processing.py   # Data loading, scaling, splitting
│   ├── models.py            # Model factory functions
│   ├── benchmark.py         # Timing and memory measurement
│   └── visualization.py     # Plot generation
├── data/
│   └── creditcard.csv       # Dataset (user provides)
├── outputs/                  # Generated plots and saved models
├── main.py                  # Main orchestrator script
├── requirements.txt         # Python dependencies
└── README.md

🔬 Technical Details

Dataset Characteristics

Property	Value
Total Transactions	284,807
Fraudulent	492 (0.17%)
Normal	284,315 (99.83%)
Features	30 (V1-V28 PCA, Time, Amount)

Preprocessing

StandardScaler applied to Amount and Time only
V1-V28 already normalized via PCA transformation
Stratified split (80/20) to maintain class distribution

Model Configurations

Random Forest:

RandomForestClassifier(
    n_estimators=100,
    n_jobs=-1,              # All CPU cores
    class_weight='balanced', # Auto-balance classes
    random_state=42
)

XGBoost:

XGBClassifier(
    n_estimators=100,
    scale_pos_weight=577,   # n_negative / n_positive
    device='cuda',          # GPU if available
    eval_metric='aucpr',
    random_state=42
)

Why These Metrics?

Metric	Why It Matters for Fraud Detection
AUPRC	Most informative for imbalanced data; focuses on minority class
Recall	Catching fraud is critical—we want to minimize false negatives
Precision	High precision reduces false positives (annoying legitimate users)
F1-Score	Harmonic mean balancing precision and recall

⚠️ Accuracy is misleading for this dataset! A model predicting "Normal" for everything would achieve 99.83% accuracy but catch zero fraud.

📈 Generated Visualizations

After running the benchmark, check the outputs/ folder for:

File	Description
`roc_curves.png`	ROC curves with AUC scores
`pr_curves.png`	Precision-Recall curves with AUPRC
`time_comparison.png`	Training and inference time barplots
`memory_comparison.png`	Peak memory and model size barplots
`confusion_matrices.png`	Side-by-side confusion matrices
`metrics_comparison.png`	Grouped barplot of all metrics

🛠️ Requirements

Python 3.10+
pandas >= 2.0.0
numpy >= 1.24.0
scikit-learn >= 1.3.0
xgboost >= 2.0.0
matplotlib >= 3.7.0
seaborn >= 0.12.0
joblib >= 1.3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fraud Detection Benchmark: Random Forest vs XGBoost

📋 Overview

Key Features

🚀 Quick Start

1. Clone the Repository

2. Install Dependencies

3. Download the Dataset

4. Run the Benchmark

📊 Sample Output

📁 Project Structure

🔬 Technical Details

Dataset Characteristics

Preprocessing

Model Configurations

Why These Metrics?

📈 Generated Visualizations

🛠️ Requirements

📝 License

🤝 Contributing

📚 References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
outputs		outputs
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Fraud Detection Benchmark: Random Forest vs XGBoost

📋 Overview

Key Features

🚀 Quick Start

1. Clone the Repository

2. Install Dependencies

3. Download the Dataset

4. Run the Benchmark

📊 Sample Output

📁 Project Structure

🔬 Technical Details

Dataset Characteristics

Preprocessing

Model Configurations

Why These Metrics?

📈 Generated Visualizations

🛠️ Requirements

📝 License

🤝 Contributing

📚 References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages