Predicting Customer Churn in Online Retail

Overview

This project builds a machine-learning pipeline to predict customer churn in an online retail setting, using the Online Retail II dataset from the UCI Machine Learning Repository.

The pipeline covers data cleaning, exploratory data analysis, RFM-based feature engineering with a leakage-free churn label, K-Means customer segmentation, model training and hyperparameter tuning (Logistic Regression, Random Forest, XGBoost), and SHAP-based explainability.

Repository structure

.
├── churn_analysis.ipynb          # Full analysis notebook
├── data/
│   ├── raw/                      # Raw Excel dataset
│   └── processed/                # Cleaned dataframes (pickle)
├── models/
│   └── artifacts/                # Trained models, scalers (joblib)
├── reports/
│   ├── figures/                  # Generated plots (PNG)
│   └── tables/                   # Summary statistics (CSV)
├── Predicting Customer Churn...  # Final report (PDF + DOCX)
├── requirements.txt              # Python dependencies
└── README.md

How to run

Install dependencies:
```
pip install -r requirements.txt
```
Place online_retail_II.xlsx in data/raw/ (already included).
Open and run churn_analysis.ipynb from top to bottom.

All figures, tables, and model artifacts are saved automatically.

Key results

Model	Test AUC
Logistic Regression	0.810
Random Forest	0.818
XGBoost	0.822

XGBoost was selected as the final model. Feature importance via SHAP shows that recency, frequency, and recent purchase momentum are the strongest churn predictors.

Requirements

Python 3.10+
See requirements.txt for package versions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting Customer Churn in Online Retail

Overview

Repository structure

How to run

Key results

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
models/artifacts		models/artifacts
reports		reports
README.md		README.md
Report.pdf		Report.pdf
churn_analysis.ipynb		churn_analysis.ipynb
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Predicting Customer Churn in Online Retail

Overview

Repository structure

How to run

Key results

Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages