A full-stack application for performing Customer Segmentation on retail transaction data. This project uses RFM Analysis (Recency, Frequency, Monetary) and K-Means Clustering to group customers into actionable segments (e.g., "Whales", "Loyal Customers", "At Risk").
- Data Pipeline: Automated ETL pipeline to load, clean, and process transaction data.
- RFM Analysis: Calculates Recency, Frequency, and Monetary values for each customer.
- Auto-K Detection: Automatically determines the optimal number of clusters using the Elbow Method.
- Interactive Dashboard: Value-driven React frontend to visualize clusters and explore customer segments.
- API: FastAPI backend serving segmentation results and analytics.
- Python (3.10+)
- FastAPI: REST API framework.
- Pandas / NumPy: Data manipulation.
- Scikit-Learn: Machine learning (K-Means, Scalers).
- Uvicorn: ASGI server.
- React (Vite)
- TypeScript
- Tailwind CSS: Styling.
- Recharts: Data visualization.
- Lucide React: Icons.
The project is built to work with the UCI Online Retail Dataset.
- Location:
online+retail Dataset/Online Retail.xlsx - Size: ~23 MB
- Columns:
InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country.
Note: The dataset file is included in the repo. If you update it, ensure it maintains the same schema.
Navigate to the backend directory, create a virtual environment, and install dependencies.
cd backend
# Create virtual environment (optional but recommended)
python -m venv venv
# Windows
venv\Scripts\activate
# Mac/Linux
source venv/bin/activate
# Install dependencies
pip install -r requirements.txtNavigate to the frontend directory and install Node.js dependencies.
cd frontend
npm installBefore starting the app, you must process the raw data to generate clusters.
cd backend
python run_pipeline.py- This will clean data, compute RFM, find optimal K, and save results to
backend/data/. - Optionally, specify the number of clusters manually:
python run_pipeline.py --k 5
cd backend
python main.py- Server runs at:
http://localhost:8000 - API Docs:
http://localhost:8000/docs
Open a new terminal:
cd frontend
npm run dev- App runs at:
http://localhost:5173(or port shown in terminal).
├── backend/
│ ├── data/ # Generated JSON files (clusters, personas)
│ ├── routes/ # FastAPI routes
│ ├── src/ # Core logic (clustering, data_loader, etc.)
│ ├── main.py # API Entry point
│ ├── run_pipeline.py # ETL Script
│ └── requirements.txt
├── frontend/
│ ├── src/ # React components and pages
│ ├── vite.config.ts # Vite config (proxy setup)
│ └── package.json
├── online+retail Dataset/ # Raw Data
│ └── Online Retail.xlsx
└── README.md