Skip to content

Bbn08/Retail-Customer-Segmentation-Sample-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Retail Customer Segmentor

A full-stack application for performing Customer Segmentation on retail transaction data. This project uses RFM Analysis (Recency, Frequency, Monetary) and K-Means Clustering to group customers into actionable segments (e.g., "Whales", "Loyal Customers", "At Risk").

🚀 Features

  • Data Pipeline: Automated ETL pipeline to load, clean, and process transaction data.
  • RFM Analysis: Calculates Recency, Frequency, and Monetary values for each customer.
  • Auto-K Detection: Automatically determines the optimal number of clusters using the Elbow Method.
  • Interactive Dashboard: Value-driven React frontend to visualize clusters and explore customer segments.
  • API: FastAPI backend serving segmentation results and analytics.

🛠 Tech Stack

Backend

  • Python (3.10+)
  • FastAPI: REST API framework.
  • Pandas / NumPy: Data manipulation.
  • Scikit-Learn: Machine learning (K-Means, Scalers).
  • Uvicorn: ASGI server.

Frontend

  • React (Vite)
  • TypeScript
  • Tailwind CSS: Styling.
  • Recharts: Data visualization.
  • Lucide React: Icons.

📂 Dataset

The project is built to work with the UCI Online Retail Dataset.

  • Location: online+retail Dataset/Online Retail.xlsx
  • Size: ~23 MB
  • Columns: InvoiceNo, StockCode, Description, Quantity, InvoiceDate, UnitPrice, CustomerID, Country.

Note: The dataset file is included in the repo. If you update it, ensure it maintains the same schema.


🔧 Setup & Installation

1. Backend Setup

Navigate to the backend directory, create a virtual environment, and install dependencies.

cd backend

# Create virtual environment (optional but recommended)
python -m venv venv
# Windows
venv\Scripts\activate
# Mac/Linux
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

2. Frontend Setup

Navigate to the frontend directory and install Node.js dependencies.

cd frontend
npm install

🏃‍♂️ Usage

1. Run the Data Pipeline

Before starting the app, you must process the raw data to generate clusters.

cd backend
python run_pipeline.py
  • This will clean data, compute RFM, find optimal K, and save results to backend/data/.
  • Optionally, specify the number of clusters manually:
    python run_pipeline.py --k 5

2. Start the Backend API

cd backend
python main.py
  • Server runs at: http://localhost:8000
  • API Docs: http://localhost:8000/docs

3. Start the Frontend

Open a new terminal:

cd frontend
npm run dev
  • App runs at: http://localhost:5173 (or port shown in terminal).

📊 Project Structure

├── backend/
│   ├── data/                 # Generated JSON files (clusters, personas)
│   ├── routes/               # FastAPI routes
│   ├── src/                  # Core logic (clustering, data_loader, etc.)
│   ├── main.py               # API Entry point
│   ├── run_pipeline.py       # ETL Script
│   └── requirements.txt
├── frontend/
│   ├── src/                  # React components and pages
│   ├── vite.config.ts        # Vite config (proxy setup)
│   └── package.json
├── online+retail Dataset/    # Raw Data
│   └── Online Retail.xlsx
└── README.md

About

This project is an ojt project, it aims to classify customer data into different metrics for practical use cases.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors