Dimensionality reduction and multi-class classification on the UCI Wine dataset using Linear Discriminant Analysis.
Reduces 13 chemical features of wine samples down to 2 linear discriminants using LDA, then trains a classifier to predict customer segments — with decision-boundary visualizations for both training and test sets.
- Load the Wine dataset (178 samples, 13 features, 3 classes)
- 80/20 train/test split + standard feature scaling
- Apply LDA → project onto 2 discriminant components
- Train a classifier (Logistic Regression in Python, SVM in R)
- Evaluate with a confusion matrix + accuracy score
- Plot 2D decision boundaries for train and test sets
Wine.csv — 178 samples across 3 customer segments, with 13 chemical analysis features (Alcohol, Malic Acid, Ash, Magnesium, Phenols, Flavanoids, etc.). Based on the UCI Wine dataset.
pip install numpy matplotlib pandas scikit-learn
python lda.pyinstall.packages(c("caTools", "MASS", "e1071", "ElemStatLearn"))
source("lda.R")| Tool | Purpose | |
|---|---|---|
| 🐍 | scikit-learn |
LDA, Logistic Regression, StandardScaler |
| 📊 | matplotlib |
Decision-boundary visualization |
| 🧮 | pandas / numpy |
Data loading and manipulation |
| 📈 | MASS (R) |
LDA implementation |
| 🤖 | e1071 (R) |
SVM classifier |
- R:
ElemStatLearnremoved from CRAN — TheElemStatLearnpackage used for visualization inlda.Rhas been archived. Install from the CRAN archive or use an alternative plotting approach.
MIT — see LICENSE
Kaustabh Ganguly (@stabgan)