Machine Learning Classifier - Iris Dataset

A comprehensive MATLAB implementation comparing multiple machine learning classification algorithms on the famous Iris flower dataset.

🎯 Project Overview

This project implements and compares five different machine learning classifiers to predict iris flower species based on petal and sepal measurements. The project achieves 100% accuracy using Support Vector Machine with Gaussian kernel.

🏆 Results Summary

Algorithm	Accuracy	Training Time	Notes
SVM (Gaussian)	100.00%	92.2 ms	🥇 Best Overall
KNN (k=1)	97.78%	56.4 ms	Very fast, highly accurate
SVM (Linear)	97.78%	132.7 ms	Excellent generalization
KNN (k=5)	95.56%	22.6 ms	Good balance
Decision Tree	95.56%	21.0 ms	⚡ Fastest, most interpretable

📊 Key Features

Algorithms Implemented

K-Nearest Neighbors (KNN) - Multiple k values tested
Support Vector Machine (SVM) - Three kernel types (Linear, Gaussian, Polynomial)
Decision Tree - Multiple depth configurations tested

Evaluation Metrics

Accuracy
Precision, Recall, F1-Score (per class)
Confusion matrices
Training time comparison
Feature importance analysis

Visualizations

Data distribution scatter plots
Confusion matrices for all models
Performance comparison charts
Feature importance plots
Hyperparameter tuning results

🚀 Quick Start

Prerequisites

MATLAB R2020a or later
Statistics and Machine Learning Toolbox

Running the Project

% Navigate to project directory
cd MLClassifier/examples

% Run individual classifiers
first_classifier      % K-Nearest Neighbors
svm_classifier        % Support Vector Machine
tree_classifier       % Decision Tree

% Compare all algorithms
compare_classifiers   % Complete comparison

📁 Project Structure

MLClassifier/
├── src/
│   ├── algorithms/           # ML algorithm implementations
│   ├── preprocessing/        # Data preprocessing functions
│   ├── evaluation/          # Evaluation metrics
│   └── visualization/       # Plotting functions
├── data/
│   └── sample_datasets/     # Iris and other datasets
├── examples/
│   ├── first_classifier.m   # KNN implementation
│   ├── svm_classifier.m     # SVM implementation
│   ├── tree_classifier.m    # Decision Tree implementation
│   └── compare_classifiers.m # Complete comparison
├── results/
│   └── plots/               # Generated visualizations
├── docs/                    # Documentation
└── README.md

🔬 Detailed Results

Dataset Information

Samples: 150 iris flowers
Features: 4 (Sepal Length, Sepal Width, Petal Length, Petal Width)
Classes: 3 (Setosa, Versicolor, Virginica)
Split: 70% training (105 samples), 30% testing (45 samples)

Best Model Performance (SVM Gaussian)

Per-Class Metrics:

Class	Precision	Recall	F1-Score
Setosa	1.0000	1.0000	1.0000
Versicolor	1.0000	1.0000	1.0000
Virginica	1.0000	1.0000	1.0000
Average	1.0000	1.0000	1.0000

Perfect classification on all 45 test samples!

Feature Importance (Decision Tree Analysis)

Feature	Importance Score
Petal Length	0.1189
Petal Width	0.0991
Sepal Length	0.0000
Sepal Width	0.0000

Key Insight: Petal measurements are sufficient for classification; sepal measurements don't contribute to the decision tree model.

🎓 What I Learned

Technical Skills

Implementation of multiple ML algorithms from scratch
Hyperparameter tuning and optimization
Model evaluation and comparison methodologies
Data visualization best practices
Professional code documentation

Key Insights

SVM with Gaussian kernel achieves perfect separation for this dataset
Decision trees are fastest but may slightly sacrifice accuracy
Feature importance analysis reveals that not all features contribute equally
Different algorithms have different strengths (speed vs. accuracy vs. interpretability)

🛠️ Technical Details

K-Nearest Neighbors

Tested k values: 1, 3, 5, 7, 9, 11, 15, 20
Best k: 1 (97.78% accuracy)
Trade-off: Lower k = higher variance, higher k = higher bias

Support Vector Machine

Kernels tested: Linear, Gaussian (RBF), Polynomial (degree 3)
Best kernel: Gaussian (100% accuracy)
Linear kernel also performed excellently (97.78%)

Decision Tree

Max depths tested: 2, 3, 5, 10, 20
Best depth: 3 (95.56% accuracy)
Deeper trees didn't improve performance (no overfitting benefit)

💡 Future Enhancements

Add more algorithms (Random Forest, Naive Bayes, Neural Networks)
Implement cross-validation for more robust evaluation
Create GUI application for interactive model selection
Add support for custom dataset upload
Implement ensemble methods
Add hyperparameter grid search automation
Export trained models for deployment

📚 References

Fisher, R. A. (1936). "The use of multiple measurements in taxonomic problems"
MATLAB Documentation: Statistics and Machine Learning Toolbox
UCI Machine Learning Repository - Iris Dataset

👨‍💻 Author

Vignesh Pai B

Email: vigneshpaib@gmail.com
LinkedIn: https://www.linkedin.com/in/vigneshpaib/
GitHub: https://github.com/vigp17

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built with MATLAB R2025b
Dataset from UCI Machine Learning Repository
Inspired by the need to compare classical ML algorithms
Created as part of learning journey in machine learning

⭐ If you found this project helpful, please consider giving it a star!

Last Updated: January 2026

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
results		results
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Classifier - Iris Dataset

🎯 Project Overview

🏆 Results Summary

📊 Key Features

Algorithms Implemented

Evaluation Metrics

Visualizations

🚀 Quick Start

Prerequisites

Running the Project

📁 Project Structure

🔬 Detailed Results

Dataset Information

Best Model Performance (SVM Gaussian)

Feature Importance (Decision Tree Analysis)

🎓 What I Learned

Technical Skills

Key Insights

🛠️ Technical Details

K-Nearest Neighbors

Support Vector Machine

Decision Tree

💡 Future Enhancements

📚 References

👨‍💻 Author

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Classifier - Iris Dataset

🎯 Project Overview

🏆 Results Summary

📊 Key Features

Algorithms Implemented

Evaluation Metrics

Visualizations

🚀 Quick Start

Prerequisites

Running the Project

📁 Project Structure

🔬 Detailed Results

Dataset Information

Best Model Performance (SVM Gaussian)

Feature Importance (Decision Tree Analysis)

🎓 What I Learned

Technical Skills

Key Insights

🛠️ Technical Details

K-Nearest Neighbors

Support Vector Machine

Decision Tree

💡 Future Enhancements

📚 References

👨‍💻 Author

📄 License

🙏 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages