An image classification model that identifies 5 types of flowers using transfer learning with MobileNet V2. Built as a capstone project for ML Zoomcamp 2025.
Flower identification is a common challenge for gardeners, botanists, and nature enthusiasts. This project builds an AI-powered classifier that can identify flowers from photographs, making it easier to:
- Identify unknown flowers while hiking or gardening
- Assist in botanical research and cataloging
- Power mobile apps for plant identification
The model classifies images into 5 flower categories:
- 🌼 Daisy
- 🌻 Sunflower
- 🌷 Tulip
- 🌹 Rose
- 🌾 Dandelion
Source: TensorFlow Flowers Dataset
- Total Images: 3,670 labeled photos
- Classes: 5 flower types
- Image Format: JPEG, various sizes
- Train/Val Split: 80/20
| Class | Total | Training | Validation |
|---|---|---|---|
| Daisy | 633 | 526 | 107 |
| Dandelion | 898 | 707 | 191 |
| Roses | 641 | 522 | 119 |
| Sunflowers | 699 | 564 | 135 |
| Tulips | 799 | 617 | 182 |
| Total | 3,670 | 2,936 | 734 |
The dataset is relatively balanced, with a class imbalance ratio of ~1.42 (dandelion has the most images, daisy the fewest).
flower-classification-capstone/
├── README.md # Project documentation
├── notebooks/
│ └── eda_and_training.ipynb # EDA + model experiments
├── src/
│ ├── download_data.py # Dataset download script
│ ├── train.py # Model training script
│ └── predict.py # Flask prediction service
├── models/ # Saved model artifacts
│ ├── flower_classifier.keras
│ └── class_names.txt
├── data/ # Dataset (downloaded separately)
├── docker/
│ └── Dockerfile # Container definition
├── tests/
│ └── test_service.py # API test script
├── requirements.txt # Python dependencies
└── Pipfile # Pipenv dependencies
A basic convolutional neural network built from scratch:
- 3 Conv2D + MaxPooling blocks
- Dense layer with dropout
- Result: Severe overfitting (96% train, 67% validation)
Using a pre-trained MobileNet V2 (ImageNet weights) as a feature extractor:
- Frozen base model + custom classification head
- Data augmentation (flip, rotation, zoom)
- GlobalAveragePooling + Dropout + Dense(5)
- Result: 88% validation accuracy
Unfreezing the last 30 layers of MobileNet V2 for fine-tuning:
- Lower learning rate (1e-5)
- 5 additional epochs
- Result: 88.3% validation accuracy (best model)
| Model | Train Accuracy | Val Accuracy | Notes |
|---|---|---|---|
| Baseline CNN | 96% | 67% | Overfitting |
| MobileNetV2 (frozen) | 90% | 88% | Transfer learning |
| MobileNetV2 (fine-tuned) | 90% | 88.3% | Selected model |
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Daisy | 0.83 | 0.95 | 0.89 | 107 |
| Dandelion | 0.95 | 0.91 | 0.93 | 191 |
| Roses | 0.77 | 0.92 | 0.84 | 119 |
| Sunflowers | 0.93 | 0.84 | 0.89 | 135 |
| Tulips | 0.90 | 0.82 | 0.86 | 182 |
| Overall | 0.88 | 0.88 | 0.88 | 734 |
- Dandelion has the highest precision (95%) - distinctive yellow color and shape
- Roses has the lowest precision (77%) - sometimes confused with tulips due to similar colors
- Daisy has the highest recall (95%) - white petals with yellow center are easy to identify
- Tulips has the lowest recall (82%) - varied colors cause some confusion with roses
git clone https://github.com/HighviewOne/flower-classification-capstone.git
cd flower-classification-capstoneOption A: Using Conda (recommended for this project)
conda activate MLZoomCamp_env
pip install -r requirements.txtOption B: Using Pipenv
pip install pipenv
pipenv install
pipenv shellOption C: Using pip with venv
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txtpython src/download_data.pyOr manually:
cd data
curl -O http://download.tensorflow.org/example_images/flower_photos.tgz
tar -xzf flower_photos.tgzThe trained model is already included. To retrain:
python src/train.pyThis will:
- Load and preprocess the dataset
- Train the MobileNetV2 transfer learning model
- Save the model to
models/flower_classifier.keras
python src/predict.pyThe API will start at http://localhost:9696
Using an image URL:
curl -X POST http://localhost:9696/predict \
-H "Content-Type: application/json" \
-d '{"image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/4/40/Sunflower_sky_backdrop.jpg/800px-Sunflower_sky_backdrop.jpg"}'Using a local file:
curl -X POST http://localhost:9696/predict \
-F "image=@path/to/flower.jpg"Expected Response:
{
"prediction": "sunflowers",
"confidence": 0.9823,
"probabilities": {
"daisy": 0.0012,
"dandelion": 0.0034,
"roses": 0.0045,
"sunflowers": 0.9823,
"tulips": 0.0086
}
}python tests/test_service.pydocker build -t flower-classifier -f docker/Dockerfile .docker run -it -p 9696:9696 flower-classifiercurl -X POST http://localhost:9696/predict \
-H "Content-Type: application/json" \
-d '{"image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/4/40/Sunflower_sky_backdrop.jpg/800px-Sunflower_sky_backdrop.jpg"}'Deploy the service to a local Kubernetes cluster using kind.
# Install kind (Kubernetes in Docker)
winget install Kubernetes.kind
# Install kubectl
winget install Kubernetes.kubectlkubernetes\deploy-k8s.bat# 1. Create a kind cluster
kind create cluster --name flower-cluster
# 2. Load the Docker image into the cluster
kind load docker-image flower-classifier:latest --name flower-cluster
# 3. Apply Kubernetes manifests
kubectl apply -f kubernetes/deployment.yaml
kubectl apply -f kubernetes/service.yaml
kubectl apply -f kubernetes/hpa.yaml
# 4. Wait for deployment to be ready
kubectl rollout status deployment/flower-classifier
# 5. Check pod status
kubectl get pods -l app=flower-classifier# Port forward to access the service
kubectl port-forward service/flower-classifier 9696:80Then test:
curl http://localhost:9696/health
python tests/test_service.py- Deployment: Manages pod lifecycle with rolling updates
- Service: LoadBalancer exposes the API on port 80
- HPA: Horizontal Pod Autoscaler scales from 1-3 replicas based on CPU usage
# Delete the cluster when done
kind delete cluster --name flower-cluster| Endpoint | Method | Description |
|---|---|---|
/predict |
POST | Classify a flower image |
/health |
GET | Health check (returns 200 OK) |
The /predict endpoint accepts three input formats:
-
JSON with image URL:
{"image_url": "https://example.com/flower.jpg"} -
JSON with base64-encoded image:
{"image_base64": "iVBORw0KGgo..."} -
Multipart form with file upload:
curl -F "image=@flower.jpg" http://localhost:9696/predict
- EDA and Training: Complete exploratory data analysis, model experimentation, and hyperparameter tuning
- Image sizes vary from 240×180 to 4000×3000 pixels (resized to 224×224 for training)
- Aspect ratios are mostly close to 1.0 (square-ish images)
- Data augmentation (random flip, rotation, zoom) helps prevent overfitting
- Python 3.11
- TensorFlow 2.20 / Keras - Deep learning framework
- MobileNet V2 - Pre-trained CNN for transfer learning
- Flask - Web service framework
- Docker - Containerization
- NumPy, Pandas - Data manipulation
- Matplotlib, Seaborn - Visualization
- scikit-learn - Metrics and evaluation
- Model trained on only 5 flower types
- Performance may vary with low-quality, blurry, or unusual angle images
- Flowers with similar colors (roses/tulips) can be confused
- Expand to more flower species (10-20 classes)
- Add confidence thresholding to reject uncertain predictions
- Implement model versioning and A/B testing
- Deploy to cloud (AWS/GCP) with auto-scaling
- Build a mobile app with camera integration
Michael - ML Zoomcamp 2025 Capstone Project
- GitHub: @HighviewOne
This project is licensed under the MIT License.
- DataTalks.Club for the ML Zoomcamp course
- TensorFlow team for the flowers dataset and MobileNet V2
- MobileNetV2 paper - Sandler et al., 2018