- Test Mean Absolute Error (MAE): ~0.28
- Best Validation MAE: ~0.24
- Dataset Used: Out of ~5000 CSV entries, ~1064 valid images were available for training/testing (rest missing).
Below is the plot of training and validation loss (MSE) over epochs.
-
Data Preprocessing
- Used the provided
focus_labels.csvwhich containedimage_pathandfocus_score. - Cleaned dataset by dropping missing image paths.
- Grouped images by their stack folder and performed GroupShuffleSplit to avoid data leakage between train, validation, and test sets.
- Applied image augmentations (random flip, brightness, contrast) to improve generalization.
- Used the provided
-
Model Architectures
-
Implemented two options:
- Simple CNN: 3 Conv layers + Dense layers for baseline.
- MobileNetV2 (Pretrained): Transfer learning with ImageNet weights, frozen base, followed by:
- Global Average Pooling
- Dropout (0.3)
- Dense (64, ReLU)
- Dense (1, regression output)
-
Final experiments used MobileNetV2 as it provided better performance.
-
-
Training Setup
- Optimizer:
Adam (1e-4) - Loss:
Mean Squared Error (MSE) - Metrics:
Mean Absolute Error (MAE) - Batch size: 32
- Epochs: 25 (with Early Stopping)
Callbacks used:
ModelCheckpoint(best model saved)EarlyStopping(patience 7, restore best weights)ReduceLROnPlateau(factor 0.5, patience 3)
- Optimizer:
-
Results
- Achieved Validation MAE ~0.24 and Test MAE ~0.28.
- The final best model was saved in Keras format (
best_model.keras) and converted to TFLite float16 (focus_model.tflite) for efficient deployment.
train.py– Training pipeline scriptfocus_labels.csv– Input CSV with labelsoutputs/– Directory containing:best_model.keras– Saved best modelfocus_model.tflite– Converted TFLite modeltraining_plots.png– Training/Validation loss plot
✅ This project demonstrates an end-to-end deep learning pipeline for predicting focus score from cell stack images, with dataset preprocessing, training, evaluation, and deployment-ready model conversion.
