This project implements a lightweight convolutional neural network (CNN) for binary face classification (face vs noface) and a sliding-window detection pipeline. The detector builds an image pyramid, scans with a fixed-size window, classifies each window, and applies Non-Maximum Suppression (NMS) to produce final bounding boxes. The repo includes training scripts, inference scripts, a detection entrypoint, a submission-ready report notebook, a simply web app for face detection, and demo images with results.
train.py— training script.test.py— test script.predict.py— classification inference for single image/patch.face_detection.py— sliding-window detection entrypoint.net.py— CNN model definition (PyTorch).load_data.py— dataset loader and preprocessing.app.py— Streamlit web app for face detection.model.pth— model checkpoint.face_detection_report.ipynb— Jupyter report for submission.demo_images/— demo inputs and detection outputs (*_result.jpg).results/— training/validation accuracy charts (train_accuracy.png,validation_accuracy.png).utils/gif2jpg.py— utility for GIF → JPG.torchsampler/— imbalanced sampler helpers.
- Install dependencies:
pip install -r requirements.txt- If you encounter NumPy/PyTorch compatibility errors (e.g., “compiled against NumPy 1.x” or “Numpy is not available”), align versions by choosing ONE of the following:
pip install 'numpy<2'pip install --upgrade torch torchvision torchaudio- Train (optional):
python train.py- Test the model on the test set:
python test.py- Classify a single image/patch:
python predict.py /path/to/image.jpg- Detect faces on a full image:
python face_detection.py /path/to/image.jpg- Run the Streamlit web app:
streamlit run app.py- Task: Binary classification (
facevsnoface), used as the backbone for detection. - Architecture: Compact CNN with convolution, ReLU, pooling, and fully connected layers ending in 2 logits for
CrossEntropyLoss. - Input Size Consistency:
- Keep inference resize consistent with training (e.g.,
36×36) to avoidfc1input dimension mismatches.
- Keep inference resize consistent with training (e.g.,
- Command:
python train.py- Typical setup:
- Loss:
CrossEntropyLoss - Optimizer: Adam (e.g., learning rate
1e-3) - Logging: print per-epoch loss/accuracy; log to TensorBoard
- Loss:
- Checkpoints:
- Saved to
model.pth. Ensure inference uses the same preprocessing as training.
- Saved to
- Steps:
- Image Pyramid → Sliding Window → CNN Classification → NMS
- Thresholds:
- Tune the probability threshold to balance Precision vs Recall.
- Output:
- Bounding boxes overlaid on the input image.
This project is intended for learning and experimentation. No specific license is provided.