Face Detection with a Lightweight CNN

Overview

This project implements a lightweight convolutional neural network (CNN) for binary face classification (face vs noface) and a sliding-window detection pipeline. The detector builds an image pyramid, scans with a fixed-size window, classifies each window, and applies Non-Maximum Suppression (NMS) to produce final bounding boxes. The repo includes training scripts, inference scripts, a detection entrypoint, a submission-ready report notebook, a simply web app for face detection, and demo images with results.

Repository Structure

train.py — training script.
test.py — test script.
predict.py — classification inference for single image/patch.
face_detection.py — sliding-window detection entrypoint.
net.py — CNN model definition (PyTorch).
load_data.py — dataset loader and preprocessing.
app.py — Streamlit web app for face detection.
model.pth — model checkpoint.
face_detection_report.ipynb — Jupyter report for submission.
demo_images/ — demo inputs and detection outputs (*_result.jpg).
results/ — training/validation accuracy charts (train_accuracy.png, validation_accuracy.png).
utils/gif2jpg.py — utility for GIF → JPG.
torchsampler/ — imbalanced sampler helpers.

Installation

Install dependencies:

pip install -r requirements.txt

If you encounter NumPy/PyTorch compatibility errors (e.g., “compiled against NumPy 1.x” or “Numpy is not available”), align versions by choosing ONE of the following:

pip install 'numpy<2'

pip install --upgrade torch torchvision torchaudio

Quick Start

Train (optional):

python train.py

Test the model on the test set:

python test.py

Classify a single image/patch:

python predict.py /path/to/image.jpg

Detect faces on a full image:

python face_detection.py /path/to/image.jpg

Run the Streamlit web app:

streamlit run app.py

Model and Preprocessing

Task: Binary classification (face vs noface), used as the backbone for detection.
Architecture: Compact CNN with convolution, ReLU, pooling, and fully connected layers ending in 2 logits for CrossEntropyLoss.
Input Size Consistency:
- Keep inference resize consistent with training (e.g., 36×36) to avoid fc1 input dimension mismatches.

Training

Command:

python train.py

Typical setup:
- Loss: CrossEntropyLoss
- Optimizer: Adam (e.g., learning rate 1e-3)
- Logging: print per-epoch loss/accuracy; log to TensorBoard
Checkpoints:
- Saved to model.pth. Ensure inference uses the same preprocessing as training.

Detection Pipeline

Steps:
- Image Pyramid → Sliding Window → CNN Classification → NMS
Thresholds:
- Tune the probability threshold to balance Precision vs Recall.
Output:
- Bounding boxes overlaid on the input image.

License

This project is intended for learning and experimentation. No specific license is provided.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Face Detection with a Lightweight CNN

Overview

Repository Structure

Installation

Quick Start

Model and Preprocessing

Training

Detection Pipeline

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
demo_images		demo_images
results		results
torchsampler		torchsampler
utils		utils
.gitignore		.gitignore
README.md		README.md
app.py		app.py
face_detection.py		face_detection.py
face_detection_report.ipynb		face_detection_report.ipynb
image.jpg		image.jpg
image_result.jpg		image_result.jpg
load_data.py		load_data.py
model.pth		model.pth
net.py		net.py
predict.py		predict.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

Face Detection with a Lightweight CNN

Overview

Repository Structure

Installation

Quick Start

Model and Preprocessing

Training

Detection Pipeline

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages