Face Gender Classifier

using two distinct approaches: a Neural Network Perceptron and a Support Vector Machine (SVM).

A machine learning project that implements gender classification from facial images using two distinct approaches: a Neural Network Perceptron and a Support Vector Machine (SVM). The project was developed and evaluated on a carefully curated and balanced dataset comprising 2,307 facial images (1,173 men and 1,134 women), with an 80-20 split for training and testing.

Both models demonstrate robust performance, with the Perceptron achieving 94.96% accuracy on training and 90.48% on testing, while the SVM reached 95.18% on training and 89.83% on testing. This high performance is achieved through an extensive feature extraction pipeline that combines:

Basic color analysis (RGB channels, statistical measures, and histograms)
Advanced texture analysis (Gray Level Co-occurrence Matrix properties)
Shape and gradient information (Histogram of Oriented Gradients)
Geometric and statistical features (entropy, moments, and spatial relationships)

The project includes a user-friendly GUI application built with CustomTkinter that enables real-world testing and model comparison. Users can:

Load images from local storage or web sources
Select face regions using an intuitive rectangle selection tool
Process images through both models simultaneously
Compare prediction results and model confidence

Overview

This project implements and compares two different machine learning approaches for gender classification from facial images. The main objectives are:

1. Model Implementation and Comparison

Perceptron Neural Network

Grid Search Parameters:

Learning Rate (alpha): [0.0001, 0.001, 0.01]
Max Iterations: [100, 1000, 10000]
Stopping Tolerance: [1e-3, 1e-4, 1e-5]

Top 5 models

Learning Rate	Max Iterations	Tolerance	Train Accuracy	Train Precision	Train Recall	Train F1	Test Accuracy	Test Precision	Test Recall	Test F1	Accuracy Diff
0.01	1000	1e-4	94.96%	95.63%	94.05%	94.83%	90.48%	91.78%	88.55%	90.13%	4.48%
0.001	10000	1e-5	94.80%	95.72%	93.61%	94.65%	90.26%	92.52%	87.22%	89.80%	4.54%
0.01	10000	1e-4	94.69%	95.50%	93.61%	94.54%	89.83%	91.67%	87.22%	89.39%	4.86%
0.01	1000	1e-5	95.93%	96.64%	95.04%	95.83%	89.61%	91.63%	86.78%	89.14%	6.32%
0.01	100	1e-5	90.84%	91.84%	89.31%	90.55%	88.96%	92.31%	84.58%	88.28%	1.88%

Support Vector Machine

Grid Search Parameters:

C (Regularization): [0.0001, 0.001, 0.01, 0.1]
Kernel: ['linear', 'rbf']
Gamma: ['scale', 'auto', 0.1, 1]

Top 5 models

C	Kernel	Gamma	Train Accuracy	Train Precision	Train Recall	Train F1	Test Accuracy	Test Precision	Test Recall	Test F1	Accuracy Diff	Support Vectors
0.001	linear	scale	95.18%	96.06%	94.05%	95.04%	89.83%	91.67%	87.22%	89.39%	5.35%	665
0.001	linear	auto	95.18%	96.06%	94.05%	95.04%	89.83%	91.67%	87.22%	89.39%	5.35%	665
0.001	linear	0.1	95.18%	96.06%	94.05%	95.04%	89.83%	91.67%	87.22%	89.39%	5.35%	665
0.001	linear	1	95.18%	96.06%	94.05%	95.04%	89.83%	91.67%	87.22%	89.39%	5.35%	665
0.01	linear	scale	99.08%	99.33%	98.79%	99.06%	88.31%	89.14%	86.78%	87.95%	10.77%	518

2. Feature Engineering

All images are preprocessed by resizing to 128x128 pixels before feature extraction. The feature vector combines both basic color properties and advanced image analysis techniques:

Feature Category	Feature Name	Description	Dimensionality
Basic Color Features	RGB Channels	Separation of image into Red, Green, and Blue channels	3 channels × (128×128)
	RGB Mean	Average intensity value for each color channel	3 values
	RGB Mode	Most frequent intensity value in each channel	3 values
	RGB Variance	Spread of intensity values in each channel	3 values
	RGB Standard Deviation	Square root of variance for each channel	3 values
	Color Histogram	Distribution of pixel intensities (256 bins per channel)	256 × 3 values
Texture Analysis	GLCM Contrast	Measures intensity contrast between pixel pairs	1 value
	GLCM Dissimilarity	Measures how different each pixel pair is	1 value
	GLCM Homogeneity	Measures closeness of element distribution	1 value
	GLCM Energy	Measures textural uniformity	1 value
	GLCM Correlation	Measures linear dependency of gray levels	1 value
Shape Features	HOG (Histogram of Oriented Gradients)	Captures edge directions and gradients	64 values
	Peak Local Max	Identifies local maximum intensity points	10 × 2 values
	Hu Moments	Shape descriptors invariant to translation, rotation, and scale	7 values
	Edge Density	Ratio of edge pixels to total pixels	1 value
Statistical Features	Image Entropy	Measures randomness in pixel intensity distribution	1 value
	Laplacian Mean	Average of second-order derivatives	1 value
	Laplacian Standard Deviation	Spread of second-order derivatives	1 value
	Aspect Ratio	Width to height ratio of the image	1 value
Geometric Features	Circularity	Measure of how circular the face region is	1 value

Feature Processing Pipeline

Image Preprocessing
- Resize to 128×128 pixels
- Convert to appropriate color spaces (RGB/Grayscale)
- Apply necessary filters and transformations
Feature Extraction
- Extract all features independently
- Normalize histograms and distributions
- Calculate statistical measures
Feature Vector Generation
- Concatenate all features into a single vector
- Standardize features using StandardScaler
- Final vector dimensionality: ~1000 features
Feature Importance
- Color features capture skin tone and lighting variations
- Texture features identify facial patterns
- Shape features capture facial structure
- Statistical features provide overall image characteristics

This comprehensive feature set enables the models to capture various aspects of facial characteristics that may be indicative of gender, leading to robust classification performance.

3. Real-World Application

The GUI application provides an intuitive interface for testing both models on real-world images. Below are some examples of the application in action:

Key Features:

Real-time feature extraction and classification
Support for both local and web images
Manual face selection for precise testing
Side-by-side model comparison

4. Performance Analysis

Comprehensive evaluation of both models' performance through various metrics and visualizations:

Perceptron Analysis

SVM Analysis

Key Findings

Perceptron Performance

Best Configuration:
- Learning Rate: 0.01
- Max Iterations: 1000
- Tolerance: 1e-4
Results:
- Training Accuracy: 94.96%
- Testing Accuracy: 90.48%
- Minimal overfitting (4.48% difference)
Strengths:
- Consistent performance across different hyperparameters
- Good generalization capabilities
- Fast inference time

SVM Performance

Best Configuration:
- C: 0.001
- Kernel: linear
- Gamma: scale
Results:
- Training Accuracy: 95.18%
- Testing Accuracy: 89.83%
- Moderate overfitting (5.35% difference)
Strengths:
- More stable predictions
- Better handling of outliers
- Fewer support vectors needed (665)

Comparative Analysis

Accuracy Comparison
- Perceptron slightly outperforms SVM in test accuracy
- Both models show similar training performance
- Perceptron shows better generalization
Training Efficiency
- SVM training is faster
- Perceptron requires more iterations but achieves better final results
- Both models show good convergence properties
Model Complexity
- SVM: 665 support vectors in best model
- Perceptron: Single layer with direct mapping
- Trade-off between model complexity and performance
Real-world Performance
- Both models show robust performance on unseen data
- Similar confusion patterns in misclassifications
- Complementary strengths in different scenarios

The analysis shows that while both models achieve comparable performance, the Perceptron demonstrates slightly better generalization capabilities, making it the preferred choice for this specific gender classification task.

Project Structure

Face-Gender-Classifier/
├── Data/                          
│   ├── Man/
│   │   ├── face_0.jpg
│   │   ├── face_1.jpg
│   │   ├── face_2.jpg
│   │   └── ...
│   │
│   ├── Woman/
│   │   ├── face_0.jpg
│   │   ├── face_1.jpg
│   │   ├── face_2.jpg
│   │   └── ...
│   │
│   ├── Data.csv                    # All data in a CSV (image paths only) and other specifications 
│   ├── test_dataset.csv            # Test set (20%)
│   ├── train_dataset.csv           # Train set (80%)
│
├── Models/                         # Saved trained models
│   ├── best_perceptron_model.pth   # Trained Perceptron model
│   ├── perceptron_scaler.pkl       # Scaler for Perceptron features
│   ├── best_svm_model.pkl          # Trained SVM model
│   └── svm_scaler.pkl              # Scaler for SVM features
│
├── Results/                         # Training results and visualizations
│   ├── Perceptron/
│   │   ├── confusion_matrix_perceptron.png
│   │   ├── hyperparameters_metrics_perceptron.png
│   │   ├── loss_accuracy_perceptron.png
│   │   └── perceptron_results.txt
│   │
│   ├── SVM/
│   │   ├── confusion_matrix_svm.png
│   │   ├── hyperparameters_metrics_svm.png
│   │   ├── correlation_analysis_svm.png
│   │   └── svm_results.txt
│   │
│   └── App/
│       ├── app.png
│       ├── perceptronResult1.png
│       ├── svmResult1.png
│       ├── perceptronResult2.png
│       └── svmResult2.png
│
├── Scripts/                         # Source code
│   ├── customTools.py              # Feature extraction and model classes
│       ├── class Image            # Feature extraction implementation
│       └── class Perceptron      # Neural network implementation
│
├── app.py                          # GUI application
├── Perceptron.ipynb                # Detailed notebook for perceptron
├── SVM.ipynb                       # Detailed notebook for svm
├── requirements.txt                # Project dependencies
└── README.md

Key Components

Models Directory
- Contains trained models and their corresponding scalers
- Models are saved in their native formats (PyTorch for Perceptron, pickle for SVM)
- Scalers ensure consistent feature scaling during inference
Results Directory
- Organized by model type
- Contains visualizations of model performance
- Includes detailed metric reports
- Stores hyperparameter analysis results
- Contains real-world results from both models
Scripts Directory
- Contains core implementation files
- Includes all the functions and implementations used in the notebooks and the app
Application
- Main GUI application for real-world testing
- Implements both models in a user-friendly interface
- Provides real-time feature extraction and classification

Installation

Prerequisites

Python 3.10.15 (Reccomended for cuda)
CUDA-capable GPU (Perceptron was built for cuda using Pytorch, SVM can use CPU)
Git

Environment Setup

Clone the repository:

git clone https://github.com/yourusername/face-gender-classifier.git
cd face-gender-classifier

Install dependencies:

pip install -r requirements.txt

GPU Support

Check your CUDA version:

nvidia-smi

Install appropriate PyTorch version:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
#Or
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

USAGE

GUI Application

Start the application:

python app.py

Using the interface:
- Click "Load Image" to select an image
- Draw a rectangle around the face using the mouse
- Use "Predict (SVM)" or "Predict (Perceptron)" buttons to classify
- View results in the prediction panel
- Compare results between models

Models

Perceptron Results

The Neural Network Perceptron achieved excellent results in gender classification, demonstrating strong generalization capabilities:

Best Model Performance:

Training Accuracy: 94.96%
Testing Accuracy: 90.48%
Training/Testing Difference: 4.48%

Key Characteristics:

Learning Rate: 0.01
Max Iterations: 1000
Tolerance: 1e-4
Convergence achieved before maximum iterations
Minimal overfitting despite high model capacity

SVM Results

The Support Vector Machine showed robust performance with excellent stability:

Best Model Performance:

Training Accuracy: 95.18%
Testing Accuracy: 89.83%
Training/Testing Difference: 5.35%

Key Characteristics:

Linear kernel
C: 0.001
Gamma: scale
Support Vectors: 665
Good balance between model complexity and performance

Model Comparison

Both models demonstrated strong performance, with some key differences:

Accuracy
- Perceptron slightly better in test accuracy (90.48% vs 89.83%)
- SVM slightly better in training accuracy (95.18% vs 94.96%)
Generalization
- Perceptron: 4.48% accuracy difference
- SVM: 5.35% accuracy difference
- Both show good generalization capabilities
Practical Considerations
- Perceptron requires GPU for optimal performance
- SVM works efficiently on CPU
- Both suitable for real-time applications

Dataset

The dataset used in this project was obtained from the Gender Classification Dataset on Kaggle. It consists of facial images carefully selected to maintain balance between classes.

Dataset Distribution

Class	Training Set	Testing Set	Total Images	Percentage
Man	938	235	1,173	50.8%
Woman	907	227	1,134	49.2%
Total	1,845	462	2,307	100%

Dataset Characteristics

Image Format: JPG
Original Dimensions: Variable sizes
Preprocessed Dimensions: 128x128 pixels
Color Space: RGB
Class Balance: Nearly perfect (50.8% men, 49.2% women)
Train/Test Split: 80/20 ratio maintaining class distribution

Sample Images

Man Sample 1

Woman Sample 1

Man Sample 2

Woman Sample 2

Real-World Testing

To evaluate the models' performance in real-world scenarios, we tested both the Perceptron and SVM on various facial images outside the training dataset. Below are examples of the classification results:

Perceptron Results

SVM Results

License

Distributed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Data		Data
Models		Models
Results		Results
Scripts		Scripts
LICENSE		LICENSE
Perceptron.ipynb		Perceptron.ipynb
README.md		README.md
SVM.ipynb		SVM.ipynb
app.py		app.py
requirements.txt		requirements.txt

License

AlvaroVasquezAI/Face_Gender_Classifier

Folders and files

Latest commit

History

Repository files navigation

Face Gender Classifier

Table of Contents

Overview

1. Model Implementation and Comparison

Perceptron Neural Network

Support Vector Machine

2. Feature Engineering

Feature Processing Pipeline

3. Real-World Application

4. Performance Analysis

Perceptron Analysis

SVM Analysis

Key Findings

Perceptron Performance

SVM Performance

Comparative Analysis

Project Structure

Key Components

Installation

Prerequisites

Environment Setup

USAGE

GUI Application

Models

Perceptron Results

SVM Results

Model Comparison

Dataset

Dataset Distribution

Dataset Characteristics

Sample Images

Real-World Testing

Perceptron Results

SVM Results

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages