Skip to content

A machine learning project for gender classification from facial images using Perceptron and SVM. Trained on a dataset of 2,307 images, both models achieved ~90% accuracy on test data. The project implements comprehensive feature extraction (RGB analysis, HOG, GLCM, and more) and an interactive GUI application for real-world testing on any image.

License

Notifications You must be signed in to change notification settings

AlvaroVasquezAI/Face_Gender_Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Face Gender Classifier

using two distinct approaches: a Neural Network Perceptron and a Support Vector Machine (SVM).

Python PyTorch Scikit-learn Pandas NumPy OpenCV CustomTkinter

A machine learning project that implements gender classification from facial images using two distinct approaches: a Neural Network Perceptron and a Support Vector Machine (SVM). The project was developed and evaluated on a carefully curated and balanced dataset comprising 2,307 facial images (1,173 men and 1,134 women), with an 80-20 split for training and testing.

Man Classification Example  Woman Classification Example

Both models demonstrate robust performance, with the Perceptron achieving 94.96% accuracy on training and 90.48% on testing, while the SVM reached 95.18% on training and 89.83% on testing. This high performance is achieved through an extensive feature extraction pipeline that combines:

  • Basic color analysis (RGB channels, statistical measures, and histograms)
  • Advanced texture analysis (Gray Level Co-occurrence Matrix properties)
  • Shape and gradient information (Histogram of Oriented Gradients)
  • Geometric and statistical features (entropy, moments, and spatial relationships)

The project includes a user-friendly GUI application built with CustomTkinter that enables real-world testing and model comparison. Users can:

  • Load images from local storage or web sources
  • Select face regions using an intuitive rectangle selection tool
  • Process images through both models simultaneously
  • Compare prediction results and model confidence

Table of Contents

Overview

This project implements and compares two different machine learning approaches for gender classification from facial images. The main objectives are:

1. Model Implementation and Comparison

Perceptron Neural Network

Grid Search Parameters:

  • Learning Rate (alpha): [0.0001, 0.001, 0.01]
  • Max Iterations: [100, 1000, 10000]
  • Stopping Tolerance: [1e-3, 1e-4, 1e-5]

Top 5 models

Learning Rate Max Iterations Tolerance Train Accuracy Train Precision Train Recall Train F1 Test Accuracy Test Precision Test Recall Test F1 Accuracy Diff
0.01 1000 1e-4 94.96% 95.63% 94.05% 94.83% 90.48% 91.78% 88.55% 90.13% 4.48%
0.001 10000 1e-5 94.80% 95.72% 93.61% 94.65% 90.26% 92.52% 87.22% 89.80% 4.54%
0.01 10000 1e-4 94.69% 95.50% 93.61% 94.54% 89.83% 91.67% 87.22% 89.39% 4.86%
0.01 1000 1e-5 95.93% 96.64% 95.04% 95.83% 89.61% 91.63% 86.78% 89.14% 6.32%
0.01 100 1e-5 90.84% 91.84% 89.31% 90.55% 88.96% 92.31% 84.58% 88.28% 1.88%

Support Vector Machine

Grid Search Parameters:

  • C (Regularization): [0.0001, 0.001, 0.01, 0.1]
  • Kernel: ['linear', 'rbf']
  • Gamma: ['scale', 'auto', 0.1, 1]

Top 5 models

C Kernel Gamma Train Accuracy Train Precision Train Recall Train F1 Test Accuracy Test Precision Test Recall Test F1 Accuracy Diff Support Vectors
0.001 linear scale 95.18% 96.06% 94.05% 95.04% 89.83% 91.67% 87.22% 89.39% 5.35% 665
0.001 linear auto 95.18% 96.06% 94.05% 95.04% 89.83% 91.67% 87.22% 89.39% 5.35% 665
0.001 linear 0.1 95.18% 96.06% 94.05% 95.04% 89.83% 91.67% 87.22% 89.39% 5.35% 665
0.001 linear 1 95.18% 96.06% 94.05% 95.04% 89.83% 91.67% 87.22% 89.39% 5.35% 665
0.01 linear scale 99.08% 99.33% 98.79% 99.06% 88.31% 89.14% 86.78% 87.95% 10.77% 518

2. Feature Engineering

All images are preprocessed by resizing to 128x128 pixels before feature extraction. The feature vector combines both basic color properties and advanced image analysis techniques:

Feature Category Feature Name Description Dimensionality
Basic Color Features RGB Channels Separation of image into Red, Green, and Blue channels 3 channels Ă— (128Ă—128)
RGB Mean Average intensity value for each color channel 3 values
RGB Mode Most frequent intensity value in each channel 3 values
RGB Variance Spread of intensity values in each channel 3 values
RGB Standard Deviation Square root of variance for each channel 3 values
Color Histogram Distribution of pixel intensities (256 bins per channel) 256 Ă— 3 values
Texture Analysis GLCM Contrast Measures intensity contrast between pixel pairs 1 value
GLCM Dissimilarity Measures how different each pixel pair is 1 value
GLCM Homogeneity Measures closeness of element distribution 1 value
GLCM Energy Measures textural uniformity 1 value
GLCM Correlation Measures linear dependency of gray levels 1 value
Shape Features HOG (Histogram of Oriented Gradients) Captures edge directions and gradients 64 values
Peak Local Max Identifies local maximum intensity points 10 Ă— 2 values
Hu Moments Shape descriptors invariant to translation, rotation, and scale 7 values
Edge Density Ratio of edge pixels to total pixels 1 value
Statistical Features Image Entropy Measures randomness in pixel intensity distribution 1 value
Laplacian Mean Average of second-order derivatives 1 value
Laplacian Standard Deviation Spread of second-order derivatives 1 value
Aspect Ratio Width to height ratio of the image 1 value
Geometric Features Circularity Measure of how circular the face region is 1 value

Feature Processing Pipeline

  1. Image Preprocessing

    • Resize to 128Ă—128 pixels
    • Convert to appropriate color spaces (RGB/Grayscale)
    • Apply necessary filters and transformations
  2. Feature Extraction

    • Extract all features independently
    • Normalize histograms and distributions
    • Calculate statistical measures
  3. Feature Vector Generation

    • Concatenate all features into a single vector
    • Standardize features using StandardScaler
    • Final vector dimensionality: ~1000 features
  4. Feature Importance

    • Color features capture skin tone and lighting variations
    • Texture features identify facial patterns
    • Shape features capture facial structure
    • Statistical features provide overall image characteristics

This comprehensive feature set enables the models to capture various aspects of facial characteristics that may be indicative of gender, leading to robust classification performance.

3. Real-World Application

The GUI application provides an intuitive interface for testing both models on real-world images. Below are some examples of the application in action:

Man Classification Example Woman Classification Example Woman Classification Complex Example Man Classification Complex Example

Key Features:

  • Real-time feature extraction and classification
  • Support for both local and web images
  • Manual face selection for precise testing
  • Side-by-side model comparison

4. Performance Analysis

Comprehensive evaluation of both models' performance through various metrics and visualizations:

Perceptron Analysis

Perceptron Confusion Matrix Perceptron Results

SVM Analysis

SVM Confusion Matrix SVM Results

Key Findings

Perceptron Performance
  • Best Configuration:
    • Learning Rate: 0.01
    • Max Iterations: 1000
    • Tolerance: 1e-4
  • Results:
    • Training Accuracy: 94.96%
    • Testing Accuracy: 90.48%
    • Minimal overfitting (4.48% difference)
  • Strengths:
    • Consistent performance across different hyperparameters
    • Good generalization capabilities
    • Fast inference time
SVM Performance
  • Best Configuration:
    • C: 0.001
    • Kernel: linear
    • Gamma: scale
  • Results:
    • Training Accuracy: 95.18%
    • Testing Accuracy: 89.83%
    • Moderate overfitting (5.35% difference)
  • Strengths:
    • More stable predictions
    • Better handling of outliers
    • Fewer support vectors needed (665)

Comparative Analysis

  1. Accuracy Comparison

    • Perceptron slightly outperforms SVM in test accuracy
    • Both models show similar training performance
    • Perceptron shows better generalization
  2. Training Efficiency

    • SVM training is faster
    • Perceptron requires more iterations but achieves better final results
    • Both models show good convergence properties
  3. Model Complexity

    • SVM: 665 support vectors in best model
    • Perceptron: Single layer with direct mapping
    • Trade-off between model complexity and performance
  4. Real-world Performance

    • Both models show robust performance on unseen data
    • Similar confusion patterns in misclassifications
    • Complementary strengths in different scenarios

The analysis shows that while both models achieve comparable performance, the Perceptron demonstrates slightly better generalization capabilities, making it the preferred choice for this specific gender classification task.

Project Structure

Face-Gender-Classifier/
├── Data/                          
│   ├── Man/
│   │   ├── face_0.jpg
│   │   ├── face_1.jpg
│   │   ├── face_2.jpg
│   │   └── ...
│   │
│   ├── Woman/
│   │   ├── face_0.jpg
│   │   ├── face_1.jpg
│   │   ├── face_2.jpg
│   │   └── ...
│   │
│   ├── Data.csv                    # All data in a CSV (image paths only) and other specifications 
│   ├── test_dataset.csv            # Test set (20%)
│   ├── train_dataset.csv           # Train set (80%)
│
├── Models/                         # Saved trained models
│   ├── best_perceptron_model.pth   # Trained Perceptron model
│   ├── perceptron_scaler.pkl       # Scaler for Perceptron features
│   ├── best_svm_model.pkl          # Trained SVM model
│   └── svm_scaler.pkl              # Scaler for SVM features
│
├── Results/                         # Training results and visualizations
│   ├── Perceptron/
│   │   ├── confusion_matrix_perceptron.png
│   │   ├── hyperparameters_metrics_perceptron.png
│   │   ├── loss_accuracy_perceptron.png
│   │   └── perceptron_results.txt
│   │
│   ├── SVM/
│   │   ├── confusion_matrix_svm.png
│   │   ├── hyperparameters_metrics_svm.png
│   │   ├── correlation_analysis_svm.png
│   │   └── svm_results.txt
│   │
│   └── App/
│       ├── app.png
│       ├── perceptronResult1.png
│       ├── svmResult1.png
│       ├── perceptronResult2.png
│       └── svmResult2.png
│
├── Scripts/                         # Source code
│   ├── customTools.py              # Feature extraction and model classes
│       ├── class Image            # Feature extraction implementation
│       └── class Perceptron      # Neural network implementation
│
├── app.py                          # GUI application
├── Perceptron.ipynb                # Detailed notebook for perceptron
├── SVM.ipynb                       # Detailed notebook for svm
├── requirements.txt                # Project dependencies
└── README.md

Key Components

  1. Models Directory

    • Contains trained models and their corresponding scalers
    • Models are saved in their native formats (PyTorch for Perceptron, pickle for SVM)
    • Scalers ensure consistent feature scaling during inference
  2. Results Directory

    • Organized by model type
    • Contains visualizations of model performance
    • Includes detailed metric reports
    • Stores hyperparameter analysis results
    • Contains real-world results from both models
  3. Scripts Directory

    • Contains core implementation files
    • Includes all the functions and implementations used in the notebooks and the app
  4. Application

    • Main GUI application for real-world testing
    • Implements both models in a user-friendly interface
    • Provides real-time feature extraction and classification

Installation

Prerequisites

  • Python 3.10.15 (Reccomended for cuda)
  • CUDA-capable GPU (Perceptron was built for cuda using Pytorch, SVM can use CPU)
  • Git

Environment Setup

  1. Clone the repository:
git clone https://github.com/yourusername/face-gender-classifier.git
cd face-gender-classifier
  1. Install dependencies:
pip install -r requirements.txt

GPU Support

  1. Check your CUDA version:
nvidia-smi
  1. Install appropriate PyTorch version:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
#Or
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

USAGE

GUI Application

  1. Start the application:
python app.py
  1. Using the interface:
    • Click "Load Image" to select an image
    • Draw a rectangle around the face using the mouse
    • Use "Predict (SVM)" or "Predict (Perceptron)" buttons to classify
    • View results in the prediction panel
    • Compare results between models
App Select image Main screen Draw Rectangle Perceptron result SVM result

Models

Perceptron Results

The Neural Network Perceptron achieved excellent results in gender classification, demonstrating strong generalization capabilities:

Perceptron Result Metrics Loss and Accuracy Curves Hyperparameter Analysis Random Test

Best Model Performance:

  • Training Accuracy: 94.96%
  • Testing Accuracy: 90.48%
  • Training/Testing Difference: 4.48%

Key Characteristics:

  • Learning Rate: 0.01
  • Max Iterations: 1000
  • Tolerance: 1e-4
  • Convergence achieved before maximum iterations
  • Minimal overfitting despite high model capacity

SVM Results

The Support Vector Machine showed robust performance with excellent stability:

SVM Result Metrics Hyperparameter Analysis Correlation Analysis Random Test

Best Model Performance:

  • Training Accuracy: 95.18%
  • Testing Accuracy: 89.83%
  • Training/Testing Difference: 5.35%

Key Characteristics:

  • Linear kernel
  • C: 0.001
  • Gamma: scale
  • Support Vectors: 665
  • Good balance between model complexity and performance

Model Comparison

Both models demonstrated strong performance, with some key differences:

  1. Accuracy

    • Perceptron slightly better in test accuracy (90.48% vs 89.83%)
    • SVM slightly better in training accuracy (95.18% vs 94.96%)
  2. Generalization

    • Perceptron: 4.48% accuracy difference
    • SVM: 5.35% accuracy difference
    • Both show good generalization capabilities
  3. Practical Considerations

    • Perceptron requires GPU for optimal performance
    • SVM works efficiently on CPU
    • Both suitable for real-time applications

Dataset

The dataset used in this project was obtained from the Gender Classification Dataset on Kaggle. It consists of facial images carefully selected to maintain balance between classes.

Dataset Distribution

Class Training Set Testing Set Total Images Percentage
Man 938 235 1,173 50.8%
Woman 907 227 1,134 49.2%
Total 1,845 462 2,307 100%

Dataset Characteristics

  • Image Format: JPG
  • Original Dimensions: Variable sizes
  • Preprocessed Dimensions: 128x128 pixels
  • Color Space: RGB
  • Class Balance: Nearly perfect (50.8% men, 49.2% women)
  • Train/Test Split: 80/20 ratio maintaining class distribution

Sample Images

Man Sample 1
Man Sample 1
Woman Sample 1
Woman Sample 1
Man Sample 2
Man Sample 2
Woman Sample 2
Woman Sample 2

Real-World Testing

To evaluate the models' performance in real-world scenarios, we tested both the Perceptron and SVM on various facial images outside the training dataset. Below are examples of the classification results:

Perceptron Results

Perceptron Test 1 Perceptron Test 2 Perceptron Test 3 Perceptron Test 4 Perceptron Test 5 Perceptron Test 6 Perceptron Test 7 Perceptron Test 8 Perceptron Test 9 Perceptron Test 10 Perceptron Test 11 Perceptron Test 12 Perceptron Test 13 Perceptron Test 14 Perceptron Test 15 Perceptron Test 16 Perceptron Test 17 Perceptron Test 18 Perceptron Test 19 Perceptron Test 20 Perceptron Test 21 Perceptron Test 22 Perceptron Test 23 Perceptron Test 24 Perceptron Test 25

SVM Results

SVM Test 1 SVM Test 2 SVM Test 3 SVM Test 4 SVM Test 5 SVM Test 6 SVM Test 7 SVM Test 8 SVM Test 9 SVM Test 10 SVM Test 11 SVM Test 12 SVM Test 13 SVM Test 14 SVM Test 15 SVM Test 16 SVM Test 17 SVM Test 18 SVM Test 19 SVM Test 20 SVM Test 21 SVM Test 22 SVM Test 23 SVM Test 24 SVM Test 25

License

Distributed under the MIT License. See LICENSE for more information.

About

A machine learning project for gender classification from facial images using Perceptron and SVM. Trained on a dataset of 2,307 images, both models achieved ~90% accuracy on test data. The project implements comprehensive feature extraction (RGB analysis, HOG, GLCM, and more) and an interactive GUI application for real-world testing on any image.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published