This project implements a real-time sign language recognition system using computer vision and deep learning. It can detect American Sign Language (ASL) gestures through a webcam and convert them to text and speech.
🔗 Live demo: https://sign-language-recognitiontool.vercel.app (Note: backend runs on a free-tier host and may take ~30s to wake up on first request)
- Real-time hand gesture detection using MediaPipe
- CNN-based sign language classification for A-Z letters
- Text-to-speech conversion
- GUI interface with Tkinter
- Data collection tools for training
final_pred.py- Main GUI application with full featuresprediction_wo_gui.py- Console-based prediction without GUIdata_collection_binary.py- Tool for collecting binary image datadata_collection_final.py- Tool for collecting skeleton hand datacnn8grps_rad1_model.h5- Pre-trained CNN modelwhite.jpg- White background image for skeleton drawingAtoZ_3.1/- Directory structure for training data
Install the required packages:
pip install -r requirements.txtRequired packages:
- opencv-python==4.8.1.78
- cvzone==1.6.1
- tensorflow==2.13.0
- keras==2.13.1
- numpy==1.24.3
- pyttsx3==2.90
- pillow==10.0.1
- pyenchant==3.2.2
python final_pred.pypython prediction_wo_gui.py# For binary images
python data_collection_binary.py
# For skeleton data
python data_collection_final.py- ESC - Exit application
- 'a' - Start/stop data collection (in data collection scripts)
- 'n' - Next letter (in data collection scripts)
- Hand Detection: Uses CVZone HandTrackingModule to detect hand landmarks
- Feature Extraction: Converts hand landmarks to skeleton representation
- Classification: CNN model predicts the sign language letter
- Post-processing: Applies rules to improve accuracy and handle gestures
- Output: Displays predicted text and converts to speech
The CNN model (cnn8grps_rad1_model.h5) is trained on 8 groups of similar gestures:
- Group 0: A, E, M, N, S, T
- Group 1: B, D, F, I, K, R, U, V, W
- Group 2: C, O
- Group 3: G, H
- Group 4: L
- Group 5: P, Q, Z
- Group 6: X
- Group 7: Y, J
- Ensure good lighting for optimal hand detection
- Keep hand within camera frame
- The model works best with clear, distinct gestures
- Press ESC to exit any running script