This project implements a gloss translation system using the mBART model. It is designed to translate spoken or written text into corresponding ASL gloss sequences, which are used in sign language modeling and generation systems.
text2gloss/
├── data/ # Input CSV file (gloss.csv)
├── models/ # mBART model loader
├── t2g_datasets/ # Custom dataset class
├── training/ # Training loop
├── evaluation/ # Evaluation using BLEU and ROUGE
├── utils/ # Configs and helpers
├── checkpoints/ # Saved models (.pkl)
├── main.py # Entry point for training and evaluation
├── requirements.txt # Python dependencies
└── README.md # Project documentation
Make sure you have Python 3.8+. Then run:
git clone https://github.com/abdullaharifx/text2gloss.git
cd text2gloss
# (Optional) Create a virtual environment
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txtPlace a gloss.csv file inside the data/ directory with the following columns:
| SENTENCE | GLOSSES |
|---|
python main.pyThis will:
- Load and split the dataset
- Fine-tune mBART for gloss translation
- Evaluate using BLEU-4 and ROUGE
- Save checkpoints to
checkpoints/
The model uses Adafactor optimizer with mBART-large-50, trained for 5 epochs. Evaluation scores (BLEU-4 and ROUGE-L) are printed after training.
If you use this work, please consider citing the following paper:
@article{zuo2024spoken2sign,
title={A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars},
author={Zuo, Ronglai and Wei, Fangyun and Chen, Zenggui and Mak, Brian and Yang, Jiaolong and Tong, Xin},
journal={arXiv preprint arXiv:2401.04730},
year={2024},
note={Accepted at ECCV 2024},
url={https://arxiv.org/abs/2401.04730}
}
- HuggingFace Transformers
- Evaluate Library (BLEU, ROUGE)
- Torch & PyTorch Ecosystem
Feel free to open an issue or contact the maintainer: [(business.abdullah.arif@gmail.com)]