This repository contains the experimental code used in the EMNLP 2025 Paper "Language Still Left Behind: Toward a Better Multilingual Machine Translation Benchmark". This repository roughly consists of two sections:
assessment: Data and code related to the manual re-evaluation of FLORES+.jinghpaw-mt: Data and code related to the Jinghpaw machine translation experiment.
The provided Jinghpaw data in this repository, except for the FLORES+ data, is under the CC-BY-SA-NC (Creative Commons Attribution Share-Alike Non-Commercial) license. If you are using the Jinghpaw machine translation data released in this repository, please cite the following:
@book{kurabe-2020-jinghpaw-reader,
author = {Kurabe, Keita},
title = {Jinghpaw Reader},
publisher = {The Research Institute for Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies},
year = {2020}
}
@book{kurabe-2020-jinghpaw-dictionary,
author = {Kurabe, Keita},
title = {A Dictionary of {J}inghpaw Usage},
publisher = "The Research Institute for Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies",
year = "2020"
}
@book{kurabe-2020-jinghpaw-grammar,
author = {Kurabe, Keita},
title = {An Introduction to {J}inghpaw Grammar},
publisher = "The Research Institute for Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies",
year = "2020"
}
@misc{kurabe-2013-kachin-folktales,
title={Kachin folktales told in {J}inghpaw},
doi={https://dx.doi.org/10.4225/72/59888e8ab2122},
year={2013},
author={Kurabe, Keita}
}
@misc{kurabe-2017-kachin-culture-history,
author={Kurabe, Keita},
year={2017},
title={Kachin culture and history told in {J}inghpaw},
doi={https://dx.doi.org/10.26278/5fa1707c5e77c}
}
If you are using the FLORES+ data, please follow the original license given by FLORES+ (https://huggingface.co/datasets/openlanguagedata/flores_plus) and cite them accordingly.
To be added.
This material is based upon work supported by the National Science Foundation (NSF) under grant BCS-2109709 and IIS-2137396 and by the Japan Society for the Promotion of Science (JSPS) under KAKENHI grant JP24K03887.