Bug Report: Language Code Incompatibility with WhisperLiveKit
Summary
NLLW does not recognize zh as a valid language code for Chinese, causing integration issues with WhisperLiveKit.
Environment
- NLLW version: 0.1.4
- WhisperLiveKit: latest
- Python: 3.10+
Steps to Reproduce
-
Install WhisperLiveKit and NLLW:
pip install whisperlivekit nllw
-
Run the translation server:
python -m whisperlivekit.basic_server --lan zh --target-language eng_Latn
-
Observe the error.
Expected Behavior
The server should start successfully and translate Chinese speech to English.
Actual Behavior
ValueError: Unknown language identifier: zh
Root Cause
WhisperLiveKit passes the --lan parameter to both Whisper and NLLW:
- Whisper expects
zh for Chinese
- NLLW only accepts
zh-CN for Chinese
This creates a conflict where no single value works for both systems:
| Parameter |
Whisper |
NLLW |
--lan zh |
✅ Works |
❌ Error |
--lan zh-CN |
❌ Error |
✅ Works |
Proposed Solution
Support multiple language codes per entry by allowing language_code to be a list:
# Before
{"name": "Chinese (Simplified)", "nllb": "zho_Hans", "language_code": "zh-CN"}
# After
{"name": "Chinese (Simplified)", "nllb": "zho_Hans", "language_code": ["zh-CN", "zh"]}
This approach:
- Maintains backward compatibility (
zh-CN still works)
- Adds support for Whisper's
zh code
- Avoids duplicate entries in the language list
- Is extensible for other languages with similar issues
Workaround
Currently, there is no workaround when using NLLW translation mode with WhisperLiveKit for Chinese.
Users can only use Whisper's direct English translation mode:
python -m whisperlivekit.basic_server --lan zh --direct-english-translation
Related
Bug Report: Language Code Incompatibility with WhisperLiveKit
Summary
NLLW does not recognize
zhas a valid language code for Chinese, causing integration issues with WhisperLiveKit.Environment
Steps to Reproduce
Install WhisperLiveKit and NLLW:
Run the translation server:
Observe the error.
Expected Behavior
The server should start successfully and translate Chinese speech to English.
Actual Behavior
Root Cause
WhisperLiveKit passes the
--lanparameter to both Whisper and NLLW:zhfor Chinesezh-CNfor ChineseThis creates a conflict where no single value works for both systems:
--lan zh--lan zh-CNProposed Solution
Support multiple language codes per entry by allowing
language_codeto be a list:This approach:
zh-CNstill works)zhcodeWorkaround
Currently, there is no workaround when using NLLW translation mode with WhisperLiveKit for Chinese.
Users can only use Whisper's direct English translation mode:
Related