Skip to content

Add MiniMax as LLM translation and TTS provider#197

Open
octo-patch wants to merge 1 commit intoR3gm:mainfrom
octo-patch:feature/add-minimax-provider
Open

Add MiniMax as LLM translation and TTS provider#197
octo-patch wants to merge 1 commit intoR3gm:mainfrom
octo-patch:feature/add-minimax-provider

Conversation

@octo-patch
Copy link
Copy Markdown

Summary

  • Add MiniMax M2.5 / M2.7 as LLM translation providers (sequential & batch modes) via the OpenAI-compatible API at https://api.minimax.io/v1
  • Add MiniMax TTS (speech-2.8-hd) with 12 verified voice presets as a new text-to-speech option
  • Reuse existing gpt_sequential() / gpt_batch() by accepting an optional client parameter, minimizing code duplication
  • Strip <think> tags and markdown code fences from MiniMax model responses before JSON parsing
  • Add MINIMAX_API_KEY environment variable checks in both media and docs conversion flows

Changes

File What
soni_translate/translate_segments.py _create_minimax_client(), match-case routing for MiniMax models, think-tag & code-fence stripping
soni_translate/text_to_speech.py segments_minimax_tts() function, regex pattern matching, 7-element return value
soni_translate/language_configuration.py MINIMAX_TTS_MODELS list (12 voices)
app_rvc.py Import TTS models, API key validation, Gradio dropdown integration
README.md MiniMax setup instructions & credits
tests/ 46 tests (18 unit translation + 19 unit TTS + 9 integration)

Test plan

  • All 46 tests pass (python -m pytest tests/ -v)
  • Set MINIMAX_API_KEY and run integration tests: python -m pytest tests/test_minimax_integration.py -v
  • Select a MiniMax translation model in the Gradio UI and verify end-to-end translation
  • Select a MiniMax TTS voice and verify audio generation

How to test locally

export MINIMAX_API_KEY='your-key'
python -m pytest tests/ -v                           # unit tests (no key needed)
python -m pytest tests/test_minimax_integration.py -v  # integration tests (key needed)

Add MiniMax (MiniMax-M2.5, MiniMax-M2.7) as an alternative cloud provider
for both text translation and text-to-speech, alongside existing OpenAI
and Google Translate providers.

Translation:
- MiniMax-M2.5 and MiniMax-M2.7 models via OpenAI-compatible API
- Sequential and batch translation modes (same as GPT workflow)
- Think-tag and markdown code fence stripping for robust JSON parsing
- Automatic fallback to Google Translate on failure

TTS:
- MiniMax speech-2.8-hd model with 12 voice options
- Includes English, Chinese, and multilingual voices
- Pattern-based TTS provider selection (MiniMax-TTS suffix)

Integration:
- MINIMAX_API_KEY environment variable for authentication
- API key validation before processing
- Gradio UI dropdown includes MiniMax TTS voices
- README documentation for setup

Tests: 37 unit tests + 9 integration tests (46 total, all passing)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant