FlowTTS: A next-generation low-latency speech synthesis system with voice cloning and human-like expression capabilities. It naturally presents filler words, emotions, and paralinguistic details, making AI in dialogue scenarios "sound like a real person".
- Ultra-Low Latency: Streaming SSE API with Keep-Alive connection
- Voice Cloning: Create custom voice by submitting audio samples
- Human-like Expression: Natural filler words, emotions, and paralinguistic details
- Multi-language Support: Chinese/English/Japanese/Korean/Cantonese/Arabic/Indonesian/Thai
| Model | Use Case | Features |
|---|---|---|
flow_02_turbo |
Conversational (Latest) | Ultra-low latency, high quality, supports Chinese/English/Japanese/Korean/Cantonese/Arabic/Indonesian/Thai |
flow_01_turbo |
Conversational | Ultra-low latency, high quality, supports Chinese/English/Japanese/Korean/Cantonese/Arabic/Indonesian/Thai |
Recommended: Pass an empty string
""for theModelfield to automatically use the latest model without specifying a version.
FlowTTS is built on TRTC AI Conversation solution. You need to enable one of the following:
- AI Recognition Package (Lite/Premium)
- TRTC Monthly Plus Plan
Python
cd examples/python
pip install -r requirements.txtNote: Please ensure you install the latest version of Tencent Cloud SDK (>=3.0.1200) for full TTS feature support.
Node.js
cd examples/nodejs
npm installRequires Node.js >= 18.
cp .env.example .envEdit .env with your Tencent Cloud credentials:
TENCENTCLOUD_SECRET_ID=your_secret_id_here
TENCENTCLOUD_SECRET_KEY=your_secret_key_here
TENCENTCLOUD_SDK_APP_ID=1400000000Get credentials from Tencent Cloud Console
# Streaming TTS
python examples/python/example_streaming.py
# Non-streaming TTS
python examples/python/example_non_streaming.py
# Voice cloning
python examples/python/example_voice_clone.py
# WebSocket bidirectional streaming
python examples/python/example_ws_bidirection.pycd examples/nodejs
# Streaming TTS
node example_streaming.js
# Non-streaming TTS
node example_non_streaming.js
# Voice cloning
node example_voice_clone.js
# WebSocket bidirectional streaming
node example_ws_bidirection.js# 1. Prepare audio sample (16kHz mono WAV, 10-180 seconds)
cp your_voice.wav test_data/clone_sample.wav
# 2. Clone voice and get voice_id
python examples/python/example_voice_clone.py
# 3. Use the returned voice_id in example_streaming.py for TTS
# Update VOICE_CONFIG["VoiceId"] with the cloned voice_id
python examples/python/example_streaming.py| Parameter | Range | Description |
|---|---|---|
| Speed | 0.5 ~ 2.0 | Speech speed |
| Volume | 0.01 ~ 10 | Volume level (must be > 0) |
| Pitch | -12 ~ 12 | Pitch adjustment |
| API Type | Formats | Sample Rates |
|---|---|---|
| Streaming (SSE) | pcm | 16000, 24000 |
| Non-streaming | pcm, wav, mp3 | 16000, 24000 |
Default format: pcm, default sample rate: 24000
Different APIs use different endpoints:
| API | Endpoint |
|---|---|
Streaming SSE (TextToSpeechSSE) |
trtc.ai.tencentcloudapi.com |
Non-streaming (TextToSpeech) |
trtc.tencentcloudapi.com |
Voice Clone (VoiceClone) |
trtc.tencentcloudapi.com |
The SDK supports HTTP Keep-Alive to reuse TCP connections and reduce latency:
Python
http_profile = HttpProfile()
http_profile.keepAlive = True # Enable Keep-Alive
http_profile.pre_conn_pool_size = 3 # Connection pool size| Parameter | Description |
|---|---|
keepAlive |
Reuses TCP connections, avoids repeated handshakes, reduces latency for subsequent requests |
pre_conn_pool_size |
Pre-established connection pool size, connections are ready before first request |
With Keep-Alive enabled, consecutive requests save approximately 50-100ms of connection establishment time
Node.js
Node.js HTTP agent supports connection reuse by default, no additional configuration needed.
Add TTS configuration in TRTC AI Conversation settings, TTSConfig:
{
"TTSType": "flow",
"VoiceId": "your_voice_id",
"Model": "",
"Speed": 1.0,
"Volume": 1.0,
"Pitch": 0,
"Language": "zh"
}| Language | Code |
|---|---|
| Chinese | zh |
| English | en |
| Japanese | ja |
| Korean | ko |
| Cantonese | yue |
| Arabic | ar |
| Indonesian | id |
| Thai | th |
MIT License - see LICENSE for details.