feat(tts): add Microsoft Edge TTS provider by hagope · Pull Request #168 · grinev/opencode-telegram-bot

hagope · 2026-06-21T22:50:45Z

Summary

Adds Microsoft Edge TTS as a fourth TTS provider (TTS_PROVIDER=edge). It uses Microsoft Edge's online Read Aloud service via WebSocket — no API key, account, or TTS_API_URL required; only outbound HTTPS/WebSocket access to speech.platform.bing.com.

Implementation ports the protocol from the Python edge-tts reference.

How it works

DRM token (Sec-MS-GEC): SHA256 of Windows file-time ticks (rounded to 5 min) + trusted client token. Includes a single 403 retry that corrects clock skew from the server's Date header.
SSML framing: prosody rate/volume/pitch, with 4096-byte UTF-8-safe chunking that never splits multi-byte characters or XML entities.
WebSocket protocol: binary audio frames use a 2-byte big-endian header-length prefix (length value includes the trailing \r\n), so audio starts at offset 2 + headerLength; text frames signal turn.end per chunk.
Output format: audio-24khz-48kbitrate-mono-mp3.

Changes

New: src/app/services/edge-tts.ts + tests/app/services/edge-tts.test.ts
Wire 'edge' provider into src/config.ts and src/app/services/tts-service.ts
isTtsConfigured() returns true for edge (no credentials needed)
Document provider in .env.example and PRODUCT.md
Adds ws as a runtime dependency (Node 20+ target; the global WebSocket is only stable from Node 22+)

Configuration

TTS_PROVIDER=edge
TTS_VOICE=en-US-EmmaMultilingualNeural

Verification

npm run build ✓
npm run lint ✓ (zero warnings)
npm test — all new tests pass (16 in edge-tts.test.ts + 4 edge cases in tts-service.test.ts)
- Note: 1 pre-existing test failure in tests/config.test.ts (env-var leakage of TTS_MODEL) confirmed to also fail on main before this PR
Verified end-to-end against the real Edge TTS service: produced a valid 2.5s MP3 (48 kbps, 24 kHz mono)

Adds a fourth TTS provider that uses Microsoft Edge's online Read Aloud service via WebSocket. No API key, account, or TTS_API_URL required; only outbound HTTPS/WebSocket access to speech.platform.bing.com. Implementation ports the protocol from the Python edge-tts reference (https://github.com/rany2/edge-tts): - Sec-MS-GEC DRM token: SHA256 of Windows file-time ticks (rounded to 5 min) + trusted client token, with single 403 retry that corrects clock skew from the server Date header - SSML framing with prosody rate/volume/pitch and 4096-byte UTF-8-safe chunking that never splits multi-byte chars or XML entities - WebSocket message handling: binary frames use a 2-byte big-endian header-length prefix (length includes the trailing CRLF), audio starts at offset 2 + headerLength; text frames signal turn.end per chunk Adds ws as a runtime dependency (Node 20+ target; global WebSocket is only stable from Node 22+). - New: src/app/services/edge-tts.ts + tests - Wire 'edge' provider into config.ts and tts-service.ts - isTtsConfigured() returns true for edge (no credentials needed) - Document provider in .env.example and PRODUCT.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tts): add Microsoft Edge TTS provider#168

feat(tts): add Microsoft Edge TTS provider#168
hagope wants to merge 1 commit into
grinev:mainfrom
hagope:feat/edge-tts-provider

hagope commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hagope commented Jun 21, 2026

Summary

How it works

Changes

Configuration

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant