feat: Add comprehensive Azure MSTTS support with automatic namespace injection by willwade · Pull Request #105 · speechmarkdown/speechmarkdown-js

willwade · 2025-10-30T12:06:09Z

Overview

This PR adds comprehensive Azure MSTTS (Microsoft Text-to-Speech) support to Speech Markdown, including automatic namespace injection, 33 express-as styles with intensity control, and language switching support.

Key Features

1. Automatic Azure SSML Namespace Injection

Automatic detection: The formatter detects when MSTTS tags (<mstts:express-as>) are present in generated SSML
Conditional injection: Only adds xmlns:mstts="https://www.w3.org/2001/mstts" namespace when needed
Seamless integration: Users don't need to manually edit SSML or worry about namespace declarations

2. Complete Express-As Style Support (33 styles)

3. Style Degree (Intensity Control)

Numeric intensity control from 0.01 to 2.0 (default 1.0)
Example: (text)[excited:"1.5"] generates <mstts:express-as style="excited" styledegree="1.5">
Works with all express-as styles

4. Language Switching Support

Full support for lang modifier: (Paris)[lang:"fr-FR"]
Generates <lang xml:lang="fr-FR">Paris</lang>
Works at both text and section levels

5. Docs

Updated docs/platforms/azure.md with all 33 styles
Documented multi-speaker dialog (mstts:dialog and mstts:turn) with raw SSML examples
Documented role attributes (Girl, Boy, YoungAdultFemale, etc.) with raw SSML examples
Feature comparison with Amazon Alexa and Google Assistant

Examples

Basic Express-As Style

(Hello!)[excited]

Generates:

<speak xmlns:mstts="https://www.w3.org/2001/mstts">
<mstts:express-as style="excited">Hello!</mstts:express-as>
</speak>

Style with Intensity

(This is very exciting!)[excited:"1.8"]

Generates:

<speak xmlns:mstts="https://www.w3.org/2001/mstts">
<mstts:express-as style="excited" styledegree="1.8">This is very exciting!</mstts:express-as>
</speak>

Language Switching

In Paris, they pronounce it (Paris)[lang:"fr-FR"].

Generates:

<speak>
In Paris, they pronounce it <lang xml:lang="fr-FR">Paris</lang>.
</speak>

Section-Level Style

#[excited]
This entire section is excited!
Multiple sentences work too.
#[excited]

Generates:

<speak xmlns:mstts="https://www.w3.org/2001/mstts">
<mstts:express-as style="excited">
This entire section is excited!
Multiple sentences work too.
</mstts:express-as>
</speak>

Ready for review! This PR brings Azure MSTTS support from 27 to 33 styles, adds language switching, and provides comprehensive documentation for all Azure-specific features.

- Implement automatic detection of MSTTS tags in generated SSML - Conditionally inject xmlns:mstts namespace only when needed - Override addSpeakTag() in MicrosoftAzureSsmlFormatter - Add containsMsttsTag() helper method with regex detection - Update test expectations for newscaster feature - All 657 tests passing

- Implement excited, disappointed, friendly, cheerful, sad, angry, fearful, empathetic, calm, lyrical, hopeful, shouting, whispering, terrified, unfriendly, gentle, serious, depressed, embarrassed, affectionate, envious, chat, cheerful, customerservice styles - Add styledegree attribute support (0.01-2.0 range) with validation - Update test expectations for Azure's behavior with invalid values - All 669 tests passing

…erage - Document all 27 express-as styles (emotional and scenario-specific) - Add styledegree attribute documentation with examples - Document automatic namespace injection feature - Add Azure example to main README showcasing express-as with styledegree - Note unsupported features (role, mstts:silence, etc.) with workarounds - Update platform documentation to reflect current implementation

…atforms - Compare Azure's 27 express-as styles vs Alexa's 2 emotions and Google's 0 - Highlight Azure's numeric intensity control (0.01-2.0) vs Alexa's 3 levels - Document automatic namespace injection advantage - Show Azure has most comprehensive emotional/stylistic control - List advantages and parity for each platform comparison

- Add all Azure styles to textModifierKey and sectionModifierKey in grammar - Update MicrosoftAzureSsmlFormatter to handle all 27 styles in both text and section modifiers - Add special handling for newscaster -> newscast style mapping - Include poetry-reading, narration-professional, newscast-casual styles - All 669 tests passing including live Azure MSTTS validation

willwade · 2025-10-30T14:18:08Z

Note. WIP. Let me check after Claude done a ton of this

- Add advertisement_upbeat, documentary-narration, narration-relaxed, newscast-formal, sports_commentary, sports_commentary_excited styles - Implement lang modifier support for Azure platform (xml:lang attribute) - Update test expectations for Azure lang support - Update documentation with all 33 Azure styles - Document multi-speaker dialog (mstts:dialog/mstts:turn) and role attributes as requiring raw SSML - Add .env to .gitignore for security - Total Azure styles now 33 (up from 27) - All 669 tests passing

- Add detailed support matrix table showing all Azure SSML elements - Document which elements are fully supported, partially supported, or not supported - Reorganize unsupported features section with clear explanations - Add workarounds for each unsupported feature - Clarify why certain features are disabled (emphasis, expletive, interjection, unit) - Document all advanced MSTTS features and their support status - Improve documentation structure and clarity

- Enable emphasis element with all 4 levels (moderate, strong, reduced, none) - Add bookmark support (generates <bookmark mark='...'> for Azure SDK) - Update all tests to expect proper SSML tags - Update documentation to reflect correct support status - All 669 tests passing

…ttributes - Add style and role keywords to grammar - Implement semicolon-delimited multiple attribute syntax - Refactor Azure formatter to collect and combine express-as attributes - Add comprehensive tests for style+role combinations - Update documentation with role attribute examples - All 672 tests passing

willwade · 2025-11-01T23:33:20Z

Ready for review now @arjan - this gives way more feature support to azure tts and its various intricacies.. I'd say its more feature packed now than any others..

…educe verbosity

… metadata - Update voice data script to include voice metadata for downstream uses - Add id, displayName/name, and languages/language/locale fields - Maintain backward compatibility with voice.name for SSML generation - Filter metadata fields in voiceTagNamed to prevent invalid SSML tags - Regenerate all voice data files (Azure, Google, Polly, Watson) - All 672 tests passing

- Updated Azure formatter to use voiceTag() consistently for voice lookups - Added getVoiceTagFallback() method to Azure formatter for unknown voices - Voice data now supports lookup by both display name (e.g., 'Jenny') and voice ID (e.g., 'en-US-JennyNeural') - SSML output always uses the correct voice ID from the catalog - Added comprehensive tests for display name lookup functionality - Updated existing tests to reflect new voice ID resolution behavior - All 677 tests passing

- Created azure-comprehensive.spec.ts with 10 test cases covering Azure TTS features - Tests include: bookmarks, style/degree, role adjustments, language changes, pitch, emphasis, and audio - 7 tests passing, 3 skipped (voice names with colons and effect attribute not yet supported) - All 684 existing tests still passing - Verified Speech Markdown correctly generates Azure SSML for common use cases

- Changed voice names from HD format (en-US-Ava:DragonHDLatestNeural) to standard neural format (en-US-AvaNeural) - HD voices with colon syntax are a separate Azure feature not currently supported by Speech Markdown parser - Updated 'Simple azure Voice name' test to use en-US-AvaNeural - Updated 'Multi Voices' test to use en-US-AvaNeural and en-US-AndrewNeural - Fixed XML entity escaping expectation (& not & in actual output) - Fixed whitespace formatting in multi-voice test - Now 9 tests passing, 1 skipped (audio effects) - All 686 existing tests still passing

- Added 30 Azure HD voices to voice data with dash syntax (e.g., en-US-Ava-DragonHDLatestNeural) - HD voices use dash syntax in Speech Markdown, converted to colon syntax in SSML (e.g., en-US-Ava:DragonHDLatestNeural) - Added isHD metadata field to voice entries and filtered it from SSML output - Updated comprehensive tests to use HD voices - All 686 tests passing (1 skipped) HD voices are premium high-definition voices with enhanced features: - Human-like speech generation with automatic emotion detection - Conversational patterns with natural pauses - Prosody variations for realism - Higher fidelity audio

- Created comprehensive test suite for Google Cloud TTS with 17 tests - Added support for google:style tag - Maps to google:style SSML tag - No namespace declaration needed per Google documentation - 16 tests passing, 1 skipped (voice sections not yet supported) - All 702 existing tests still passing

willwade · 2025-11-02T23:55:58Z

So note.. We've fixed and tested quite a bit in this - we've added tests directy using SSML snippets from azure and google cloud docs.. We've also done quite a bit to add langs to the voice lists and voice-id so a user can use either id or name and we replace it correctly in the ssml with the id

arjan · 2025-11-04T11:23:41Z

This looks pretty "comprehensive" indeed Will! Nice work.
I must say, I don't use SSML / Azure that much atm so I will proceed to merge this without further verification, if that's OK with you?

willwade · 2025-11-05T17:14:52Z

TY @arjan - I'll just keep an eye on a release.. (NB: ive been working on a PR for the editor - I'll hold off doing much more on that till released.. I have grand plans for that but will prolly keep the "grand" plans for a completely seperate PR you may not want to release! second NB: https://github.com/willwade/js-tts-wrapper - is a wrapper supporting speechmarkdown across as many TTS systems as possible.. we can give live preview of the output..)

arjan · 2025-11-09T08:12:50Z

Release done ✅

willwade added 6 commits October 30, 2025 12:04

chore: Remove development test script

25a97b7

willwade mentioned this pull request Oct 30, 2025

Support Microsoft Azure specifig SSML tags willwade/js-tts-wrapper#23

Closed

willwade added 5 commits November 1, 2025 23:00

fix: Apply Prettier formatting and fix linting issues

58ffd26

willwade added 7 commits November 2, 2025 10:59

docs: Update Azure documentation - remove outdated role section and r…

1b8e6c3

…educe verbosity

arjan merged commit dd851d1 into speechmarkdown:master Nov 5, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add comprehensive Azure MSTTS support with automatic namespace injection#105

feat: Add comprehensive Azure MSTTS support with automatic namespace injection#105
arjan merged 18 commits intospeechmarkdown:masterfrom
willwade:feature/azure-namespace-injection

willwade commented Oct 30, 2025 •

edited

Loading

Uh oh!

willwade commented Oct 30, 2025

Uh oh!

willwade commented Nov 1, 2025

Uh oh!

willwade commented Nov 2, 2025

Uh oh!

arjan commented Nov 4, 2025

Uh oh!

Uh oh!

willwade commented Nov 5, 2025 •

edited

Loading

Uh oh!

arjan commented Nov 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

willwade commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Key Features

1. Automatic Azure SSML Namespace Injection

2. Complete Express-As Style Support (33 styles)

3. Style Degree (Intensity Control)

4. Language Switching Support

5. Docs

Examples

Basic Express-As Style

Style with Intensity

Language Switching

Section-Level Style

Uh oh!

willwade commented Oct 30, 2025

Uh oh!

willwade commented Nov 1, 2025

Uh oh!

willwade commented Nov 2, 2025

Uh oh!

arjan commented Nov 4, 2025

Uh oh!

Uh oh!

willwade commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arjan commented Nov 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

willwade commented Oct 30, 2025 •

edited

Loading

willwade commented Nov 5, 2025 •

edited

Loading