Skip to content

[RD-567] Add TTS evals#7

Open
dbrkn wants to merge 5 commits intomainfrom
berkin/tts-evals
Open

[RD-567] Add TTS evals#7
dbrkn wants to merge 5 commits intomainfrom
berkin/tts-evals

Conversation

@dbrkn
Copy link
Owner

@dbrkn dbrkn commented Feb 13, 2026

  1. Adds a speech_generation pipeline that generates audio from text prompts via whisperkit-cli tts, transcribes the output using the WhisperKitPro engine, and computes WER against the original prompt. Includes a text-only dataset, configurable TTS/transcription params, a registered alias
  2. Adds a generic --pipeline-config key=value CLI flag for alias-mode overrides. ( mainly to set speakers and language for tts generation)

Sample command:

export WHISPERKIT_CLI_PATH="/path/to/whisperkit-cli"
export WHISPERKITPRO_CLI_PATH="/path/to/whisperkitpro-cli"
uv run openbench-cli evaluate \
  --pipeline whisperkit-speech-generation \
  --dataset customer-service-tts-prompts-vocalized \
  --metrics wer \
  --verbose

Sample Result:

Screenshot 2026-02-13 at 9 23 02 PM

dberkin1 and others added 5 commits February 13, 2026 20:52
Adds a new TTS evaluation pipeline using ElevenLabs' API to generate
audio from text prompts, then transcribes with WhisperKitPro for WER.

Made-with: Cursor

Co-authored-by: dberkin1 <berkin@argmax.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant