Add batching support for Fish Speech S2 Pro#675

Open

lucasnewman wants to merge 2 commits intoBlaizzy:mainfrom

lucasnewman:fish-speech-s2-pro-batching

Collaborator

lucasnewman commented Apr 24, 2026 •

edited

Loading

This gets a >2x RTFx factor when using larger batch sizes on my M5 Max:

1x: [done] 1 result segment(s), 4.55s audio, 2.88s wall, RTF=0.63

8x: [done] 8 result segment(s), 69.29s audio, 19.83s wall, RTF=0.29


          Add batching support for Fish Speech S2 Pro.

198f4c1

lucasnewman requested a review from Blaizzy

April 24, 2026 18:08

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py

+                      )
+                      mx.eval(normal)
+                      normal_tokens = normal.tolist()

Owner

Blaizzy Apr 24, 2026

This will force the graph to evaluate and slow down.

We want to evalute once 👌🏽

In mlx-vlm this cost 10-30 tok/s gen

Collaborator Author

lucasnewman Apr 24, 2026

We're forced to evaluate in this path because we need to validate the actual token ids to know if we need to resample with high temperature, so there's no real way around this.

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated

Blaizzy reviewed

View reviewed changes

mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated

Blaizzy requested changes

View reviewed changes

Owner

Blaizzy left a comment

Overall LGTM!

We just need to adjust evaluation point and the frequency at which we are evaluating (avoid eval in every iteration of for loop)

After that we can merge 🚀


          Cleanup eval handling.

0b6e59b

lucasnewman requested a review from Blaizzy

April 25, 2026 16:31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet