Skip to content

Add batching support for Fish Speech S2 Pro#675

Open
lucasnewman wants to merge 2 commits intoBlaizzy:mainfrom
lucasnewman:fish-speech-s2-pro-batching
Open

Add batching support for Fish Speech S2 Pro#675
lucasnewman wants to merge 2 commits intoBlaizzy:mainfrom
lucasnewman:fish-speech-s2-pro-batching

Conversation

@lucasnewman
Copy link
Copy Markdown
Collaborator

@lucasnewman lucasnewman commented Apr 24, 2026

This gets a >2x RTFx factor when using larger batch sizes on my M5 Max:

1x: [done] 1 result segment(s), 4.55s audio, 2.88s wall, RTF=0.63

8x: [done] 8 result segment(s), 69.29s audio, 19.83s wall, RTF=0.29

@lucasnewman lucasnewman requested a review from Blaizzy April 24, 2026 18:08
Comment thread mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated
)
mx.eval(normal)

normal_tokens = normal.tolist()
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will force the graph to evaluate and slow down.

We want to evalute once 👌🏽

In mlx-vlm this cost 10-30 tok/s gen

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're forced to evaluate in this path because we need to validate the actual token ids to know if we need to resample with high temperature, so there's no real way around this.

Comment thread mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated
Comment thread mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated
Comment thread mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated
Comment thread mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated
Comment thread mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated
Comment thread mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated
Comment thread mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated
Comment thread mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated
Comment thread mlx_audio/tts/models/fish_qwen3_omni/fish_speech.py Outdated
Copy link
Copy Markdown
Owner

@Blaizzy Blaizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM!

We just need to adjust evaluation point and the frequency at which we are evaluating (avoid eval in every iteration of for loop)

After that we can merge 🚀

@lucasnewman lucasnewman requested a review from Blaizzy April 25, 2026 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants