Skip to content

Cosyvoice2 GRPO RL training recipe #1463

@yuekaizhang

Description

@yuekaizhang

Hi, we would like share the GRPO RL training recipe based on cosyvoice2 llm.

Recipe: https://github.com/nvidia-china-sae/mair-hub/tree/main/rl-tutorial/cosyvoice_llm

Here are the initial training results:

Model Seed-TTS test_zh CER Cosyvoice3 zero_shot_zh Comment
Official CosyVoice2 LLM 1.45 % 4.08% See the paper
+ GRPO 1.37% 3.36%
SFT (initialized from Qwen2-0.5B-Instruct) 1.81 % 4.83% See PR #1887
+ GRPO 1.06 % 4.03%

We will add more experimental results as we continue refining the recipe.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions