Skip to content

Document audio models using Mimi neural codec#1

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/check-audio-models-moshi-inheritance
Draft

Document audio models using Mimi neural codec#1
Copilot wants to merge 2 commits intomainfrom
copilot/check-audio-models-moshi-inheritance

Conversation

Copy link

Copilot AI commented Feb 11, 2026

Comprehensive investigation of Mimi (Kyutai's neural audio codec) usage across the transformers audio model ecosystem.

Findings

Direct Mimi Integration:

  • Moshi: Uses Mimi as default audio encoder (model_type="mimi")
  • CSM: Uses MimiModel as codec in conversion pipeline

Architecture Inheritance:

  • KyutaiSpeechToText: Inherits MoshiPreTrainedModel, MoshiModel; uses MimiConv1dPaddingCache
  • Qwen3OmniMoe: Inherits MimiLayerScale for code2wav component
  • VibeVoiceAcousticTokenizer: Inherits MimiConv1dPaddingCache for causal convolution

Reusable Components

MimiConv1dPaddingCache → KyutaiSpeechToText, VibeVoiceAcousticTokenizer
MimiLayerScale → Qwen3OmniMoe  
MimiSdpaAttention → Copied to Moshi (with replacements)
MimiModel → Moshi (audio encoder), CSM (codec)

Impact: 5 of 25+ audio models use Mimi/Moshi components. KyutaiSpeechToText shows deepest integration with full architecture inheritance.

Report: MIMI_MOSHI_AUDIO_MODELS_INVESTIGATION.md

Original prompt

I want you to check the audio models in transformers and figureout which models use or inherit mimi from moshi.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: MHRDYN7 <113298714+MHRDYN7@users.noreply.github.com>
Copilot AI changed the title [WIP] Investigate audio models using or inheriting from mimi in transformers Document audio models using Mimi neural codec Feb 11, 2026
Copilot AI requested a review from MHRDYN7 February 11, 2026 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments