Model request: NVIDIA Parakeet TDT 0.6B v3 (multilingual streaming ASR)

### Model

[nvidia/parakeet-tdt-0.6b-v3](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3) (NVIDIA Parakeet TDT 0.6B v3, CC-BY-4.0)

### Why this model

The current audio catalog covers Whisper (large-v3 and large-v3-turbo), which is excellent for batch transcription but not streaming-native. Parakeet would complement it well for live use cases:

- **Streaming-friendly architecture.** TDT/RNN-T decoding is designed for incremental, low-latency output, which suits live captioning much better than sliding-window re-transcription with Whisper.
- **On-device friendly size.** At 0.6B parameters it fits comfortably in memory on iPhone-class devices, leaving headroom for the host app.
- **Multilingual.** v3 supports 25 European languages, including smaller ones (Dutch, in my case) that are poorly served by on-device alternatives.
- **Proven demand on Apple silicon.** Community CoreML ports already exist (e.g. FluidAudio), but a first-party Core AI export recipe with the ahead-of-time compilation and instant-load benefits would be a significant step up in reliability.

### Use case

I build a live-captioning app for deaf and hard-of-hearing users (real-time, fully on-device, privacy-sensitive). A streaming-capable multilingual ASR model in the Core AI catalog would directly improve accessibility apps in this category.

Thanks for considering it.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model request: NVIDIA Parakeet TDT 0.6B v3 (multilingual streaming ASR) #7

Model

Why this model

Use case

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Model request: NVIDIA Parakeet TDT 0.6B v3 (multilingual streaming ASR) #7

Description

Model

Why this model

Use case

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions