Skip to content
15 changes: 10 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# FluidVoice

[![Supported Models](https://img.shields.io/badge/Models-Parakeet%20v3%20%26%20v2%20%7C%20Cohere%20Transcribe%20%7C%20Apple%20Speech%20%7C%20Whisper-blue)](https://github.com/altic-dev/Fluid-oss)
[![Supported Models](https://img.shields.io/badge/Models-Parakeet%20Flash%20%7C%20Parakeet%20v3%20%26%20v2%20%7C%20Cohere%20Transcribe%20%7C%20Apple%20Speech%20%7C%20Whisper-blue)](https://huggingface.co/nvidia/parakeet_realtime_eou_120m-v1)

Fully open source voice-to-text dictation app for macOS with AI enhancement.

Expand All @@ -11,9 +11,9 @@ Fully open source voice-to-text dictation app for macOS with AI enhancement.

## Latest Update

- Added **[Parakeet Flash (Beta)](https://huggingface.co/nvidia/parakeet_realtime_eou_120m-v1)** for low-latency live English dictation on Apple Silicon
- Added **Cohere Transcribe** support for higher-accuracy multilingual dictation on Apple Silicon
- Expanded the offline voice engine lineup with **Parakeet v3/v2, Cohere, Apple Speech, and Whisper**
- Continued improving the multilingual dictation experience for users who switch between languages often
- Expanded the voice engine lineup with **Parakeet Flash, Parakeet v3/v2, Cohere, Apple Speech, and Whisper**

## Star History

Expand Down Expand Up @@ -55,7 +55,7 @@ https://github.com/user-attachments/assets/c57ef6d5-f0a1-4a3f-a121-637533442c24

## Features
- **Live Preview Mode**: Real-time transcription preview in overlay
- **Multiple Speech Models**: Parakeet TDT v3 & v2, Cohere Transcribe, Apple Speech, and Whisper
- **Multiple Speech Models**: Parakeet Flash, Parakeet TDT v3 & v2, Cohere Transcribe, Apple Speech, and Whisper
- **Real-time transcription** with extremely low latency
- **AI enhancement** with OpenAI, Groq, and custom providers
- **Global hotkey** for instant voice capture
Expand All @@ -68,14 +68,19 @@ https://github.com/user-attachments/assets/c57ef6d5-f0a1-4a3f-a121-637533442c24

| Model | Best for | Language support | Download size | Hardware |
| --- | --- | --- | --- | --- |
| [Parakeet Flash (Beta)](https://huggingface.co/nvidia/parakeet_realtime_eou_120m-v1) | Lowest-latency live English dictation | English only | ~250 MB | Apple Silicon |
| Parakeet TDT v3 | Fast default multilingual dictation | [25 languages](#parakeet-tdt-v3-languages) | ~500 MB | Apple Silicon |
| Parakeet TDT v2 | Fastest English-only dictation | [English only](#parakeet-tdt-v2-languages) | ~500 MB | Apple Silicon |
| Cohere Transcribe | High-accuracy multilingual dictation | [14 languages](#cohere-transcribe-languages) | ~1.4 GB | Apple Silicon |
| Apple Speech | Zero-download native macOS speech recognition | [System languages](#apple-speech-languages) | Built-in | Apple Silicon + Intel |
| Whisper Tiny / Base / Small / Medium / Large | Broad compatibility, including Intel Macs | [99 languages](#whisper-language-support) | ~75 MB to ~2.9 GB | Apple Silicon + Intel |

Notes:
Parakeet TDT v3 is the default on Apple Silicon. Cohere is the stronger pick if you want a larger multilingual model with higher displayed accuracy. Whisper remains the fallback for Intel Macs and the widest language coverage.
Parakeet Flash is the best pick when you want words to appear live with the lowest latency. Parakeet TDT v3 remains the default multilingual Apple Silicon model. Cohere is the stronger pick if you want a larger multilingual model with higher displayed accuracy. Whisper remains the fallback for Intel Macs and the widest language coverage.

### Parakeet Flash Languages

English.

### Parakeet TDT v3 Languages

Expand Down
6 changes: 5 additions & 1 deletion Sources/Fluid/Analytics/AnalyticsService.swift
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,10 @@ final class AnalyticsService {

let settings = SettingsStore.shared

let anyDictationShortcutUsesAI =
settings.dictationPromptSelection(for: .primary) != .off ||
settings.dictationPromptSelection(for: .secondary) != .off

var properties: [String: Any] = [
"app_version": version,
"app_build": build,
Expand All @@ -82,7 +86,7 @@ final class AnalyticsService {
"environment": environment,

// Low-cardinality settings snapshot
"ai_processing_enabled": !settings.isDictationPromptOff,
"ai_processing_enabled": anyDictationShortcutUsesAI,
"streaming_preview_enabled": settings.enableStreamingPreview,
"press_and_hold_mode": settings.pressAndHoldMode,
"copy_to_clipboard_enabled": settings.copyTranscriptionToClipboard,
Expand Down
7 changes: 6 additions & 1 deletion Sources/Fluid/AppDelegate.swift
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,9 @@ class AppDelegate: NSObject, NSApplicationDelegate {
}

private func forceFrontOnLaunch() {
for delay in [0.0, 0.12, 0.35] {
// Login-item launches can take longer before SwiftUI's main window exists.
// Keep retrying for a few seconds so the existing ContentView startup path runs.
for delay in [0.0, 0.12, 0.35, 1.0, 2.0, 4.0] {
DispatchQueue.main.asyncAfter(deadline: .now() + delay) { [weak self] in
guard let self else { return }
self.bringMainWindowToFront()
Expand All @@ -95,6 +97,9 @@ class AppDelegate: NSObject, NSApplicationDelegate {
}) {
mainWindow.orderFrontRegardless()
mainWindow.makeKeyAndOrderFront(nil)
DebugLogger.shared.debug("Brought main window to front", source: "AppDelegate")
} else {
DebugLogger.shared.debug("Main window not ready yet during launch-front retry", source: "AppDelegate")
}
}

Expand Down
Loading
Loading