Skip to content

Add ExecuWhisper macOS dictation example (Parakeet + LFM2.5 formatter)#237

Open
seyeong-han wants to merge 8 commits into
meta-pytorch:mainfrom
seyeong-han:execuwhisper-app-with-formatter
Open

Add ExecuWhisper macOS dictation example (Parakeet + LFM2.5 formatter)#237
seyeong-han wants to merge 8 commits into
meta-pytorch:mainfrom
seyeong-han:execuwhisper-app-with-formatter

Conversation

@seyeong-han
Copy link
Copy Markdown
Contributor

@seyeong-han seyeong-han commented Apr 30, 2026

Fully On-Device Free Dictation App

clean-reformatting.mp4

What you hear in the clip:

"Uh can we, can we move the meeting uh to Friday, actually no, let's let's make it Monday at 10 in the morning."

What gets pasted:

Can we move the meeting to Friday? Actually, no, let's make it Monday at 10 AM in the morning.

execuwhisper-overlay-dictation.mp4
architecture

Summary

ExecuWhisper is a native macOS dictation app that runs fully on-device using ExecuTorch. ASR via NVIDIA Parakeet-TDT (Metal backend); a fine-tuned LiquidAI LFM2.5-350M cleans disfluencies, casing, and punctuation (MLX delegate). No cloud, no API keys, no telemetry.

This PR adds the source under execuwhisper/macos/, mirroring the existing voxtral_realtime/macos/ layout.

What's in this diff

  • New execuwhisper/macos/ directory:
    • ExecuWhisper/ — Swift app source.
    • ExecuWhisperTests/ — XCTest target.
    • docs/ — demo script, support runbook, release-QA checklist.
    • scripts/build.sh, create_dmg.sh, sign_release.sh, verify_*, probe_formatter.py, benchmark_helper.py.
    • project.yml — xcodegen spec (no DEVELOPMENT_TEAM hard-coded; user supplies via env var).
    • README.md, CHANGELOG.md, THIRD_PARTY_NOTICES.md, .gitignore.

Upstream dependencies (in review)

The helper binaries this app embeds live in three PRs against pytorch/executorch that are still in review. Reviewers/users have two paths:

  • A. Wait for them to land on pytorch/executorch:main and build everything from a single source checkout.
  • B. Use the prebuilt arm64 helpers attached to this PR's GitHub Release and only build the Swift app locally.
PR What it adds
pytorch/executorch#18861 parakeet_helper (ASR runtime, Metal) + make parakeet-metal
pytorch/executorch#19195 LFM2.5 MLX export pipeline + lfm2_5_350m model class + lfm_2_5-mlx Makefile target
pytorch/executorch#19562 lfm25_formatter_helper (formatter runtime, MLX) + make lfm_2_5_formatter-mlx

Models

Prebuilt binaries

GitHub Release execuwhisper-v0.1.0 on the fork (until this PR merges) ships:

  • lfm25_formatter_helper-arm64-darwin.tar.gz
  • parakeet_helper-arm64-darwin.tar.gz
  • mlx.metallib
  • helpers-arm64-darwin.tar.gz (one-shot bundle of the above)

Helpers are pre-signed with the hardened runtime + disable-library-validation + allow-dyld-environment-variables entitlements. Not redistributing libomp.dylib (third-party LLVM OpenMP) — users obtain via brew install libomp. Not redistributing the .app / .dmg (legal review pending).

Validation

Release-gate eval for the formatter (AMI corpus subset):

Metric Value Gate Status
Forbidden 0.030 ≤ 0.10
Coverage 0.874 ≥ 0.85
Verdict RELEASE-READY

In-app smoke tests: dictation overlay, hotkey, replacements, snippets, session export, long-input chunking, helper warm reuse — all passing on macOS 14.x and 15.x, MacBook Pro M3 / Mac Studio M2.

Known limitations

  • macOS-only; Apple Silicon required (M1+); macOS 14+.
  • Three upstream ExecuTorch PRs still in review (linked above).
  • Formatter occasionally over-summarizes self-corrections ("actually no — make it tomorrow"); see model card for full eval breakdown.
  • 30-word chunker is naive on word boundaries; smarter chunking deferred.

What changed since the previous push of this branch

  • Moved from ExecuWhisper/ (top-level) to execuwhisper/macos/ (mirrors voxtral_realtime/macos/ convention).
  • Dropped the committed .xcodeproj/; added .gitignore and an xcodegen generate build step.
  • Removed hard-coded DEVELOPMENT_TEAM; users supply via env var.
  • Sanitized requirements_et-mlx.txt / requirements_et-metal.txt (removed @ file:///Users/younghan/... editable installs; documented pip install -e <executorch_path> step in README).
  • Updated HF repo references from younghan-meta/LFM2.5-ExecuTorch-MLX to the new dedicated younghan-meta/LFM2.5-350M-ExecuWhisper-Formatter.
  • Removed internal handover docs (FINETUNING_HANDOVER.md, FINETUNING_DATASET_HANDOFF.md).
  • Added THIRD_PARTY_NOTICES.md, ASCII architecture diagram, acknowledgements.
  • Replaced personal-name test fixture in TextPipelineTests.swift with a generic example.
  • Reworded "internal DMG" language across CHANGELOG.md, RELEASE_QA_CHECKLIST.md, and create_dmg.sh.
  • Added BSD copyright headers to probe_formatter.py, sign_release.sh, verify_project_settings.sh, verify_release.sh.
  • Eval results published to the HF model card under eval/.

Reviewer guidance

  • Focus on: the Swift app (ExecuWhisper/Services/, ExecuWhisper/Views/), the build wiring (project.yml, scripts/build.sh), and the README's prebuilt-vs-source story.
  • Auto-tested: xcodebuild -scheme ExecuWhisper test exercises the formatter prompt builder, replacement pipeline, session compatibility, and helper bridge reuse.
  • Out of scope for this PR: Windows/Linux ports, DMG distribution (legal review pending), CI for native macOS apps in this repo (no precedent).

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 30, 2026
@seyeong-han seyeong-han force-pushed the execuwhisper-app-with-formatter branch 2 times, most recently from fdcbab8 to 29caabb Compare April 30, 2026 21:23
@seyeong-han seyeong-han force-pushed the execuwhisper-app-with-formatter branch from 7451fc8 to 3b4a4db Compare May 15, 2026 21:46
@seyeong-han seyeong-han changed the title Add ExecuWhisper macOS dictation app with LFM2.5 smart formatter Add ExecuWhisper macOS dictation example (Parakeet + LFM2.5 formatter) May 15, 2026
Add a native macOS dictation app that runs fully on-device using ExecuTorch:
NVIDIA Parakeet-TDT for ASR (Metal backend) plus a fine-tuned LiquidAI
LFM2.5-350M for cleaning up disfluencies, casing, and punctuation (MLX
delegate).

Layout follows the voxtral_realtime/macos/ convention:
  execuwhisper/
    macos/
      ExecuWhisper/        Swift app source
      ExecuWhisperTests/   XCTest target
      docs/                Demo script, support runbook, release QA checklist
      scripts/             Build / DMG / sign / verify / probe scripts
      project.yml          xcodegen spec (no DEVELOPMENT_TEAM hard-coded;
                           supply via env var)
      README.md            Public README with prebuilt + from-source paths
      THIRD_PARTY_NOTICES  Upstream component attribution
      CHANGELOG.md         v0.1.0 initial open-source release notes
      .gitignore           xcodeproj/, build/, DMG, etc.

Models live in two Hugging Face repos:
  younghan-meta/Parakeet-TDT-ExecuTorch-Metal      (ASR runtime)
  younghan-meta/LFM2.5-350M-ExecuWhisper-Formatter (formatter runtime + fp32)

Helper binaries depend on three upstream ExecuTorch PRs in review:
  pytorch/executorch#18861 - parakeet_helper (ASR runtime)
  pytorch/executorch#19195 - LFM2.5 MLX export pipeline
  pytorch/executorch#19562 - lfm25_formatter_helper (formatter runtime)

Until those land, build via the README from-source path or use the
prebuilt arm64 helpers attached to the GitHub Release on this PR.

Eval: AMI release-gate run for the formatter shows forbidden 0.030 (gate
0.10) and coverage 0.874 (gate 0.85). Full eval reports in the formatter
HF repo under eval/.

No telemetry. The only network call is the first-launch model download
from huggingface.co.
@seyeong-han seyeong-han force-pushed the execuwhisper-app-with-formatter branch from cc889c3 to aec1d11 Compare May 15, 2026 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant