Ship-42 Local Voice Studio (Qwen3-TTS Desktop, Electron + MLX)

Open local AI solution by Ship-42.

Local-first desktop app with a minimal ElevenLabs-like workflow:

Studio for free text + PDF jobs
Automatic language detection
Streaming playback via chunk events
PDF reader with word-level highlight
Voice library with encrypted local storage (AES-GCM)
MP3 export (192k)
MP4 export (1080p30) with karaoke word highlighting
Dedicated model download controls in Settings
No cloud login required

Stack

Electron (main/preload)
React + TypeScript + Vite (renderer)
Python FastAPI service (localhost)
Queue worker with concurrency=1
FFmpeg-based export pipeline

Model Strategy

By default, each model ID maps to MLX-community 8bit repos:

base -> mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit
customvoice -> mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit
voicedesign -> mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-8bit

The backend auto-attempts model download into local cache on first use.

Important Runtime Note

MLX is the primary runtime. The app verifies that mlx-audio supports qwen3_tts.

If compatible: synthesis runs on MLX.
If incompatible and fallback is disabled (default): jobs fail with a fix hint.
If incompatible and fallback is enabled in Settings: jobs fall back to macOS say.

If your FFmpeg build does not include ass subtitle filters, MP4 export falls back to an image-based karaoke renderer using the same word timeline.

Qwen3 runtime support in mlx-audio follows the upstream implementation: Blaizzy mlx-audio qwen3_tts README

Local-only Behavior

No external TTS inference API is used.
Synthesis runs locally on your Mac (MLX).
Hugging Face is used only for model file downloads (first run / missing cache).
After models are downloaded, generation is offline-first.

For Voice Clone, add a Reference text in the Voices page when possible. This avoids automatic STT transcription downloads during cloning.

Runtime packages (pinned)

Install pinned MLX runtime packages from this repo:

source runtime/.venv/bin/activate
pip uninstall -y mlx-lm mlx-audio
pip install --upgrade --force-reinstall -r python_service/requirements-mlx.txt

mlx-lm is intentionally removed for this runtime because it currently conflicts with the mlx-audio Qwen3 dependency set.

Word alignment uses local WhisperX forced alignment. Alignment models are stored in runtime/models/whisperx and can be preloaded from Settings. If runtime/.venv-align is missing, the Alignment model (WhisperX) download action will try to bootstrap that runtime automatically. The alignment worker loads whisperx.alignment and whisperx.audio only (no VAD/diarization path).

Prerequisites

macOS ARM64 (Apple Silicon)
Node.js >= 20
Python >= 3.11
ffmpeg + ffprobe

Install

# Run from the current project folder
npm install
python3 -m venv --clear runtime/.venv
source runtime/.venv/bin/activate
pip install -U pip
pip install -r python_service/requirements.txt
python -m pip uninstall -y mlx-lm mlx-audio
python -m pip install --upgrade --force-reinstall -r python_service/requirements-mlx.txt
python -m pip check
python -c "import importlib.metadata as m; print('mlx-audio', m.version('mlx-audio'))"
python -c "import pkgutil, mlx_audio.tts.models as mm; print('qwen3_tts' in [mod.name for mod in pkgutil.iter_modules(mm.__path__)])"
deactivate

python3 -m venv --clear runtime/.venv-align
source runtime/.venv-align/bin/activate
pip install -U pip
pip install --upgrade --force-reinstall -r python_service/requirements-align.txt
deactivate

If you had older experiments in the same .venv, keep --clear to avoid stale dependency conflicts.

Electron prefers runtime/.venv/bin/python3 automatically (falls back to python3 if missing). Alignment uses runtime/.venv-align/bin/python3.

Self-contained storage

Runtime data is stored inside this repository folder:

runtime/models (model cache/downloads, including WhisperX alignment models)
runtime/outputs (jobs, assets, exports, voices)
runtime/config (local app config + encrypted voice-secret blob)
runtime/tmp (temporary render/synthesis files)

Sandbox rule: everything is intentionally kept inside the current project folder.

Run (development)

npm run dev

This starts:

Vite renderer on http://127.0.0.1:5173 (or the next free local port)
Electron desktop shell
Python API service on http://127.0.0.1:8765 (spawned by Electron main process)

Run (built app)

npm run build
npm run start

npm run build only builds the renderer. It does not launch the app by itself.

API (local service)

POST /v1/jobs/text
POST /v1/jobs/pdf
GET /v1/jobs
GET /v1/jobs/{jobId}
GET /v1/jobs/{jobId}/events (SSE)
GET /v1/assets/{assetId}
GET /v1/voices
POST /v1/voices
PATCH /v1/voices/{voiceId}
DELETE /v1/voices/{voiceId}
POST /v1/voices/preview
GET /v1/runtime
POST /v1/jobs/{jobId}/language

MLX Runtime Verify / Troubleshooting

Open Settings and click Verify runtime.
If you see qwen3_tts not supported:

# Run from the current project folder
source runtime/.venv/bin/activate
python -m pip uninstall -y mlx-lm mlx-audio
python -m pip install --upgrade --force-reinstall -r python_service/requirements-mlx.txt
python -c "import importlib.metadata as m; print('mlx-audio', m.version('mlx-audio'))"
python -c "import pkgutil, mlx_audio.tts.models as mm; print('qwen3_tts' in [mod.name for mod in pkgutil.iter_modules(mm.__path__)])"

Restart npm run dev and verify runtime again.

Alignment runtime troubleshoot

If alignment fails with missing WhisperX runtime:

# Run from the current project folder
source runtime/.venv-align/bin/activate
python -m pip install --upgrade --force-reinstall -r python_service/requirements-align.txt
python -m pip check

Then open Settings and download Alignment model (WhisperX). If it still fails, use the exact Alignment reason / Probe error shown in Settings runtime status for diagnosis.

You can also trigger this from Settings directly: the app attempts to prepare runtime/.venv-align and then downloads WhisperX alignment models.

UI Pages

Studio
PDF Reader
Voice Clone
Voice Design
Exports
Settings

Tests

source runtime/.venv/bin/activate
pytest python_service/tests

GitHub metadata (suggested)

Owner: Ship-42
Name: local-voice-studio (or your preferred repo name)
Description: Local-first Text-to-Speech Studio for Apple Silicon (Electron + MLX + Qwen3 + WhisperX). Voice clone, voice design, PDF reader, MP3/MP4 karaoke export.
Topics: local-ai, text-to-speech, qwen3, mlx, whisperx, electron, apple-silicon, pdf, karaoke, voice-clone
Website (optional): link to your Ship-42 profile or docs page

Publish checklist

# Run in this project folder
npm run build
source runtime/.venv/bin/activate
pytest python_service/tests

Then create/push your GitHub repo and make sure local runtime data is not committed (runtime/, local venvs, caches, outputs are ignored by .gitignore).

License

MIT. See LICENSE.

Security

Voice reference files are encrypted at rest with AES-GCM.
Encryption key is generated by Electron and stored using safeStorage when available.

Project Layout

electron/main.cjs Electron lifecycle, secure IPC, Python service launcher
electron/preload.cjs context bridge for renderer
src/ React renderer pages/components/state
python_service/app/main.py FastAPI entry
python_service/app/manager.py queue worker and job orchestration
python_service/app/tts_engine.py model handling + synthesis backend adapter
python_service/app/exporters.py mp3/mp4/alignment exports

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
electron		electron
python_service		python_service
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Logo.jpeg		Logo.jpeg
README.md		README.md
app_view.png		app_view.png
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
ship42-logo.jpeg		ship42-logo.jpeg
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ship-42 Local Voice Studio (Qwen3-TTS Desktop, Electron + MLX)

Stack

Model Strategy

Important Runtime Note

Local-only Behavior

Runtime packages (pinned)

Prerequisites

Install

Self-contained storage

Run (development)

Run (built app)

API (local service)

MLX Runtime Verify / Troubleshooting

Alignment runtime troubleshoot

UI Pages

Tests

GitHub metadata (suggested)

Publish checklist

License

Security

Project Layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ship-42 Local Voice Studio (Qwen3-TTS Desktop, Electron + MLX)

Stack

Model Strategy

Important Runtime Note

Local-only Behavior

Runtime packages (pinned)

Prerequisites

Install

Self-contained storage

Run (development)

Run (built app)

API (local service)

MLX Runtime Verify / Troubleshooting

Alignment runtime troubleshoot

UI Pages

Tests

GitHub metadata (suggested)

Publish checklist

License

Security

Project Layout

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages