Applaud is free, private, and open source audio transcription and summarization tool. It's designed to take audio recordings and provide a transcription, as well as common AI use-cases like flashcards, questions, and summaries.
The goal is to provide a self-hosted alternative to https://www.plaud.ai/. Some people don't like the idea of uploading their audio recordings to a third-party service. With Applaud, you can host it yourself and keep your data private. Applaud even supports local LLM models using Ollama.
NOTE: This is not intended to be hosted on a public server. It is designed to be run locally on your own machine. There is no authentication or authorization built into the application.
- Automatically syncs audio recordings (e.g. iCloud Drive, Google Drive, etc.)
- Transcribes audio recordings into a JSON file using
insanely-fast-whisperwith full CUDA and MPS (Apple Silicon) support - Summarizes the transcript using the models of your choice
- Generates flashcards, questions, and answers from the transcript
✅ OpenAI (.env OPENAI_API_KEY)
✅ Anthropic (.env ANTHROPIC_API_KEY)
✅ Google (.env GOOGLE_API_KEY)
✅ Ollama (.env OLLAMA_BASE_URL and OLLAMA_API_KEY)
✅ OpenRouter (.env OPENROUTER_API_KEY)
Once the frontend and backend are deployed, visiting http://localhost:3000 for the first time will prompt you to set up your LLM provider. You can change models at any time by selecting the cog icon in the top right corner. If you wish to use Ollama, there are a few additional steps to take.
- Install
ollamaand start the server:ollama serve - Create a 120k context window model
ollama install llama3.2ollama show --modelfile llama3.2 > Modelfile- Set
PARAMETER num_ctx 120000 ollama create -f Modelfile llama3.2-120k
- Configure the
.envOLLAMA_BASE_URLto point to your Ollama instance (typicallyhttp://localhost:11434/v1unless you're hosting it on an external server) - Configure the
.envOLLAMA_API_KEYto your Ollama API key (usually doesn't matter, but if you have secured your Ollama instance, you may need to set this) - Visit the frontend and choose
ollamaas the provider andllama3.2-120kas the model
This has been tested on macOS. Linux support should be supported. Windows support is not guaranteed.
docker composepython@3.11ffmpeg
- Clone the repository:
git clone https://github.com/landoncrabtree/applaud.git && cd applaud - Prepare the environment variables:
cp .env.example .env - Refer to the
watcher/README.mdfor instructions on how to setup and start the watcher service - Modify
.envwith any API keys for different LLM providers - Start the frontend and backend services:
docker compose up- The frontend will be available at
http://localhost:3000 - The backend will be available at
http://localhost:8080
- The frontend will be available at
Using insanely-fast-whisper with distil-whisper/large-v2 and pyannote/speaker-diarization-3.1 on a M2 Max MacBook Pro (32GB RAM, 30 GPU Cores)
| Audio Duration | File Size | Transcription Time | Segmentation Time (speaker diarization) |
|---|---|---|---|
| 54:56 | 67.6MB | 09:48 | 02:36 |
- Fix watcher sometimes duplicating files
- UI/UX improvements
- Add more AI tools
- Chat side panel with conversation history per transcript
- Change all hardcoded references to
localhost:8080to be dynamic - Fix delete foreign key constraint (just delete all references first, then delete the transcript)





