Skip to content

raibid-labs/scribe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Scribe

A small Linux/CUDA utility for turning audio files into verbatim Markdown transcripts plus timestamped JSON sidecars. Built around whisper.cpp and ffmpeg. Part of the raibid-labs ecosystem.

What it does

Take an audio file → run it through ffmpeg (resample to 16 kHz mono PCM) → run it through whisper.cpp (CUDA-accelerated on NVIDIA GPUs) → write two files to a configurable output directory:

  • <slug>-<timestamp>.md — verbatim transcript wrapped in YAML frontmatter
  • <slug>-<timestamp>.json — segment-level timestamps for downstream tooling

Design contract

  • Verbatim only. No summarization, no paraphrase, no condensation, no LLM in the path. If you wanted Snipd, this isn't it.
  • Single-purpose. Audio in, faithful text out. Annotation, link insertion, search, summarization — those are downstream concerns the user (or another tool) handles.
  • Two-file output. Markdown for humans, JSON for tooling. Same base name, same directory.
  • Configurable. Output directory, model choice, and whisper.cpp flags surface via CLI flags and (eventually) .fsx config.

See docs/01-architecture.md for the full design rationale.

Status

Phase 0 — repository init. No implementation yet. See issues for the work plan.

Quick start

(Forthcoming once the implementation lands.)

scribe transcribe --input podcast.mp3 --out-dir ~/transcripts

Documentation

Related raibid-labs projects

  • Scryforge — Fusabi-powered TUI information rolodex (RSS, email, YouTube, Spotify, Reddit, bookmarks). Will eventually invoke Scribe as the engine behind a transcribe-to-vault action on podcast and YouTube items.
  • voice-stuff (legacy, local) — Python push-to-talk dictation prototype. Shares the underlying whisper.cpp build with Scribe. Slated for a Rust rewrite as Murmur (dictation) plus a separate voice-agent project.
  • Phage — context composition engine. Pattern reference for Scribe's eventual .fsx config layer (Fusabi-as-config-DSL).
  • gudpkm n8n stack — the voice_memo workflow has a documented TODO to call Scribe for the audio-upload branch of its webhook.

See docs/03-related-tools.md for the full picture.

License

Dual-licensed under either of:

at your option.

About

Verbatim audio → Markdown transcripts via ffmpeg + whisper.cpp (CUDA). Part of raibid-labs.

Resources

License

Unknown and 2 other licenses found

Licenses found

Unknown
LICENSE
Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

 
 
 

Contributors