Skip to content

eliran17e/VideoEditor

Repository files navigation

Highlight Extractor

A local CLI that turns a source video into a single concatenated highlight reel in the source's original orientation (16:9). Pick the moments manually (type timestamp ranges) or automatically (transcribe the audio, then let Gemini rank the most engaging moments).

How it works (staged pipeline)

Each stage is a single-task module: it reads one input artifact and writes one output artifact, knowing nothing about the other stages. The pivot artifact is segments.json — an ordered list of Segment objects. Whether those segments are typed by a human or ranked by AI, everything downstream is identical.

Stage Status What it does
probe video → metadata (duration, resolution, fps)
extract_audio video → 16kHz mono wav
transcribe wav → transcript.json (Groq Whisper)
audio_energy 🔲 stub wav → peaks.json
build_candidates 🔲 stub merge signals → candidates.json
select → segments.json (manual ranges or Gemini auto)
clip video + segments.json → individual clips
concat clips → reel.mp4

scene_detect, subtitles, vertical_reframe, external_context, music are future layers and are not present yet.

Requirements

  • Python 3.11+
  • ffmpeg and ffprobe on your PATH (checked at startup with a clear error if missing).
  • Software encoding only (libx264). No GPU encoders.
  • API keys (auto mode only): a Groq key for transcription and a Gemini key for selection. Manual mode needs neither.

Setup

cd highlight-extractor
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt

For auto mode, copy .env.example to .env and fill in your keys:

GROQ_API_KEY=...
GEMINI_API_KEY=...

.env is gitignored — keys never get committed.

Run

python cli.py "C:\path\to\source.mp4"

You can also omit the path and you'll be prompted for it. The first question is what to make:

  • Highlight reel — cut clips (manual or AI), optionally subtitled (below).
  • Subtitle a whole video — generate learning subtitles over the entire video. Outputs a .ass sidecar by default (load it in VLC/mpv, no re-encode, toggleable) or burns it in. Romaji-only is fully offline/free.

For a highlight reel, the remaining questions are:

  1. Source video path (or pass it as the CLI arg above)
  2. Selection modemanual or auto 3a. (manual) Ranges — e.g. 00:30-00:45, 01:10-01:25 or one per line. Accepts MM:SS-MM:SS and HH:MM:SS-HH:MM:SS. 3b. (auto) How many highlights, an optional steer for the AI (e.g. "focus on funny moments"), and a max seconds per highlight (0 = no cap).
  3. Padding seconds added around each cut (default 0.5)
  4. Subtitles? yes/no — burn Japanese-learning subtitles onto the reel. If yes, you're also asked whether the source already shows its own subtitles; if so, ours move to the top and show only Japanese + romaji (so they don't collide with the source's bottom subs).
  5. Output filename (default reel.mp4, written next to the source)

Subtitles (Japanese-learning aid)

Built for studying Japanese. Romaji is always shown, spaced by word, not syllable (kudasai, not ku da sai), so you can hear where words begin and end. You choose which other lines appear:

  • Romaji + English
  • Japanese + romaji + English
  • Japanese + romaji
  • Romaji only

English is only meaningful for Japanese audio. For other-language audio, the Japanese line is a translation (so you learn how to say it), with romaji.

How it's produced:

  • Romaji is generated offline (Janome word-segmentation + pykakasi), so the romaji line is free, instant, unlimited, and never depends on the network — ideal for subtitling whole episodes. Romaji-only on Japanese audio needs no API at all.
  • Translation uses an LLM with fallback: Gemini first, then Groq (llama-3.3-70b-versatile) if Gemini is overloaded — so a Gemini 503 spike won't stop you. Both use keys you already have.

For clean output, use a raw source without burned-in subtitles; for sources that already show subs, answer "yes" to the top-position prompt so ours don't collide.

In auto mode the tool extracts audio, transcribes it, and asks Gemini to pick the highlights — then echoes the resolved config and the final segment list (with the AI's labels) and asks you to confirm before rendering.

Flags

Flag Default Meaning
--padding 0.5 Seconds added around each cut
--max-clip 0 Auto mode: cap each highlight's length in seconds (0 = no cap)
--output reel.mp4 Output filename
--keep-temp off Preserve workdir/<task_id>/ for debugging
--fresh off Auto mode: ignore the cached transcript and re-transcribe

Transcript caching (auto mode)

The first auto run on a video transcribes it and caches the result under cache/ (gitignored), keyed on the file's path + size + modified-time. Later runs on the same video reuse the transcript — so you can re-run with a different highlight count, steer, or --max-clip without spending Groq quota or waiting on transcription again. Edit/replace the video and the key changes automatically; pass --fresh to force a re-transcribe.

Rendering correctness

  • Frame-accurate cuts: every segment is re-encoded (never -c copy). Stream-copy only cuts on keyframes, which drifts boundaries by seconds.
  • Glitch-free concat: every clip is normalized to identical parameters (libx264 / yuv420p, source resolution + fps, AAC 44.1kHz stereo), then joined with the concat demuxer (-c copy, safe because the clips already match).
  • Validation: ranges are clamped to [0, duration], empty ranges dropped, and after padding, overlapping/adjacent segments are merged.

Working directory

Each run uses workdir/<task_id>/ for intermediate clips and segments.json. It is removed at the end unless you pass --keep-temp. The folder is gitignored.

Layout

highlight-extractor/
  cli.py                 # interactive config + orchestration
  config.py              # Config dataclass
  models.py              # Segment dataclass + segments.json I/O
  media/
    ffmpeg.py            # subprocess wrappers, presence check, probe
  pipeline/
    probe.py             # ✅
    extract_audio.py     # 🔲 stub
    transcribe.py        # 🔲 stub
    audio_energy.py      # 🔲 stub
    build_candidates.py  # 🔲 stub
    select.py            # ✅ manual mode; auto stub
    clip.py              # ✅
    concat.py            # ✅
  workdir/               # per-run temp (gitignored)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages