A Claude Code skill that merges, auto-edits, and polishes screen + camera recordings into finished, still-editable videos. It manufactures Screen Studio's unreleased multi-clip merge, cuts your footage with real editorial judgment (not just silence trimming), cleans the audio to studio quality, and can even render branded intros with code.
Tip: add a docs/demo.gif of a real merge + auto-cut to make this README sing.
Contents: Why · Features · Quick start · Editorial brain · How it works · Requirements · Troubleshooting · Roadmap · License
Note
Talk to it in plain English. Once installed, you just say "merge these three takes and tighten it up" or "enhance the audio on this demo", Claude drives the pipeline. The commands below are what it runs under the hood (and what you can run yourself).
Recording a good product demo or tutorial in one clean take is nearly impossible. You stop, restart, ramble, leave dead air. The fix has always been hours in an editor. video-editor turns that into a conversation with your coding agent. It stitches your separate takes into one timeline, makes genuine editorial cuts informed by film-editing craft, levels and de-noises your voice, and hands you back a project you can still fine-tune, or a finished MP4 ready to upload.
It's built around a discovery: Screen Studio stores its projects as open JSON, and its "merge multiple recordings" feature is on the roadmap but unreleased. This skill reverse-engineered the format and manufactures that feature today, while keeping every cursor-zoom, camera bubble, and effect intact.
- 🎞️ Merge unlimited takes into one editable project: losslessly fused, arranged on a single timeline, fully editable in Screen Studio afterwards.
- 🧠 Editorial AI cutting, not mechanical de-silencing: cuts dead air, filler, and redundancy; leads with the hook; keeps emphasis pauses. Grounded in Walter Murch's Rule of Six and a per-video-type playbook (demo, tutorial, talking-head, launch, conference, developer).
- 🎙️ Studio-quality voice in one pass: gentle ML denoise + volume leveling + broadcast loudness (−16 LUFS). Tuned to not sound robotic on already-clean mics.
- 📺 Two output targets: keep it editable in Screen Studio, or bake a finished
.mp4for ScreenFlow, Cap, YouTube, or anywhere. - 🪄 Code-rendered branded intros/outros: the agent writes Remotion (React → MP4) for titles, lower-thirds, and motion graphics, themed and stitched on automatically.
- 🔒 Local & private: ffmpeg, a local denoise binary, and local Whisper. No uploads, no API keys, no per-minute cloud fees.
# 1. Install as a Claude Code skill
git clone git@github.com:noeltock/video-editor.git ~/.claude/skills/video-editor
cd ~/.claude/skills/video-editor && ./scripts/setup.sh
# 2. In Claude Code, just ask:
# "merge my last two Screen Studio takes and tighten the edit"
# "enhance the audio on <project>"
# "auto-edit this demo and bake an mp4"That's it. Claude reads SKILL.md, runs the pipeline, and opens the result.
Prefer to drive it yourself? The raw commands ↓
# Fuse N takes into one editable project (+ studio audio inline)
python3 engine/fuse.py --out "My Demo" "Take 1.screenstudio" "Take 2.screenstudio"
# Enhance audio only, in place (denoise + leveling + loudness)
python3 -c "import sys; sys.path.insert(0,'engine'); import audio; \
audio.enhance('in.m4a','out.m4a', level='moderate')"
# AI auto-edit: transcribe → (Claude decides keep-ranges) → apply as slices
.venv/bin/python engine/analyse.py "<project>/recording/channel-2-microphone-0.m4a" --out analysis.json
python3 engine/apply_cuts.py "<project>.screenstudio" --keep-ranges keep.json
# Bake a finished MP4 (for non-editable tools / upload)
python3 engine/render.py display.mp4 --keep-ranges keep.json --audio voice.m4a --out final.mp4
# Branded intro via Remotion, then stitch onto the demo
cd engine/remotion && npx remotion render Intro intro.mp4 --props='{"title":"My Demo"}'
python3 engine/compose.py --out final.mp4 intro.mp4 demo.mp4 outro.mp4Most "AI video editors" just delete silence. This one edits like a person. Before cutting, it understands the recording (type, audience, goal, narrative arc), then runs a multi-pass process:
| Pass | What it does |
|---|---|
| Understand | Classify the video type, map the Hook → Context → Tension → Payoff arc, find where the value lands |
| Structure | Cut redundancy (keep only the strongest articulation), false starts, meta-commentary, dead weight |
| Editorial | Apply Murch's Rule of Six (emotion > story > rhythm); lead with the hook; keep deliberate emphasis pauses |
| Review | Self-check for over-cutting / choppiness before anything is applied |
The full craft reference (per-type playbooks, retention principles, the cut rubric) lives in references/editorial-playbook.md.
N recordings ──▶ FUSE ──▶ one Screen Studio session (lossless)
(.screenstudio) │ display + webcam + audio concatenated
│ cursor data merged & time-shifted
▼
AUDIO denoise → level → loudness
▼
ANALYSE Whisper word-times + silence
▼
EDITORIAL CUT (Claude decides keep-ranges)
▼
┌──────────────┴───────────────┐
editable Screen Studio baked .mp4
project (effects intact) (ScreenFlow/Cap/YouTube)
Screen Studio is one project = one session: the only arrangement primitive that works is slices. So everything fuses into a single session and is arranged/cut with slices. The reverse-engineered format spec is in references/screen-studio-format.md.
| Dependency | For | Install |
|---|---|---|
| ffmpeg | everything | brew install ffmpeg |
| deep-filter | audio denoise | auto-installed by setup.sh (releases) |
| Python 3.10+ / faster-whisper | AI auto-edit transcription | setup.sh creates a .venv |
| Node 18+ / Remotion | branded intros/outros (optional) | setup.sh runs npm install |
| Screen Studio | the editable output path | screen.studio (macOS) |
Important
The editable output path needs the macOS Screen Studio app. The baked-MP4 path and the audio/Remotion tooling are cross-platform (the prebuilt deep-filter binary covers macOS + Linux).
Researched and ready to build when there's demand:
- OTIO / FCPXML export: make AI cut-decisions portable to DaVinci Resolve / Final Cut / Premiere
- Shorts extraction: long-form → vertical clips via transcript scoring
- B-roll auto-insertion: transcript keyword → stock footage → overlay (an open gap in the ecosystem)
- YouTube publish: titles, description, chapters, voiceover, upload
The full landscape scan of agentic-video tooling is in references/agentic-video-landscape.md.
Common questions & gotchas
- The audio sounds robotic. It shouldn't, that's the whole point. The default is light processing for already-clean mics. If a recording is genuinely noisy, raise
--atten-lim-db; if it's clean, the defaults are correct. Avoid theresemblemode on clean audio (it over-processes). - Edited project didn't update in Screen Studio. Screen Studio doesn't hot-reload an open project, quit and reopen it.
- A cut "jumps" on a screen recording. Cuts land on narration boundaries, but a screen demo can still jump if you cut mid-interaction. Nudge that keep-range boundary by a few hundred ms.
- Merge is slow. Clips from the same machine take the instant lossless path. Mismatched resolutions/fps force a re-encode (slower, still visually lossless). Record takes with identical capture settings to stay on the fast path.
deep-filter/faster-whispernot found. Re-run./scripts/setup.sh, and ensure~/.local/binis on yourPATH.- Is my data uploaded anywhere? No. Denoise, transcription, and rendering all run locally. No API keys, no cloud.
PRs and issues welcome. The engine is small, dependency-light Python (engine/*.py) plus a Remotion project; SKILL.md is the entry point Claude reads. Please keep changes surgical and the docs (references/) in sync.
Stands on the shoulders of ffmpeg, DeepFilterNet, faster-whisper, Remotion, auto-editor, and the editing wisdom of Walter Murch (In the Blink of an Eye).
MIT © 2026 Noel Tock