FFmpeg can do almost anything with video. Its docs explain almost nothing. This cookbook fixes that.
The official FFmpeg documentation is a reference manual -- comprehensive but impenetrable for anyone trying to accomplish a specific task. You know what you want to do ("compress this video for the web"), but the docs give you a wall of flags, codecs, and filter syntax with no guidance on which combination actually works. This cookbook bridges that gap with tested, copy-paste-ready recipes organized by real-world tasks.
Most FFmpeg tutorials online are either outdated, incomplete, or solve one narrow problem. This cookbook provides:
- Practical recipes organized by what you are trying to accomplish, not by FFmpeg's internal architecture
- Tested commands that work with modern FFmpeg (6.x and 7.x)
- GPU acceleration recipes for NVIDIA GPUs, because encoding speed matters when you have hundreds of files
- A Python wrapper so you can integrate FFmpeg into automated workflows without string-concatenating shell commands
- Explanations of why, not just what. Understanding CRF, presets, and filter graphs makes you dangerous with FFmpeg instead of just copying commands
These five commands cover 80% of what most people need from FFmpeg.
ffmpeg -i input.mp4 -c:v libx264 -crf 23 -preset medium -c:a copy output.mp4CRF 23 is the sweet spot for quality vs size. Lower CRF = better quality, larger file. The audio is copied without re-encoding.
ffmpeg -i input.mov -c:v libx264 -crf 23 -c:a aac -movflags +faststart output.mp4The -movflags +faststart flag is critical for web playback -- it moves metadata to the beginning of the file so browsers can start playing before the full download completes.
ffmpeg -i input.mp4 -vn -c:a libmp3lame -b:a 320k output.mp3The -vn flag drops the video stream. You can swap libmp3lame for aac, flac, or pcm_s16le depending on your needs.
ffmpeg -ss 00:01:00 -i input.mp4 -t 30 -c copy clip.mp4Extracts 30 seconds starting at the 1-minute mark. Stream copy means this is nearly instant regardless of file size.
ffmpeg -i input.mp4 -vf "fps=10,scale=480:-1:flags=lanczos,palettegen" palette.png
ffmpeg -i input.mp4 -i palette.png -lavfi "fps=10,scale=480:-1:flags=lanczos [x]; [x][1:v] paletteuse" output.gifThe two-pass palette method produces dramatically better GIF quality than single-pass conversion.
The foundation. Cutting, trimming, format conversion, audio extraction, compression, and GIF creation. Start here if you are new to FFmpeg.
- Cut and Trim -- Input vs output seeking, keyframe-accurate cuts
- Convert Format -- Container vs codec, MP4/WebM/MKV/AVI conversion
- Extract Audio -- MP3, AAC, FLAC, WAV extraction with bitrate control
- Compress Video -- CRF, presets, two-pass encoding, target file size
- Create GIF -- Palette generation, fps control, optimization
Encoding on NVIDIA GPUs with NVENC and processing with CUDA filters. If you have an NVIDIA GPU and process video regularly, this chapter will save you hours.
- NVENC Basics -- h264_nvenc, hevc_nvenc setup and presets
- CUDA Filters -- scale_cuda, overlay_cuda, hwupload/hwdownload
- CPU vs GPU Benchmark -- Speed, quality, and size tradeoffs
- Hardware Decode -- Full GPU pipeline from decode to encode
Audio normalization, mixing, subtitles, and synchronization. Essential for podcast production, voice-over work, and fixing audio issues.
- Normalize Audio -- EBU R128 loudnorm, dynaudnorm, volume
- Mix Audio Tracks -- amix, amerge, adelay, per-stream volume
- Add Subtitles -- SRT burn-in, ASS styling, soft subs with movtext
- Sync Audio/Video -- itsoffset, atempo, aresample drift correction
The real power of FFmpeg. Filter graphs, overlays, concatenation, color correction, and stabilization.
- Filter Graph Intro -- Understanding -filter_complex, labels, chains
- Overlay/Watermark -- Image and text overlay with positioning
- Concatenate Videos -- Concat demuxer, concat filter, crossfades
- Color Correction -- eq, curves, colorbalance, LUT application
- Stabilize Video -- vidstabdetect + vidstabtransform workflow
Batch processing, live streaming, thumbnail generation, and two-pass encoding for production workflows.
- Batch Processing -- Shell loops, xargs, find, inotifywait
- RTMP Streaming -- Twitch, YouTube, re-streaming, low-latency
- Thumbnail Generation -- Keyframes, scene detection, contact sheets
- Two-Pass Encoding -- Target bitrate, target file size
If you own an NVIDIA GPU (GTX 600 series or newer), you can encode video 5-10x faster than CPU-only encoding. The tradeoff is slightly larger files at equivalent visual quality, but for most workflows the speed gain is worth it.
Quick comparison on a 5-minute 1080p clip (RTX 3070):
| Encoder | Speed | File Size | VMAF Quality |
|---|---|---|---|
| libx264 (CPU, medium) | 45 fps | 48 MB | 95.2 |
| h264_nvenc (GPU, p4) | 350 fps | 62 MB | 93.1 |
| h264_nvenc (GPU, p7) | 180 fps | 55 MB | 94.2 |
NVENC at preset p7 gets within 1 VMAF point of CPU encoding while still being 4x faster. For batch processing hundreds of files, this difference is transformative.
Check if your system supports NVENC:
ffmpeg -encoders | grep nvencThe scripts/ffmpeg_wrapper.py module provides a clean Python interface for common operations:
from pathlib import Path
from scripts.ffmpeg_wrapper import compress_video, extract_audio, create_gif, batch_convert
# Compress with CPU
result = compress_video(Path("input.mp4"), Path("output.mp4"), crf=23)
print(f"Success: {result.success}, Time: {result.duration_seconds}s")
# Compress with GPU (NVENC)
result = compress_video(Path("input.mp4"), Path("output.mp4"), use_gpu=True)
# Extract audio
result = extract_audio(Path("video.mp4"), Path("audio.mp3"), codec="mp3", bitrate="320k")
# Create optimized GIF
result = create_gif(Path("input.mp4"), Path("output.gif"), fps=10, width=480)
# Batch convert entire directory
results = batch_convert(Path("raw/"), Path("converted/"), input_ext=".avi", output_ext=".mp4")Every function returns an FFmpegResult dataclass with success, output_path, duration_seconds, command, and stderr fields.
Run scripts/benchmark_gpu.py to compare CPU vs GPU encoding on your hardware. The script automatically detects NVENC availability and runs both encoders on the same source file, measuring speed and output size.
Typical results across GPU generations:
| GPU | 1080p Speed | 4K Speed | Relative to CPU |
|---|---|---|---|
| GTX 1060 | 200 fps | 55 fps | 4-5x faster |
| RTX 2070 | 280 fps | 90 fps | 6-7x faster |
| RTX 3070 | 350 fps | 120 fps | 7-8x faster |
| RTX 4070 | 420 fps | 160 fps | 9-10x faster |
These numbers are for h264_nvenc at preset p4. Slower presets (p5-p7) reduce speed but improve quality.
Contributions are welcome. See CONTRIBUTING.md for guidelines. The short version:
- Fork and clone
- Install dev dependencies:
pip install -e ".[dev]" - Add your recipe or script
- Run quality checks:
ruff check scripts/ tests/ && pytest tests/ -v - Submit a pull request
Every recipe should include at least 3 practical examples with explanations, and every Python function needs tests with mocked subprocess calls (no real FFmpeg required for testing).
MIT License. See LICENSE for details.