PgsToSrtPlus

Extract PGS (Blu-ray) subtitles from MKV files and convert them to SRT using PaddleOCR and Ollama VLMs.

How It Works

PgsToSrtPlus decodes PGS subtitle bitmaps, preprocesses and splits them into individual text lines, then runs a two-stage OCR pipeline:

PaddleOCR performs a fast first-pass recognition on every line. Lines above a confidence threshold are accepted as-is.
Lines below the threshold fall back to a Vision Language Model (via Ollama) for a more accurate read.
A second VLM pass performs italic detection by comparing the original subtitle bitmap against a synthetically rendered upright reference image, classifying each token as italic or roman.

The result is an SRT file with accurate text and <i> markup.

Why PaddleOCR?

PaddleOCR generally outperforms Tesseract in terms of accuracy, particularly on complex, low-quality, and scene-text documents.

PaddleOCR also has a wider dictionary than Tesseract, and will properly recognize non-ASCII characters that sometimes appear in subtitles, such as music notes (♪ ♫ ♬).

Supported Languages

English (en) — default
Japanese (ja)

Other languages may work but will use a generic fallback prompt. Language-specific OCR prompts, fonts, and post-processing steps are configurable per language.

Requirements

Docker
Ollama running separately and accessible from the Docker container (used for low-confidence OCR fallback and italic detection). Default model: qwen3-vl:32b-instruct

Quick Start

Pull the image:

# CPU
docker pull ebette1/pgs-to-srt-plus:latest

# GPU (NVIDIA)
docker pull ebette1/pgs-to-srt-plus-gpu:latest

Run:

docker run --rm --add-host=host.docker.internal:host-gateway \
  -v /path/to/media:/media \
  ebette1/pgs-to-srt-plus:latest \
  "/media/movie.mkv" \
  --ollama http://host.docker.internal:11434

For GPU acceleration (requires NVIDIA Container Toolkit):

docker run --rm --gpus all --add-host=host.docker.internal:host-gateway \
  -v /path/to/media:/media \
  ebette1/pgs-to-srt-plus-gpu:latest \
  "/media/movie.mkv" \
  --ollama http://host.docker.internal:11434

The SRT file is written next to the input file. Use -o /path with a bind mount to write elsewhere.

Options

Option	Default	Description
`--ollama`	`http://127.0.0.1:11434`	Ollama endpoint URL
`--language`, `-l`	`en`	Subtitle language (`en`, `ja`)
`--track`	auto-detect	PGS track index
`-o`, `--output`	same as input	Output directory
`--model`	`qwen3-vl:32b-instruct`	Ollama VLM model
`--device`	`cpu`	PaddleOCR device (`cpu`, `gpu`)
`--verify-threshold`	`0.97`	PaddleOCR confidence below which to fall back to VLM
`--paddle-model`	`PP-OCRv5_server_rec`	PaddleOCR recognition model

Acknowledgments

Tentacule/PgsToSrt — the original inspiration for this project
SubtitleEdit / libse — PGS parsing and Matroska container support

Docker Images

ebette1/pgs-to-srt-plus (CPU)
ebette1/pgs-to-srt-plus-gpu (NVIDIA GPU)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
PgsToSrtPlus		PgsToSrtPlus
Dockerfile		Dockerfile
Dockerfile.gpu		Dockerfile.gpu
PgsToSrtPlus.sln		PgsToSrtPlus.sln
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PgsToSrtPlus

How It Works

Why PaddleOCR?

Supported Languages

Requirements

Quick Start

Options

Acknowledgments

Docker Images

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PgsToSrtPlus

How It Works

Why PaddleOCR?

Supported Languages

Requirements

Quick Start

Options

Acknowledgments

Docker Images

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages