A self-hosted PDF OCR API that converts scanned documents to markdown. Powered by PaddleOCR-VL, runs on GPU via Docker.
api docker pdf ocr pdf-extractor ocr-api paddleocr document-ai document-ocr document-parsing vision-language-model pdf-ocr pdf-to-markdown scanned-pdf paddleocr-vl multilingual-ocr
-
Updated
Apr 19, 2026 - Python