ocr-tool is an installable Agent Skill for extracting text from PDFs with
pdfocr.
It provides one workflow:
- extracts text from full PDFs or selected page ranges
- caches OCR output one page at a time so overlapping requests reuse prior work
- installs
pdfocrwhen needed - writes extracted text to
.ocr-tool-cache/output.txtor a chosen output file
This skill is intentionally limited to extraction. Use another skill after OCR if you want notes, flashcards, quizzes, or other transformed outputs.
pdfocr: Required for PDF-to-text extraction.- DeepInfra API Key: Required by
pdfocr.- Set it via
DEEPINFRA_API_KEY(recommended). - Or provide it via
config.jsonnext to thepdfocrexecutable.
- Set it via
Codex recommends installing non-built-in skills using the $skill-installer.
Prompt Codex with:
$skill-installer install the skill from repo planetis-m/study-assistant with path ocr-tool
Clone or copy ocr-tool into your agent's scanned skills path.
Invoke the skill explicitly using $ocr-tool in your prompts:
Use $ocr-tool to extract text from lecture1.pdf.
Use $ocr-tool to OCR pages 8-20 of lecture1.pdf and return only the cleaned text.
Use $ocr-tool on this PDF, then use $study-assistant in study-notes mode on the extracted text.