This repo contains:
image-text.py— send an image to a local or remote Ollama API (llama3.2-vision) and print streamed text.handwrite.py— TrOCR (microsoft/trocr-base-handwritten) for offline handwritten text (PyTorch + Hugging Face).image-to-text.ipynb— OpenCV + Tesseract experiments on sample images underimage/.
- Python 3.10+ recommended.
- Tesseract OCR installed and on your
PATH(for the notebook andpytesseract). On Windows, install Tesseract and uncomment thetesseract_cmdline in the notebook if needed. - For
image-text.py: Ollama running withllama3.2-vision(or compatible vision model), reachable at your API URL.
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txtOptional: copy .env.example to .env and set OLLAMA_URL, or export the variable in your shell.
# From this directory; uses image/ and OLLAMA_URL (default in .env.example)
python image-text.py
python handwrite.pyOpen image-to-text.ipynb in Jupyter or VS Code. Run the notebook from this project folder so paths like image/handnote-10.png resolve. The notebook is kept without stored outputs for cleaner Git diffs.
Do not commit .env, API keys, or downloaded model weights. See .gitignore.