Skip to content

Machine-Learning-ML/handwriting-image-to-text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image to text (Llama 3.2 Vision & OCR)

This repo contains:

  • image-text.py — send an image to a local or remote Ollama API (llama3.2-vision) and print streamed text.
  • handwrite.pyTrOCR (microsoft/trocr-base-handwritten) for offline handwritten text (PyTorch + Hugging Face).
  • image-to-text.ipynbOpenCV + Tesseract experiments on sample images under image/.

Prerequisites

  1. Python 3.10+ recommended.
  2. Tesseract OCR installed and on your PATH (for the notebook and pytesseract). On Windows, install Tesseract and uncomment the tesseract_cmd line in the notebook if needed.
  3. For image-text.py: Ollama running with llama3.2-vision (or compatible vision model), reachable at your API URL.

Setup

python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt

Optional: copy .env.example to .env and set OLLAMA_URL, or export the variable in your shell.

Run

# From this directory; uses image/ and OLLAMA_URL (default in .env.example)
python image-text.py
python handwrite.py

Open image-to-text.ipynb in Jupyter or VS Code. Run the notebook from this project folder so paths like image/handnote-10.png resolve. The notebook is kept without stored outputs for cleaner Git diffs.

Git hygiene

Do not commit .env, API keys, or downloaded model weights. See .gitignore.

About

Experiments to turn handwritten note images into text: Ollama (llama3.2-vision) API client, Hugging Face TrOCR, and OpenCV + Tesseract in a Jupyter notebook. Sample images included; configure OLLAMA_URL locally.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors