A RAG (Retrieval-Augmented Generation) tool that lets you ask questions about PDF books using natural language.
- Load a PDF — Text is extracted and split into ~1000-character chunks
- Store in ChromaDB — Chunks are converted to embeddings and stored in a vector database
- Ask questions — Your question is matched against the chunks to find relevant passages
- Get answers — The relevant passages are sent to Claude, which generates an answer
First, make sure you have Python 3.8+ installed. You can check with:
python --versionCreate and activate a virtual environment:
macOS/Linux:
python -m venv venv
source venv/bin/activateWindows (Command Prompt):
python -m venv venv
venv\Scripts\activate.batWindows (PowerShell):
python -m venv venv
venv\Scripts\Activate.ps1Note: On Windows, if you get an execution policy error in PowerShell, run
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUserfirst.
With your virtual environment activated:
pip install chromadb pymupdf anthropic python-dotenvCreate a .env file in the project directory:
ANTHROPIC_API_KEY=your-key-hereGet a key at https://console.anthropic.com/settings/keys
python ask_book.py load path/to/book.pdfThis only needs to be done once per book. The data is stored in ./chroma_db/.
python ask_book.py ask "Who is the main character?"
python ask_book.py ask "What happens at the end?"
python ask_book.py ask "Describe the relationship between Alice and Bob"ChromaDB stores text as high-dimensional vectors (embeddings). When you ask a question, it converts your question to a vector and finds the most similar chunks using cosine similarity. This is much more powerful than keyword search — it understands meaning, not just words.
- Runs locally, no external service needed
- Uses
all-MiniLM-L6-v2for embeddings by default - Data persists in
./chroma_db/
Claude (by Anthropic) is the LLM that reads the retrieved passages and generates answers. The app uses claude-sonnet-4-20250514 for a good balance of speed and quality.
The API is pay-per-use. Typical costs for this app:
- ~$0.003 per question (input tokens for context)
- ~$0.015 per answer (output tokens)
- Only stores one book at a time (loading a new book replaces the old one)
- Very long books may have less accurate retrieval
- Answers are only as good as the relevant chunks found