Personal Knowledge Base RAG

Local RAG tool that indexes and searches across Evernote (Yarle) and Obsidian note collections. Supports markdown, PDFs, and images.

Setup

Python 3.12+ on Debian/Ubuntu uses an externally-managed environment, so install into a virtual environment:

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

You'll need to source .venv/bin/activate at the start of each terminal session, or prefix commands with .venv/bin/python3.

Local models (Ollama)

Requires Ollama running locally with:

qwen3:8b — answer generation (search and web UI)

Cloud vision models (image indexing)

export GEMINI_API_KEY=...           # Gemini 2.5 Flash — primary OCR/description model
export AWS_ACCESS_KEY_ID=...        # Claude Sonnet via Bedrock — secondary interpretation pass
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION_NAME=eu-west-2

Files

File	Purpose
`config.py`	Paths, model names, chunking parameters
`index.py`	Indexing pipeline (markdown, PDFs, images)
`search.py`	CLI search interface
`web.py`	Web search interface (localhost:5000)
`chroma_db/`	Persisted vector database (auto-created)

Indexing

Daily update

# Text + images (obsidian only, last 30 days, ≥20KB images)
python3 -u index.py

# Images only
python3 -u index.py --only-images

# Text only, skip images
python3 -u index.py --skip-images

Backfill (all sources, all ages)

# Process all images across both sources with rate limiting
python3 -u index.py --backfill

# Backfill and replace old glm-ocr chunks as Gemini succeeds
python3 -u index.py --backfill --replace-models glm-ocr

# Backfill images only
python3 -u index.py --backfill --only-images

Testing and maintenance

# Test end-to-end with one image
python3 -u index.py --only-images --max-images 1

# Index one source only
python3 -u index.py --only-source yarle

# Reset and rebuild from scratch (text only)
python3 -u index.py --reset --skip-images

# Verify what's indexed (obsidian only by default; checks against primary vision model)
python3 -u index.py --verify
python3 -u index.py --verify --only-source yarle
python3 -u index.py --verify --backfill   # check all sources

# Verify and re-index any missing text files
python3 -u index.py --verify --fix --skip-images

Indexing is resumable — stop anytime with Ctrl+C and re-run to continue. If Gemini hits its daily quota, the secondary (Sonnet) pass still runs on images already processed, and re-running the next day continues from where it stopped.

Flags

Flag	Description
`--skip-images`	Skip image processing
`--only-images`	Only process images, skip markdown/PDFs
`--only-source yarle\|obsidian`	Index one collection only
`--backfill`	Process all images regardless of age (for bulk initial indexing). Retries on 429s using `Retry-After` header with exponential backoff. Default processes last 30 days only
`--vision-models MODEL …`	Vision models in order. First runs on all new images; subsequent models run only on images the first succeeded on. Default: `gemini/gemini-2.5-flash` then `bedrock/eu.anthropic.claude-sonnet-4-6`
`--replace-models MODEL …`	Delete chunks from these models when the primary model succeeds (e.g. `--replace-models glm-ocr`)
`--min-image-kb N`	Minimum image size in KB (default: 20)
`--max-images N`	Limit images processed per pass per source
`--reset`	Delete existing index before rebuilding
`--verify`	Audit indexed files against disk; no indexing performed
`--fix`	Use with `--verify` to re-index any missing files
`--non-interactive`	Skip the credential prompt; continue even if some vision models are unavailable (required for automated/scheduled runs)

Automated daily indexing

A systemd user timer runs index.py --non-interactive daily, sending desktop notifications on start and completion. Image indexing runs opportunistically — it works when credentials are valid and degrades gracefully when they are not (e.g. expired AWS SSO tokens).

One-time setup

0. Optional — desktop notifications:

sudo apt install libnotify-bin

Without this, indexing runs silently. With it, you get a toast on start and finish.

1. Store your Gemini API key:

mkdir -p ~/.config/ever
cat > ~/.config/ever/credentials.env <<'EOF'
GEMINI_API_KEY=your-key-here
EOF

AWS credentials are read automatically from ~/.aws/ — nothing extra needed there.

2. Install the systemd units:

mkdir -p ~/.config/systemd/user

# Copy the service and timer files from this repo
cp run_index.sh /home/wook/Documents/ever/run_index.sh   # already in repo
# Create ~/.config/systemd/user/ever-index.service and ever-index.timer
# (see files in repo root for content)

Or install and enable in one step:

systemctl --user daemon-reload
systemctl --user enable --now ever-index.timer

3. Verify:

systemctl --user list-timers ever-index.timer

Manual trigger

systemctl --user start ever-index.service
journalctl --user -u ever-index.service -f   # watch output

How it works

run_index.sh sources ~/.config/ever/credentials.env, fires a notify-send toast at start and finish, then calls index.py --non-interactive.
With --non-interactive, if any vision model credentials are missing, the script prints a warning and continues rather than waiting for user input.
Persistent=true in the timer means a missed run (machine was off) is triggered on next boot.

MCP Server (Claude Code integration)

mcp_server.py exposes the RAG index as a tool that Claude can call directly during conversations, without needing the web UI or CLI.

Setup

fastmcp is included in requirements.txt. Register with Claude Code (user scope — available in all projects):

claude mcp add --transport stdio personal-notes -- python3 /home/wook/Documents/ever/mcp_server.py

Or project-scoped (saves to .mcp.json in the repo):

claude mcp add --scope project --transport stdio personal-notes -- python3 /home/wook/Documents/ever/mcp_server.py

Verify it's registered:

claude mcp list

Tool: `search_notes`

Once registered, Claude can call search_notes during any conversation. Parameters:

Parameter	Type	Description
`query`	string	Natural-language search query
`source`	string (optional)	Limit to `"obsidian"` or `"yarle"`
`content_type`	string (optional)	Limit to `"md"`, `"pdf"`, or `"image"`
`top_k`	int (optional)	Number of chunks to return (default 5)
`date_after`	string (optional)	ISO date filter, e.g. `"2025-03-01"`

Returns raw chunks with collection, file type, date, similarity score, and filename so Claude can synthesise an answer and cite specific notes.

Notes

The server loads the embedding model and ChromaDB index once at startup — first call may be slow.
Requires the index to be built first (index.py). If the collection doesn't exist, the server exits with an error.

Search

CLI

# One-shot query with LLM answer
python3 search.py "What is the FHIR standard?"

# Interactive mode
python3 search.py

# Filter by source or content type
python3 search.py --source obsidian "WGS validation"
python3 search.py --type pdf "sequencing quality"

# Raw chunks without LLM synthesis
python3 search.py --no-llm "pipeline architecture"

# More context chunks
python3 search.py --top-k 10 "variant calling"

Web UI

python3 web.py
# Open http://localhost:5000

Features: search box, source/type/top-k filters, "chunks only" mode, similarity scores, source file paths.

Architecture

[Markdown/PDF/Image files] → [Indexer] → [ChromaDB vectors]
                                              ↓
                            [User query] → [Retriever] → [Top chunks] → [Qwen3 8B] → [Answer]

Embedding model: all-MiniLM-L6-v2 (384-dim, ~80MB)
Vector store: ChromaDB with cosine similarity
Chunking: 500 chars with 50 char overlap
Images: .png, .jpg, .jpeg, .webp; processed newest-first by modification time; skips files < 20KB; default age filter ≤ 30 days (overridden by --backfill)
Two-pass vision: Pass 1 (Gemini 2.5 Flash) runs on all new images; Pass 2 (Claude Sonnet) runs only on images Pass 1 succeeded on. Each model produces separate chunks that coexist in the index.
PDF OCR fallback: if a PDF has no text layer, each page is rendered at 150 DPI and passed to the primary vision model; disabled by --skip-images
Chunk IDs: include the vision model name so multiple models can index the same image independently, and so re-runs skip already-indexed files
Date metadata: every chunk stores a note_date field (YYYY-MM-DD) extracted by priority:
1. Obsidian — date prefix in filename (e.g. 2026-02-27 Epic.md)
2. Yarle — Created at: line in the first 15 lines of the file
3. Fallback — file modification time This field enables date-range filtering in both the CLI (search.py) and the MCP search_notes tool (date_after parameter).

Cost Estimates

Search is free — answer generation uses qwen3:8b running locally via Ollama.

Costs only apply to image indexing via cloud vision models:

Model	Pricing	Notes
Gemini 2.5 Flash (AI Studio key)	Free with rate limits	Primary OCR/description pass
Gemini 2.5 Flash (paid API key)	$0.30 / 1M input tokens, $2.50 / 1M output tokens	—
Claude Sonnet 4.6 (Bedrock)	$3 / 1M input tokens, $15 / 1M output tokens	Secondary interpretation pass; images tokenised at ~1,334 tokens per 1000×1000 px ≈ ~$0.004/image

Gemini is free when using an AI Studio API key. Rate limits apply (requests per minute/day), which is why backfill indexing is slow — the code retries automatically using the Retry-After header with exponential backoff. Paid Google Cloud keys use token-based billing instead.

Bedrock/Claude is a paid AWS service billed per token. The secondary pass only runs on images where Gemini succeeded.

Approximate one-time backfill cost (based on this collection of ~3,600 images):

Gemini pass: free (AI Studio key), or ~$2–5 (paid key, depending on description length)
Bedrock pass (~1,400 obsidian images): ~~$6 (~~$0.004/image × 1,400)

Ongoing daily indexing processes only new images (last 30 days, ≥20KB), so typical Bedrock costs are cents or less per day.

To check AWS spend: AWS Console → Bedrock → Usage.

Sources

Collection	Path	Content
yarle	`/home/wook/Documents/evern/yarle1`	~1,715 Markdown files, ~263 PDFs, ~2,203 images
obsidian	`/home/wook/Documents/obsidiangit`	~600 Markdown files, ~20 PDFs, ~1,373 images

Further Details

See technical_walkthrough.md for a deeper dive into the implementation, including:

Live-executed examples of the indexing pipeline, chunk ID design, and two-pass cloud vision
ChromaDB metadata schema with real chunk samples
MCP server internals and search_notes parameter reference
Note date extraction logic and migration script walkthrough
Dependency versions (verified at document build time via showboat)

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
config.py		config.py
index.py		index.py
mcp_server.py		mcp_server.py
migrate_add_dates.py		migrate_add_dates.py
note_date_utils.py		note_date_utils.py
requirements.txt		requirements.txt
run_index.sh		run_index.sh
search.py		search.py
technical_walkthrough.md		technical_walkthrough.md
test_non_interactive.py		test_non_interactive.py
test_vision.py		test_vision.py
web.py		web.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Personal Knowledge Base RAG

Setup

Local models (Ollama)

Cloud vision models (image indexing)

Files

Indexing

Daily update

Backfill (all sources, all ages)

Testing and maintenance

Flags

Automated daily indexing

One-time setup

Manual trigger

How it works

MCP Server (Claude Code integration)

Setup

Tool: `search_notes`

Notes

Search

CLI

Web UI

Architecture

Cost Estimates

Sources

Further Details

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Personal Knowledge Base RAG

Setup

Local models (Ollama)

Cloud vision models (image indexing)

Files

Indexing

Daily update

Backfill (all sources, all ages)

Testing and maintenance

Flags

Automated daily indexing

One-time setup

Manual trigger

How it works

MCP Server (Claude Code integration)

Setup

Tool: search_notes

Notes

Search

CLI

Web UI

Architecture

Cost Estimates

Sources

Further Details

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Tool: `search_notes`

Packages