Skip to content

woook/ever-rag

Repository files navigation

Personal Knowledge Base RAG

Local RAG tool that indexes and searches across Evernote (Yarle) and Obsidian note collections. Supports markdown, PDFs, and images.

Setup

Python 3.12+ on Debian/Ubuntu uses an externally-managed environment, so install into a virtual environment:

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

You'll need to source .venv/bin/activate at the start of each terminal session, or prefix commands with .venv/bin/python3.

Local models (Ollama)

Requires Ollama running locally with:

  • qwen3:8b — answer generation (search and web UI)

Cloud vision models (image indexing)

export GEMINI_API_KEY=...           # Gemini 2.5 Flash — primary OCR/description model
export AWS_ACCESS_KEY_ID=...        # Claude Sonnet via Bedrock — secondary interpretation pass
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION_NAME=eu-west-2

Files

File Purpose
config.py Paths, model names, chunking parameters
index.py Indexing pipeline (markdown, PDFs, images)
search.py CLI search interface
web.py Web search interface (localhost:5000)
chroma_db/ Persisted vector database (auto-created)

Indexing

Daily update

# Text + images (obsidian only, last 30 days, ≥20KB images)
python3 -u index.py

# Images only
python3 -u index.py --only-images

# Text only, skip images
python3 -u index.py --skip-images

Backfill (all sources, all ages)

# Process all images across both sources with rate limiting
python3 -u index.py --backfill

# Backfill and replace old glm-ocr chunks as Gemini succeeds
python3 -u index.py --backfill --replace-models glm-ocr

# Backfill images only
python3 -u index.py --backfill --only-images

Testing and maintenance

# Test end-to-end with one image
python3 -u index.py --only-images --max-images 1

# Index one source only
python3 -u index.py --only-source yarle

# Reset and rebuild from scratch (text only)
python3 -u index.py --reset --skip-images

# Verify what's indexed (obsidian only by default; checks against primary vision model)
python3 -u index.py --verify
python3 -u index.py --verify --only-source yarle
python3 -u index.py --verify --backfill   # check all sources

# Verify and re-index any missing text files
python3 -u index.py --verify --fix --skip-images

Indexing is resumable — stop anytime with Ctrl+C and re-run to continue. If Gemini hits its daily quota, the secondary (Sonnet) pass still runs on images already processed, and re-running the next day continues from where it stopped.

Flags

Flag Description
--skip-images Skip image processing
--only-images Only process images, skip markdown/PDFs
--only-source yarle|obsidian Index one collection only
--backfill Process all images regardless of age (for bulk initial indexing). Retries on 429s using Retry-After header with exponential backoff. Default processes last 30 days only
--vision-models MODEL … Vision models in order. First runs on all new images; subsequent models run only on images the first succeeded on. Default: gemini/gemini-2.5-flash then bedrock/eu.anthropic.claude-sonnet-4-6
--replace-models MODEL … Delete chunks from these models when the primary model succeeds (e.g. --replace-models glm-ocr)
--min-image-kb N Minimum image size in KB (default: 20)
--max-images N Limit images processed per pass per source
--reset Delete existing index before rebuilding
--verify Audit indexed files against disk; no indexing performed
--fix Use with --verify to re-index any missing files
--non-interactive Skip the credential prompt; continue even if some vision models are unavailable (required for automated/scheduled runs)

Automated daily indexing

A systemd user timer runs index.py --non-interactive daily, sending desktop notifications on start and completion. Image indexing runs opportunistically — it works when credentials are valid and degrades gracefully when they are not (e.g. expired AWS SSO tokens).

One-time setup

0. Optional — desktop notifications:

sudo apt install libnotify-bin

Without this, indexing runs silently. With it, you get a toast on start and finish.

1. Store your Gemini API key:

mkdir -p ~/.config/ever
cat > ~/.config/ever/credentials.env <<'EOF'
GEMINI_API_KEY=your-key-here
EOF

AWS credentials are read automatically from ~/.aws/ — nothing extra needed there.

2. Install the systemd units:

mkdir -p ~/.config/systemd/user

# Copy the service and timer files from this repo
cp run_index.sh /home/wook/Documents/ever/run_index.sh   # already in repo
# Create ~/.config/systemd/user/ever-index.service and ever-index.timer
# (see files in repo root for content)

Or install and enable in one step:

systemctl --user daemon-reload
systemctl --user enable --now ever-index.timer

3. Verify:

systemctl --user list-timers ever-index.timer

Manual trigger

systemctl --user start ever-index.service
journalctl --user -u ever-index.service -f   # watch output

How it works

  • run_index.sh sources ~/.config/ever/credentials.env, fires a notify-send toast at start and finish, then calls index.py --non-interactive.
  • With --non-interactive, if any vision model credentials are missing, the script prints a warning and continues rather than waiting for user input.
  • Persistent=true in the timer means a missed run (machine was off) is triggered on next boot.

MCP Server (Claude Code integration)

mcp_server.py exposes the RAG index as a tool that Claude can call directly during conversations, without needing the web UI or CLI.

Setup

fastmcp is included in requirements.txt. Register with Claude Code (user scope — available in all projects):

claude mcp add --transport stdio personal-notes -- python3 /home/wook/Documents/ever/mcp_server.py

Or project-scoped (saves to .mcp.json in the repo):

claude mcp add --scope project --transport stdio personal-notes -- python3 /home/wook/Documents/ever/mcp_server.py

Verify it's registered:

claude mcp list

Tool: search_notes

Once registered, Claude can call search_notes during any conversation. Parameters:

Parameter Type Description
query string Natural-language search query
source string (optional) Limit to "obsidian" or "yarle"
content_type string (optional) Limit to "md", "pdf", or "image"
top_k int (optional) Number of chunks to return (default 5)
date_after string (optional) ISO date filter, e.g. "2025-03-01"

Returns raw chunks with collection, file type, date, similarity score, and filename so Claude can synthesise an answer and cite specific notes.

Notes

  • The server loads the embedding model and ChromaDB index once at startup — first call may be slow.
  • Requires the index to be built first (index.py). If the collection doesn't exist, the server exits with an error.

Search

CLI

# One-shot query with LLM answer
python3 search.py "What is the FHIR standard?"

# Interactive mode
python3 search.py

# Filter by source or content type
python3 search.py --source obsidian "WGS validation"
python3 search.py --type pdf "sequencing quality"

# Raw chunks without LLM synthesis
python3 search.py --no-llm "pipeline architecture"

# More context chunks
python3 search.py --top-k 10 "variant calling"

Web UI

python3 web.py
# Open http://localhost:5000

Features: search box, source/type/top-k filters, "chunks only" mode, similarity scores, source file paths.

Architecture

[Markdown/PDF/Image files] → [Indexer] → [ChromaDB vectors]
                                              ↓
                            [User query] → [Retriever] → [Top chunks] → [Qwen3 8B] → [Answer]
  • Embedding model: all-MiniLM-L6-v2 (384-dim, ~80MB)
  • Vector store: ChromaDB with cosine similarity
  • Chunking: 500 chars with 50 char overlap
  • Images: .png, .jpg, .jpeg, .webp; processed newest-first by modification time; skips files < 20KB; default age filter ≤ 30 days (overridden by --backfill)
  • Two-pass vision: Pass 1 (Gemini 2.5 Flash) runs on all new images; Pass 2 (Claude Sonnet) runs only on images Pass 1 succeeded on. Each model produces separate chunks that coexist in the index.
  • PDF OCR fallback: if a PDF has no text layer, each page is rendered at 150 DPI and passed to the primary vision model; disabled by --skip-images
  • Chunk IDs: include the vision model name so multiple models can index the same image independently, and so re-runs skip already-indexed files
  • Date metadata: every chunk stores a note_date field (YYYY-MM-DD) extracted by priority:
    1. Obsidian — date prefix in filename (e.g. 2026-02-27 Epic.md)
    2. Yarle — Created at: line in the first 15 lines of the file
    3. Fallback — file modification time This field enables date-range filtering in both the CLI (search.py) and the MCP search_notes tool (date_after parameter).

Cost Estimates

Search is free — answer generation uses qwen3:8b running locally via Ollama.

Costs only apply to image indexing via cloud vision models:

Model Pricing Notes
Gemini 2.5 Flash (AI Studio key) Free with rate limits Primary OCR/description pass
Gemini 2.5 Flash (paid API key) $0.30 / 1M input tokens, $2.50 / 1M output tokens
Claude Sonnet 4.6 (Bedrock) $3 / 1M input tokens, $15 / 1M output tokens Secondary interpretation pass; images tokenised at ~1,334 tokens per 1000×1000 px ≈ ~$0.004/image

Gemini is free when using an AI Studio API key. Rate limits apply (requests per minute/day), which is why backfill indexing is slow — the code retries automatically using the Retry-After header with exponential backoff. Paid Google Cloud keys use token-based billing instead.

Bedrock/Claude is a paid AWS service billed per token. The secondary pass only runs on images where Gemini succeeded.

Approximate one-time backfill cost (based on this collection of ~3,600 images):

  • Gemini pass: free (AI Studio key), or ~$2–5 (paid key, depending on description length)
  • Bedrock pass (~1,400 obsidian images): $6 ($0.004/image × 1,400)

Ongoing daily indexing processes only new images (last 30 days, ≥20KB), so typical Bedrock costs are cents or less per day.

To check AWS spend: AWS Console → Bedrock → Usage.

Sources

Collection Path Content
yarle /home/wook/Documents/evern/yarle1 ~1,715 Markdown files, ~263 PDFs, ~2,203 images
obsidian /home/wook/Documents/obsidiangit ~600 Markdown files, ~20 PDFs, ~1,373 images

Further Details

See technical_walkthrough.md for a deeper dive into the implementation, including:

  • Live-executed examples of the indexing pipeline, chunk ID design, and two-pass cloud vision
  • ChromaDB metadata schema with real chunk samples
  • MCP server internals and search_notes parameter reference
  • Note date extraction logic and migration script walkthrough
  • Dependency versions (verified at document build time via showboat)

About

Local RAG tool for searching across Evernote and Obsidian note collections

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors