RagWiki is a standalone Wikipedia RAG stack with metadata-gated retrieval.
Instead of searching all chunks globally first, it tries to identify likely pages, then runs chunk vector search against those pages. It can refresh from Wikipedia on demand and returns answer + citations + routing diagnostics.
web/: Next.js UI and/api/queryproxy endpoint.rag-api/: FastAPI service for ingest, query orchestration, routing, and generation.infra/: local Docker Compose stack (Qdrant +rag-api+web)..env.example: required environment variables and routing thresholds.
- Create env file in repo root:
cp .env.example .env(PowerShell:Copy-Item .env.example .env)
- Set real values for:
GEMINI_API_KEYRAG_API_KEYRAG_ADMIN_KEY
- Start all services:
cd infradocker compose --env-file ../.env up --build
- Open:
- Website:
http://localhost:3000 - API docs:
http://localhost:8000/docs
- Website:
GET /v1/healthPOST /v1/querywithx-api-keyPOST /v1/ingestwithx-api-keyPOST /v1/admin/reindexwithx-admin-keyPOST /v1/admin/rebuild_page_vectorswithx-admin-key
curl -X POST http://localhost:8000/v1/query \
-H "content-type: application/json" \
-H "x-api-key: change-this-api-key" \
-d "{\"query\": \"Who created Python?\", \"debug\": true}"curl -X POST http://localhost:8000/v1/ingest \
-H "content-type: application/json" \
-H "x-api-key: change-this-api-key" \
-d "{\"titles\": [\"Python (programming language)\"]}"Deep architecture, pipeline, data model, policy, and test mapping are documented in:
401 Invalid API key: verifyRAG_API_KEY/RAG_ADMIN_KEYin.envand request headers.429 Rate limit exceeded: bothwebandrag-apihave in-memory rate limiting.- Empty answers/no citations: ingest pages first, then retry query with
debug: true. - Timeout from web: UI waits 30s; server proxy waits up to 90s for
rag-api.
Run canned query evaluation against the local stack:
cd rag-api
python tests/run_eval.py