Skip to content

notGax/ragwiki

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RagWiki

RagWiki is a standalone Wikipedia RAG stack with metadata-gated retrieval.
Instead of searching all chunks globally first, it tries to identify likely pages, then runs chunk vector search against those pages. It can refresh from Wikipedia on demand and returns answer + citations + routing diagnostics.

Repository layout

  • web/: Next.js UI and /api/query proxy endpoint.
  • rag-api/: FastAPI service for ingest, query orchestration, routing, and generation.
  • infra/: local Docker Compose stack (Qdrant + rag-api + web).
  • .env.example: required environment variables and routing thresholds.

Quick start

  1. Create env file in repo root:
    • cp .env.example .env (PowerShell: Copy-Item .env.example .env)
  2. Set real values for:
    • GEMINI_API_KEY
    • RAG_API_KEY
    • RAG_ADMIN_KEY
  3. Start all services:
    • cd infra
    • docker compose --env-file ../.env up --build
  4. Open:
    • Website: http://localhost:3000
    • API docs: http://localhost:8000/docs

API quick reference

  • GET /v1/health
  • POST /v1/query with x-api-key
  • POST /v1/ingest with x-api-key
  • POST /v1/admin/reindex with x-admin-key
  • POST /v1/admin/rebuild_page_vectors with x-admin-key

Query example

curl -X POST http://localhost:8000/v1/query \
  -H "content-type: application/json" \
  -H "x-api-key: change-this-api-key" \
  -d "{\"query\": \"Who created Python?\", \"debug\": true}"

Ingest example

curl -X POST http://localhost:8000/v1/ingest \
  -H "content-type: application/json" \
  -H "x-api-key: change-this-api-key" \
  -d "{\"titles\": [\"Python (programming language)\"]}"

Internals documentation

Deep architecture, pipeline, data model, policy, and test mapping are documented in:

Troubleshooting

  • 401 Invalid API key: verify RAG_API_KEY/RAG_ADMIN_KEY in .env and request headers.
  • 429 Rate limit exceeded: both web and rag-api have in-memory rate limiting.
  • Empty answers/no citations: ingest pages first, then retry query with debug: true.
  • Timeout from web: UI waits 30s; server proxy waits up to 90s for rag-api.

Evaluation harness

Run canned query evaluation against the local stack:

cd rag-api
python tests/run_eval.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors