Skip to content

hanhanwu/RagDoctor

Repository files navigation

RAG Doctor

Making RAG Reliable for Enterprise Use

The Challenges

  • Most teams don't fail at building RAG. They fail at making it reliable.
  • Teams adopt auto eval, but critical decisions still heavily rely on manual review: slow, costly, and inconsistent.
  • Existing evaluation and observability tools struggle to adapt as enterprise knowledge continuously evolves.

What RAG Doctor Does

  • ⚡Provide high quality evaluation while reducing up to 90% of manual evaluation, finish RAG A/B tests in minutes (not days).
  • 🔍 Traces failures back to their root causes: retrieval, chunking, prompts, grounding, or outdated knowledge.
  • 📈 Detects when knowledge base changes invalidate existing ground truth data.
  • 🎯 One click delivers insight for both AI teams and business teams.
  • 🚀Turn RAG evaluation into a fast, iterative loop

github animation

Why This Matters

  • Built from experience running production RAG systems where evaluation became the bottleneck.
  • Auto Evaluation + Root Cause Analysis will become the foundation of enterprise AI.
  • RAG Doctor is the first step toward:
    • automated knowledge base optimization
    • auto-generated ground truth datasets
    • self-improving AI systems

🚀 Roadmap (what we're building next)

  • Iterative improvement loop
  • Improve auto evaluation by add context such as query quality, data input stats, system prompt, etc.
  • More precise root cause analysis
  • Automatic golden dataset generation and update
  • Automatic knowledge base selection

🤝 Join Me

This is not just a tool, it helps shape the reliability of future AI products.

Join me if you:

  • care about making AI products reliable
  • want to shape an emerging standard

Ways to contribute:

  • Share feedback
  • Contribute to system design and development
  • Contribute to core algorithms

👉 Project Setup Guide

Releases

No releases published

Packages

 
 
 

Contributors