This project demonstrates how to build an Agentic Retrieval-Augmented Generation (RAG) pipeline using LangChain, where an LLM-powered agent can decide when and how to retrieve information from your knowledge base, instead of blindly fetching context every time. The agent dynamically chooses retrieval, reasoning, and tool usage, allowing more efficient and context-aware responses.
Some dependencies (like onnxruntime) can be tricky to install depending on your OS. Follow the steps for your platform before running pip install.
conda install onnxruntime -c conda-forgeSee the onnxruntime GitHub issue for extra help.
Follow this guide to install Microsoft C++ Build Tools, making sure to set the environment variable path.
pip install -r requirements.txtpip install "unstructured[md]"This step ingests your documents into a Chroma vector database so the agent can retrieve relevant chunks when needed.
python create_database.pyInstead of a simple retrieval pipeline, here the agent:
- Reads your query.
- Decides whether retrieval is needed.
- Queries Chroma DB only if relevant.
- Combines retrieved context with its own reasoning.
Example query:
python agentic_query.py "How does Alice meet the Mad Hatter?"You’ll need an OpenAI API key set in your environment variables:
# Mac/Linux
export OPENAI_API_KEY="your_api_key_here"
# Windows
setx OPENAI_API_KEY "your_api_key_here"While this project is based on traditional RAG examples like Pixegami’s LangChain RAG tutorial, the code here has been adapted for Agentic RAG — allowing more intelligent, context-aware querying.
| Feature | Traditional RAG | Agentic RAG |
|---|---|---|
| Always fetches context | ✅ Always | 🔄 Only when needed |
| Reasoning before retrieval | ❌ No | ✅ Yes |
| Multi-tool orchestration | ❌ Limited | ✅ Yes |
| Efficiency | 🚀 Optimized |