This is a lightweight Retrieval-Augmented Generation (RAG) app that lets users upload a PDF document and ask questions about its content. It's designed for domain-specific question answering using OpenAI's LLMs and ChromaDB for local vector search.
- Upload any PDF (e.g. academic papers, contracts, manuals)
- Ask natural language questions about the document
- Uses
SentenceTransformerembeddings and OpenAI'sgpt-3.5-turbofor answers - Built with
Streamlitfor quick and interactive UI - Chunking with overlap to preserve semantic context
- Stores and queries vector representations locally via ChromaDB
pip install -r requirements.txt
-
Clone and enter the repo git clone https://github.com/your-username/RAG-Based-Domain-Specific-Q-A-System.git cd RAG-Based-Domain-Specific-Q-A-System
-
Create virtual environment python3 -m venv venv source venv/bin/activate # For Windows: venv\Scripts\activate
-
Install dependencies pip install -r requirements.txt
4.Set up .env file OPENAI_API_KEY=your-openai-api-key-here
5.Run the app streamlit run app/main.py
Once you upload a PDF, try asking:
- "What is the main contribution of this paper?"
- "Which algorithm is used?"
- "What are the limitations discussed?"
- "Explain the methodology"
- "List the datasets used"
- Chunk size and overlap can be tweaked in processor.py
- Embedding model can be changed in vector_store.py
- Local vector DB (ChromaDB) is reset on reruns unless persisted in vector_db/
Feel free to reach out if you want to deploy or scale this further.
- Name: Vraj Desai
- Email: vrajdhar@usc.edu
- Contact: +1 2136915656