Skip to content

BABA-YAGAX71/RagChatBot_ForCompanyPolicy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Chatbot for Company Policy

A Retrieval-Augmented Generation (RAG) chatbot that allows you to upload company policy documents (PDF, TXT) and ask questions about them. The chatbot uses Groq, LangChain, FAISS, and Streamlit for a fast, interactive experience.

Features

  • Document Upload: Upload multiple PDF and TXT files via Streamlit UI
  • Vector Search: Uses FAISS and HuggingFace embeddings for semantic search
  • LLM Integration: Powered by Groq's Llama 3.3 70B model
  • Web UI: Interactive Streamlit frontend with chat history
  • REST API: FastAPI backend with LangServe for extensibility
  • LLMops tools: LangServe

Prerequisites

  • Python 3.13 or later
  • Pipenv (for dependency management)
  • Groq API Key (get one at console.groq.com)

Quick Start

1. Install Dependencies

pipenv install

2. Set Up Environment Variables

Create a .env file in the project root:

GROQ_API_KEY=your_groq_api_key_here

3. Run the Backend (in one terminal)

pipenv run python -m uvicorn main:app --reload

The backend will start at http://127.0.0.1:8000. You can explore the API at http://127.0.0.1:8000/docs.

4. Run the Frontend (in another terminal)

pipenv run streamlit run client.py

The Streamlit app will open at http://localhost:8501 in your browser.

Usage

  1. Upload Documents: Use the sidebar to upload PDF or TXT files containing your company policies.
  2. Process Files: Click "Process Files" to embed and index the documents.
  3. Ask Questions: Once processed, type your question in the chat input and get answers with source references.

Project Structure

Rag_Chatbot_for_company_policy/
├── main.py              # FastAPI backend with RAG chain
├── client.py            # Streamlit frontend UI
├── Pipfile              # Pipenv dependencies
├── Pipfile.lock         # Locked dependency versions
├── .env                 # Environment variables (not in git)
├── .gitignore           # Git ignore file
├── faiss_index/         # FAISS vector store (auto-created)
└── README.md            # This file

Architecture

Backend (main.py)

  • FastAPI server with CORS enabled
  • LangChain chains for RAG logic
  • FAISS vector store for document retrieval
  • Groq LLM for response generation
  • LangServe integration for LLM as a service

Frontend (client.py)

  • Streamlit interactive UI
  • File uploader for documents
  • Chat interface with message history
  • Source attribution for retrieved documents

Configuration

Change the LLM Model

Edit main.py and modify the ChatGroq initialization:

llm = ChatGroq(
    model_name="llama-3.3-70b-versatile",  # Change this
    groq_api_key=os.getenv("GROQ_API_KEY")
)

Change the Embedding Model

Edit main.py and modify:

EMBEDDING_MODEL_NAME = "sentence-transformers/all-MiniLM-L6-v2"  # Change this

Change Streamlit Port

pipenv run streamlit run client.py --server.port 8502

Dependencies

Key packages (see Pipfile for full list):

  • FastAPI: Web framework
  • Streamlit: UI framework
  • LangChain: RAG orchestration
  • Groq: LLM API
  • FAISS: Vector database
  • Uvicorn: ASGI server
  • Pydantic: Data validation

Troubleshooting

Backend Won't Start

  • Ensure GROQ_API_KEY is set in .env
  • Check if port 8000 is already in use: lsof -i :8000 (Mac/Linux) or netstat -ano | findstr :8000 (Windows)

Streamlit Can't Connect to Backend

  • Verify backend is running at http://127.0.0.1:8000
  • Check BACKEND_URL in client.py matches your backend address

No Documents Found When Asking Questions

  • Ensure you uploaded and processed documents first via the Streamlit sidebar
  • Check browser console for errors

Deploy to Streamlit Community Cloud

  1. Push your code to a GitHub repository:

    git add .
    git commit -m "Initial commit"
    git push origin main
  2. Go to share.streamlit.io

  3. Click "New app" and connect your GitHub repository

  4. Set the main file to client.py

  5. Add GROQ_API_KEY as a secret in Streamlit Cloud settings

Note: You'll need to deploy the backend separately (e.g., on Heroku, AWS, Google Cloud) or use a separate backend service.

Notes

  • The FAISS index is stored locally in faiss_index/ directory
  • Chat history is stored in Streamlit session state and persists during the session
  • The app uses CORS middleware to allow requests from any origin

License

This project is open source and available under the MIT License.

💬 Support

For issues or feature requests, please open an issue or contact the development team.

Releases

No releases published

Packages

 
 
 

Contributors

Languages