Skip to content

prantik2003/webdoc-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

WebDoc AI πŸ§ πŸ“„

WebDoc AI is a full-stack, enterprise-grade Retrieval-Augmented Generation (RAG) application. It allows users to upload various documents (PDFs, Word documents, text files) or provide web URLs, and instantly chat with an intelligent assistant to extract insights, summarize information, and answer complex questions based strictly on the provided context.


πŸ—οΈ Architecture & Tech Stack

The application is built using a modern three-tier architecture to ensure security, modularity, and high performance.

1. Frontend: React Interface (/frontend-react)

  • Tech: React 18, Vite, CSS (Glassmorphism UI), Lucide React, React Markdown.
  • Role: The user-facing interface. It manages chat sessions, handles file selection, and beautifully renders AI responses in Markdown format.

2. API Proxy: Node.js Server (/backend-node)

  • Tech: Express.js, Multer, Axios.
  • Role: Acts as a secure middleware layer. It intercepts file uploads from the frontend, temporarily stores them using Multer, and formats the payloads before securely proxying them to the core Python engine.

3. Core RAG Engine: Python/FastAPI (/rag-engine-python)

  • Tech: FastAPI, LangChain, Pinecone (Vector DB), Google Gemini API, HuggingFace, FlashRank.
  • Role: The "brain" of the application. It handles text extraction, chunking, vector embedding, semantic retrieval, and LLM generation.

βš™οΈ System Workflow

Here is exactly what happens under the hood when you use WebDoc AI:

sequenceDiagram
    actor User
    participant Frontend (React)
    participant Proxy (Node.js)
    participant Engine (FastAPI)
    participant Pinecone (Vector DB)
    participant LLM (Gemini)

    %% Document Upload Phase
    Note over User, LLM: Phase 1: Document/URL Ingestion
    User->>Frontend: Uploads Document / Enters URL
    Frontend->>Proxy: POST /api/upload
    Proxy->>Engine: Forwards Document
    Engine->>Engine: Extract Text & Split into Chunks
    Engine->>Engine: Generate Vector Embeddings
    Engine->>Pinecone: Store Vectors in isolated Session Namespace
    Pinecone-->>Engine: Acknowledge
    Engine-->>Frontend: Return unique Session ID

    %% Chat Phase
    Note over User, LLM: Phase 2: Q&A Retrieval
    User->>Frontend: Asks a Question
    Frontend->>Proxy: POST /api/ask (Query + History + Session ID)
    Proxy->>Engine: Forward Query
    Engine->>LLM: Rewrite query based on Chat History
    Engine->>Pinecone: Semantic Search for relevant Chunks
    Pinecone-->>Engine: Return top matching context
    Engine->>Engine: Re-rank results for maximum accuracy
    Engine->>LLM: Generate answer using retrieved Context
    LLM-->>Engine: Streaming / Final Answer
    Engine-->>Frontend: Display Markdown Answer & Sources
Loading

Detailed Breakdown:

  1. Ingestion: When a file or URL is uploaded, specialized Python loaders extract the raw text. The text is split into overlapping chunks to preserve context.
  2. Embedding & Storage: These chunks are converted into numerical representations (embeddings) and stored in a Pinecone Serverless database. Each upload gets a unique UUID to isolate data across different chat sessions.
  3. Contextual Retrieval: When a user asks a question, the system looks at the previous chat history to figure out exactly what the user means. It then searches the Pinecone database for the chunks that best answer the question.
  4. Generation: An LLM (Large Language Model) reads the retrieved context and formulates a human-readable response, guaranteeing that the answer is grounded only in the uploaded document or URL.

πŸš€ How to Run Locally

Since this is a three-tier app, you will need three separate terminal windows to run it.

Prerequisites

  1. Rename the .env.example (or create a .env) in rag-engine-python and add your API keys:
    PINECONE_API_KEY=your_key
    GEMINI_API_KEY=your_key
    # Add any other required keys (e.g. HuggingFace)

Step 1: Start the Python Engine

cd rag-engine-python
python -m venv venv
# Windows: .\venv\Scripts\Activate.ps1
# Mac/Linux: source venv/bin/activate
pip install -r requirements.txt
python -m uvicorn main:app --reload

(Runs on port 8000)

Step 2: Start the Node.js Proxy

cd backend-node
npm install
node server.js

(Runs on port 5000)

Step 3: Start the Frontend UI

cd frontend-react
npm install
npm run dev

(Runs on port 5173)

Navigate to http://localhost:5173 in your browser to start chatting with your documents!

About

πŸ§ πŸ“„ WebDoc AI is a full-stack, enterprise-grade RAG application. It allows users to upload various documents (PDFs, Word documents, text files) or provide web URLs, and instantly chat with an intelligent assistant to extract insights, summarize information, and answer complex questions based strictly on the provided context.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors