🧠 InvenioAI — Document Q&A (RAG) + Analytics

Hybrid RAG Pipeline, Semantic Search, and Multi-Document Intelligence.

Overview

In the era of information density, extracting precise answers from large PDF collections is critical. InvenioAI is a high-performance Document Q&A system that implements a state-of-the-art Hybrid RAG pipeline.

It transforms static PDF documents into a searchable, intelligent knowledge base, allowing users to ask complex questions and receive answers grounded in retrieved context with verifiable source citations.

Live Demo

Hugging Face Space: https://felixhrdyn-invenioai.hf.space

Technical Features

Hybrid RAG Pipeline: Combines dense semantic retrieval (MultiQuery + MMR) with lexical BM25 search, fused via weighted Reciprocal Rank Fusion (RRF).
Advanced Reranking: Utilizes Cross-Encoder models to re-evaluate top candidates, ensuring the most relevant context is provided to the LLM.
Async Job Orchestration: Background indexing and query execution with real-time status polling for a smooth user experience.
Deep Analytics Dashboard: Built-in metrics tracking for retrieval accuracy (nDCG, HitRate), latency, and API usage.
Cloud-Ready Architecture: Ships with an all-in-one Docker configuration optimized for Hugging Face Spaces and Azure Container Apps.
Flexible UI: Premium Streamlit interface featuring a custom design system, glassmorphism aesthetics, and interactive chat history.

Technology Stack

Backend

Framework: FastAPI
RAG Engine: LangChain
Models: Google Gemini 3.1 Flash Lite Preview, all-MiniLM-L6-v2 (Local Embedding)
Reranker: Cross-Encoder (MS-MARCO MiniLM)
Search: BM25 (Lexical) + Qdrant (Dense)

Frontend

Framework: Streamlit
Visualization: Plotly, Pandas
Styling: Vanilla CSS (Custom Design System)
Icons: Lucide (SVG)

Infrastructure

Vector Database: Qdrant (Local / Server / Cloud)
Deployment: Docker, GitHub Actions (CI/CD)
Environment: Python 3.10+

System Architecture

graph TD
    subgraph Data_Layer [Ingestion Layer]
        PDF[PDF Documents] -->|Upload| API[FastAPI Backend]
        API -->|Chunking| Split[Text Splitter]
    end
    
    subgraph Intelligence_Layer [Processing & RAG]
        Split -->|Dense| QDR[Qdrant Vector DB]
        Split -->|Lexical| BM25[BM25 Index]
        
        API -->|Query| RAG[Hybrid Retriever]
        RAG -->|RRF Fusion| Fuse[Candidate Fusion]
        Fuse -->|Reranking| Rerank[Cross-Encoder]
        Rerank -->|Context| LLM[Gemini LLM]
    end
    
    subgraph Presentation_Layer [UI & Analytics]
        UI[Streamlit Dashboard] -->|REST API| API
        LLM -->|Answer| UI
        API -->|Log| Metrics[Local Metrics Store]
        Metrics -->|Visualize| Dashboard[Analytics Page]
    end

Performance & Limits

InvenioAI is optimized for speed and retrieval precision while maintaining low operational costs.

Core Metrics & Operational Limits

Parameter	Value	Description
Retrieval Mode	Hybrid	Dense (MMR) + Lexical (BM25)
Fusion Limit	Top 20	Candidates kept after RRF fusion
QA Latency	~10-15s	Average end-to-end response time
Indexing Speed	~32 chunks/batch	Optimized for memory-constrained runtimes

Deployment Guide

Prerequisites

Python 3.10+
Google Gemini API Key
Qdrant Instance (Optional, defaults to local storage)

Execution Procedures

Step 1: Environment Setup

python -m venv venv
source venv/bin/activate  # venv\Scripts\activate on Windows
pip install -r requirements.txt
cp .env.example .env

Step 2: Run Application

# Terminal 1: Backend API
uvicorn app.main:app --reload

# Terminal 2: Streamlit UI
streamlit run frontend/streamlit_app.py

Step 3: Docker (Production)

docker build -t invenioai .
docker run -p 7860:7860 invenioai

Configuration

The application is configured via .env. Key variables include:

GEMINI_API_KEY: Required for LLM and Query Rewriting.
QDRANT_URL: Optional server URL (defaults to local ./qdrant_storage).
INVENIOAI_ENABLE_HYBRID_SEARCH: Toggle dense+lexical mode (Default: 1).
INVENIOAI_DELETE_UPLOADED_PDFS: Clean up storage after indexing (Default: 0).

Author

Felix Hardyan

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
app		app
frontend		frontend
tests		tests
uploaded_docs		uploaded_docs
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.api		Dockerfile.api
README.hf.md		README.hf.md
pytest.ini		pytest.ini
readme.md		readme.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 InvenioAI — Document Q&A (RAG) + Analytics

Overview

Live Demo

Technical Features

Technology Stack

Backend

Frontend

Infrastructure

System Architecture

Performance & Limits

Core Metrics & Operational Limits

Deployment Guide

Prerequisites

Execution Procedures

Configuration

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 InvenioAI — Document Q&A (RAG) + Analytics

Overview

Live Demo

Technical Features

Technology Stack

Backend

Frontend

Infrastructure

System Architecture

Performance & Limits

Core Metrics & Operational Limits

Deployment Guide

Prerequisites

Execution Procedures

Configuration

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages