FinRAG

title	FinRAG
emoji	📊
colorFrom	blue
colorTo	gray
sdk	docker
app_port	7860
pinned	false

FinRAG

Production-style retrieval-augmented generation system for SEC 10-K analysis.

FinRAG ingests annual filings, parses them into structured markdown, builds a hybrid retrieval index and serves grounded answers through a FastAPI backend and React frontend. The project is designed to answer filing-specific questions with traceable source passages rather than free-form financial commentary.

What This System Does

Ingests 10-K PDF filings with LlamaCloud parsing.
Normalizes parsed markdown into section-aware chunks with metadata.
Stores embeddings in a persistent Chroma collection.
Combines dense retrieval, BM25 sparse retrieval, RRF fusion, and cross-encoder reranking.
Serves answers with retrieved source passages through an API and web UI.

Architecture

PDF upload / local filing
        |
        v
LlamaCloud parse -> page markdown -> normalized sections -> rechunked chunks
        |
        v
Embeddings (BAAI/bge-large-en-v1.5)
        |
        v
Chroma vector store + BM25 index
        |
        v
RRF fusion -> cross-encoder reranker
        |
        v
LLM answer generation with source passages
        |
        v
FastAPI API + React frontend

Retrieval Design

FinRAG uses a multi-stage retrieval stack rather than a single vector search.

Parse the 10-K into markdown with page-level structure.
Split by filing headers, then rechunk long sections while preserving tables.
Embed chunks with BAAI/bge-large-en-v1.5.
Retrieve candidates from:
- Chroma dense similarity search
- BM25 sparse keyword search
Merge candidate lists with Reciprocal Rank Fusion.
Rerank the merged shortlist with BAAI/bge-reranker-base.
Generate the final answer from retrieved evidence only.

This design improves recall on both semantic and keyword-heavy questions, which matters for filings containing exact legal phrasing, section titles, and numeric disclosures.

Tech Stack

Backend: FastAPI, Python
Retrieval: ChromaDB, BM25, RRF fusion, cross-encoder reranking
Embeddings: BAAI/bge-large-en-v1.5
Parsing: LlamaCloud
LLM serving: Groq via langchain_groq
Frontend: React + Vite
Packaging: Docker

Repository Layout

FinRAG/
├─ app/
│  ├─ app.py            # FastAPI application
│  └─ main.py           # CLI entrypoint for ingestion and query
├─ config/
│  └─ settings.py       # Central configuration
├─ rag/
│  ├─ ingestion.py      # Parse, normalize, rechunk, persist artifacts
│  ├─ embeddings.py     # Embedding model wrapper
│  ├─ retriever.py      # Dense + BM25 + RRF + reranker
│  ├─ pipeline.py       # Prompting and structured query output
│  └─ vectorstore.py    # Chroma collection management
├─ frontend/
│  └─ src/              # React client
├─ data/
│  ├─ raw/              # Uploaded / local 10-K PDFs
│  ├─ processed/        # Saved parse artifacts and chunk manifests
│  └─ vectorstore/      # Persistent Chroma data
├─ Dockerfile
└─ requirements.txt

Local Development

Prerequisites

Python 3.10+
Node.js 18+
Groq API key
LlamaCloud API key

Create a .env file:

GROQ_API_KEY=...
LLAMA_CLOUD_API_KEY=...

Backend Setup

cd FinRAG
python -m venv .venv
.venv\Scripts\activate
python -m pip install --upgrade pip
pip install -r requirements.txt

Ingest a Filing

python -m app.main --ingest

This will:

parse the most recent PDF in data/raw/ if no explicit path is wired in,
create processed parse artifacts under data/processed/<doc_id>/,
build or extend the Chroma collection.

Run the API

uvicorn app.app:app --reload --port 8000

Run the Frontend

cd frontend
npm install
npm run dev

Frontend runs on http://localhost:5173 and calls the backend on http://localhost:8000.

API Surface

GET /health
- liveness check plus current vector-store document count
POST /api/upload
- upload a PDF and trigger ingestion
POST /api/query
- submit a natural-language question against the indexed filings
POST /api/reset-index
- clear the vector index for a fresh dev session

Example Query

python -m app.main --query "What are the main risk factors disclosed in this filing?"

Expected response shape:

{
  "answer": "...",
  "top_retrieval_score": 0.61,
  "sources": [
    {
      "source": "Tesla 2025 10-K",
      "section": "Item 1A. Risk Factors",
      "score": 0.61,
      "preview": "..."
    }
  ]
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
app		app
config		config
frontend		frontend
rag		rag
tests		tests
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FinRAG

What This System Does

Architecture

Retrieval Design

Tech Stack

Repository Layout

Local Development

Prerequisites

Backend Setup

Ingest a Filing

Run the API

Run the Frontend

API Surface

Example Query

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FinRAG

What This System Does

Architecture

Retrieval Design

Tech Stack

Repository Layout

Local Development

Prerequisites

Backend Setup

Ingest a Filing

Run the API

Run the Frontend

API Surface

Example Query

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages