RAG Knowledge Bot

基於 RAG (Retrieval-Augmented Generation) 架構的企業知識庫客服。 Python + FastAPI + LangChain + ChromaDB，支援文件上傳、向量檢索、引用來源、Streaming 回覆。

📌 為什麼有這個專案

LLM 直接回答問題會幻覺、會講不知道的事。 RAG 是目前產業界最普遍的解法：

「先去知識庫找答案，再讓 LLM 用找到的內容回答。」

這個 demo 完整實作 RAG pipeline：

文件處理：PDF / Markdown / TXT 上傳，自動切塊、向量化
向量檢索：ChromaDB + sentence-transformers，毫秒級檢索
答案生成：把檢索到的片段塞進 prompt，讓 LLM 回答並引用來源
Streaming：用 SSE 把答案逐字推給前端

可整合任何 LLM（OpenAI / Anthropic / Ollama 本地模型），預設用 Ollama 跑本地 demo。

🎯 核心功能

功能	說明
📄 文件上傳	支援 PDF、Markdown、純文字、HTML，自動萃取文字
✂️ 智慧切塊	recursive character splitter，保留語意完整
🧮 向量索引	sentence-transformers 嵌入 + ChromaDB 儲存
🔍 語意檢索	top-k 向量相似度檢索 + MMR 多樣性
💬 生成式回答	含引用來源的答案（cite as [doc:1]）
📡 Streaming SSE	答案逐字推送，UX 流暢
🔄 可切換 LLM	OpenAI / Anthropic / Ollama 任選
📊 檢索評估	內建測試集，自動跑 Recall@k 評估

🏗 系統架構

                ┌─────────────────────────┐
        upload  │   /api/docs/upload      │
        ──────► │   parse → split → embed │
                └────────────┬────────────┘
                             ▼
                    ┌──────────────────┐
                    │  ChromaDB        │
                    │  (vector store)  │
                    └─────────▲────────┘
                              │ retrieve
                ┌─────────────┴──────────────┐
        query   │   /api/chat (SSE)          │
        ──────► │   retrieve → prompt → LLM  │
                └──────────────┬─────────────┘
                               ▼
                       ┌────────────────┐
                       │  OpenAI /      │
                       │  Anthropic /   │
                       │  Ollama        │
                       └────────────────┘

🚀 快速開始

前置條件

Python 3.11+
(推薦) Ollama 跑本地 LLM，避免 API 費用

本機

git clone https://github.com/LeoTang0127/rag-knowledge-bot.git
cd rag-knowledge-bot

python -m venv .venv
source .venv/bin/activate          # Windows: .venv\Scripts\activate
pip install -r requirements.txt

# 起動 Ollama 並 pull 模型（或改用 OpenAI，見 .env.example）
ollama pull llama3.2:3b
ollama pull nomic-embed-text

# 上 API
uvicorn app.main:app --reload --port 8000

開啟 http://localhost:8000/docs 看 Swagger UI。

Docker

docker compose up

📚 使用範例

1. 上傳文件

curl -X POST http://localhost:8000/api/docs/upload \
  -F "file=@sample-docs/sop.pdf" \
  -F "tag=manufacturing"

{
  "docId": "doc_a1b2c3",
  "title": "SOP 標準作業流程",
  "chunks": 42,
  "tag": "manufacturing"
}

2. 查詢（SSE Streaming）

curl -N -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"question": "錫膏印刷機的清潔週期是多久？", "topK": 4}'

event: chunk
data: {"text": "根據 "}

event: chunk
data: {"text": "SOP 文件，"}

...

event: sources
data: [{"docId":"doc_a1b2c3","title":"SOP 標準作業流程","page":7,"score":0.89}]

event: done
data: {}

3. 文件管理

GET    /api/docs              # 列出所有文件
DELETE /api/docs/{docId}      # 刪除（含 vector index）
GET    /api/docs/{docId}/chunks  # 看切塊結果

完整 API 規格：docs/API.md

🗂 專案結構

app/
├── main.py                 # FastAPI app 初始化
├── api/                    # 路由
│   ├── chat.py             # 對話 (SSE)
│   ├── docs.py             # 文件管理
│   └── health.py
├── domain/                 # 業務邏輯（純 Python）
│   ├── splitter.py         # 切塊策略
│   ├── retriever.py        # 檢索邏輯
│   └── prompt.py           # Prompt 模板
└── infrastructure/         # 外部依賴
    ├── vectorstore.py      # ChromaDB 包裝
    ├── embedder.py         # Embedding 模型
    └── llm.py              # OpenAI / Anthropic / Ollama
tests/                      # pytest
docs/
├── API.md                  # API 規格
├── RAG_DESIGN.md           # RAG 架構設計與取捨
└── DEVELOPMENT.md          # 開發規範
sample-docs/                # 範例文件（可用來 demo）

🧪 測試與評估

pytest                              # 跑所有測試
pytest --cov=app                    # 含覆蓋率
python -m app.eval.run              # 跑檢索品質評估

評估報告（範例）：

Recall@1:  0.78
Recall@3:  0.92
Recall@5:  0.96
MRR:       0.84
平均檢索時間: 12ms

⚙️ 設定

複製 .env.example 為 .env 並修改：

# LLM 來源（ollama / openai / anthropic）
LLM_PROVIDER=ollama
LLM_MODEL=llama3.2:3b
OLLAMA_BASE_URL=http://localhost:11434

# 若用 OpenAI
# LLM_PROVIDER=openai
# OPENAI_API_KEY=sk-...
# LLM_MODEL=gpt-4o-mini

# Embedding（建議跟 LLM 同一個 provider 以減少依賴）
EMBED_PROVIDER=ollama
EMBED_MODEL=nomic-embed-text

# 向量庫
CHROMA_PATH=./data/chroma

# 切塊參數
CHUNK_SIZE=800
CHUNK_OVERLAP=100

⚠️ 絕對不要把 .env 提交到 Git，已在 .gitignore 排除。

🛠 技術棧

類別	選用	為什麼
Runtime	Python 3.11	業界 AI 主力
Web	FastAPI	async + 自動 OpenAPI
LLM 抽象	LangChain (核心)	統一介面、社群活躍
Vector DB	ChromaDB	嵌入式、零部署、適合 demo
Embedding	nomic-embed-text (Ollama)	開源、品質佳、本地跑
Test	pytest + httpx	async 友善

📐 設計取捨

為什麼用 ChromaDB 不用 Pinecone / Weaviate？

ChromaDB 嵌入式、零部署，適合 demo 與小型內部知識庫。若要規模化（>100M chunks）建議改 Qdrant 或 pgvector。

為什麼預設用 Ollama？

為了讓任何人 clone 下來都能跑（不需 API key）。生產建議：

隱私敏感：Ollama on-prem
品質優先：OpenAI gpt-4o-mini 或 Anthropic claude-3.5-haiku
成本敏感：自架小模型 + RAG（這個 demo 已示範）

為什麼答案要強制引用？

LLM 幻覺問題在 RAG 中也會發生（說了知識庫沒有的話）。強制引用 + 在 prompt 內明確要求「答不出來就說不知道」可顯著降低幻覺。

詳細設計請見 docs/RAG_DESIGN.md。

📝 License

MIT

👤 Author

Leo Tang — @LeoTang0127 · mrleowin@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
app		app
docs		docs
sample-docs		sample-docs
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Knowledge Bot

📌 為什麼有這個專案

🎯 核心功能

🏗 系統架構

🚀 快速開始

前置條件

本機

Docker

📚 使用範例

1. 上傳文件

2. 查詢（SSE Streaming）

3. 文件管理

🗂 專案結構

🧪 測試與評估

⚙️ 設定

🛠 技術棧

📐 設計取捨

為什麼用 ChromaDB 不用 Pinecone / Weaviate？

為什麼預設用 Ollama？

為什麼答案要強制引用？

📝 License

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Knowledge Bot

📌 為什麼有這個專案

🎯 核心功能

🏗 系統架構

🚀 快速開始

前置條件

本機

Docker

📚 使用範例

1. 上傳文件

2. 查詢（SSE Streaming）

3. 文件管理

🗂 專案結構

🧪 測試與評估

⚙️ 設定

🛠 技術棧

📐 設計取捨

為什麼用 ChromaDB 不用 Pinecone / Weaviate？

為什麼預設用 Ollama？

為什麼答案要強制引用？

📝 License

👤 Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages