-
Notifications
You must be signed in to change notification settings - Fork 1
API Documentation
Version: v1
Last Updated: 2026-02-01
Base URL: http://localhost:8004/api/v1
- Overview
- Authentication
- Response Format
- Error Handling
- API Endpoints
- Data Models
- WebSocket Support
- Rate Limiting
The Knowledge Base Platform API provides a comprehensive set of endpoints for managing knowledge bases, documents, embeddings, and conversational AI interactions. The API follows RESTful principles and returns JSON responses.
- Knowledge Base Management: Create, update, and manage multiple knowledge bases
- Document Processing: Upload, process, and index documents with various chunking strategies
- RAG (Retrieval-Augmented Generation): Query knowledge bases with context-aware responses
- Multiple Embedding Providers: Support for OpenAI, Voyage AI, and Ollama embeddings
- Flexible Chunking: Fixed-size, semantic, and paragraph-based chunking strategies
- Hybrid Search: Combine dense vector search with BM25 lexical search
- Conversation Management: Track and manage multi-turn conversations
- Progress Tracking: Real-time document processing progress with detailed stages
Current Status: No authentication required (MVP phase)
Future Implementation: JWT-based authentication with user management
Authorization: Bearer <token>{
"id": "uuid",
"status": "success",
"data": { ... }
}List endpoints support pagination:
{
"items": [...],
"total": 100,
"page": 1,
"page_size": 20,
"pages": 5
}Query Parameters:
-
page: Page number (default: 1) -
page_size: Items per page (default: 20, max: 100)
{
"detail": "Error message",
"path": "/api/v1/endpoint",
"suggestion": "Helpful suggestion"
}| Code | Meaning | Description |
|---|---|---|
| 200 | OK | Request successful |
| 201 | Created | Resource created successfully |
| 204 | No Content | Successful deletion |
| 400 | Bad Request | Invalid request parameters |
| 404 | Not Found | Resource not found |
| 422 | Unprocessable Entity | Validation error |
| 500 | Internal Server Error | Server error |
| 503 | Service Unavailable | Dependency unavailable |
Check API health status.
Response:
{
"status": "healthy",
"timestamp": "2026-02-01T12:00:00Z"
}Check if all dependencies are ready.
Response:
{
"ready": true,
"checks": {
"database": true,
"vector_store": true,
"lexical_store": true
}
}Status Codes:
-
200: All services ready -
503: One or more services unavailable
Get API information and configuration.
Response:
{
"version": "1.0.0",
"environment": "development",
"features": {
"async_processing": true,
"cache": false,
"metrics": false
},
"integrations": {
"opensearch_available": true
},
"limits": {
"max_file_size_mb": 50,
"max_chunk_size": 2000,
"chunk_overlap": 200
},
"supported_formats": ["txt", "md", "fb2"]
}Create a new knowledge base.
Request Body:
{
"name": "Technical Documentation",
"description": "Product documentation and guides",
"embedding_model": "text-embedding-3-small",
"chunk_size": 1000,
"chunk_overlap": 200,
"chunking_strategy": "semantic",
"upsert_batch_size": 100,
"bm25_match_mode": "best_fields",
"bm25_min_should_match": 1,
"bm25_use_phrase": false,
"bm25_analyzer": "standard",
"contextual_description_enabled": true
}Required Fields:
-
name: Knowledge base name (string)
Optional Fields:
-
description: Description (string, nullable) -
embedding_model: Embedding model name (default: from settings) -
chunk_size: Chunk size in characters (default: 1000) -
chunk_overlap: Overlap between chunks (default: 200) -
chunking_strategy: One offixed_size,semantic,paragraph(default:fixed_size) -
upsert_batch_size: Batch size for vector insertion (default: 100) -
bm25_match_mode: BM25 match mode (default: "best_fields") -
bm25_min_should_match: Minimum matches for BM25 (default: 1) -
bm25_use_phrase: Use phrase matching (default: false) -
bm25_analyzer: Text analyzer (default: "standard") -
contextual_description_enabled: KB-level contextual description toggle (true/false/null)
Response: 201 Created
{
"id": "uuid",
"name": "Technical Documentation",
"description": "Product documentation and guides",
"collection_name": "kb_abc123",
"embedding_model": "text-embedding-3-small",
"embedding_provider": "openai",
"embedding_dimension": 1536,
"chunk_size": 1000,
"chunk_overlap": 200,
"chunking_strategy": "semantic",
"upsert_batch_size": 100,
"document_count": 0,
"total_chunks": 0,
"created_at": "2026-02-01T12:00:00Z",
"updated_at": "2026-02-01T12:00:00Z"
}List all knowledge bases with pagination.
Query Parameters:
-
page: Page number (default: 1) -
page_size: Items per page (default: 20)
Response: 200 OK
{
"items": [
{
"id": "uuid",
"name": "Technical Documentation",
"description": "Product documentation",
"document_count": 15,
"total_chunks": 450,
"embedding_model": "text-embedding-3-small",
"chunking_strategy": "semantic",
"created_at": "2026-02-01T12:00:00Z"
}
],
"total": 1,
"page": 1,
"page_size": 20,
"pages": 1
}Get knowledge base details.
Path Parameters:
-
kb_id: Knowledge base UUID
Response: 200 OK
{
"id": "uuid",
"name": "Technical Documentation",
"description": "Product documentation and guides",
"collection_name": "kb_abc123",
"embedding_model": "text-embedding-3-small",
"embedding_provider": "openai",
"embedding_dimension": 1536,
"chunk_size": 1000,
"chunk_overlap": 200,
"chunking_strategy": "semantic",
"document_count": 15,
"total_chunks": 450,
"bm25_match_mode": "best_fields",
"bm25_min_should_match": 1,
"bm25_use_phrase": false,
"bm25_analyzer": "standard",
"contextual_description_enabled": true,
"created_at": "2026-02-01T12:00:00Z",
"updated_at": "2026-02-01T12:00:00Z"
}Update knowledge base configuration.
Path Parameters:
-
kb_id: Knowledge base UUID
Request Body:
{
"name": "Updated Name",
"description": "Updated description",
"chunk_size": 1200,
"chunk_overlap": 250,
"bm25_match_mode": "cross_fields",
"contextual_description_enabled": false
}Note: Cannot change embedding_model or chunking_strategy after creation.
Response: 200 OK (same format as GET)
Delete a knowledge base (soft delete).
Path Parameters:
-
kb_id: Knowledge base UUID
Response: 204 No Content
Reprocess all documents in a knowledge base.
Path Parameters:
-
kb_id: Knowledge base UUID
Query Parameters:
-
detect_duplicates: Recompute duplicate chunk summary (true/false, default:false) -
contextual_description_enabled: Optional per-request override (true/false); if omitted, KB/global defaults are used
Response: 200 OK
{
"queued": 15,
"knowledge_base_id": "uuid"
}Clean up orphaned vector chunks from deleted documents.
Path Parameters:
-
kb_id: Knowledge base UUID
Response: 200 OK
{
"message": "Cleaned up orphaned chunks",
"deleted_documents": 5,
"chunks_removed": 150
}Upload and process a document.
Request: multipart/form-data
Form Fields:
-
file: File to upload (required) -
knowledge_base_id: Target knowledge base UUID (required) -
detect_duplicates: Compute duplicate chunk summary (true/false, optional) -
contextual_description_enabled: Optional per-request override (true/false) for contextual description generation
Example with curl:
curl -X POST http://localhost:8004/api/v1/documents/ \
-F "file=@document.pdf" \
-F "knowledge_base_id=uuid" \
-F "detect_duplicates=false" \
-F "contextual_description_enabled=true"Response: 201 Created
{
"id": "uuid",
"filename": "document.pdf",
"file_size": 1024000,
"file_type": "pdf",
"status": "pending",
"embeddings_status": "pending",
"bm25_status": "pending",
"chunk_count": 0,
"knowledge_base_id": "uuid",
"processing_stage": null,
"progress_percentage": 0,
"error_message": null,
"created_at": "2026-02-01T12:00:00Z",
"processed_at": null
}File Size Limit: 50 MB (configurable)
Supported Formats:
- PDF (
.pdf) - Word Documents (
.docx) - Text (
.txt) - Markdown (
.md) - HTML (
.html)
List documents with filtering and pagination.
Query Parameters:
-
knowledge_base_id: Filter by knowledge base (required) -
status: Filter by status (pending,processing,completed,failed) -
page: Page number (default: 1) -
page_size: Items per page (default: 20)
Response: 200 OK
{
"items": [
{
"id": "uuid",
"filename": "document.pdf",
"file_size": 1024000,
"file_type": "pdf",
"status": "completed",
"embeddings_status": "completed",
"bm25_status": "completed",
"chunk_count": 45,
"processing_stage": "Completed",
"progress_percentage": 100,
"created_at": "2026-02-01T12:00:00Z",
"processed_at": "2026-02-01T12:05:00Z"
}
],
"total": 15,
"page": 1,
"page_size": 20,
"pages": 1
}Get document details with content.
Path Parameters:
-
doc_id: Document UUID
Query Parameters:
-
detect_duplicates: Recompute duplicate chunk summary (true/false, default:false) -
contextual_description_enabled: Optional per-request override (true/false); if omitted, KB/global defaults are used
Response: 200 OK
{
"id": "uuid",
"filename": "document.pdf",
"file_size": 1024000,
"file_type": "pdf",
"status": "completed",
"embeddings_status": "completed",
"bm25_status": "completed",
"chunk_count": 45,
"knowledge_base_id": "uuid",
"content": "Full document text content...",
"content_hash": "sha256hash",
"processing_stage": "Completed",
"progress_percentage": 100,
"error_message": null,
"created_at": "2026-02-01T12:00:00Z",
"updated_at": "2026-02-01T12:05:00Z",
"processed_at": "2026-02-01T12:05:00Z"
}Get document processing status (optimized for polling).
Path Parameters:
-
doc_id: Document UUID
Response: 200 OK
{
"id": "uuid",
"filename": "document.pdf",
"status": "processing",
"embeddings_status": "processing",
"bm25_status": "pending",
"chunk_count": 0,
"processing_stage": "Generating embeddings (100/301)",
"progress_percentage": 48,
"error_message": null
}Processing Stages:
-
"Loading document..."(5%) -
"Preparing to chunk..."(15%) -
"Chunking completed (N chunks)"(30%) -
"Generating embeddings (X/N)"(35-75%, updates per batch) -
"Embeddings created (N)"(75%) -
"Indexing in Qdrant..."(80%) -
"Qdrant indexing completed"(85%) -
"Indexing BM25..."(90%) -
"BM25 indexing completed"(95%) -
"Completed"(100%)
Polling Recommendation: Poll every 1-2 seconds while status is processing or pending.
Delete a document and its vectors.
Path Parameters:
-
doc_id: Document UUID
Response: 204 No Content
Note: This performs a soft delete (sets is_deleted=true), removes all associated vectors from Qdrant and OpenSearch, and deletes the original uploaded file from disk (./uploads/).
Reprocess a document with current KB settings. Deletes old vectors and re-indexes from scratch.
For DOCX and PDF files, also re-extracts heading structure and page map from the original uploaded file before chunking. This means improvements to the parsing logic take effect without re-uploading the document.
Path Parameters:
-
doc_id: Document UUID
Response: 200 OK
{
"id": "uuid",
"filename": "document.pdf",
"status": "pending",
"message": "Document reprocessing started"
}Analyze document structure using LLM.
Path Parameters:
-
doc_id: Document UUID
Response: 200 OK
{
"document_id": "uuid",
"filename": "document.pdf",
"document_type": "technical_guide",
"description": "Software installation and configuration guide",
"total_sections": 12,
"sections": [
{
"title": "Introduction",
"level": 1,
"order": 0,
"chunk_indices": [0, 1, 2],
"summary": "Overview of the system"
}
]
}Apply analyzed structure to document.
Path Parameters:
-
doc_id: Document UUID
Request Body:
{
"document_type": "technical_guide",
"description": "Installation guide",
"sections": [...]
}Response: 200 OK
{
"message": "Structure applied successfully",
"document_id": "uuid"
}Get document structure if available.
Path Parameters:
-
doc_id: Document UUID
Response: 200 OK
{
"has_structure": true,
"document_id": "uuid",
"document_type": "technical_guide",
"description": "Installation guide",
"sections": [...]
}Query a knowledge base with RAG.
Request Body:
{
"question": "How do I install the software?",
"knowledge_base_id": "uuid",
"conversation_id": "uuid",
"conversation_history": [
{
"role": "user",
"content": "Previous question"
},
{
"role": "assistant",
"content": "Previous answer"
}
],
"top_k": 5,
"temperature": 0.7,
"retrieval_mode": "hybrid",
"lexical_top_k": 10,
"hybrid_dense_weight": 0.7,
"hybrid_lexical_weight": 0.3,
"bm25_match_mode": "best_fields",
"bm25_min_should_match": 1,
"bm25_use_phrase": false,
"bm25_analyzer": "standard",
"max_context_chars": 4000,
"score_threshold": 0.5,
"llm_model": "gpt-4o",
"llm_provider": "openai",
"use_structure": true,
"use_mmr": false,
"mmr_diversity": 0.5,
"context_expansion": ["window"],
"context_window": 1
}Required Fields:
-
question: User question (string) -
knowledge_base_id: Target KB UUID
Optional Fields:
-
conversation_id: Continue existing conversation (UUID) -
conversation_history: Previous messages (array) -
top_k: Number of chunks to retrieve (default: 5) -
temperature: LLM temperature (default: 0.7) -
retrieval_mode:"dense"or"hybrid"(default: "dense") -
lexical_top_k: BM25 top-k (default: 10) -
hybrid_dense_weight: Dense weight in hybrid (default: 0.7) -
hybrid_lexical_weight: Lexical weight in hybrid (default: 0.3) -
bm25_match_mode: BM25 match mode (default: from KB settings) -
bm25_min_should_match: Minimum matches (default: from KB settings) -
bm25_use_phrase: Use phrase matching (default: from KB settings) -
bm25_analyzer: Text analyzer (default: from KB settings) -
max_context_chars: Max context size (default: 4000) -
score_threshold: Minimum similarity score (default: 0.0) -
llm_model: LLM model name (default: from settings) -
llm_provider: LLM provider (default: from settings) -
use_structure: Use document structure (default: false) -
use_mmr: Use Maximal Marginal Relevance (default: false) -
mmr_diversity: MMR diversity factor 0-1 (default: 0.5) -
context_expansion: Context expansion modes (e.g.,["window"]) -
context_window: Window size (chunks on each side) for windowed retrieval (default: 0)
Response: 200 OK
{
"answer": "To install the software, follow these steps...",
"sources": [
{
"text": "Installation instructions...",
"score": 0.92,
"document_id": "uuid",
"filename": "install-guide.pdf",
"chunk_index": 5,
"metadata": {
"section_heading": "Installation",
"section_path": "Chapter 2 > Installation",
"section_level": 2,
"page_number": 42,
"page_number_physical": 3,
"source_type": "dense",
"dense_score_raw": 0.87,
"lexical_score_raw": 0.61,
"combined_score": 0.78,
"rerank_applied": true,
"rerank_provider": "voyage",
"rerank_model": "rerank-2",
"rerank_score": 0.91,
"pre_rerank_score": 0.78,
"contextual_description": "Section describing step-by-step installation on Linux."
}
}
],
"query": "How do I install the software?",
"confidence_score": 0.85,
"model": "gpt-4o",
"knowledge_base_id": "uuid",
"conversation_id": "uuid",
"use_mmr": false,
"mmr_diversity": 0.5
}Get chat statistics for a knowledge base.
Path Parameters:
-
kb_id: Knowledge base UUID
Response: 200 OK
{
"total_conversations": 150,
"total_messages": 450,
"avg_messages_per_conversation": 3.0,
"last_activity": "2026-02-01T12:00:00Z"
}List all conversations.
Query Parameters:
-
knowledge_base_id: Filter by KB (optional) -
page: Page number (default: 1) -
page_size: Items per page (default: 20)
Response: 200 OK
[
{
"id": "uuid",
"knowledge_base_id": "uuid",
"title": "Software Installation Questions",
"created_at": "2026-02-01T12:00:00Z",
"updated_at": "2026-02-01T12:15:00Z"
}
]Get conversation details.
Path Parameters:
-
conversation_id: Conversation UUID
Response: 200 OK
{
"id": "uuid",
"knowledge_base_id": "uuid",
"title": "Software Installation Questions",
"settings": {
"top_k": 5,
"temperature": 0.7,
"llm_model": "gpt-4o"
},
"created_at": "2026-02-01T12:00:00Z",
"updated_at": "2026-02-01T12:15:00Z"
}Update conversation settings.
Path Parameters:
-
conversation_id: Conversation UUID
Request Body:
{
"top_k": 10,
"temperature": 0.5,
"use_structure": true,
"context_expansion": ["window"],
"context_window": 1
}Response: 200 OK (same format as GET)
Get conversation message history.
Path Parameters:
-
conversation_id: Conversation UUID
Response: 200 OK
[
{
"id": "uuid",
"role": "user",
"content": "How do I install?",
"timestamp": "2026-02-01T12:00:00Z",
"message_index": 0
},
{
"id": "uuid",
"role": "assistant",
"content": "To install...",
"sources": [...],
"timestamp": "2026-02-01T12:00:05Z",
"message_index": 1
}
]Delete a conversation.
Path Parameters:
-
conversation_id: Conversation UUID
Response: 204 No Content
List all available embedding models.
Response: 200 OK
[
{
"model": "text-embedding-3-small",
"provider": "openai",
"dimension": 1536,
"description": "OpenAI small embedding model - fast and cost-effective",
"cost_per_million_tokens": 0.02
},
{
"model": "voyage-4",
"provider": "voyage",
"dimension": 1024,
"description": "Voyage standard model - balanced performance",
"cost_per_million_tokens": 0.06
}
]Get details for a specific embedding model.
Path Parameters:
-
model_name: Model name (e.g.,text-embedding-3-small)
Response: 200 OK
{
"model": "text-embedding-3-small",
"provider": "openai",
"dimension": 1536,
"description": "OpenAI small embedding model - fast and cost-effective",
"cost_per_million_tokens": 0.02
}List all embedding providers.
Response: 200 OK
["openai", "voyage", "ollama"]Get models for a specific provider.
Path Parameters:
-
provider: Provider name (openai,voyage,ollama)
Response: 200 OK
[
"text-embedding-3-small",
"text-embedding-3-large"
]List all available LLM models.
Response: 200 OK
{
"openai": [
"gpt-4o",
"gpt-4o-mini",
"gpt-4-turbo",
"gpt-3.5-turbo"
],
"ollama": [
"llama3.1:8b",
"mistral:7b"
]
}List all LLM providers.
Response: 200 OK
["openai", "ollama"]Check Ollama server status.
Response: 200 OK
{
"available": true,
"url": "http://localhost:11434",
"version": "0.1.0"
}Response: 503 Service Unavailable (if Ollama is down)
{
"available": false,
"error": "Connection refused"
}List all Ollama models.
Response: 200 OK
{
"models": [
{
"name": "llama3.1:8b",
"size": 4700000000,
"modified_at": "2026-02-01T10:00:00Z"
}
]
}List Ollama embedding models.
Response: 200 OK
[
"nomic-embed-text",
"mxbai-embed-large",
"all-minilm"
]List Ollama LLM models.
Response: 200 OK
[
"llama3.1:8b",
"mistral:7b",
"qwen2.5:7b"
]Get current application settings.
Response: 200 OK
{
"id": 1,
"llm_model": "gpt-4o",
"llm_provider": "openai",
"temperature": 0.7,
"top_k": 5,
"max_context_chars": 4000,
"score_threshold": 0.0,
"retrieval_mode": "dense",
"lexical_top_k": 10,
"hybrid_dense_weight": 0.7,
"hybrid_lexical_weight": 0.3,
"bm25_match_mode": "best_fields",
"bm25_min_should_match": 1,
"bm25_use_phrase": false,
"bm25_analyzer": "standard",
"contextual_description_enabled": false,
"kb_chunk_size": 1000,
"kb_chunk_overlap": 200,
"kb_upsert_batch_size": 100,
"created_at": "2026-01-01T00:00:00Z",
"updated_at": "2026-02-01T12:00:00Z"
}Update application settings.
Request Body:
{
"llm_model": "gpt-4o-mini",
"temperature": 0.5,
"top_k": 10,
"contextual_description_enabled": true
}Response: 200 OK (same format as GET)
Reset settings to defaults.
Response: 200 OK (returns default settings)
Get settings metadata (available options).
Response: 200 OK
{
"bm25_match_modes": [
"best_fields",
"most_fields",
"cross_fields",
"phrase"
],
"bm25_analyzers": [
"standard",
"simple",
"whitespace",
"keyword"
]
}interface KnowledgeBase {
id: string
name: string
description: string | null
collection_name: string
embedding_model: string
embedding_provider: string
embedding_dimension: number
chunk_size: number
chunk_overlap: number
chunking_strategy: 'fixed_size' | 'semantic' | 'paragraph'
upsert_batch_size: number
document_count: number
total_chunks: number
bm25_match_mode: string | null
bm25_min_should_match: number | null
bm25_use_phrase: boolean | null
bm25_analyzer: string | null
contextual_description_enabled: boolean | null
created_at: string
updated_at: string
}interface Document {
id: string
filename: string
file_size: number
file_type: string
status: 'pending' | 'processing' | 'completed' | 'failed'
embeddings_status: 'pending' | 'processing' | 'completed' | 'failed'
bm25_status: 'pending' | 'processing' | 'completed' | 'failed'
chunk_count: number
knowledge_base_id: string
content?: string // Only in detailed response
content_hash: string
processing_stage: string | null
progress_percentage: number
error_message: string | null
created_at: string
updated_at: string
processed_at: string | null
}interface ChatMessage {
role: 'user' | 'assistant'
content: string
sources?: SourceChunk[]
timestamp: string
}
interface SourceChunk {
text: string
score: number
document_id: string
filename: string
chunk_index: number
metadata?: Record<string, unknown>
}Status: Not implemented yet
Planned: Real-time progress updates for document processing
Current Status: No rate limiting (MVP phase)
Planned Implementation:
- Per-user rate limits
- Per-endpoint rate limits
- Configurable limits via settings
# 1. Create a knowledge base
KB_ID=$(curl -X POST http://localhost:8004/api/v1/knowledge-bases/ \
-H "Content-Type: application/json" \
-d '{
"name": "Product Docs",
"chunking_strategy": "semantic",
"embedding_model": "text-embedding-3-small"
}' | jq -r '.id')
# 2. Upload a document
DOC_ID=$(curl -X POST http://localhost:8004/api/v1/documents/ \
-F "file=@manual.pdf" \
-F "knowledge_base_id=$KB_ID" | jq -r '.id')
# 3. Poll document status
while true; do
STATUS=$(curl -s http://localhost:8004/api/v1/documents/$DOC_ID/status | jq -r '.status')
if [ "$STATUS" = "completed" ]; then break; fi
sleep 2
done
# 4. Query the knowledge base
curl -X POST http://localhost:8004/api/v1/chat/ \
-H "Content-Type: application/json" \
-d '{
"question": "How do I configure the system?",
"knowledge_base_id": "'$KB_ID'",
"top_k": 5,
"retrieval_mode": "hybrid"
}' | jq .| Version | Date | Changes |
|---|---|---|
| v1.0 | 2026-02-01 | Initial API documentation |
For API issues or questions:
- GitHub Issues: [Repository Link]
- Documentation:
/docs(Swagger UI) - ReDoc:
/redoc
Last Updated: 2026-02-01
📝 Questions? Open an issue | 🌟 Like it? Star the repo | 📖 API Docs: Swagger UI
Version: v1.0
Updated: 2026-02-08