Skip to content

Flux29/RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

569 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agent-First RAG Platform

Five PydanticAI governance-domain agents with a shared MCP server pool, Pydantic AI Gateway, and AG-UI streaming — built for industrial control system documentation (FANUC, Siemens, PLC).

Architecture

Control Plane                         Data Plane
┌─────────────────────────┐   ┌──────────────────────────┐
│  Pydantic AI Gateway    │   │  Vespa (hybrid search)   │
│  (provider routing,     │   │  Ollama (embeddings/LLM) │
│   cost limits, OTel)    │   │  Docling Serve (parsing) │
├─────────────────────────┤   │  PostgreSQL + pgvector   │
│  5 Governance Agents    │   │  OTel Collector → Jaeger │
│  ┌───────┬───────┐      │   └──────────────────────────┘
│  │Query  │Ingest │      │
│  ├───────┼───────┤      │   UI
│  │Eval   │Memory │      │   ┌──────────────────────────┐
│  ├───────┴───────┤      │   │  CopilotKit + Next.js    │
│  │    System     │      │   │  (AG-UI SSE streaming)   │
│  └───────────────┘      │   └──────────────────────────┘
└─────────────────────────┘

Agents

Agent Port Domain Primary Endpoint
QueryAgent 8010 Runtime Intelligence /search
IngestionAgent 8011 Knowledge Construction /ingest
EvaluationAgent 8012 Quality Assurance /score
MemoryAgent 8013 State Governance /store
SystemAgent 8014 Infrastructure Control /execute

Deployment Modes

  • In-process (default): Gateway loads all 5 agents via httpx.ASGITransport — single container
  • Container: Each agent runs as a separate service with HTTP routing

Set AGENT_DEPLOY_MODE=container to switch modes.

Quick Start

Prerequisites

  • Docker Desktop 20.10+ with Docker Compose 2.0+
  • Python 3.12+, uv package manager
  • 16GB+ RAM (GPU recommended)

1. Start Core Services

# Core stack (Gateway, Vespa, Ollama, Docling, PostgreSQL, UI)
docker compose -f infra/compose/docker-compose.yml up -d

# With observability (adds OTel Collector + Jaeger)
docker compose -f infra/compose/docker-compose.yml --profile observability up -d

# With GPU acceleration
docker compose -f infra/compose/docker-compose.yml -f infra/compose/docker-compose.gpu.yml up -d

2. Install Dependencies

uv sync

3. Run Tests

uv run python -m pytest tests/ -v

4. Start Gateway (local dev)

uv run uvicorn services.gateway.app.main:create_app --factory --reload --port 8002

5. Query

curl -X POST http://localhost:8002/v1/run \
  -H "Content-Type: application/json" \
  -d '{"agent": "query-agent", "prompt": "FANUC alarm codes", "context": {"hits": 5}}'

Project Structure

RAG/
├── agents/                      # 5 governance-domain agents
│   ├── query/                  # QueryAgent (retrieval + ranking)
│   ├── ingestion/              # IngestionAgent (Docling pipeline)
│   ├── evaluation/             # EvaluationAgent (quality scoring)
│   ├── memory/                 # MemoryAgent (session + long-term)
│   └── system/                 # SystemAgent (git/shell/MCP)
├── libs/rag_common/            # Shared libraries
│   ├── clients/                # Vespa, Ollama, agent_interface
│   ├── models/                 # Pydantic DTOs
│   └── embeddings.py           # Two-tier embedding cache
├── services/gateway/           # Pydantic AI Gateway (PAIG)
├── infra/                      # Docker Compose, initdb, OTel
├── mcp/                        # MCP server configurations
├── servers/                    # MCP tool files (glossary, etc.)
├── skills/                     # Reusable skill functions
├── ui/                         # CopilotKit + Next.js frontend
├── tests/                      # Unit, integration, retrieval, benchmarks
├── infra/migrations/            # Database migration files (V001__*.sql)
├── Docs/                       # Governance documents
│   ├── ADRs/                   # Architecture Decision Records
│   ├── PDR.md                  # Project Design Record
│   └── MEMORY.md               # Memory systems analysis
├── vespa-app/                  # Vespa application package
├── reports/                    # Parity matrix, refactor analysis
└── Obsolete/                   # Retired legacy code (preserved)

Service Endpoints

Service Port URL Purpose
Gateway 8002 http://localhost:8002 Agent orchestration + AG-UI
Vespa 8081 http://localhost:8081 Hybrid search engine
Ollama 11434 http://localhost:11434 Embeddings + LLM
Docling 5001 http://localhost:5001 Document parsing
PostgreSQL 5432 localhost:5432 Agent memory + pgvector
Jaeger 16686 http://localhost:16686 Trace visualization
UI 3000 http://localhost:3000 CopilotKit frontend

Gateway Endpoints

  • GET /health — Health check
  • GET /v1/providers — Available LLM providers
  • GET /v1/agents — Registered agents
  • POST /v1/run — Execute agent action (JSON)
  • POST /v1/ag-ui — AG-UI SSE streaming

Documentation

Sources of Truth

  • ADR — Architecture Decision Records
  • PDR — Project Design Record
  • Repo Manifest — File-level source of truth

Guides

  • Memory Systems — Memory architecture analysis and data-flow documentation
  • GPU Acceleration — Use docker-compose.gpu.yml overlay (see Quick Start above)

Development

Running Tests

# All tests
uv run python -m pytest tests/ -v

# Gateway tests only
uv run python -m pytest tests/test_gateway.py tests/integration/ -v

# Retrieval tests
uv run python -m pytest tests/retrieval/ -v

Governance Checks

uv run python governance_check.py
uv run python parity_check.py

Technology Stack

  • Agents: PydanticAI with multi-agent delegation (Levels 2-5)
  • Gateway: Pydantic AI Gateway (AGPL-3.0, self-hosted)
  • Search: Vespa.ai (hybrid BM25 + dense retrieval)
  • Embeddings: Ollama qwen3-embedding:0.6b (1024 dims)
  • Memory: PostgreSQL + pgvector + Vespa agent memory
  • Observability: OpenTelemetry → Jaeger
  • UI: CopilotKit + Next.js (AG-UI protocol)
  • MCP: Shared server pool with role-based access
  • Containers: Docker Compose with lockfile-based builds (non-root)

References


Version: 1.0.0 (Agent-First Platform)
Status: Operational

About

Agent-first RAG system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors