Agent-First RAG Platform

Five PydanticAI governance-domain agents with a shared MCP server pool, Pydantic AI Gateway, and AG-UI streaming — built for industrial control system documentation (FANUC, Siemens, PLC).

Architecture

Control Plane                         Data Plane
┌─────────────────────────┐   ┌──────────────────────────┐
│  Pydantic AI Gateway    │   │  Vespa (hybrid search)   │
│  (provider routing,     │   │  Ollama (embeddings/LLM) │
│   cost limits, OTel)    │   │  Docling Serve (parsing) │
├─────────────────────────┤   │  PostgreSQL + pgvector   │
│  5 Governance Agents    │   │  OTel Collector → Jaeger │
│  ┌───────┬───────┐      │   └──────────────────────────┘
│  │Query  │Ingest │      │
│  ├───────┼───────┤      │   UI
│  │Eval   │Memory │      │   ┌──────────────────────────┐
│  ├───────┴───────┤      │   │  CopilotKit + Next.js    │
│  │    System     │      │   │  (AG-UI SSE streaming)   │
│  └───────────────┘      │   └──────────────────────────┘
└─────────────────────────┘

Agents

Agent	Port	Domain	Primary Endpoint
QueryAgent	8010	Runtime Intelligence	`/search`
IngestionAgent	8011	Knowledge Construction	`/ingest`
EvaluationAgent	8012	Quality Assurance	`/score`
MemoryAgent	8013	State Governance	`/store`
SystemAgent	8014	Infrastructure Control	`/execute`

Deployment Modes

In-process (default): Gateway loads all 5 agents via httpx.ASGITransport — single container
Container: Each agent runs as a separate service with HTTP routing

Set AGENT_DEPLOY_MODE=container to switch modes.

Quick Start

Prerequisites

Docker Desktop 20.10+ with Docker Compose 2.0+
Python 3.12+, uv package manager
16GB+ RAM (GPU recommended)

1. Start Core Services

# Core stack (Gateway, Vespa, Ollama, Docling, PostgreSQL, UI)
docker compose -f infra/compose/docker-compose.yml up -d

# With observability (adds OTel Collector + Jaeger)
docker compose -f infra/compose/docker-compose.yml --profile observability up -d

# With GPU acceleration
docker compose -f infra/compose/docker-compose.yml -f infra/compose/docker-compose.gpu.yml up -d

2. Install Dependencies

uv sync

3. Run Tests

uv run python -m pytest tests/ -v

4. Start Gateway (local dev)

uv run uvicorn services.gateway.app.main:create_app --factory --reload --port 8002

5. Query

curl -X POST http://localhost:8002/v1/run \
  -H "Content-Type: application/json" \
  -d '{"agent": "query-agent", "prompt": "FANUC alarm codes", "context": {"hits": 5}}'

Project Structure

RAG/
├── agents/                      # 5 governance-domain agents
│   ├── query/                  # QueryAgent (retrieval + ranking)
│   ├── ingestion/              # IngestionAgent (Docling pipeline)
│   ├── evaluation/             # EvaluationAgent (quality scoring)
│   ├── memory/                 # MemoryAgent (session + long-term)
│   └── system/                 # SystemAgent (git/shell/MCP)
├── libs/rag_common/            # Shared libraries
│   ├── clients/                # Vespa, Ollama, agent_interface
│   ├── models/                 # Pydantic DTOs
│   └── embeddings.py           # Two-tier embedding cache
├── services/gateway/           # Pydantic AI Gateway (PAIG)
├── infra/                      # Docker Compose, initdb, OTel
├── mcp/                        # MCP server configurations
├── servers/                    # MCP tool files (glossary, etc.)
├── skills/                     # Reusable skill functions
├── ui/                         # CopilotKit + Next.js frontend
├── tests/                      # Unit, integration, retrieval, benchmarks
├── infra/migrations/            # Database migration files (V001__*.sql)
├── Docs/                       # Governance documents
│   ├── ADRs/                   # Architecture Decision Records
│   ├── PDR.md                  # Project Design Record
│   └── MEMORY.md               # Memory systems analysis
├── vespa-app/                  # Vespa application package
├── reports/                    # Parity matrix, refactor analysis
└── Obsolete/                   # Retired legacy code (preserved)

Service Endpoints

Service	Port	URL	Purpose
Gateway	8002	http://localhost:8002	Agent orchestration + AG-UI
Vespa	8081	http://localhost:8081	Hybrid search engine
Ollama	11434	http://localhost:11434	Embeddings + LLM
Docling	5001	http://localhost:5001	Document parsing
PostgreSQL	5432	localhost:5432	Agent memory + pgvector
Jaeger	16686	http://localhost:16686	Trace visualization
UI	3000	http://localhost:3000	CopilotKit frontend

Gateway Endpoints

GET /health — Health check
GET /v1/providers — Available LLM providers
GET /v1/agents — Registered agents
POST /v1/run — Execute agent action (JSON)
POST /v1/ag-ui — AG-UI SSE streaming

Documentation

Sources of Truth

ADR — Architecture Decision Records
PDR — Project Design Record
Repo Manifest — File-level source of truth

Guides

Memory Systems — Memory architecture analysis and data-flow documentation
GPU Acceleration — Use docker-compose.gpu.yml overlay (see Quick Start above)

Development

Running Tests

# All tests
uv run python -m pytest tests/ -v

# Gateway tests only
uv run python -m pytest tests/test_gateway.py tests/integration/ -v

# Retrieval tests
uv run python -m pytest tests/retrieval/ -v

Governance Checks

uv run python governance_check.py
uv run python parity_check.py

Technology Stack

Agents: PydanticAI with multi-agent delegation (Levels 2-5)
Gateway: Pydantic AI Gateway (AGPL-3.0, self-hosted)
Search: Vespa.ai (hybrid BM25 + dense retrieval)
Embeddings: Ollama qwen3-embedding:0.6b (1024 dims)
Memory: PostgreSQL + pgvector + Vespa agent memory
Observability: OpenTelemetry → Jaeger
UI: CopilotKit + Next.js (AG-UI protocol)
MCP: Shared server pool with role-based access
Containers: Docker Compose with lockfile-based builds (non-root)

References

PydanticAI: https://ai.pydantic.dev/
Pydantic AI Gateway: https://ai.pydantic.dev/gateway/
AG-UI Protocol: https://docs.ag-ui.com/
Vespa: https://docs.vespa.ai/
CopilotKit: https://docs.copilotkit.ai/
MCP: https://modelcontextprotocol.io/

Version: 1.0.0 (Agent-First Platform)
Status: Operational

Name		Name	Last commit message	Last commit date
Latest commit History 569 Commits
.agents		.agents
.github/workflows		.github/workflows
.windsurf		.windsurf
Docs		Docs
Obsolete		Obsolete
agents		agents
hooks		hooks
infra		infra
libs		libs
mcp		mcp
models		models
reports/refactor-analysis		reports/refactor-analysis
scripts		scripts
servers		servers
services		services
skills		skills
tests		tests
ui		ui
vespa-app		vespa-app
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
DRIFT-LOG.md		DRIFT-LOG.md
README.md		README.md
conftest.py		conftest.py
governance_check.py		governance_check.py
parity_check.py		parity_check.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent-First RAG Platform

Architecture

Agents

Deployment Modes

Quick Start

Prerequisites

1. Start Core Services

2. Install Dependencies

3. Run Tests

4. Start Gateway (local dev)

5. Query

Project Structure

Service Endpoints

Gateway Endpoints

Documentation

Sources of Truth

Guides

Development

Running Tests

Governance Checks

Technology Stack

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agent-First RAG Platform

Architecture

Agents

Deployment Modes

Quick Start

Prerequisites

1. Start Core Services

2. Install Dependencies

3. Run Tests

4. Start Gateway (local dev)

5. Query

Project Structure

Service Endpoints

Gateway Endpoints

Documentation

Sources of Truth

Guides

Development

Running Tests

Governance Checks

Technology Stack

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages