Production-grade framework for building scalable autonomous multi-agent systems
Orchestration engine · Event mesh · Vector memory · Multi-LLM routing · Live dashboard
Quick Start · Architecture · Features · API Docs · Contributing
Agentic AI Framework is a production-ready platform for orchestrating fleets of autonomous AI agents. Agents communicate via an event mesh (Kafka or NATS), share long-term vector memory (Qdrant), coordinate through a DAG-based workflow engine, and are managed through a real-time Next.js dashboard.
User goal → Planner agent decomposes → DAG scheduled → Agents execute in parallel
→ Results synthesised by Coordinator → Report delivered
Click "Launch Demo" on the dashboard to watch all four agents activate simultaneously and process tasks in real time.
┌───────────────────────────────────────────────────────────────────────────┐
│ Agentic AI Framework │
├────────────────────┬──────────────────────┬───────────────────────────────┤
│ Next.js Dashboard │ FastAPI REST / WS │ Orchestration Engine │
│ Light + Dark UI │ JWT · RBAC · OTel │ DAG Scheduler │
│ Real-time agents │ OpenAPI docs │ Parallel task execution │
│ Workflow builder │ SSE streaming chat │ Human-in-the-loop │
│ Memory inspector │ WebSocket chat │ State persistence (Redis) │
├────────────────────┴──────────────────────┴───────────────────────────────┤
│ Agent SDK │
│ PlannerAgent · ExecutorAgent · ResearchAgent · CoordinatorAgent │
│ AgentRegistry · Capability discovery · Health monitoring │
├────────────────────────────────────────────────────────────────────────────┤
│ Event Mesh │
│ Kafka / NATS JetStream — pub/sub, DLQ, replay │
├──────────────────┬──────────────────────┬─────────────────────────────────┤
│ Memory System │ LLM Abstraction │ Tool Registry │
│ Redis (STM) │ OpenAI GPT-4o │ Web search │
│ Qdrant (LTM) │ Anthropic Claude │ Python execution │
│ Semantic RAG │ Ollama (local) │ HTTP / API calls │
│ Auto-summarise │ Cost-aware router │ Plugin system │
├──────────────────┴──────────────────────┴─────────────────────────────────┤
│ Infrastructure │
│ PostgreSQL · Redis · Kafka · Qdrant · Prometheus · Grafana │
│ Docker Compose · Kubernetes + Helm · GitHub CI/CD │
└────────────────────────────────────────────────────────────────────────────┘
| Category | Capabilities |
|---|---|
| Multi-Agent | Planner, Executor, Research, Coordinator, Validator, Reporter agents; dynamic registration; capability discovery |
| Orchestration | DAG execution, parallel tasks, priority queuing, adaptive routing, human-in-the-loop checkpoints |
| Memory | Redis short-term window, Qdrant vector long-term, semantic RAG retrieval, automatic summarisation |
| Event Mesh | Kafka / NATS with at-least-once delivery, dead-letter queue, event replay, distributed fan-out |
| LLM Support | OpenAI, Anthropic, Ollama — cost-aware routing, circuit breakers, streaming, fallback chains |
| Tools | Web search, Python exec, HTTP requests, dynamic plugin registration, per-tool rate limiting |
| Observability | OpenTelemetry tracing, Prometheus metrics, Grafana dashboards, structured logging |
| Security | JWT auth, API key auth, RBAC, tenant isolation, audit logging, secret management |
| Frontend | Real-time dashboard, agent grid, workflow visualiser, streaming chat, memory inspector, light + dark themes |
| Infrastructure | Docker Compose, Kubernetes manifests, Helm chart, HPA (2→20 replicas), rolling deploys |
| CI/CD | GitHub Actions — lint, type-check, test, security scan, Docker build, deploy to staging/prod |
agentic-ai-framework/
├── apps/
│ ├── api/ # FastAPI backend
│ │ ├── main.py # App factory, lifespan, middleware
│ │ ├── routers/ # agents · workflows · chat · memory · monitoring · demo
│ │ ├── middleware/auth.py # JWT + API-key auth, RBAC
│ │ └── services/container.py # Dependency injection root
│ └── web/ # Next.js 15 frontend
│ ├── app/ # App Router pages
│ │ ├── page.tsx # Dashboard (KPIs, agent grid, health)
│ │ ├── agents/ # Agent management + creation
│ │ ├── workflows/ # Workflow builder + execution timeline
│ │ ├── chat/ # Streaming AI chat (REST / SSE / WebSocket)
│ │ ├── memory/ # Memory store + semantic search
│ │ ├── events/ # Event mesh live feed
│ │ └── monitoring/ # Metrics, agent performance, tool stats
│ ├── components/
│ │ ├── layout/sidebar.tsx # Collapsible sidebar + theme toggle
│ │ ├── agents/ # AgentStatusGrid with live status badges
│ │ ├── monitoring/ # MetricCard, SystemHealthPanel
│ │ └── providers/ # QueryProvider, ThemeProvider (light/dark)
│ └── lib/api.ts # Typed API client (axios + interceptors)
├── packages/
│ ├── agents/
│ │ ├── base_agent.py # Abstract base: lifecycle, memory, tools, LLM
│ │ ├── planner_agent.py # Recursive goal decomposition → task DAGs
│ │ ├── executor_agent.py # Code, tool, and reasoning execution
│ │ ├── research_agent.py # Multi-source information retrieval
│ │ ├── coordinator_agent.py # Swarm orchestration + multi-agent debate
│ │ └── registry.py # Capability-based discovery, health tracking
│ ├── orchestrator/engine.py # DAGScheduler + WorkflowOrchestrator
│ ├── memory/memory_manager.py # Unified STM/LTM facade with embeddings
│ ├── eventmesh/
│ │ ├── kafka_bus.py # Kafka producer/consumer, DLQ, replay
│ │ └── nats_bus.py # NATS JetStream alternative
│ ├── llm/providers.py # OpenAI / Anthropic / Ollama + cost router
│ ├── tools/registry.py # Tool registration, permissions, rate limiting
│ └── shared/
│ ├── models.py # 30+ Pydantic domain models
│ ├── config.py # Pydantic settings from environment
│ └── utils.py # Retry, CircuitBreaker, TokenBucket, tracing
├── infrastructure/
│ ├── docker/
│ │ ├── docker-compose.yml # Full local stack (all services)
│ │ ├── Dockerfile.api # Multi-stage Python build
│ │ ├── Dockerfile.web # Standalone Next.js output
│ │ ├── prometheus.yml # Scrape config
│ │ └── otel-collector-config.yaml
│ └── k8s/
│ ├── namespace.yaml # NS, ServiceAccount, RBAC, ConfigMap, Secrets
│ ├── deployments/ # API + Web Deployments + HorizontalPodAutoscaler
│ ├── services/ # ClusterIP Services + Ingress
│ └── helm/agentic-ai/ # Helm chart (Chart.yaml + values.yaml)
├── examples/
│ ├── demos/sales_analysis_demo.py # End-to-end 5-agent pipeline demo
│ └── workflows/sales_analysis_workflow.yaml
├── .github/workflows/ci.yml # Full CI/CD pipeline
├── scripts/start_dev.sh # One-command dev environment
├── pyproject.toml # Python deps (uv / pip)
├── .env.example
└── README.md
| Tool | Version | Notes |
|---|---|---|
| Docker | 24+ | For infrastructure services |
| Python | 3.12+ | Backend |
| Node.js | 20+ | Frontend |
| uv | latest | Fast Python package manager |
git clone https://github.com/YOUR_USERNAME/agentic-ai-framework.git
cd agentic-ai-framework
cp .env.example .envEdit .env and add your API keys:
OPENAI_API_KEY=sk-... # For GPT-4o + embeddings
ANTHROPIC_API_KEY=sk-ant-... # For Claude (optional)No API keys? The demo mode works without any LLM keys — agents simulate tasks locally.
docker compose -f infrastructure/docker/docker-compose.yml up -d \
postgres redis qdrantpip install uv
uv venv && uv pip install -r pyproject.toml
uvicorn apps.api.main:app --reload --port 8010cd apps/web
npm install
npm run dev -- --port 3002Open http://localhost:3002 — the dashboard loads with 4 agents registered.
Click "Launch Demo" to activate all agents and watch them process tasks in real time.
bash scripts/start_dev.shThis checks dependencies, starts Docker services, launches the API and frontend, and prints all URLs.
docker compose -f infrastructure/docker/docker-compose.yml up --build
# Endpoints:
# Dashboard → http://localhost:3002
# API → http://localhost:8010
# API Docs → http://localhost:8010/api/v1/docs
# Grafana → http://localhost:3001 (admin / admin)
# Prometheus → http://localhost:9090
# Qdrant UI → http://localhost:6333/dashboardfrom packages.agents.base_agent import BaseAgent, AgentContext
from packages.shared.models import AgentConfig, AgentType, TaskDefinition
from typing import Any, Dict, Optional
class MyAgent(BaseAgent):
def __init__(self, context: Optional[AgentContext] = None):
config = AgentConfig(
name="my-agent",
type=AgentType.EXECUTOR,
system_prompt="You are specialised in...",
model_provider="openai",
model_name="gpt-4o",
)
super().__init__(config, context)
async def plan(self, task: TaskDefinition) -> Dict[str, Any]:
resp = await self.chat(f"Plan: {task.description}")
return {"steps": resp.content}
async def execute(self, task: TaskDefinition, plan: Dict[str, Any]) -> Any:
resp = await self.chat(f"Execute: {plan['steps']}")
return resp.content
# Optional: self-critique pass
async def reflect(self, task: TaskDefinition, output: Any) -> Optional[Any]:
return outputRegister it:
from packages.agents.registry import AgentRegistry
registry = AgentRegistry()
agent = MyAgent(context=context)
await registry.register(agent)
asyncio.create_task(agent.run())name: sales-analysis
tasks:
- task_id: t1
name: Research
agent_type: research
input_data:
topic: "industry benchmarks"
allow_parallel: true
- task_id: t2
name: Analyse
agent_type: executor
dependencies: [t1]
- task_id: t3
name: Report
agent_type: coordinator
dependencies: [t1, t2]
requires_human_approval: falseExecute via API:
curl -X POST http://localhost:8010/api/v1/workflows/execute \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"workflow_id": "...", "input_data": {}}'curl -X POST http://localhost:8010/api/v1/auth/token \
-d '{"username": "admin", "password": "demo"}'
# → {"access_token": "eyJ...", "token_type": "bearer"}| Method | Path | Description |
|---|---|---|
POST |
/api/v1/auth/token |
Get JWT token |
GET |
/api/v1/agents |
List all agents |
POST |
/api/v1/agents |
Create a new agent |
DELETE |
/api/v1/agents/{id} |
Terminate agent |
POST |
/api/v1/agents/{id}/tasks |
Submit task to agent |
POST |
/api/v1/workflows |
Create workflow |
POST |
/api/v1/workflows/execute |
Execute workflow |
GET |
/api/v1/workflows/executions |
List executions |
POST |
/api/v1/chat/ |
Non-streaming chat |
POST |
/api/v1/chat/stream |
SSE streaming chat |
WS |
/api/v1/chat/ws/{session} |
WebSocket chat |
POST |
/api/v1/memory/store |
Store a memory |
POST |
/api/v1/memory/query |
Semantic memory search |
GET |
/api/v1/monitoring/health |
System health |
POST |
/api/v1/demo/launch |
Start demo tasks |
POST |
/api/v1/demo/reset |
Reset agents to idle |
Interactive Swagger UI: http://localhost:8010/api/v1/docs
| Layer | Technology | Purpose |
|---|---|---|
| Backend | Python 3.12 + FastAPI | REST / WebSocket API |
| Frontend | Next.js 15 + TypeScript | Real-time dashboard |
| Styling | TailwindCSS + shadcn/ui | Light + dark themes |
| Messaging | Apache Kafka / NATS | Event mesh |
| Cache | Redis 7 | Short-term memory, state |
| Database | PostgreSQL 16 | Persistent storage |
| Vector DB | Qdrant | Long-term vector memory |
| LLM APIs | OpenAI, Anthropic, Ollama | Multi-model support |
| Observability | Prometheus + Grafana + OTel | Metrics + tracing |
| Container | Docker + Kubernetes + Helm | Deployment |
| CI/CD | GitHub Actions | Build + deploy pipeline |
Copy .env.example to .env. Key variables:
# LLM providers (at least one required for non-demo mode)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Infrastructure (defaults work with Docker Compose)
DATABASE_URL=postgresql+asyncpg://agentic:agentic_secret@localhost:5432/agentic_db
REDIS_URL=redis://localhost:6379/0
QDRANT_URL=http://localhost:6333
KAFKA_BOOTSTRAP_SERVERS=localhost:9092
# Auth
JWT_SECRET=change-me-32-chars-minimum
SECRET_KEY=change-me-in-production
# Frontend
NEXT_PUBLIC_API_URL=http://localhost:8010/api/v1See .env.example for the full list.
# Create namespace and apply manifests
kubectl apply -f infrastructure/k8s/namespace.yaml
kubectl apply -f infrastructure/k8s/deployments/
kubectl apply -f infrastructure/k8s/services/
# Or install with Helm
helm install agentic infrastructure/k8s/helm/agentic-ai \
--namespace agentic-ai \
--set secrets.openaiApiKey="sk-..." \
--set ingress.host="agentic.yourdomain.com"Push to develop → deploys to staging.
Push to main → deploys to production.
See .github/workflows/ci.yml for the full pipeline.
- LangGraph stateful agent graphs
- Drag-and-drop visual workflow builder
- GraphQL API layer
- Reinforcement learning agent training
- Swarm intelligence protocols
- Federated agent clusters (cross-region)
- AI marketplace for community plugins
- Carbon-aware model routing
# Fork → clone → branch
git checkout -b feat/your-feature
# Install dev deps
uv sync --all-extras
# Run checks
uv run ruff check .
uv run mypy packages apps/api
uv run pytest tests/ -v
# Submit PRMIT — see LICENSE.