AgentField — The AI Backend

Build and scale AI agents like APIs. Deploy, observe, and prove.

AI has outgrown chatbots and prompt orchestrators. Backend agents need backend infrastructure.

Docs · Quick Start · Python SDK · Go SDK · TypeScript SDK · REST API · Examples · Discord

Now includes Harness Orchestration — multi-turn coding agents with Claude Code, Codex, Gemini CLI, and OpenCode

AgentField is an open-source control plane that lets you build AI agents callable by any service in your stack - frontends, backends, other agents, cron jobs - just like any other API. You write agent logic in Python, Go, or TypeScript. AgentField turns it into production infrastructure: routing, coordination, memory, async execution, and cryptographic audit trails. Every function becomes a REST endpoint. Every agent gets a cryptographic identity. Every decision is traceable.

agentfield-quick-start.mp4

_{One prompt → a running containerized production ready multi-agent backend. No glue code, start using the agent API!}

Build production agents with a prompt.

Describe the system in one line. Get a production-ready multi-agent backend. Works in Claude Code, Codex, Gemini CLI, OpenCode, Aider, Windsurf, and Cursor.

curl -fsSL https://agentfield.ai/install.sh | bash

Then in your coding agent, paste any spec with /agentfield :

/agentfield Build a claims processor with risk scoring, pattern detection,
and human approval for low-confidence decisions.

You get a Docker Compose stack wired up end-to-end — the agent, the control plane, and a production ready REST API endpoint you can paste and curl into a terminal to try it. See it in action →

The DX you get

Best in class Python (or Go / TypeScript) DX. With least intrusive abstraction. No DSL, no YAML, no graph wiring.

from agentfield import Agent, AIConfig
from pydantic import BaseModel

app = Agent(
    node_id="claims-processor",
    version="2.1.0",# Canary deploys, A/B testing, blue-green rollouts
    ai_config=AIConfig(model="anthropic/claude-sonnet-4-20250514"),
)

class Decision(BaseModel):
    action: str# "approve", "deny", "escalate"
    confidence: float
    reasoning: str

@app.reasoner(tags=["insurance", "critical"])
async def evaluate_claim(claim: dict) -> dict:

    # Structured AI judgment - returns typed Pydantic output
    decision = await app.ai(
        system="Insurance claims adjuster. Evaluate and decide.",
        user=f"Claim #{claim['id']}: {claim['description']}",
        schema=Decision,
    )

    if decision.confidence < 0.85:
        # Human approval - suspends execution, notifies via webhook, resumes when approved
        await app.pause(
            approval_request_id=f"claim-{claim['id']}",
            approval_request_url=f"https://internal.acme.com/approvals/claim-{claim['id']}",
            expires_in_hours=48,
        )

    # Route to the next agent - traced through the control plane
    await app.call("notifier.send_decision", input={
        "claim_id": claim["id"],
        "decision": decision.model_dump(),
    })

    return decision.model_dump()

app.run()
# This single line exposes: POST /api/v1/execute/claims-processor.evaluate_claim
# The agent auto-registers with the control plane, gets a cryptographic identity, and every
# execution produces a verifiable, tamper-proof audit trail.

What you just saw: app.ai() calls an LLM and returns structured output. app.pause() suspends for human approval. app.call() routes to other agents through the control plane. app.run() auto-exposes everything as REST. Read the full docs →

Prefer to scaffold by hand? (Python / Go / TypeScript / Docker)

af init my-agent --defaults                            # Scaffold agent
cd my-agent && pip install -r requirements.txt
af server          # Terminal 1 → Dashboard at http://localhost:8080
python main.py     # Terminal 2 → Agent auto-registers

# Call your agent
curl -X POST http://localhost:8080/api/v1/execute/my-agent.demo_echo \
  -H "Content-Type: application/json" \
  -d '{"input": {"message": "Hello!"}}'

# Go
af init my-agent --defaults --language go && cd my-agent && go run .

# TypeScript
af init my-agent --defaults --language typescript && cd my-agent && npm install && npm run dev

# Docker (control plane only)
docker run -p 8080:8080 agentfield/control-plane:latest

Deployment guide → for Docker Compose, Kubernetes, and production setups.

How AgentField fits in your stack

Most agent tools help you write agent logic. AgentField is what runs it in production — the operating layer that makes agents callable by software, durable across failures, governed by policy, and provable by audit.

	Frameworks _{LangChain · CrewAI · PydanticAI · OpenAI Agents SDK}	Workflow engines _{Temporal · Airflow}	Visual builders _{n8n · Zapier}	AgentField
Build agent logic (prompts, tools, structured output)	●	—	—	●
Callable production ready REST APIs out-of-box	—	◐	●	●
Async + retries + webhooks	—	●	◐	●
Memory scopes (global · agent · session · run)	◐	—	—	●
Service discovery + cross-agent calls	—	—	—	●
Distributed agents	—	—	—	●
Tamper-proof, verifiable audit per execution	—	—	—	●
Harness orchestration (Claude Code · Codex · CLI)	—	—	—	●
Identity and Access Management (IAM) for agents	—	—	—	●
Fleet observability (DAGs · metrics · traces)	—	◐	—	●
Multi-language SDKs (Python · Go · TypeScript)	◐	●	—	●

_{● full · ◐ partial · — not the focus}

Use a framework when you're proving behavior. Use AgentField when agents need to be production systems — callable by software, coordinating across services, surviving failures, and governed under audit.

Full comparison & decision guide →

What You Get

Build - Python, Go, or TypeScript. Every function becomes a REST endpoint.

Reasoners & Skills - @app.reasoner() for AI judgment, @app.skill() for deterministic code
Structured AI - app.ai(schema=MyModel) → typed Pydantic/Zod output from any LLM
Harness - app.harness("Fix the bug") dispatches multi-turn tasks to Claude Code, Codex, Gemini CLI, or OpenCode
Cross-Agent Calls - app.call("other-agent.func") routes through the control plane with full tracing
Discovery - app.discover(tags=["ml*"]) finds agents and capabilities across the mesh. tools="discover" lets LLMs auto-invoke them.
Memory - app.memory.set() / .get() / .search() - KV + vector search, four scopes, no Redis needed

Run - Production infrastructure for non-deterministic AI.

Async Execution - Fire-and-forget with webhooks, SSE streaming, retries. No timeout limits - agents run for hours or days.
Human-in-the-Loop - app.pause() suspends execution for human approval. Crash-safe, durable, audited.
Canary Deployments - Traffic weight routing, A/B testing, blue-green deploys. Roll out agent versions at 5% → 50% → 100%.
Observability - Automatic workflow DAGs, Prometheus /metrics, structured logs, execution timeline.

Govern - IAM for AI agents. Identity, access control, and audit trails - built in.

Cryptographic Identity - Every agent gets a W3C DID (decentralized identifier) - not a shared API key. Agents authenticate to each other the way services authenticate with mTLS, but with cryptographic signatures that travel with the agent.
Verifiable Credentials - Tamper-proof receipt for every execution. Offline-verifiable: af vc verify audit.json.
Policy Enforcement - Tag-based policy gates with cryptographic verification. "Only agents tagged 'finance' can call this" - enforced by infrastructure, not prompts.

See the full production-ready feature set →

▼ Click to expand full capabilities

AI & LLM

Feature	How
Structured output (Pydantic/Zod)	`app.ai(schema=MyModel)`
Multi-turn coding agents	`app.harness("task", provider="claude-code")`
LLM auto-discovers agents and tools	`app.ai(tools="discover")`
Multimodal (text, image, audio)	`app.ai("Describe", image_url="...")`
Streaming responses	`app.ai("...", stream=True)`
100+ LLMs via LiteLLM	`AIConfig(model="anthropic/claude-sonnet-4-20250514")`
Temperature, max tokens, format	`app.ai(..., temperature=0.2)`

Agent Mesh & Discovery

Feature	How
Cross-agent calls with tracing	`app.call("agent.func", input={...})`
Discover agents by tag (wildcards)	`app.discover(tags=["ml*"])`
Discover by health status	`app.discover(health_status="active")`
Agent routers (namespacing)	`AgentRouter(prefix="billing")`
Auto context propagation	Workflow, session, actor IDs forwarded
Parallel agent execution	`asyncio.gather(app.call(...), ...)`
Auto-registration on startup	Service mesh with zero config

Execution Engine

Feature	How
Sync execution (REST)	`POST /api/v1/execute/{agent}.{func}`
Async (fire-and-forget)	`POST /api/v1/execute/async/{agent}.{func}`
Webhooks + HMAC-SHA256 signing	`AsyncConfig(webhook_url="...", secret="...")`
SSE streaming (real-time)	`/api/v1/execute/stream/{id}`
No timeout limits (hours/days)	Control plane allows unlimited duration
Execution polling	`GET /api/v1/executions/{id}`
Batch status checks	`POST /api/v1/executions/batch-status`
Progress updates mid-execution	Intermediate payloads during long tasks
Auto retries + exponential backoff	Transparent - control plane handles
Backpressure + queue depth limits	Fair scheduling, circuit breakers
Durable queue (PostgreSQL)	Atomic lease-based processing

Memory (Distributed State)

Feature	How
Key-value storage	`app.memory.set(key, value)` / `.get(key)`
Vector search (semantic)	`app.memory.search(embedding, top_k=5)`
Four scopes	Global, agent, session, run
Reactive memory events	`@app.memory.on_change("order_*")`
Metadata filtering	Filter stored values by metadata
Zero dependencies	Built into control plane - no Redis

Human-in-the-Loop

Feature	How
Durable pause/resume	`await app.pause(reason="...")`
Approval workflows with UI	`approval_request_url` for reviewers
Configurable timeouts	`expires_in_hours=24` + auto-escalation
Crash-safe state	Survives agent restarts

Canary Deployments & Versioning

Feature	How
Traffic weight routing	5% → 50% → 100% rollouts
A/B testing	50/50 splits with `X-Routed-Version`
Blue-green deployments	Instant weight switch, zero downtime
Per-version health tracking	Unhealthy versions auto-removed
Agent lifecycle states	pending → starting → ready → degraded → offline

Identity & Governance

Feature	How
Cryptographic identity per agent	Auto-generated W3C DID + Ed25519 keys
Verifiable Credentials	Tamper-proof receipt per execution
Offline VC verification	`af vc verify audit.json`
Tag-based access policies	ALLOW/DENY rules on caller → target tags
Cryptographically signed requests	Ed25519 signatures on cross-agent calls
VC hierarchy (3 tiers)	Platform → Node → Function control
Agent notes (audit log)	`app.note("Decision", tags=["critical"])`
Non-repudiation	Cryptographic proof of actions
Permission request workflows	Auto-created when access denied

Observability & Fleet Management

Feature	How
Automatic DAG visualization	Workflow graphs in dashboard
Prometheus metrics	`/metrics` out of the box
Structured JSON logging	Automatic from SDK
Execution timeline	Chronological decision trace
Health checks (K8s-ready)	`/health`, `/ready` endpoints
Correlation IDs	`X-Workflow-ID`, `X-Execution-ID`
Workflow DAG API	`GET /api/v1/workflows/{id}/dag`
Agent heartbeat monitoring	Auto health status transitions

Harness (Multi-turn Coding Agents)

Feature	How
4 providers	Claude Code, Codex, Gemini CLI, OpenCode
Schema-constrained output	`schema=ResultModel` (Pydantic/Zod)
Cost capping	`max_budget_usd=3.0`
Turn limiting	`max_turns=100`
Tool access control	`tools=["Read", "Write", "Bash"]`
Environment injection	`env={"KEY": "value"}`
System prompt override	`system_prompt="..."`
Multi-layer output recovery	Cosmetic repair → retry → full retry

Connector API (Fleet Management)

Feature	How
Remote agent management	`/connector/reasoners`
Version traffic control	`/connector/.../weight`
Bearer token auth	`AGENTFIELD_CONNECTOR_TOKEN`
Air-gapped deployment	Outbound WebSocket only

Developer Experience

Feature	How
CLI scaffolding	`af init my-agent --defaults --language python\|go\|typescript`
Local dev with dashboard	`af server` → http://localhost:8080
Hot reload	`af dev` auto-detects changes
Auto-REST from decorators	Every `@app.reasoner()` → `POST /api/v1/execute/...`
Python, Go, TypeScript SDKs	Native patterns per language
MCP server integration	`af add --mcp --url <server>`
Config storage API	`POST /api/v1/configs/:key` - database-backed
Docker + Kubernetes ready	Stateless control plane, horizontal scaling

Explore all features in detail →

Built With AgentField

Autonomous Engineering Team _{One API call spins up PM, architect, coders, QA, reviewers - hundreds of coordinated agents that plan, build, test, and ship.} View project →	Deep Research Engine _{Recursive research backend. Spawns parallel agents, evaluates quality, generates deeper agents, and recurses -10,000+ agents per query.} View project →
Reactive MongoDB Intelligence _{Atlas Triggers + agent reasoning. Documents arrive raw and leave enriched - risk scores, pattern detection, evidence chains.} View project →	Autonomous Security Audit _{250 coordinated agents trace every vulnerability source-to-sink and adversarially verify each finding. Confirmed exploits, not pattern flags.} View project →
CloudSecurity AF _{AI-native cloud infrastructure security scanner that performs shift-left attack path analysis directly from IaC, prioritizing the most dangerous risk chains before deployment.} View project →	Agentic PR Reviewer _{Builds a custom review strategy for every PR - spawns parallel reviewer agents with runtime-crafted prompts, adversarially challenges its own findings, and posts evidence-grounded inline comments.} View project →

See all examples →

Built something with AgentField? Submit your project to be featured on the examples page.

See It In Action

_{Real-time workflow DAGs · Execution traces · Agent fleet management · Audit trails}

Architecture

The control plane is a stateless Go service. Agents connect from anywhere - your laptop, Docker, Kubernetes. They register capabilities, the control plane routes calls between them, tracks execution as DAGs, and enforces policies. Full architecture docs →

Learn More

The thinking behind AgentField - essays on AI backends, harness orchestration, and the infrastructure production agents actually need.

What is harness orchestration? _{The atomic unit of intelligence is climbing from the model call to the autonomous harness - and what changes when it does.} Read post →	Part 1: The Black Box _{Treating harnesses like Claude Code and Codex as autonomous, embodied, persistent computational entities.} Read post →
Part 2: Engineering the Membrane _{Shaping the boundary surface of a harness across four engineerable dimensions: workspace, drift, verifier placement, and recovery budget.} Read post →	The AI Backend _{Our thesis: in five years every serious software company will run an AI backend - a reasoning layer that makes the decisions that used to be hardcoded.} Read post →
IAM for AI Backends _{Agents need identity, not API keys - how decentralized identifiers and verifiable credentials make agent-to-agent delegation auditable and accountable.} Read post →

Documentation

vs Agent Frameworks - How AgentField compares to LangChain, CrewAI, and workflow engines
Full Documentation

Community

GitHub Issues · Documentation · Examples

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1,075 Commits
.github		.github
assets		assets
control-plane		control-plane
deployments		deployments
docs		docs
examples		examples
scripts		scripts
sdk		sdk
skills/agentfield		skills/agentfield
tests/functional		tests/functional
.cliff.toml		.cliff.toml
.coverage-gate.toml		.coverage-gate.toml
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.goreleaser.yml		.goreleaser.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
VERSION		VERSION
coverage-baseline.json		coverage-baseline.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentField — The AI Backend

Build and scale AI agents like APIs. Deploy, observe, and prove.

Build production agents with a prompt.

The DX you get

How AgentField fits in your stack

What You Get

▼ Click to expand full capabilities

AI & LLM

Agent Mesh & Discovery

Execution Engine

Memory (Distributed State)

Human-in-the-Loop

Canary Deployments & Versioning

Identity & Governance

Observability & Fleet Management

Harness (Multi-turn Coding Agents)

Connector API (Fleet Management)

Developer Experience

Built With AgentField

See It In Action

Architecture

Learn More

Documentation

Community

License

About

Uh oh!

Releases 326

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentField — The AI Backend

Build and scale AI agents like APIs. Deploy, observe, and prove.

Build production agents with a prompt.

The DX you get

How AgentField fits in your stack

What You Get

▼ Click to expand full capabilities

AI & LLM

Agent Mesh & Discovery

Execution Engine

Memory (Distributed State)

Human-in-the-Loop

Canary Deployments & Versioning

Identity & Governance

Observability & Fleet Management

Harness (Multi-turn Coding Agents)

Connector API (Fleet Management)

Developer Experience

Built With AgentField

See It In Action

Architecture

Learn More

Documentation

Community

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 326

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages