-
Notifications
You must be signed in to change notification settings - Fork 0
Agent Deployment Modes
Same agent, three ways to ship it. Each mode is one line of glue away from the next — you can start small and eject the agent into autonomy when it earns its own deployment.
┌─────────────────────────┐
library mode │ agent { skills { } } │ in your JVM
└────────────┬────────────┘
│ +1 line
▼
┌─────────────────────────┐
hosted mode │ McpServer.from(agent) │ in your JVM, but addressable
└────────────┬────────────┘
│ +1 line
▼
┌─────────────────────────┐
autonomous mode │ McpRunner.serve(...) │ its own process / JAR / image
└─────────────────────────┘
| Mode | Glue | Where it runs | Who can call it |
|---|---|---|---|
| Library | agent<IN, OUT>("...") { skills { ... } } |
Inside your existing JVM, in-process | Your Kotlin code, with full type safety |
| Hosted | + McpServer.from(agent) { expose("...") }.start()
|
Inside your existing JVM, plus an MCP endpoint | Your Kotlin code (typed) AND any MCP client (over HTTP) |
| Autonomous | fun main(args) = exitProcess(McpRunner.serve(agent, args)) |
Its own process / JAR / Docker image / native binary | Any MCP client, anywhere on the network |
The default. An agent is just a Kotlin function with type-safe inputs and outputs:
val classifier = agent<String, Category>("classifier") {
skills {
skill<String, Category>("classify", "Classifies free-text input") {
implementedBy { text -> Category.of(text) }
}
}
}
val result: Category = classifier("Order status request")- Zero overhead. No serialization, no wire, no protocol. Direct Kotlin call.
-
Type-safe at compile time.
classifier(42)won't compile. - Deploy boundary is the host app. The agent ships when your app ships.
Best for: embedding agent logic in an existing Kotlin service, batch jobs, scripts, libraries you publish to Maven Central.
Add McpServer.from(agent).start() inside your service. The agent stays callable internally (typed, zero overhead), and external MCP clients can also reach it over HTTP:
fun main() {
val classifier = agent<String, Category>("classifier") { /* same as above */ }
// Existing internal callers — unchanged, still typed:
val result: Category = classifier("Order status request")
// New: the same agent, also addressable over MCP
val server = McpServer.from(classifier) {
port = 8080
expose("classify")
}.start()
Runtime.getRuntime().addShutdownHook(Thread { server.stop() })
Thread.currentThread().join()
}- Same agent instance serves both internal and external callers.
- Shared lifecycle with the host process — the agent dies when the host dies.
- Backward compatible — your existing internal call sites don't change.
Best for: existing services that want to also expose agent capabilities to Claude Code / Cursor / other MCP-aware tools, sidecars, internal platforms wanting to standardize on MCP for cross-team consumption.
The agent IS its own deployable unit. One line of main is all you write — McpRunner does parse + start + shutdown hook + block:
val coder = agent<Spec, CodeBundle>("coder") {
model { ollama("gpt-oss:120b-cloud") }
skills {
skill<Spec, CodeBundle>("write-code", "Generate Kotlin from a spec") {
tools(/* ... */)
}
}
}
fun main(args: Array<String>) = exitProcess(McpRunner.serve(coder, args) {
port = 8080
expose("write-code")
})./gradlew shadowJar produces a fat JAR. java -jar coder.jar --port 9000 --expose write-code runs it. Wrap that in a Dockerfile (one FROM eclipse-temurin:21-jre line) and you have a container. Or --no-jre via GraalVM native image (Phase 2) for a single binary.
- Independent deploy/scale/restart. The agent is a service.
- Language-neutral consumers. Any MCP client speaks to it — Python LLM frameworks, IDEs, JS tools, other agents.
- Operational footprint. It's a process you run, monitor, version, and roll back.
Best for: production agent fleets, agent-as-a-service offerings, multi-tenant deployments, anywhere the deploy boundary should match the agent boundary.
| Concern | Library | Hosted | Autonomous |
|---|---|---|---|
| Per-call overhead | None (direct call) | None internally; HTTP for external | HTTP / serialization always |
| Compile-time type safety | ✅ everywhere | ✅ internal callers; schema-validated externally | Schema-validated only |
| Deploy unit | Host app | Host app | The agent itself |
| Independent scaling | Tied to host | Tied to host | Yes |
| Failure isolation | None (in-process) | None (in-process) | Yes (separate process) |
| Cross-language consumers | Kotlin / JVM only | Any MCP client | Any MCP client |
| Operational cost | Zero (rides host) | Zero (rides host) | Process to run/monitor |
Beyond a single agent: the Swarm topology. When you want multiple agents to live in the same JVM but each carry its own
Agent<IN, OUT>surface (prompt, knowledge, memory, hooks), ship them as separate JARs and let a captainServiceLoader-discover andabsorb()the rest. Each sibling becomes a tool the captain can call; personality is preserved end-to-end. See Swarm for the mechanism.
You don't have to pick once. The three modes are a path, not a partition:
-
Start library. Wire the agent into the code that needs it. Iterate fast — no infra to think about, type errors caught at
./gradlew compileKotlin. -
Add hosted when external callers appear. Someone wants to call your agent from Claude Code, or another team wants programmatic access. One
McpServer.from(agent).start()and your existing internal call sites are unchanged. -
Eject to autonomous when the deploy unit needs to be the agent itself. Independent scaling, separate ops budget, language-neutral fleet. One
mainline, onegradle shadowJar, done.
The agent definition itself is the same Kotlin code in all three modes. Only the wiring around it changes.
See also: MCP Integration | Architecture Overview | Roadmap
Getting Started
Core Concepts
Composition Operators
LLM Integration
- Model & Tool Calling
- MCP Integration
- Agent Deployment Modes
- Swarm
- Tool Error Recovery
- Skill Selection & Routing
- Budget Controls
- Observability Hooks
Guided Generation
Agent Memory
Reference
- API Quick Reference
- Type Algebra Cheat Sheet
- Glossary
- Best Practices
- Cookbook & Recipes
- Troubleshooting & FAQ
- Roadmap
Contributing