Skip to content
skobeltsyn edited this page May 4, 2026 · 1 revision

Swarm — multi-agent JAR composition

A captain agent discovers sibling agents from JARs on its classpath via Java's ServiceLoader, then absorbs each one as a tool. Each sibling keeps its full Agent<IN, OUT> surface — prompt, skills, knowledge, memory bank, observability hooks — and the captain composes them through delegation. (See: #984)

import agents_engine.runtime.AgentProvider
import agents_engine.runtime.Swarm
import agents_engine.runtime.absorb
import agents_engine.runtime.LiveRunner

class CoderProvider : AgentProvider {
    override fun build(): Agent<*, *> = agent<String, String>("coder") { /* ... */ }
}

fun main(args: Array<String>) {
    val me = agent<String, String>("captain") {
        prompt("You route requests to the right tool.")
        skills { skill<String, String>("dispatch", "route") { tools() } }
    }
    Swarm.discover()
        .filterNot { it.name == me.name }
        .forEach { sibling -> me.absorb(sibling) }
    LiveRunner.serve(me, args)
}

Each sibling JAR ships META-INF/services/agents_engine.runtime.AgentProvider pointing at its provider class. ServiceLoader walks the classpath, instantiates each provider, and Swarm.discover() returns the resulting list of Agent instances.

How absorb works

Source: src/main/kotlin/agents_engine/runtime/Swarm.kt. Three steps:

1. Validation (fail-fast)

require(sibling.name != this.name)              // can't absorb self
require(sibling.name !in this.toolMap)           // no tool-name collision
require(firstSkill.inType == String::class)     // sibling must accept String input

2. Wrap the sibling as one tool on the captain

val tool = ToolDef(
    name = sibling.name,                                 // tool name = sibling agent name
    description = "Delegate to \"${sibling.name}\". Skills: " +
        sibling.skills.values.joinToString("; ") { "${it.name}${it.description}" },
) { args ->
    val query = args["query"]?.toString() ?: args.values.firstOrNull()?.toString() ?: ""
    @Suppress("UNCHECKED_CAST")
    val asString = sibling as Agent<String, *>
    asString.invoke(query)?.toString() ?: "null"        // FRESH invocation of the sibling
}

The tool's executor is a closure capturing the sibling reference. Every call invokes the whole sibling agent — its own LLM, its own prompt, its own tools, its own observability hooks.

3. Make the tool reachable from every captain skill

registerBuiltInTool(tool)        // unguarded path; bypasses agent.frozen check
enableAutoTool(tool.name)        // auto-include in every skill's tool list

enableAutoTool adds the name to the captain's autoToolNames set. The agentic loop builds each invocation's tool list as skill.toolNames + agent.autoToolNames, so the absorbed sibling appears as a callable tool on every skill the captain has.

Runtime flow

When the captain's LLM decides to call an absorbed sibling:

┌───────────────────────────────────────────────────────────────┐
│ Captain — its LLM sees:                                       │
│   tools = [own tools…, factor (absorbed), exit (absorbed)]    │
│   user input = "factor 30"                                    │
│   → decides: call factor(query="factor 30")                   │
└───────────────────────────────────────────────────────────────┘
                       │
                       ▼
┌───────────────────────────────────────────────────────────────┐
│ Captain's tool executor (the closure)                         │
│   args = {query: "factor 30"}                                 │
│   → sibling.invoke("factor 30")           [NEW agentic loop]  │
└───────────────────────────────────────────────────────────────┘
                       │
                       ▼
┌───────────────────────────────────────────────────────────────┐
│ Sibling — its LLM sees:                                       │
│   prompt = sibling's own prompt                               │
│   tools = [factor_number]                                     │
│   user input = "factor 30"                                    │
│   → calls factor_number(n=30) → "2, 3, 5"                     │
│   → its onToolUse fires:  [factor] factor_number(n=30) → 2,3,5│
│   → returns "2, 3, 5"                                         │
└───────────────────────────────────────────────────────────────┘
                       │
                       ▼
┌───────────────────────────────────────────────────────────────┐
│ Captain's tool executor returns "2, 3, 5"                     │
│   → captain's onToolUse fires:  [fib] factor(query=…) → 2,3,5 │
│   → captain's LLM renders: "The prime factors are 2 × 3 × 5"  │
└───────────────────────────────────────────────────────────────┘

Two layers of LLM, two onToolUse firings, two distinct agent personalities — observable independently.

Properties this design preserves

  • Personality — each sibling keeps its full Agent surface (prompt, knowledge, memory bank, observability hooks). The captain only sees them through a single tool name; they run with their entire identity intact.
  • Independence — every sibling invocation is a fresh agentic loop. No state leak, no shared message history, no shared budget between captain and sibling.
  • Two-layer observability — each agent's onToolUse / onSkillChosen / onError fires independently. The trace shows the delegation hierarchy directly.
  • No type-erasure trickery — sibling must be Agent<String, *> at runtime; the cast is checked upfront via firstSkill.inType == String::class.
  • Single-JVM — siblings run in the same classloader as the captain. No IPC overhead, no static-typing-across-JARs limitation. For cross-language siblings, use MCP instead.

ServiceLoader registration

Each agent JAR ships:

META-INF/services/agents_engine.runtime.AgentProvider

with a single line naming the provider class:

com.example.coder.CoderProvider

That's the only contract. The JVM's ServiceLoader.load(...) walks the classpath at runtime and instantiates whatever's listed.

Multi-captain setups

Any agent JAR can be the captain by giving it a main() that filters itself out of Swarm.discover() and absorbs the rest. In the swarm demo (./gradlew swarmDemo) all three of fib.jar, factor.jar, and exit.jar carry their own Main-Class manifest entry plus a Class-Path entry naming the other two — so the user picks captain just by choosing which JAR to launch:

cd build/tmp/jars_swarm_demo
java -jar fib.jar          # fib captain
java -jar factor.jar       # factor captain
java -jar exit.jar         # exit captain

The fat-JAR approach (each JAR bundles framework + deps) means every agent is a self-contained, drop-in launchable artifact. The JVM auto-classpaths the others through the manifest's Class-Path entry.

Constraints / non-goals

  • Sibling input must be String — typed-input siblings (e.g. Agent<Specification, Code>) throw IllegalArgumentException at absorb time. Out of scope for v1; would require schema-driven invocation.
  • Skill-per-skill exposure — v1 exposes each sibling as one tool (named after the agent). Future work could split each of the sibling's skills into its own captain-side tool.
  • Static typing across JAR boundaries — impossible due to JVM type erasure. Runtime type checks are honest, but compile-time enforcement at composition sites isn't.
  • Lifecycle management — sibling code runs in the same JVM as the captain; if a sibling crashes the JVM, the captain dies too. By design — for process isolation, use MCP-stdio rather than the swarm path.

Clone this wiki locally