The Swarm Judgment Harness.
Define and compose swarms of LLM agents. Spawn an agent to do things, spawn a swarm to evaluate things, or hand a swarm a Docker sandbox — from the CLI, the SDKs, or your own agent.
SDKs published to language-native registries. Pick the one for your stack:
| Language | Package | Install |
|---|---|---|
| Rust | objectiveai-sdk |
cargo add objectiveai-sdk |
| TypeScript | @objectiveai/sdk |
npm i @objectiveai/sdk |
| Python | objectiveai-sdk |
pip install objectiveai-sdk |
| Go | objectiveai-sdk-go |
go get github.com/ObjectiveAI/objectiveai/objectiveai-sdk-go |
Additional crates on crates.io: objectiveai-api, objectiveai-cli, objectiveai-mcp-cli, objectiveai-mcp-proxy, objectiveai-mcp-filesystem, objectiveai-sdk-macros. Additional PyPI package: objectiveai-cocoindex.
Install all four prebuilt binaries with one command:
curl -fsSL https://raw.githubusercontent.com/ObjectiveAI/objectiveai/main/install.sh | bash
. "$HOME/.objectiveai/env"| Binary | What it does | Download |
|---|---|---|
objectiveai |
CLI + embedded viewer | latest |
objectiveai-api |
API server | latest |
objectiveai-viewer |
Standalone Tauri desktop app | latest |
objectiveai-mcp |
MCP server (streamable HTTP) | latest |
Supported platforms: Linux x86_64, Linux aarch64, macOS x86_64, macOS aarch64, Windows x86_64. See Binaries & self-hosting for install flags and per-binary detail.
ObjectiveAI is a harness for defining, composing, and running swarms of LLM agents. You define an Agent once — model, prompts, decoding parameters, output mode, tools, MCP servers. You compose Agents into a Swarm. You then run them in any of three execution modes: spawn a single agent to do work, spawn a whole swarm to collectively evaluate something, or hand a swarm a Docker sandbox to act in.
Agents, Swarms, Functions, and Profiles are all content-addressed Git-hosted resources. The same swarm.json that powers your CLI invocation tonight is the one your colleague pins by commit SHA next month and the one your trained Profile was fit against.
The brand promise is judgment: the system was built so that collective evaluation by a swarm — not a single sampled token — produces every score. The mechanism is swarms: reusable, composable, version-tracked collections of configured models. Everything else (the CLI, the API, the web app, the MCP server, the SDKs in five languages) exists to drive swarms in the ways that matter.
Three shapes. Each mode resolves the same Agents and Swarms but does something different with them.
| Mode | What it does | Returns | Reach for it when |
|---|---|---|---|
| Agent completion | Spawn a single Agent to do work — call tools, talk to MCP servers, execute multi-turn loops, generate artifacts | Whatever the Agent produces | You need one agent to perform a discrete task |
| Function execution | Spawn a swarm to evaluate something. Functions are composable, recursive evaluation pipelines | Scalar or vector of scores | You want a calibrated, trainable multi-model evaluation |
| Laboratory execution | Builder agents run in a Docker sandbox with persistent filesystem MCP; an optional evaluation agent scores the outputs | Builder outputs + evaluation result | You need agents to write code, files, or artifacts in isolation |
Function execution is the judgment mode — that's where the system's name comes from. Agent completions are the foundational orchestration layer; every other mode is built on top of them. Laboratory executions extend the system to workloads that produce files and artifacts, not just scores.
A single language model asked to score something hands back one sampled token and walks away from everything else it computed. The signal it had — how confident it really was, where it hedged, what it nearly chose instead — never leaves the model. ObjectiveAI is built to preserve that signal across an entire swarm.
Each agent in a swarm contributes a preference distribution over the candidates rather than a single sampled token. Those distributions combine across the swarm under learned weights to produce the final score. No discrete collapse. No lost signal.
Function execution requested
│
▼
┌─────────────┐
│ Function │ (composable, content-addressed, versioned)
└──────┬──────┘
│ fans out to its swarm
▼
┌──────────────────────────────────┐
│ Swarm │
│ ┌────────┐ ┌────────┐ ┌──────┐ │
│ │ Agent │ │ Agent │ │ ... │ │
│ └───┬────┘ └───┬────┘ └──┬───┘ │
└──────┼──────────┼─────────┼─────┘
│ evaluations │
▼ ▼ ▼
┌────────────────────────────────┐
│ weighted combination (Profile)│
└────────────────────────────────┘
│
▼
scores: [0.61, 0.28, 0.11] (sums to 1)
This matters twice over: once per model, and once across models. Different models have different failure modes, different training distributions, different calibration profiles. Combining them with learned weights — weights that can be trained against ground truth — is strictly more powerful than picking the one model that scores highest on average.
Reusability across modes. Trainability where it counts. Content-addressing throughout:
- Reusable. An Agent is a 22-character ID — define one once and reference it from any swarm, any function, any lab. A Swarm is a sorted set of
(agent_id, count)pairs. Define it once and run it for action, evaluation, or sandboxed work without re-defining anything. - Trainable. Profiles learn weights over a Function's task tree against labeled data. The models stay fixed; the way the swarm's evaluations combine improves.
- Reproducible. Every resource reference is
(owner, repo, commit). Pin a commit SHA, get the exact same agent / swarm / function / profile your evaluation ran against six months ago. - Composable. Functions can call other Functions. Swarms compose into bigger swarms. The CLI dispatches plugins as unknown subcommands. The viewer surfaces plugin UIs as sandboxed iframe tabs.
- Polyglot. Rust, TypeScript, Python, Go, and (in-progress) .NET SDKs share the same generated JSON Schema corpus. Field names and shapes are identical across languages.
Install the CLI, API server, viewer, and MCP server from the latest release:
curl -fsSL https://raw.githubusercontent.com/ObjectiveAI/objectiveai/main/install.sh | bash
. "$HOME/.objectiveai/env"Set your API key:
objectiveai api headers x-objectiveai-authorization config set "apk_your_key_here"A Function describes the evaluation; a Profile names the swarm and its trained weights. Both are GitHub-hosted resources you reference by owner/repository. The example below calls a published code-safety evaluator function with its trained profile:
objectiveai functions executions create standard \
--function remote=github,owner=your-org,repository=safety-evaluator \
--profile remote=github,owner=your-org,repository=safety-evaluator-profile \
--input-inline '{"snippet":"eval(user_input)"}'The streamed output ends with a notification containing the score vector:
{"Notification":{"value":{"execution":{"output":{"Vector":[0.91,0.07,0.01,0.01]}}}}}Each number is the swarm's combined evaluation weight for that label, in the order declared by the function's response set. Values sum to 1.
import { ObjectiveAI, functionsExecutionsCreateFunctionExecution } from "@objectiveai/sdk";
const client = new ObjectiveAI({ authorization: process.env.OBJECTIVEAI_AUTHORIZATION });
const result = await functionsExecutionsCreateFunctionExecution(client, {
function: { remote: "github", owner: "your-org", repository: "safety-evaluator" },
profile: { remote: "github", owner: "your-org", repository: "safety-evaluator-profile" },
input: { snippet: "eval(user_input)" },
stream: false,
});
console.log(result.output); // { Vector: [0.91, 0.07, 0.01, 0.01] }Spawn a single agent to do work:
objectiveai agents completions create standard \
--agent remote=github,owner=your-org,repository=writer-agent \
--messages-inline '[{"role":"user","content":"Write a haiku about ocean waves."}]'Spawn builder agents in a Docker sandbox with persistent filesystem access:
objectiveai laboratories executions create \
--docker-image python:3.12-slim \
--builder-agent remote=github,owner=your-org,repository=builder-agent \
--builder-messages-inline '[{"role":"user","content":"Write a Python script that prints hello to /workspace/out.txt"}]'Pin a commit=<sha> segment to lock in a specific version of any remote resource. See Core primitives for a full explanation of Agents, Swarms, Profiles, and the three execution modes, and SDKs for Python, Rust, Go, and .NET patterns including streaming.
Three resources (Agents, Swarms, Profiles) define what's in the system; three execution modes (Agent completions, Function executions, Laboratory executions) define what you can do with them. Resources are content-addressed Git-hosted JSON; execution modes resolve resources at request time and stream typed results back. Everything ties together through a shared resource graph at the bottom of this section.
An Agent is a fully-specified configuration of a single upstream model: model identity, prompt structure, decoding parameters, output mode, tools, MCP servers, provider preferences. Agents are content-addressed via XXHash3-128 — the same configuration always produces the same 22-character base62 ID. IDs are deterministic because the serialized configuration is hashed after normalization (empty fields stripped, defaults canonicalized). Two Agents with identical effective settings are the same Agent.
Agents are stored as agent.json in Git repositories and referenced by owner/repo@commit everywhere a swarm, function, or laboratory needs an agent. Authoring agents lives in source control; calling them happens by reference.
{
"description": "Skeptical evaluator",
"upstream": "openrouter",
"model": "openai/gpt-4o",
"output_mode": "json_schema",
"temperature": 0.2,
"prefix_messages": [
{ "role": "system", "content": "You are a rigorous critic. Challenge assumptions." }
]
}Each upstream (OpenRouter, Claude Agent SDK, Codex SDK) has its own agent type with its own parameter set. The same Agent can be driven in any of the three execution modes — running solo in an agent completion, contributing to the swarm's evaluation in a function execution, or working as a builder or evaluator in a laboratory.
A Swarm is an ordered collection of Agents used together for collective judgment. Swarms are immutable and content-addressed — their ID is computed from the sorted (full_id, count) pairs of their constituent agents. Weights are not baked into the swarm definition; they are execution-time parameters supplied by a Profile or passed directly.
Each agent slot has a count (number of instances) and optional fallbacks. Duplicate agents are merged and their counts summed. The total agent count across all slots must be between 1 and 128.
{
"description": "Balanced judgment panel",
"agents": [
{
"upstream": "openrouter",
"model": "openai/gpt-4o",
"output_mode": "json_schema",
"prefix_messages": [
{ "role": "system", "content": "You are a rational skeptic. Ground every choice in logic." }
],
"count": 2
},
{
"upstream": "openrouter",
"model": "anthropic/claude-sonnet-4-20250514",
"output_mode": "tool_call",
"suffix_messages": [
{ "role": "system", "content": "You are an intuitive thinker. Trust your instincts." }
],
"count": 1
}
]
}Swarms are stored as swarm.json in Git repositories and shared across functions. Because weights are external, the same swarm can be reused with different weight configurations without creating a new swarm.
ObjectiveAI does not fine-tune models. It learns weights.
A Profile is the result of training: given a dataset of (input, expected_output) pairs, ObjectiveAI executes a Function repeatedly, computes loss against expected outputs, and adjusts the weights over each task until they converge. The learned configuration — which tasks to trust more, which to discount — is stored as profile.json.
Profiles are GitHub-hosted and referenced by owner/repo@commit. Pinning a commit SHA is strongly recommended: the Profile's shape (number of tasks, their order) is tied to the function it was trained on, and that function may evolve. A mismatched Profile silently produces wrong weights.
{
"owner": "ObjectiveAI",
"repo": "quality-scorer",
"commit": "a3f8c21d..."
}At execution time, the Function and Profile are independent inputs. The retrieval system fetches both, resolves the resource graph, and applies the learned weights to combine task outputs.
An agent completion spawns a single Agent to do work. The Agent receives a task as a conversation and acts on it — calls tools, talks to MCP servers, executes a multi-turn loop, writes code, generates artifacts. Function executions and laboratory executions are built on top of agent completions; they're multi-agent or multi-step orchestrations of the same underlying primitive.
The Agent is supplied by remote reference. Messages can include images, audio, and files in addition to text. Tool calls are detected mid-stream and executed automatically; MCP servers attached to the Agent are dialed transparently. The response carries a Continuation that captures the conversation state so the next call can pick up where this one left off.
{
"agent": { "remote": "github", "owner": "your-org", "repository": "writer-agent" },
"messages": [
{ "role": "user", "content": "Rewrite this commit message as a conventional-commits changelog entry." }
]
}CLI: objectiveai agents completions create standard --agent remote=github,owner=...,repository=... --messages-inline '...'. SDK: agentsCompletionsCreateAgentCompletion (JS) / create_agent_completion (Python) / agent::completions::http::create_agent_completion (Rust).
A function execution spawns a swarm to evaluate something. Functions are composable, recursive evaluation pipelines: data in, scores out. A Function is a list of tasks executed against an input. Each task is one of:
- A swarm evaluation step — hands the input plus a fixed set of candidate labels to the swarm and gets back a score across those candidates. Each Agent contributes a preference distribution over the candidates rather than a single sampled token; the distributions combine with the Profile's weights to produce the step's score. Larger candidate sets are handled transparently by internal machinery — evaluations span hundreds of candidates if needed.
- A nested function call — references another
function.jsonbyowner/repo@commit. Resolves and executes that function against the same (or a transformed) input. - A mapped operation — runs a task N times over an indexed range, producing N outputs.
Functions are recursive: a function's tasks can themselves be functions, which can contain more nested functions. The composition is arbitrarily deep.
Functions produce either:
- Scalar — a single score in [0, 1].
- Vector — an array of scores summing to 1, one per output dimension.
The final output is the weighted combination of all task outputs, with weights supplied by a Profile. Tasks carry output expressions (JMESPath or Starlark) that transform raw task results into the function's output type before combining.
Functions are stored as function.json in Git repositories and referenced by owner/repo triple. They are content-addressed via their task structure and input schema.
{
"type": "alpha.scalar.leaf.function",
"description": "Score response quality on a 0-1 scale",
"input_schema": { "type": "object", "properties": { "response": { "type": "string" } } },
"tasks": [
{
"type": "vector.completion",
"messages": [{ "role": "user", "content": "Rate this response: {{input.response}}" }],
"responses": ["poor", "mediocre", "good", "excellent"],
"output": { "$starlark": "output['scores'][2] + output['scores'][3]" }
}
]
}A laboratory execution runs Agents in a Docker sandbox with persistent filesystem access. The container starts with objectiveai-mcp-filesystem injected as an MCP server, giving the running agents read/write tools over the workspace directory. One or more builder agents produce outputs (write files, run code, generate artifacts); an optional evaluation agent then runs against the resulting workspace and produces a schema-constrained verdict. Failed evaluations can trigger up to max_evaluation_retries builder re-runs.
This mode extends the system beyond text scoring into physical-output workloads: an agent that has to write working code, lay out a directory tree, or produce a binary artifact can do all of that inside a hermetic container, with the outputs available to the evaluator (and to you) afterwards.
objectiveai laboratories executions create \
--docker-image python:3.12-slim \
--builder-agent remote=github,owner=your-org,repository=builder-agent \
--builder-messages-inline '[{"role":"user","content":"Write tests for src/foo.py"}]' \
--evaluation-agent remote=github,owner=your-org,repository=evaluator-agent \
--evaluation-messages-inline '[{"role":"user","content":"Determine whether the tests pass."}]' \
--evaluation-output-schema-inline '{"type":"object","properties":{"passed":{"type":"boolean"}}}' \
--max-evaluation-retries 3Builder messages, evaluation messages, and the evaluation output schema can be supplied inline as JSON, generated via inline Python, or loaded from a Python file (--builder-messages-python-inline, --builder-messages-python-file, and the corresponding flags for evaluation).
Laboratory executions are not learned-weighted or vote-aggregated — they're sequential agentic pipelines, scoped to a sandbox. Use them when you need agents to do something with files, not just judge between options.
All resources reference each other via (owner, repository, commit) triples. Content-addressing plus commit pinning makes the full graph reproducible from any entry point.
agent.json <- swarm.json <- profile.json function.json
(agents) (swarms+weights) (tasks + input_schema)
At execution: function.json + profile.json -> scores
The Function and Profile are deliberately separate files. The same Function can be run with different Profiles (e.g. a domain-specific profile vs. a general-purpose profile). The same Profile cannot be applied to a structurally different Function — the task count and order must match.
Remote references resolve lazily: the retrieval system walks the graph starting from the execution request, fetching and caching each resource exactly once. Deduplication is by (owner, repo, commit) triple. All fetches are content-verified — a cached resource is never re-fetched if the commit SHA matches. Even deeply nested function graphs execute with minimal network overhead.
Invention is specific to the Function execution mode — it generates function.json files that you then run as function executions. Agent completions and laboratory executions don't have an analogous generator; they're authored directly.
An agent that needs a new evaluation pipeline can ask ObjectiveAI to build one for it. Invention takes a natural-language description — a spec — and an optional set of examples, then runs a five-step agentic process (essay → input schema → essay tasks → tasks → description) that produces a complete, valid function.json: typed input schema, task tree with expressions, and description. The output is ready to commit, train against a dataset, and call immediately.
Input to invention:
spec— plain text description of what the function should evaluatename— target repository name for publishingdepth,min_branch_width,max_branch_width,min_leaf_width,max_leaf_width— tree shape constraints- Optional: an agent to run the invention steps; a seed for reproducibility; a remote target (GitHub or local filesystem)
Output: a function.json with a JSON Schema input_schema, a tasks array of swarm evaluation steps (or nested function references), and a description. The file is published to the configured remote automatically.
Setting depth > 0 triggers recursive invention. The root function is invented first. Its task tree contains placeholder slots for child functions. The recursive client then spawns a concurrent child invention for each placeholder, resolving the full tree bottom-up. All streams are merged immediately — no waiting for siblings. The result is a multi-level decision tree where every node is itself an invented, deployable function.
Depth and width bounds control the shape: min_branch_width / max_branch_width govern non-leaf nodes; min_leaf_width / max_leaf_width govern leaves. A depth=2, max_branch_width=3 invention produces up to nine leaf functions under three branch functions under one root — all invented concurrently, all published independently.
objectiveai functions inventions recursive create alpha-scalar \
--name my-org/code-quality-scorer \
--spec "Score a pull request diff on correctness, readability, and test coverage" \
--depth 1 --min-branch-width 2 --max-branch-width 3- Invent — agent describes what it needs to evaluate; invention generates
function.json. - Train — provide labeled examples; ObjectiveAI learns weights and writes
profile.json. - Deploy — push both files to a Git repository; reference by
owner/repo@commit. - Use — the same agent (or any agent) calls the function to evaluate future inputs.
Each cycle produces a reusable, versioned evaluation tool. An agent that executes this loop on demand gains evaluation infrastructure calibrated to its own criteria — not to a pre-defined rubric. The system does not fine-tune models; it learns weights over fixed agents. The infrastructure improves; the models stay stable.
Every SDK exposes the same three execution modes: Agent Completions (spawn a single Agent to do work — tools, MCP, multi-turn loops), Function Executions (spawn a swarm to evaluate something — composable evaluation pipelines), and Laboratory Executions (Docker-sandboxed builder + evaluator runs). All three support streaming via Server-Sent Events. The API emits incremental chunks; each SDK merges them into an accumulating object using an immutable merge system (TypeScript), a mutable push system (Python, Rust, Go), or equivalent. Types are generated from a shared JSON Schema corpus derived from the Rust SDK, so field names and shapes are identical across languages.
| Language | Package | Install | Runtime targets |
|---|---|---|---|
| Rust | objectiveai-sdk on crates.io |
cargo add objectiveai-sdk |
Any (async via reqwest + tokio) |
| TypeScript | @objectiveai/sdk on npm |
npm i @objectiveai/sdk |
Node.js, Deno, browser (CJS + ESM) |
| Python | objectiveai-sdk on PyPI |
pip install objectiveai-sdk |
CPython 3.10+ (includes PyO3 extension) |
| Go | github.com/ObjectiveAI/objectiveai/objectiveai-sdk-go |
go get github.com/ObjectiveAI/objectiveai/objectiveai-sdk-go |
Go 1.26+ |
| .NET | ObjectiveAI (NuGet — in progress) |
not yet published | net10.0 |
The base URL defaults to https://api.objectiveai.dev in all SDKs. Auth is passed as OBJECTIVEAI_AUTHORIZATION (env var) or via the client constructor.
import {
ObjectiveAI,
functionsExecutionsCreateFunctionExecution,
functionsExecutionsResponseStreamingFunctionExecutionChunkMerged,
} from "@objectiveai/sdk";
const client = new ObjectiveAI({ authorization: process.env.OBJECTIVEAI_AUTHORIZATION });
const stream = await functionsExecutionsCreateFunctionExecution(client, {
stream: true,
function: { remote: "github", owner: "your-org", repository: "safety-evaluator" },
profile: { remote: "github", owner: "your-org", repository: "safety-evaluator-profile" },
input: { snippet: "eval(user_input)" },
});
let acc: any = null;
for await (const chunk of stream) {
acc = acc ? functionsExecutionsResponseStreamingFunctionExecutionChunkMerged(acc, chunk)[0] : chunk;
}
console.log("output:", acc?.output);import asyncio, os
from objectiveai_sdk.client import ObjectiveAI
from objectiveai_sdk.functions.executions.http import create_function_execution
async def main() -> None:
client = ObjectiveAI(authorization=os.environ.get("OBJECTIVEAI_AUTHORIZATION"))
params = {
"stream": True,
"function": {"remote": "github", "owner": "your-org", "repository": "safety-evaluator"},
"profile": {"remote": "github", "owner": "your-org", "repository": "safety-evaluator-profile"},
"input": {"snippet": "eval(user_input)"},
}
stream = await create_function_execution(client, params)
acc = None
async for chunk in stream:
if acc is None:
acc = chunk
else:
acc.push(chunk)
print("output:", acc.output if acc else None)
asyncio.run(main())use futures::StreamExt;
use objectiveai_sdk::{HttpClient, functions::executions};
#[tokio::main]
async fn main() -> Result<(), objectiveai_sdk::HttpError> {
let client = HttpClient::builder()
.authorization(std::env::var("OBJECTIVEAI_AUTHORIZATION").ok())
.build();
let mut stream = executions::http::create_function_execution_streaming(
&client,
executions::request::params(/* function: remote ref, profile: remote ref, input */),
).await?;
let mut acc: Option<executions::response::streaming::FunctionExecutionChunk> = None;
while let Some(Ok(chunk)) = stream.next().await {
match &mut acc {
Some(a) => a.push(&chunk),
None => acc = Some(chunk),
}
}
println!("output: {:?}", acc.map(|a| a.output));
Ok(())
}The Go SDK is fully auto-generated from the JSON Schema corpus. Types are strict-validated on unmarshal. The client exposes generic helpers PostUnary[T] / PostStreaming[T] / GetUnary[T] / DeleteUnary[T]; endpoint functions such as FunctionsExecutionsCreateFunctionExecutionStreaming wrap these. A wazero-hosted WASM binary (compiled from the Rust core) provides chunk-to-unary conversion and merge verification without CGO.
The .NET SDK (ObjectiveAI, targeting net10.0) is in active development. The NuGet publish workflow is not yet wired up, so it must be built from source for now.
curl -fsSL https://raw.githubusercontent.com/ObjectiveAI/objectiveai/main/install.sh | bash
. "$HOME/.objectiveai/env"All four binaries land in ~/.objectiveai/ and are added to PATH. The CLI (objectiveai) self-updates on startup; re-run the installer to upgrade objectiveai-api, objectiveai-viewer, and objectiveai-mcp.
The primary user-facing binary. Built with clap derive macros and emits newline-delimited JSON (NDJSON) on stdout. Top-level command groups: agents, swarms, functions, vector, laboratories, plugins, logs, instructions, schemas, api, viewer.
objectiveai agents list
objectiveai agents completions create standard --agent remote=github,owner=...,repository=... --messages-inline '...'
objectiveai functions executions create standard --function remote=github,owner=...,repository=... --profile remote=github,owner=...,repository=... --input-inline '{...}'
objectiveai laboratories executions create --docker-image ... --builder-agent remote=github,owner=...,repository=... --builder-messages-inline '...'
objectiveai plugins install github --owner ObjectiveAI --repository my-pluginThe default build embeds the Tauri viewer as a sidecar: running a streaming command opens a live viewer window backed by an in-process HTTP server. Pass --no-viewer at install time for a smaller build without the embedded viewer. JSON schemas for every public type are accessible at objectiveai schemas list / objectiveai schemas output <name>.
Standalone HTTP API server. Run it with:
objectiveai-apiKey environment variables (all optional):
| Variable | Default | Effect |
|---|---|---|
ADDRESS |
0.0.0.0 |
Bind address |
PORT |
5000 |
Bind port |
OBJECTIVEAI_ADDRESS |
https://api.objectiveai.dev |
Upstream ObjectiveAI address when proxying |
OBJECTIVEAI_AUTHORIZATION |
— | Bearer token for the ObjectiveAI API |
OPENROUTER_AUTHORIZATION |
— | Bearer token for OpenRouter |
GITHUB_AUTHORIZATION |
— | GitHub token for resource retrieval |
MCP_AUTHORIZATION |
— | Bearer token for outbound MCP calls |
The server is streaming-first: every layer (agent completions, function executions, laboratory executions, inventions) produces a typed stream of chunks and yields immediately to the HTTP response — nothing is buffered in the hot path.
Standalone Tauri desktop application. Presents the same UI that the CLI embeds as a sidecar, but runs as a first-class window manager process rather than being spawned in-process by a CLI command. Reach for it when you want the viewer always open and decoupled from CLI invocations.
MCP (Model Context Protocol) server built from objectiveai-mcp-cli. Exposes ObjectiveAI's tooling over the streamable-HTTP MCP transport so editors and agents (Claude, Cursor, etc.) can invoke it via the standard MCP protocol. Defaults to 0.0.0.0:3000; override with ADDRESS and PORT.
Three crates make up the MCP surface:
objectiveai-mcp-cli— the binary shipped asobjectiveai-mcp. Wraps the CLI as MCP tools over streamable-HTTP.objectiveai-mcp-proxy— a multiplexing sidecar ofobjectiveai-api. Terminates an MCP client connection and forwards tool calls to an upstream MCP server or to ObjectiveAI-native tools. Embedded insideobjectiveai-apiat runtime.objectiveai-mcp-filesystem— MCP filesystem helpers (read/write/list) adapting the SDK's filesystem layer to MCP tool calls. Docker-injected into laboratory executions so agents running in sandboxed containers can access the ObjectiveAI filesystem layer.
Pass flags to bash -s -- after the installer URL:
curl -fsSL https://raw.githubusercontent.com/ObjectiveAI/objectiveai/main/install.sh | bash -s -- --no-viewer| Flag | Effect |
|---|---|
--no-viewer |
Skips the standalone objectiveai-viewer; installs the CLI variant without an embedded Tauri viewer (smaller binary). |
--no-api |
Skips objectiveai-api. |
--no-mcp |
Skips objectiveai-mcp. |
--cli-only |
Equivalent to --no-viewer --no-api --no-mcp. Only objectiveai is installed. |
Flags compose freely.
The hosted API at https://api.objectiveai.dev requires no setup and is the default for the CLI and all SDKs. Run your own objectiveai-api when you need total control over data routing — for example, to point agents at private upstream providers not available on OpenRouter, to meet on-prem or air-gapped requirements, or to run the full execution pipeline locally without network egress. Configure the CLI to point at your instance with objectiveai api mode set local and objectiveai api local address set http://localhost:5000.
Supported platforms: Linux x86_64, Linux aarch64, macOS x86_64, macOS aarch64 (Apple Silicon), Windows x86_64.
A plugin is a binary that adds new top-level subcommands to the ObjectiveAI CLI, optionally paired with a viewer UI tab. Plugins are described by an objectiveai.json manifest at the repository root. The CLI dispatches any unknown top-level subcommand to the matching installed plugin binary, communicating over a JSONL protocol on stdout. The viewer surfaces plugins with a declared UI source as sandboxed iframe tabs, isolated from the host DOM.
Install from a public GitHub repository:
# From the ObjectiveAI org (default whitelist — no extra flags needed).
objectiveai plugins install github --owner ObjectiveAI --repository my-plugin
# Pin to a specific commit.
objectiveai plugins install github --owner ObjectiveAI --repository my-plugin --commit-sha <sha>
# Third-party repository — requires explicit opt-in.
objectiveai plugins install github --owner third-party --repository my-plugin --allow-untrusted
# Replace an existing install (binary, viewer bundle, and manifest are rewritten).
objectiveai plugins install github --owner ObjectiveAI --repository my-plugin --upgradeTo print layout and manifest conventions for placing a plugin by hand in ~/.objectiveai/plugins/:
objectiveai plugins install filesystemobjectiveai.json at the repository root declares the plugin's metadata, platform binaries, and optional viewer source. All fields except description and version are optional.
| Field | Type | Notes |
|---|---|---|
description |
string | Required. One-line summary shown in listings. |
version |
string | Required. Used to construct release-asset URLs (releases/download/v<version>/<asset>). |
author / homepage / license |
string | Optional metadata. |
binaries |
object | Map of <os>_<arch> → release-asset filename. Supported keys: linux_x86_64, linux_aarch64, windows_x86_64, windows_aarch64, macos_x86_64, macos_aarch64. Declare only platforms you ship. |
viewer_zip |
string | Release-asset filename for the UI bundle (a zip with index.html at root). Mutually exclusive with viewer_url. |
viewer_url |
string | Remote URL loaded as the iframe src verbatim. Must be https:// or http://localhost. Mutually exclusive with viewer_zip. |
viewer_routes |
array | HTTP routes the viewer's embedded axum server exposes on behalf of the plugin. |
mobile_ready |
bool | Opt-in for iOS/Android viewer builds. Defaults to false. |
Example:
{
"description": "Run wave-physics simulations from the CLI.",
"version": "1.0.0",
"author": "Example Corp",
"license": "MIT",
"binaries": {
"linux_x86_64": "psyops-linux-x86_64",
"windows_x86_64": "psyops-windows-x86_64.exe",
"macos_aarch64": "psyops-macos-aarch64"
},
"viewer_zip": "psyops-viewer.zip"
}A plugin binary reads its arguments from argv and writes JSONL to stdout. Each line must be one of three shapes:
The host parses stdout line-by-line; unparseable lines are forwarded as string notifications rather than dropped.
For the viewer, produce a static dist/ with index.html at the root, zip it, and reference it in viewer_zip. For remote-hosted UIs, use viewer_url. The viewer posts events to the iframe via postMessage.
To iterate locally: place the binary at ~/.objectiveai/plugins/<name>/plugin[.exe] and the manifest at ~/.objectiveai/plugins/<name>.json, then invoke objectiveai <name> <args>. The objectiveai-cli/test-fixtures/hello-plugin/ fixture is the minimal example — a single main.rs that reads argv[1] and emits one notification line.
For distribution, cut a GitHub release tagged v<version>, upload binaries and the viewer zip as release assets named exactly as declared in the manifest, then install with plugins install github.
Full reference: PLUGINS.md.
objectiveai.dev is the production web interface, built with Next.js (App Router). The app provides browsing and detail views for the three core resource types: Functions (/functions, /{owner}/{repo}), Swarms (/swarms, /{id}), and Profiles (/profiles). From a function detail page, users can inspect the task tree, execute the function against a chosen swarm and profile, and view per-task vote breakdowns and aggregate scores. The profiles listing surfaces trained weight configurations available for reuse. A /demo route renders live component prototypes including the FunctionTree canvas visualization, vote matrices, decomposition views, and contribution waterfalls.
The examples/ directory collects real software built on top of ObjectiveAI, with links to full source repositories.
psychological-operations — an agentic X (Twitter) scraper and scoring pipeline (repo). It pairs human-driven Chrome automation with ObjectiveAI to rank scraped tweets along operator-defined axes. The project defines three primary objects: Scrapes (declarative search jobs that scroll and parse x.com into SQLite), PsyOps (scoring jobs that pull tagged posts and run them through an ObjectiveAI function using a chosen swarm, profile, and strategy — including Swiss System tournament-style ranking), and Inventions (wrappers around recursive function invention). A pilot study ranked tweets from 33 YC W22 CEO accounts along an unsettlingness axis using sub-functions invented by a Claude Opus agent; published artifacts are content-addressed and reproducible.
objectiveai-claude-agent-sdk-runner— a long-lived Python stdio NDJSON server that runs concurrent Claude Agent SDK sessions on behalf ofobjectiveai-api. The Rust API caller spawns and multiplexes requests over a single stdin/stdout pair using a semaphore-backed FIFO queue; each request carries a stringidfor demultiplexing events from N concurrent streams.objectiveai-codex-sdk-runner— same architecture as the Claude runner but targets the OpenAI Codex SDK. Authentication is inherited from~/.codex/auth.json; the runner shells out to thecodexbinary and streamsThreadEventobjects back to the Rust caller.objectiveai-function-tree— a TypeScript/React package that renders a 2D canvas visualization of ObjectiveAI function execution trees. Exposes aFunctionTreecomponent plus a headlesscoreexport and CSS; peer-depends on React 18+. Used internally byobjectiveai-web.objectiveai-cocoindex(PyPI) — a Python integration that wraps ObjectiveAI function executions as memoized CocoIndex processing components. The memo key combines the bound(function, profile, strategy)triple with the per-call input, making it safe to drop into indexing pipelines.objectiveai-github-discord-notifier— a Python FastAPI webhook server (Docker-deployable) that validates GitHub webhook signatures and forwards pull-request and issue events to a configured Discord channel.objectiveai-json-schema— generated JSON Schema files for every public serializable type in the Rust SDK, named using dot-separated module paths (e.g.functions.executions.RetryToken.json). Several hundred schemas cover agents, swarms, functions, profiles, executions, CLI output, MCP types, and more. These files drive code generation for the Go SDK and .NET SDK and can be used by any downstream tooling that needs machine-readable type definitions.- ObjectiveAI-claude-code-1 — an autonomous Claude Code agent that invents and publishes ObjectiveAI Functions without human intervention. Uses the Agent SDK to create, test, and deploy new scoring pipelines, closing the loop on the invention system.
A single git repository contains the SDK core, server, clients, integrations, and tools.
objectiveai/
│
├── # SDK core (Rust)
│ ├── objectiveai-sdk-rs/ # Rust SDK — types, validation, compilation
│ ├── objectiveai-sdk-rs-macros/ # Procedural macros for the Rust SDK
│ ├── objectiveai-sdk-rs-cffi/ # C FFI bindings (expose SDK to C/C++)
│ ├── objectiveai-sdk-rs-pyo3/ # PyO3 bindings (Rust extension for Python)
│ └── objectiveai-sdk-rs-wasm-js/ # WASM bindings for browser / Node.js
│
├── # SDKs (other languages)
│ ├── objectiveai-sdk-js/ # TypeScript/JavaScript SDK (npm)
│ ├── objectiveai-sdk-py/ # Python SDK (PyPI)
│ ├── objectiveai-sdk-go/ # Go SDK
│ └── objectiveai-dotnet/ # .NET SDK (NuGet: ObjectiveAI)
│
├── # Server & binaries
│ ├── objectiveai-api/ # API server (self-hostable or importable)
│ ├── objectiveai-cli/ # Command-line interface
│ ├── objectiveai-viewer/ # Desktop viewer app (Tauri)
│ └── objectiveai-mcp-cli/ # MCP CLI binary (ships as objectiveai-mcp)
│
├── # MCP integration
│ ├── objectiveai-mcp-proxy/ # MCP proxy — multiplexes tool calls
│ └── objectiveai-mcp-filesystem/ # MCP filesystem helpers
│
├── # Runners
│ ├── objectiveai-claude-agent-sdk-runner/ # Concurrent Claude Agent SDK runner
│ └── objectiveai-codex-sdk-runner/ # Concurrent OpenAI Codex SDK runner
│
├── # Web & tools
│ ├── objectiveai-web/ # Next.js production web interface
│ ├── objectiveai-function-tree/ # 2D canvas function-tree visualizer
│ ├── objectiveai-cocoindex/ # CocoIndex integration (Python)
│ ├── objectiveai-github-discord-notifier/ # GitHub webhook → Discord notifier
│ └── objectiveai-json-schema/ # Generated JSON Schema files
│
└── # Other
├── examples/ # Usage examples
├── bin/ # Vendored build tool binaries
└── *.sh # Root scripts: build, install, publish, version
- Rust — stable toolchain via rustup. No pinned
rust-toolchain.toml; use the current stable release.wasm-packandmaturinare installed automatically bybuild-bin.shinto./bin/. - Node.js + pnpm 10.25.0 — the workspace
packageManagerfield pins this version. Install pnpm viacorepack enableornpm i -g pnpm@10.25.0. - Python — required for
objectiveai-sdk-py(PyO3/maturin extension build) and the Claude/Codex agent-SDK runners (PyInstaller). - Docker — required for the
objectiveai-mcp-filesystemmusl cross-compilation step inbuild.sh.
pnpm install # JS workspace dependencies
cargo build --release # Rust crates
bash build.sh # full monorepo build in dependency order
bash build-bin.sh # (re)install pinned build tools into ./bin/build.sh generates JSON schemas, compiles WASM and CFFI bindings, builds all language SDKs (.NET, Go, Python, JS), and produces viewer artifacts.
bash test.sh # all suites in parallel (spawns a local API server)
cargo test # Rust workspace tests
pnpm test # JS/TS teststest.sh exports OBJECTIVEAI_TEST_PORT and runs per-package test.sh scripts concurrently across objectiveai-sdk-rs, objectiveai-api, objectiveai-json-schema, objectiveai-cli, objectiveai-mcp-proxy, objectiveai-sdk-js, objectiveai-sdk-py, objectiveai-sdk-go, and objectiveai-viewer. Tests must not hit the production API — use the local server, mocks, or fixtures.
- Package manager: use
pnpm, nevernpm. Filter to a single workspace package withpnpm --filter <package-name> run <script>. - No type re-exports in Rust. When an import path is wrong, fix it at the call site. Never add re-export aliases or shim
pub useentries to paper over a broken import. mod.rsdiscipline.mod.rsfiles contain only module declarations and re-export globs — no functions, structs, enums, traits, or impls. Every entry must be eitherpub mod foo;ormod foo; pub use foo::*;.- No network-hitting tests. Tests must not contact the production API. Mock responses or use local fixtures.
- Test failures are not pre-existing issues. Every failure must be investigated and fixed; never dismiss one to move on.
- Single shared version. All packages share one version number. Bump atomically across Cargo.toml, package.json, pyproject.toml, .csproj, and all inter-package dependency references with
bash version.sh <new-version>. - Publishing.
bash publish.shorchestrates the full release across crates.io, PyPI, npm, the Go module proxy, and GitHub Releases in dependency-order waves, polling each registry until the new version is live before proceeding.
MIT.
{"type": "notification", "key": "value"} // data to forward to the caller {"type": "error", "level": "warn", "fatal": false, "message": "..."} {"type": "command", "command": "agents list"} // spawn a CLI command, fire-and-forget