Invocation:
python -m agent_cap.agents [flags]
| Flag |
Effect |
--list-strategies |
Print built-in + plugin strategies, exit. |
--list-protocols |
Print registered LLM protocols, exit. |
| Flag |
Default |
Notes |
--strategy NAME |
plan-execute |
One of --list-strategies. |
--max-turns N |
8 |
Cap on tool-use turns per agent. |
--sequence r1,r2,... |
(config order) |
Only for --strategy sequential. |
Two YAML layouts are supported. See yaml.md for details. From the
command line:
| Flag |
Repeatable |
Notes |
--agent ROLE=k=v,k=v |
yes |
Inline. ROLE is the agent pool key AND the role name unless you also use --role. |
--agents-file PATH |
yes |
YAML file with only agents: (+ optional defaults:/include:). Later files override earlier. |
--agents-glob PATTERN |
yes |
Shell glob, expanded and sorted. |
--role ROLE=AGENT |
yes |
Bind strategy role to a pool key. Lets two roles share one endpoint. |
For the common case of "one agent named agent", these flags build the role
directly without --agent ROLE=k=v,k=v syntax. They are ignored if an agent
role already exists (from --config, --agent, --agents-file).
| Flag |
Endpoint field |
--model NAME |
name |
--base-url URL |
base_url |
--api-key KEY |
api_key |
--max-tokens N |
max_tokens |
--temperature T |
temperature |
--top-p P |
top_p |
--seed N |
seed |
--engine NAME |
engine (e.g. vllm, sglang) |
--protocol NAME |
protocol (openai, harmony, mock) |
--system-prompt TEXT |
system_prompt |
--use-streaming |
use_streaming=true |
Example without any YAML:
python -m agent_cap.agents --strategy single \
--model gpt-oss-120b --base-url http://localhost:30000 --api-key EMPTY \
--engine sglang --max-tokens 131072 --top-p 1.0 --seed 42 \
--dataset imo-answerbench --num-tasks 5 --max-turns 128 \
--tool-backend math-python --evaluator imo \
--judge "name=openrouter/elephant-alpha,base_url=https://openrouter.ai/api/v1,api_key=$OPENROUTER_API_KEY" \
--output-dir results/imo_gptoss_sglang --resume
Supported keys inside --agent and --role values:
name model id / served-model-name
base_url OpenAI-compatible base URL (must include /v1 for hosted)
api_key bearer; any non-empty string is fine for self-hosted
max_tokens int, default 16384
temperature float, default 0.0
top_p float, default 1.0 (harmony protocol only; openai uses provider default)
seed int, optional (harmony protocol only; openai client does not forward seed)
use_streaming bool, default false
protocol explicit override: openai | harmony | mock | ...
engine serving-engine variant for the protocol (e.g. vllm | sglang for harmony)
system_prompt optional system message
| Flag |
Notes |
--task "..." |
Single prompt. |
--task-file PATH.jsonl |
Each line: {"task_id": ..., "user_prompt": ..., ...}. |
--dataset NAME |
Uses agent_cap.runner.unified_runner._load_dataset_tasks. Known names: imo-answerbench, mcp-atlas, swe-bench-pro, medagentbench, financebench. |
--num-tasks N |
Cap dataset to N tasks (0 = all). |
| Flag |
Notes |
--demo-tools |
Built-in calc + echo for smoke tests. |
--tool-backend NAME |
Real backends: mcp, math-python, or demo. |
--mcp-server-url URL |
Required when --tool-backend mcp. |
--no-tools |
Force-disable tools even if config provides them. |
| Flag |
Notes |
--evaluator NAME |
One of --list-strategies-equivalent for evaluators: exact, gtfa, imo, imo-answerbench, llm-judge / judge, none. |
--judge "k=v,k=v" |
Config for llm-judge (or as fallback for other evaluators). Keys: name, base_url, api_key, temperature, max_tokens, timeout_s, system_prompt, user_template, decision_field, score_field, extract. |
| Flag |
Notes |
--output PATH.json |
Single-file dump when not using --output-dir. |
--output-dir DIR |
Writes results.jsonl (one row per task) and output-data.jsonl. |
--resume |
Skip task_ids already present in <output-dir>/results.jsonl. |
-v / -vv |
-v prints per-task summary; -vv adds per-turn trace. |
| Flag |
Notes |
--mock |
Deterministic offline LLM. No API key, no network. |
--load-module DOTTED.PATH |
Import a module so its @register_* decorators fire. Repeatable. |
--config PATH.yaml |
Main YAML config. CLI flags override fields in it. |
When a setting can be supplied in multiple places:
CLI flag > --config YAML > --agents-file / --agents-glob > YAML `include:`
For per-agent fields specifically:
CLI --agent (inline) > CLI --role (mapping) > --config agents: > --agents-file agents: > defaults: