Skip to content

battysh/batty

Repository files navigation

Batty

Batty

Hierarchical agent teams for software development.

Define a team of AI agents in YAML. Batty runs them through a shim-based runtime, routes work and messages between roles, manages engineer worktrees, and keeps the team moving while tmux remains the display and persistence layer.

CI crates.io Release MIT License

Quick Start · Docs · GitHub


Batty is a control plane for AI coding teams. Instead of one agent doing everything badly, you define roles like architect, manager, and engineers; Batty launches each agent through a PTY-owning shim, isolates engineer work in git worktrees, routes messages, tracks the board, and uses tmux for visibility and session persistence.

Quick Start

cargo install batty-cli
cd my-project && batty init
batty start --attach
batty send architect "Build a REST API with JWT auth"

That gets you from zero to a live team session. For the full walkthrough, templates, and configuration details, see the Getting Started guide.

Quick Demo

Watch the Batty demo

Watch the full demo on YouTube

You
  |
  | batty send architect "Build a chess engine"
  v
Architect (Claude Code)
  | plans the approach
  v
Manager (Claude Code)
  | creates tasks, assigns work
  v
Engineers (Codex / Claude / Aider)
  eng-1-1   eng-1-2   eng-1-3   eng-1-4
   |          |          |          |
   +---- isolated git worktrees ----+

Batty keeps each role visible in its own tmux pane, while the shim handles PTY ownership, state detection, and structured message delivery. The daemon auto-dispatches board tasks, runs standups, and merges engineer branches back when they pass tests.

Install

1. Install kanban-md

kanban-md is a separate Go tool. Grab the latest binary from GitHub releases:

# macOS (Apple Silicon)
curl -sL https://github.com/antopolskiy/kanban-md/releases/latest/download/kanban-md_0.33.0_darwin_arm64.tar.gz | tar xz
mv kanban-md /usr/local/bin/

# macOS (Intel)
curl -sL https://github.com/antopolskiy/kanban-md/releases/latest/download/kanban-md_0.33.0_darwin_amd64.tar.gz | tar xz
mv kanban-md /usr/local/bin/

# Linux (x86_64)
curl -sL https://github.com/antopolskiy/kanban-md/releases/latest/download/kanban-md_0.33.0_linux_amd64.tar.gz | tar xz
mv kanban-md ~/.local/bin/

Or with Go: go install github.com/antopolskiy/kanban-md@latest

2. Install Batty

From crates.io:

cargo install batty-cli

From source:

git clone https://github.com/battysh/batty.git
cd batty
cargo install --path .

How It Works

team.yaml
   |
   v
batty start
   |
   +--> shim process per agent
   +--> PTY + screen classifier per shim
   +--> tmux panes tail shim PTY logs
   +--> engineer worktrees created when enabled
   +--> daemon loop watches shim state, inboxes, board, retries, standups
   |
   v
batty send / assign / board / status / merge

Batty does not embed a model. It orchestrates external agent CLIs, keeps state in files, uses shims as the execution boundary, and uses tmux plus git worktrees as the operator-facing runtime surface.

Built-in Templates

batty init --template <name> scaffolds a ready-to-run team:

Template Agents Description
solo 1 Single engineer, no hierarchy
pair 2 Architect + 1 engineer
simple 6 Human + architect + manager + 3 engineers
squad 7 Architect + manager + 5 engineers
large 19 Human + architect + 3 managers + 15 engineers
research 10 PI + 3 sub-leads + 6 researchers
software 11 Human + tech lead + 2 eng managers + 8 developers
batty 6 Batty's own self-development team

Highlights

  • Hierarchical agent teams instead of one overloaded coding agent
  • Shim-driven runtime with PTY ownership, state classification, and structured delivery
  • tmux-backed visibility with persistent panes and session resume
  • Agent-agnostic role assignment: Claude Code, Codex, Aider, Kiro, or similar — set the default with batty init --agent <backend>
  • Maildir inbox routing with explicit talks_to communication rules
  • Stable per-engineer worktrees with fresh task branches on each assignment
  • Kanban-driven task loop with auto-dispatch, retry tracking, and test gating
  • Scheduled tasks: scheduled_for delays dispatch until a future time, cron_schedule enables recurring tasks that auto-recycle from done back to todo (guide)
  • Intervention system: seven automated recovery mechanisms (triage, review, owned-task, dispatch-gap, utilization, board replenishment, idle nudge) with cooldowns, dedup, and escalation
  • Per-intervention runtime toggles via batty nudge to disable or re-enable specific daemon behaviors without restarting
  • Orchestrator automation for triage, review, owned-task recovery, dispatch-gap recovery, utilization recovery, standups, nudges, and retrospectives
  • Auto-merge policy engine with confidence scoring and configurable thresholds for safe unattended merges
  • Review timeout escalation: stale reviews are nudged and auto-escalated after configurable thresholds, with per-priority overrides
  • Failure pattern detection: rolling window analysis detects recurring failures and notifies when thresholds are exceeded
  • SQLite telemetry database: batty telemetry queries agent performance, task lifecycle, review pipeline metrics, and event history
  • Consolidated metrics dashboard: batty metrics shows tasks, cycle time, rates, and agent performance in one view
  • Run retrospectives: batty retro generates Markdown reports analyzing task throughput, review stall durations, rework rates, and failure patterns
  • Team template export/import: batty export-template saves your team config, batty init --from restores it
  • Bundled Grafana dashboard template with 21 panels and 6 alerts for monitoring agent sessions, pipeline health, and task lifecycle
  • Daemon restart recovery: dead agent panes are automatically respawned with task context and backoff
  • External senders: allow non-team sources (email routers, Slack bridges) to message any role
  • Graceful non-git-repo handling: git-dependent operations degrade cleanly when the project is not a repository
  • Session summary on batty stop: prints task counts, cycle times, and agent uptime before exiting
  • Daemon auto-archive: completed tasks are automatically archived when the board exceeds a threshold
  • Board health dashboard: batty board health shows per-status counts, stale tasks, and dependency issues
  • Team load and cost estimation: batty load shows team utilization, batty cost estimates session spending
  • Inbox purge with age filtering: batty inbox purge --older-than 7d cleans up delivered messages
  • batty validate --show-checks: individual pass/fail status for each config validation rule
  • batty doctor --fix: detect and clean up orphan worktrees and branches left by previous runs
  • batty board archive: move completed tasks to an archive directory to keep the active board fast
  • Error resilience: sentinel tests guard production error paths in daemon and task loop modules
  • Modular codebase: large modules (daemon, config, delivery, watcher, doctor, merge) are decomposed into focused submodules
  • Worktree reconciliation: auto-detect cherry-picked branches and reset stale worktrees so engineers always start clean
  • Pending delivery queue: messages sent to agents that are still starting are buffered and delivered automatically when the agent becomes ready
  • YAML config, Markdown boards, JSON/JSONL + SQLite logs: everything stays file-based

CLI Quick Reference

Command Purpose
batty init [--template NAME] [--agent BACKEND] Scaffold .batty/team_config/
batty start [--attach] Launch the daemon and tmux session
batty stop / batty attach Stop or reattach to the team session
batty send <role> <message> Send a message to a role
batty assign <engineer> <task> Queue work for an engineer and report delivery result
batty inbox <member> / read / ack Inspect and manage inbox messages
batty board / board list / board summary Open the kanban board or inspect it without a TTY
batty board health Show board health dashboard (status counts, stale tasks, dep issues)
batty board archive [--older-than DATE] Move done tasks to archive directory
batty status [--json] Show current team state
batty merge <engineer> Merge an engineer worktree branch
batty review <id> <disposition> [feedback] Record a review disposition (approve, request-changes, reject)
batty task review <id> --disposition <d> Record a review disposition (workflow-level variant)
batty task schedule <id> [--at T] [--cron E] [--clear] Set or clear scheduled dispatch time and cron recurrence
batty nudge disable/enable/status Toggle specific daemon interventions at runtime
batty telemetry summary/agents/tasks/reviews/events Query SQLite telemetry for agent, task, and review metrics
batty retro Generate a run retrospective analyzing throughput and failure patterns
batty load Estimate team load and show recent load history
batty cost Estimate current run cost from agent session files
batty metrics Show consolidated telemetry dashboard (tasks, cycle time, rates, agents)
batty doctor [--fix] Dump diagnostic state; --fix cleans up orphan worktrees/branches
batty pause / resume / queue Control automation and inspect queued dispatch work
batty inbox purge [--older-than DUR] Purge delivered inbox messages, optionally by age
batty validate [--show-checks] Validate config; --show-checks shows per-rule pass/fail
batty config / export-run Show resolved config and export runtime state
batty telegram Configure Telegram for human communication
batty completions <shell> Generate shell completions

Requirements

  • Rust toolchain, stable >= 1.85
  • tmux >= 3.1 (recommended >= 3.2)
  • kanban-md CLI: see Install for setup
  • At least one coding agent CLI such as Claude Code, Codex, or Aider

Engineer Worktrees

When use_worktrees: true is enabled for engineers, Batty keeps one stable worktree directory per engineer under .batty/worktrees/<engineer>.

Each new batty assign does not create a new worktree. Instead it:

  • reuses that engineer's existing worktree path
  • resets the engineer slot onto current main
  • creates a fresh task branch such as eng-1-2/task-123 or eng-1-2/task-say-hello-1633ae2d
  • launches the engineer in that branch

After merge, Batty resets the engineer back to the base branch eng-main/<engineer> so the next assignment starts clean.

Telegram Integration

Batty can expose a human endpoint over Telegram through a user role. This is useful when you want the team to keep running in tmux while you send direction or receive updates from your phone.

The fastest path is:

batty init --template simple
batty telegram
batty stop && batty start

batty telegram guides you through:

  • creating or reusing a bot token from @BotFather
  • discovering your numeric Telegram user ID
  • sending a verification message
  • updating .batty/team_config/team.yaml with the Telegram channel config

After setup, the user role in team.yaml will look like this:

- name: human
  role_type: user
  channel: telegram
  talks_to: [architect]
  channel_config:
    provider: telegram
    target: "123456789"
    bot_token: "<telegram-bot-token>"
    allowed_user_ids: [123456789]

Notes:

  • You must DM the bot first in Telegram before it can send you messages.
  • bot_token can also come from BATTY_TELEGRAM_BOT_TOKEN instead of being stored in team.yaml.
  • The built-in simple, large, software, and batty templates already include a Telegram-ready user role.

Built with Batty

Batty team session in tmux

This session shows Batty coordinating a live team in ~/mafia_solver: the architect sets direction, black-lead and red-lead turn that into lane-specific work, and the black-eng-* / red-eng-* panes are individual engineer agents running in separate worktrees inside one shared tmux layout.

  • chess_test: a chess engine built by a Batty team (architect + manager + engineers)

Grafana Monitoring

Batty includes a bundled Grafana dashboard template with 21 panels across 6 rows and 6 pre-configured alerts. The dashboard covers session overview, pipeline health, agent performance, delivery and communication, task lifecycle, and recent activity.

The dashboard JSON is available in the source tree at src/team/grafana/dashboard.json. Copy it and import into your Grafana instance to monitor live team runs.

Pre-configured alerts:

Alert Detects
Agent Stall Agent silent past threshold
Delivery Failure Spike Message delivery failures climbing
Pipeline Starvation Not enough work in the pipeline
High Failure Rate Tasks failing above threshold
Context Exhaustion Agent context window nearly full
Session Idle Entire team idle too long

Docs and Links

License

MIT