macOS AI agent powered by an LLM-driven Plan-Execute-Verify loop, with native desktop automation, cross-run memory, and record-and-replay skills.
Axion is a Swift-based AI agent for macOS that takes natural language task descriptions and autonomously plans and executes actions. It combines core tools (Bash, file operations, web search) with 21 native desktop automation tools via MCP (Model Context Protocol), plus browser automation via Playwright. Use the built-in CLI directly, or integrate via HTTP API / MCP Server mode.
Key highlights:
- Versatile Tool Selection — Automatically picks the right tool: Bash for CLI tasks, MCP for GUI interactions, Playwright for browser automation, or Skills for specialized workflows
- SDK Skill System — Prompt skills, recorded skills, and built-in desktop skills with dual-track lookup and skill-scoped memory
- Record & Replay Skills — Record a workflow once, replay it instantly without LLM calls
- HTTP API Server — Integrate with CI/CD and external systems via REST + SSE
- MCP Server Mode — Act as a desktop plugin for external agents (Claude Code, Cursor, etc.), while also supporting CLI, file, and web tasks standalone
- User Takeover — Pause and resume when automation gets stuck
- Completion Notifications — macOS desktop notification with AI-generated summary when tasks finish
- Self-Evolution — Background review agent and intelligent curator automatically extract memory, evolve skills, and manage skill lifecycle after each run
┌───────────────────────────────────────────────────────────┐
│ AxionCLI │
│ run / setup / doctor / server / mcp / record / skill │
│ daemon / Agent Stream Loop · Memory · Takeover │
│ Skill System · Built-in Skills · Skill + Memory Context │
├──────────────────────┬────────────────────────────────────┤
│ AxionCore │ AxionHelper │
│ Models, Protocols, │ MCP Server │
│ Config, Errors │ 21 Native macOS Automation Tools │
└──────────────────────┴────────────────────────────────────┘
- AxionCLI — CLI entry point with agent stream loop, memory, skill system (prompt + recorded + built-in), daemon management, server modes, and completion notifications
- AxionCore — Shared model layer (RunConfig, AxionConfig) and protocol definitions
- AxionHelper — MCP server process providing 21 native macOS automation tools via stdio
| Tool | Description |
|---|---|
launch_app |
Launch a macOS app by name (detects blocking dialogs) |
list_apps |
List all running applications |
quit_app |
Quit a running application |
activate_window |
Activate (bring to front) a specific window |
| Tool | Description |
|---|---|
list_windows |
List windows (filterable by process ID) |
get_window_state |
Get the state of a specific window |
move_window |
Move a window to a new position |
resize_window |
Move and/or resize a window |
validate_window |
Check if a window exists and is actionable |
arrange_windows |
Arrange multiple windows (tile, cascade) |
| Tool | Description |
|---|---|
click |
Click at coordinates or by AX selector |
click_element |
Click an element by title/role — no coordinate lookup needed |
double_click |
Double-click at coordinates or by AX selector |
right_click |
Right-click at coordinates or by AX selector |
drag |
Drag from one point to another |
scroll |
Scroll by direction and amount |
| Tool | Description |
|---|---|
type_text |
Type text at the current cursor position |
press_key |
Press a single key |
hotkey |
Press a keyboard shortcut combination |
| Tool | Description |
|---|---|
screenshot |
Take a screenshot (full screen or specific window) |
get_accessibility_tree |
Get the accessibility tree of a window |
get_file_info |
Get file metadata (size, dates, permissions) |
| Tool | Description |
|---|---|
start_recording |
Start capturing user input events in listen-only mode |
stop_recording |
Stop recording and return captured events |
- macOS 14+
- Xcode 16+ (Swift 6.1)
- Accessibility and Screen Recording permissions
Homebrew (recommended):
brew tap terryso/tap
brew install axionBuild from source:
git clone https://github.com/terryso/axion.git
cd axion
swift build -c release# Interactive setup (API Key, Provider, etc.)
axion setup
# Check environment status
axion doctor# Execute a task (default — runs live)
axion run "Open Calculator and compute 123 + 456"
# CLI tasks use Bash directly — no GUI needed
axion run "Compress ~/Downloads/video.mp4 using ffmpeg"
axion run "Check disk usage of ~/Documents"
axion run "Search the web for Guangzhou weather today"
# Dry-run mode (generates a plan without executing)
axion run --dryrun "Open Calculator and compute 123 + 456"
# Fast mode — fewer LLM calls for simple tasks
axion run --fast "Open Calculator"
# Limit maximum steps
axion run --max-steps 10 "Create a new note in Notes"
# Disable post-run review and curator
axion run --no-review "Open Calculator"When a task finishes, Axion sends a macOS desktop notification with three lines:
- Status — completed / failed / cancelled
- AI Summary — auto-generated one-line result summary (max 100 chars)
- Stats — elapsed time, LLM calls, estimated cost
If the task involved UI operations (desktop automation), Axion automatically brings the terminal window back to the foreground so you can immediately see the results.
Notifications are skipped in JSON mode for programmatic use.
When automation gets stuck, Axion pauses and lets you take over manually. Complete the action yourself, then press Enter to resume. Imperfect automation beats no automation.
Available options when paused:
- Press Enter — resume after manual fix
- Type skip — skip the current step
- Type abort — cancel the task
- Type a description — describe what you did (e.g., "used Cmd+Shift+G to enter path")
Takeover experiences are automatically recorded as Memory, helping the Planner avoid similar blocks in the future. You can also manually record experiences:
axion memory learn-takeover --bundle-id com.apple.finder \
--issue "file dialog not accessible via AX" \
--summary "used Cmd+Shift+G to enter path directly"Axion learns from every task execution. After each run, it automatically extracts app operation patterns (menu paths, control positions, operation sequences) and persists them. On subsequent runs involving the same app, the Planner injects this experience for more accurate plans.
# Memory is enabled by default — view accumulated knowledge
axion memory list
# Clear memory for a specific app
axion memory clear --app com.apple.calculator
# Disable memory for a single run
axion run --no-memory "Open Calculator"After each run, Axion automatically triggers a background review that analyzes the conversation, extracts memory, and evolves skills — no user action required.
Review Agent — Automatically runs after axion run completes:
- Checks if review is needed based on message count and scheduling interval
- Forks a lightweight review agent (Haiku model) that inspects the conversation
- Extracts new memory facts and evolves skill definitions
- Runs in a detached task — does not block the terminal
# Review is enabled by default. Disable for a single run:
axion run --no-review "Open Calculator"
# Override the model used for review:
axion run --review-model claude-haiku-4-5-20251001 "Open Calculator"Intelligent Curator — Periodically manages skill lifecycle:
- Mechanical curation — archives stale skills (>30 days unused), transitions skill states
- LLM curation — consolidates overlapping skills, prunes redundant ones
- Runs automatically when the configured interval elapses
# View curator status and next run time
axion curator status
# Force-run curator immediately
axion curator run
# Dry-run (see what would change without modifying)
axion curator run --dry-runSkill usage tracking — Every skill invocation via the Skill tool is automatically counted, providing data for curator decisions.
Review and curator results appear as trace events in ~/.axion/runs/<run-id>/review-trace.jsonl.
Run Axion as a service for external integrations:
# Start API server
axion server --port 4242
# With authentication
axion server --port 4242 --auth-key mysecret
# Limit concurrent tasks
axion server --port 4242 --max-concurrent 3API endpoints:
| Method | Path | Description |
|---|---|---|
GET |
/v1/health |
Health check |
POST |
/v1/runs |
Submit a task ({"task": "..."}) |
GET |
/v1/runs/{id} |
Query task status |
GET |
/v1/runs/{id}/events |
SSE real-time event stream |
GET |
/v1/skills |
List all skills |
GET |
/v1/skills/{name} |
Get skill detail |
POST |
/v1/skills/{name}/run |
Execute a skill |
Axion can act as an MCP server for external agents:
# Start as MCP stdio server
axion mcpAdd to your Claude Code MCP configuration:
{
"mcpServers": {
"axion": {
"command": "/path/to/axion",
"args": ["mcp"]
}
}
}Record a workflow once, replay it anytime without LLM planning:
# Record your actions
axion record "open_calculator"
# ... perform desktop operations ...
# Press Ctrl-C to stop recording
# Compile recording into a reusable skill
axion skill compile open_calculator
# Run the skill (no LLM needed — fast and deterministic)
axion skill run open_calculator
# List all saved skills
axion skill list
# Delete a skill
axion skill delete open_calculatorSkills are stored as JSON in ~/.axion/skills/ and can be parameterized with --param.
Coordinate operations across multiple applications — copy data from browser to spreadsheet, extract attachments from mail to Finder, and chain end-to-end workflows across apps.
axion run "Copy the page title from Safari and paste it into TextEdit"
axion run "Put Safari and TextEdit side by side, Safari on the left"The arrange_windows tool supports layouts: tile-left-right, tile-top-bottom, cascade.
Axion serves as the flagship reference implementation of OpenAgentSDK. Third-party developers can:
- Use the project template to scaffold new Agent apps
- Register custom tools via the
@Toolmacro - Integrate with Axion's desktop capabilities via
axion mcp - Build on the same MCP + Agent Loop architecture
Axion integrates with OpenAgentSDK's Skill system, supporting two types of skills:
- Prompt Skills — Discovered from
~/.claude/skills/*/SKILL.mdfiles, each defining apromptTemplate, optionaltoolRestrictions, andmodelOverride - Recorded Skills — JSON files in
~/.axion/skills/compiled from user recordings
Dual-track lookup — When a skill name is referenced, Axion checks prompt skills first, then falls back to recorded skills. Same-name skills always resolve to the prompt version.
Explicit triggering — Type /skill-name as the task prefix to invoke a specific skill directly:
# Trigger a prompt skill directly
axion run "/screenshot-analyze analyze the current screen layout"
# Trigger a recorded skill directly
axion run "/open-calculator"
# Or use the dedicated command
axion skill run open-calculatorImplicit triggering — Axion injects a curated list of available skills into the system prompt. The LLM can automatically invoke the right skill based on the user's intent without explicit mention.
Built-in desktop skills — Three skills are registered in code (no filesystem files needed):
| Skill | Aliases | Description |
|---|---|---|
screenshot-analyze |
sa, analyze, screen |
Capture and analyze the current screen |
data-extract |
extract, de |
Extract structured data from visible content |
form-fill |
fill, ff |
Fill form fields automatically |
# List all available skills (prompt + recorded + built-in)
axion skill list
# Disable skill system for a single run
axion run --no-skills "Open Calculator"Skill + Memory integration — Skills interact with the cross-run memory system:
- Successful skill execution records an
affordancefact scoped toskill:{name} - Failed execution records an
avoidfact so the Planner learns from errors - Before execution, up to 3 relevant skill-scoped memories are injected into the prompt
- Use
--no-memoryto skip both injection and recording
HTTP API skill endpoints:
| Method | Path | Description |
|---|---|---|
GET |
/v1/skills |
List all skills (merged prompt + recorded, with type field) |
GET |
/v1/skills/{name} |
Get skill detail (type, step_count, parameter_count) |
POST |
/v1/skills/{name}/run |
Execute a skill via API ({"task": "..."}) |
Run Axion as a persistent launchd daemon that survives reboots and auto-restarts on crashes. All running task state is persisted to disk, so in-flight tasks are automatically recovered after an unexpected server termination.
Daemon management:
# Install as a launchd agent (auto-start on login)
axion daemon install --port 4242
# With authentication
axion daemon install --port 4242 --auth-key mysecret
# Check daemon status
axion daemon status
# Uninstall (stops service and removes plist)
axion daemon uninstall
# Uninstall but keep log files
axion daemon uninstall --keep-logsKey daemon properties:
- Auto-start —
RunAtLoad: truestarts on login - Crash recovery —
KeepAlive: truerestarts on any exit - Log files — stdout →
~/.axion/server.log, stderr →~/.axion/server.err.log - ThrottleInterval — 10s minimum between restart attempts
Task state persistence:
- All task state (
api-output.json) and SSE events (api-events.jsonl) are written to~/.axion/api-runs/in real-time - On server restart,
RunRecoveryServiceloads all persisted runs and:- Marks
running/queued/resuming/userTakeovertasks asfailedwith error"server interrupted" - Preserves
intervention_needed,completed,failed, andcancelledstates unchanged - Restores SSE event history so late subscribers can replay past events
- Marks
AxionHelper can run as a standalone MCP server for any MCP client:
# Start MCP stdio server
.build/release/AxionHelper{
"mcpServers": {
"axion": {
"command": "/path/to/AxionHelper"
}
}
}Config file located at ~/.config/axion/config.json:
{
"provider": "anthropic",
"apiKey": "sk-...",
"model": "claude-sonnet-4-20250514",
"maxSteps": 20,
"maxModelCalls": 50,
"reviewModel": "claude-haiku-4-5-20251001",
"reviewMemoryInterval": 10,
"reviewSkillInterval": 15,
"reviewMinMessages": 4,
"curatorEnabled": true,
"curatorIntervalHours": 168,
"curatorStaleAfterDays": 30,
"curatorArchiveAfterDays": 90
}Supports Anthropic and OpenAI Compatible providers. Config priority: defaults → config.json → environment variables → CLI flags.
# Build
swift build
# Run unit tests (Swift Testing framework)
swift test --filter "AxionHelperTests.Tools" --filter "AxionHelperTests.Models" \
--filter "AxionHelperTests.MCP" --filter "AxionHelperTests.Services" \
--filter "AxionCoreTests" --filter "AxionCLITests"
# Run integration tests (requires macOS Accessibility permissions)
swift test --filter AxionHelperIntegrationTestsSources/
├── AxionCLI/ # CLI entry point and commands
│ ├── Commands/ # run, setup, doctor, server, mcp, record, skill, daemon, curator subcommands
│ ├── Config/ # Configuration management
│ ├── Checks/ # Environment and permission checks
│ ├── Constants/ # CLI-specific constants
│ ├── IO/ # Output handlers and takeover I/O
│ ├── MCP/ # MCPServerRunner (Agent-as-MCP-Server)
│ ├── API/ # HTTP API server, SSE events
│ ├── Memory/ # MemoryContextProvider, RunMemoryProcessor
│ ├── Planner/ # PromptBuilder
│ ├── Skills/ # SkillRegistry, AxionBuiltInSkills
│ ├── Helper/ # HelperProcessManager (stdio lifecycle)
│ ├── Trace/ # TraceRecorder (review/curator trace events)
│ └── Services/ # RunOrchestrator, AgentBuilder, shared services
├── AxionCore/ # Shared core layer
│ ├── Models/ # RunConfig, AxionConfig, AppProfile
│ ├── Protocols/ # Service protocols
│ ├── Errors/ # Error types
│ └── Constants/ # ToolNames and shared constants
├── AxionHelper/ # MCP server (Helper process)
│ ├── MCP/ # MCPServer and ToolRegistrar (21 tools)
│ ├── Services/ # AccessibilityEngine, Screenshot, InputSimulation, EventRecorder
│ ├── Models/ # AppInfo, WindowInfo, AXElement, SelectorQuery
│ └── Protocols/ # Service protocol definitions
Tests/
├── AxionCoreTests/ # Core model unit tests
├── AxionCLITests/ # CLI command tests
├── AxionHelperTests/ # Helper tool and service tests
│ ├── Tools/ # Tool unit tests
│ ├── Models/ # Model tests
│ ├── Services/ # Service tests
│ ├── MCP/ # MCP protocol tests
│ └── Integration/ # Integration tests (requires real macOS environment)
- open-agent-sdk-swift — Agent SDK (Agent Loop, MCP Client, Memory Store, Hooks)
- swift-mcp — MCP protocol implementation
- swift-argument-parser — CLI argument parsing
MIT