Skip to content

kunchenguid/chrome-devtools-axi

Repository files navigation

chrome-devtools-axi

npm CI Release Platform X Discord

The most agent-ergonomic browser automation

chrome-devtools-axi wraps chrome-devtools-mcp with an AXI-compliant CLI.

  • Token-efficient — TOON-encoded output cuts token usage ~40% vs raw JSON
  • Combined operations — one command navigates, captures, and suggests next steps
  • Contextual suggestions — every response includes actionable next-step hints

Quick Start

$ chrome-devtools-axi open https://example.com
page: {title: "Example Domain", url: "https://example.com", refs: 1}
snapshot:
RootWebArea "Example Domain"
  heading "Example Domain"
  paragraph "This domain is for use in illustrative examples..."
  uid=1 link "More information..."
help[1]:
  Run `chrome-devtools-axi click @1` to click the "More information..." link

$ chrome-devtools-axi click @1
page: {title: "IANA — IANA-Managed Reserved Domains", refs: 12}
snapshot:
...

Install

Tell your agent:

Execute `npx -y chrome-devtools-axi` to get browser automation tools.

How It Works

┌───────────────────────┐
│  chrome-devtools-axi  │  CLI — parse args, format output
└──────────┬────────────┘
           │ HTTP (localhost:9224)
           ▼
┌───────────────────────┐
│     Bridge Server     │  Persistent process, manages MCP session
└──────────┬────────────┘
           │ stdio
           ▼
┌───────────────────────┐
│  chrome-devtools-mcp  │  Headless Chrome via DevTools Protocol
└───────────────────────┘
  • Persistent bridge — a detached process keeps the MCP session alive across commands, so Chrome doesn't restart every invocation
  • Auto-lifecycle — the bridge starts on first command and writes a PID file to ~/.chrome-devtools-axi/bridge.pid
  • Snapshot parsing — accessibility tree snapshots are extracted and analyzed for interactive elements (uid= refs)
  • TOON encoding — structured metadata uses TOON format for compact, token-efficient output

CLI Reference

Navigation

Command Description
open <url> Navigate to URL and snapshot
snapshot Capture current page state
screenshot <p> Save a screenshot to a file
scroll <dir> Scroll: up, down, top, bottom
back Navigate back
wait <ms|text> Wait for time or text to appear
eval <js> Evaluate a JavaScript expression or function
run Execute a multi-step script from stdin

eval wraps plain input as () => (<expr>) before sending it to DevTools. For multi-statement logic, pass an arrow function, function, or IIFE yourself.

chrome-devtools-axi eval "document.title"
chrome-devtools-axi eval "(() => { const rows = [...document.querySelectorAll('tr')]; return rows.map((row) => row.textContent) })()"

Interaction

Command Description
click @<uid> Click an element by ref
fill @<uid> <text> Fill a form field
type <text> Type text at current focus
press <key> Press a keyboard key
hover @<uid> Hover over an element
drag @<from> @<to> Drag an element onto another
fillform @<uid>=<val>... Fill multiple form fields
dialog <accept|dismiss> Handle a browser dialog
upload @<uid> <path> Upload a file through an input

Page Management

Command Description
pages List all open tabs
newpage <url> Open a new tab
selectpage <id> Switch to a tab by ID
closepage <id> Close a tab by ID
resize <w> <h> Resize the browser viewport

Emulation

Command Description
emulate Emulate device/network/viewport

DevTools Debugging

Command Description
console List console messages
console-get <id> Get a specific console message
network List network requests
network-get [id] Get a specific network request

Performance

Command Description
lighthouse Run a Lighthouse audit
perf-start Start a performance trace
perf-stop Stop the performance trace
perf-insight <set> <name> Analyze a performance insight
heap <path> Capture a heap snapshot

Bridge

Command Description
start Start the bridge server
stop Stop the bridge server

Running with no command shows the CLI home view. It prepends bin and description metadata, then includes the current snapshot when a browser session is active or the no-session status/help block when one is not.

Flags

Flag Description
--help Show usage information
-v, -V, --version Show the installed CLI version
--full Show complete output without truncation
--background Open new page in background (newpage)
--uid @<uid> Target a specific element (screenshot)
--full-page Capture entire scrollable page (screenshot)
--format <fmt> Image format: png, jpeg, webp (screenshot)
--viewport <spec> Viewport like "390x844x3,mobile" (emulate)
--color-scheme <value> dark, light, or auto (emulate)
--network <condition> Network throttle: Slow 3G, etc. (emulate)
--cpu <rate> CPU throttling rate 1-20 (emulate)
--geolocation <lat>x<lon> Set geolocation (emulate)
--user-agent <string> Custom user agent (emulate)
--type <type> Filter by type (console, network)
--limit <n> Max items to return (console, network)
--page <n> Pagination (console, network)
--device <device> desktop or mobile (lighthouse)
--mode <mode> navigation or snapshot (lighthouse)
--output-dir <path> Directory for reports (lighthouse)
--no-reload Skip page reload (perf-start)
--no-auto-stop Disable auto-stop (perf-start)
--file <path> Save trace data to file (perf-start/stop)
--response-file <path> Save response body (network-get)
--request-file <path> Save request body (network-get)

Configuration

The bridge server port defaults to 9224. Override it with an environment variable:

export CHROME_DEVTOOLS_AXI_PORT=9225

Connect to an existing Chrome instance instead of launching one:

export CHROME_DEVTOOLS_AXI_BROWSER_URL=http://127.0.0.1:9222

CHROME_DEVTOOLS_AXI_BROWSER_URL accepts both http:// or https:// URLs and ws:// or wss:// endpoints:

  • http(s):// uses --browserUrl and fetches /json/version to discover the WebSocket URL.
  • ws(s):// uses --wsEndpoint directly.

For authenticated ws:// or wss:// endpoints, pass JSON headers with CHROME_DEVTOOLS_AXI_WS_HEADERS:

export CHROME_DEVTOOLS_AXI_BROWSER_URL=wss://cluster.example/launch
export CHROME_DEVTOOLS_AXI_WS_HEADERS='{"Authorization":"Bearer token"}'

State is stored in ~/.chrome-devtools-axi/:

File Purpose
bridge.pid PID and port of the running bridge

Session Hooks

On supported agents, the packaged CLI also installs a SessionStart hook in ~/.claude/settings.json and ~/.codex/hooks.json, and enables codex_hooks in ~/.codex/config.toml.

Set CHROME_DEVTOOLS_AXI_DISABLE_HOOKS=1 to skip that auto-install behavior.

Development entrypoints such as npm run dev and bin/chrome-devtools-axi.ts do not modify those hook files.

Development

npm run build      # Compile TypeScript to dist/
npm run dev        # Run CLI directly with tsx
npm test           # Run tests with vitest
npm run test:watch # Run tests in watch mode

About

The most agent-ergonomic browser automation

Resources

License

Stars

Watchers

Forks

Contributors