CLI-first browser automation for AI agents
Features • Installation • Quick Start • Task And Automation Model • Testing
browser-cli is a browser automation tool for AI agents and developers who need
reliable browser control from the command line.
┌──────────────────────────────────────────────────────────────────────┐
│ Task/Automation Layer (task.py + task.meta.json + automation.toml) │
├──────────────────────────────────────────────────────────────────────┤
│ Browser Daemon ──► 60+ commands ──► Semantic Ref System │
│ ├─ read: one-shot page capture │
│ ├─ open/snapshot/click/fill: interactive control │
│ ├─ console/network/trace: observation & debugging │
│ ├─ verify-*: assertions │
│ └─ ... 60+ commands total │
├──────────────────────────────────────────────────────────────────────┤
│ Dual Backend: Playwright (default) ◄──► Chrome Extension (opt) │
└──────────────────────────────────────────────────────────────────────┘
| Component | Purpose |
|---|---|
| Browser Daemon | Long-lived browser instance with daemon-backed CLI commands |
| Semantic Refs | Stable element identifiers using bridgic-style reconstruction |
| Task Runtime | Reusable task.py execution through browser_cli.task_runtime |
| Automation Service | Persistent local service for published automation snapshots |
- Dual Backend Architecture: managed profile mode by default, extension mode when real Chrome is available
- Semantic Ref System: stable refs that survive many DOM re-renders
- Agent Isolation:
X_AGENT_IDisolates visible tabs while sharing browser storage - JSON-First API: daemon-backed commands return structured JSON
- Task Runtime: package browser logic as
task.py + task.meta.json - Automation Publish Layer: publish immutable task snapshots and operate them through a local Web UI
Requirements:
- Python 3.10+
- uv
- Stable Google Chrome
Install as a tool:
uv tool install browser-control-and-automation-cli
browser-cli doctor
browser-cli paths
browser-cli read https://example.comThe published package name is browser-control-and-automation-cli. The
installed command remains browser-cli.
Run without installing:
uvx --from browser-control-and-automation-cli browser-cli read https://example.comInstall from Git:
uv tool install git+https://github.com/hongv/browser-cli.git
browser-cli --helpInstalled users should start with docs/installed-with-uv.md.
For removal and local cleanup guidance, see docs/uninstall.md.
Clone the repository and sync the managed development environment:
git clone https://github.com/hongv/browser-cli.git
cd browser-cli
uv sync --devThe CLI targets stable Google Chrome. Playwright Chromium is mainly useful for local integration testing and is installed through the repo environment.
For real-Chrome execution:
- Open
chrome://extensions - Enable
Developer mode - Click
Load unpacked - Select
browser-cli-extension/
Once connected, browser-cli status reports extension capability state and the
daemon can prefer the extension backend at safe idle points.
If you installed Browser CLI with uv, use the dedicated installed-user guide at
docs/installed-with-uv.md. The short version is:
browser-cli doctor
browser-cli paths
browser-cli read https://example.combrowser-cli read https://example.com
browser-cli read https://example.com --snapshot
browser-cli read https://example.com --scroll-bottombrowser-cli open https://example.com
browser-cli snapshot
browser-cli click @8d4b03a9
browser-cli fill @input_ref "value"
browser-cli html
browser-cli status
browser-cli reloadX_AGENT_ID=agent-a browser-cli open https://example.com
X_AGENT_ID=agent-a browser-cli tabs
X_AGENT_ID=agent-b browser-cli open https://example.org
X_AGENT_ID=agent-b browser-cli tabsBrowser CLI separates local authoring from durable publication:
taskis local editable sourceautomationis a published immutable snapshot
Typical task layout:
tasks/
my_task/
task.py
task.meta.json
automation.toml
Validate and run a task directly:
browser-cli task validate tasks/my_task
browser-cli task run tasks/my_task --set url=https://example.comPublish the current task directory into the automation service:
browser-cli automation publish tasks/my_task
browser-cli automation status
browser-cli automation uiPublication semantics:
automation publishsnapshotstask.py,task.meta.json, andautomation.tomltogether under~/.browser-cli/automations/<automation-id>/versions/<version>/- if source
automation.tomlexists, Browser CLI uses it as the publish-time configuration truth - if source
automation.tomlis absent, Browser CLI publishes generated defaults and reports that explicitly viamanifest_source
Export a persisted automation back to automation.toml:
browser-cli automation export my_task --output /tmp/my_task.automation.tomlIncluded examples:
- Automation-packaged reference and real-site tasks:
- Additional real-site task examples:
- Additional usage notes:
Real-site publish example:
browser-cli task validate tasks/douyin_video_download
browser-cli automation publish tasks/douyin_video_download
browser-cli automation inspect douyin_video_download
browser-cli automation statusInspect semantics:
browser-cli automation inspect <automation-id>shows the current live automation-service configurationbrowser-cli automation inspect <automation-id> --version <n>showssnapshot_configfor the immutable published version andlive_configfor the current service statelatest_runremains a separate operational view
stdout: final rendered result onlystderr: diagnostics only
Exit codes:
0: success1: unexpected internal error2: usage error66: empty content69: browser unavailable73: profile unavailable75: temporary read failure
- success
stdout: JSON only - failure
stderr: short error summary - stable machine-readable error codes include:
NO_ACTIVE_TABAGENT_ACTIVE_TAB_BUSYTAB_NOT_FOUNDNO_SNAPSHOT_CONTEXTREF_NOT_FOUNDSTALE_SNAPSHOTAMBIGUOUS_REF
- Managed profile mode is the default backend.
- Extension mode is the preferred real-Chrome backend when connected and healthy.
- Driver rebinding happens only at safe idle points and is reported as
state_reset. runtime.timeout_secondsis the total wall-clock timeout for one automation run in the automation service.
- Repo navigation and subsystem ownership:
AGENTS.md - Installed-user guide:
docs/installed-with-uv.md - Uninstall and cleanup guide:
docs/uninstall.md - Explore-to-task skill:
skills/browser-cli-explore-delivery/SKILL.md - Smoke checklist:
docs/smoke-checklist.md
Run lint:
./scripts/lint.shRun tests:
./scripts/test.shRun guards:
./scripts/guard.shRun the full local validation flow:
./scripts/check.shFast Python 3.10 compatibility check:
uv run python scripts/guards/python_compatibility.pyWhen the runtime behaves unexpectedly, use:
browser-cli status
browser-cli reload
browser-cli statusThe integration coverage is fixture-driven and local-first. It exercises:
- navigation, tabs, and history
- snapshot and rendered HTML capture
- semantic ref reconstruction after DOM re-render
- stale and ambiguous ref failures
- iframe refs
- ref-driven element actions
- console, network, dialogs, trace, video, screenshot, and PDF
- cookies, storage save/load, and
X_AGENT_IDisolation - task runtime and automation publishing/service flows
This project is deeply inspired by bridgic-browser. Browser CLI keeps the semantic ref and daemon-backed strengths while pushing the product toward a CLI-first, agent-first surface with reusable task and automation layers.