browser-cli

CLI-first browser automation for AI agents

Features • Installation • Quick Start • Task And Automation Model • Testing

browser-cli is a browser automation tool for AI agents and developers who need reliable browser control from the command line.

Architecture

┌──────────────────────────────────────────────────────────────────────┐
│  Task/Automation Layer  (task.py + task.meta.json + automation.toml) │
├──────────────────────────────────────────────────────────────────────┤
│  Browser Daemon  ──►  60+ commands  ──►  Semantic Ref System         │
│  ├─ read: one-shot page capture                                      │
│  ├─ open/snapshot/click/fill: interactive control                    │
│  ├─ console/network/trace: observation & debugging                   │
│  ├─ verify-*: assertions                                             │
│  └─ ... 60+ commands total                                           │
├──────────────────────────────────────────────────────────────────────┤
│  Dual Backend: Playwright (default) ◄──► Chrome Extension (opt)      │
└──────────────────────────────────────────────────────────────────────┘

Component	Purpose
Browser Daemon	Long-lived browser instance with daemon-backed CLI commands
Semantic Refs	Stable element identifiers using bridgic-style reconstruction
Task Runtime	Reusable `task.py` execution through `browser_cli.task_runtime`
Automation Service	Persistent local service for published automation snapshots

Features

Dual Backend Architecture: managed profile mode by default, extension mode when real Chrome is available
Semantic Ref System: stable refs that survive many DOM re-renders
Agent Isolation: X_AGENT_ID isolates visible tabs while sharing browser storage
JSON-First API: daemon-backed commands return structured JSON
Task Runtime: package browser logic as task.py + task.meta.json
Automation Publish Layer: publish immutable task snapshots and operate them through a local Web UI

Installation

Requirements:

Python 3.10+
uv
Stable Google Chrome

Install as a tool:

uv tool install browser-control-and-automation-cli
browser-cli doctor
browser-cli paths
browser-cli read https://example.com

The published package name is browser-control-and-automation-cli. The installed command remains browser-cli.

Run without installing:

uvx --from browser-control-and-automation-cli browser-cli read https://example.com

Install from Git:

uv tool install git+https://github.com/hongv/browser-cli.git
browser-cli --help

Installed users should start with docs/installed-with-uv.md. For removal and local cleanup guidance, see docs/uninstall.md.

Development

Clone the repository and sync the managed development environment:

git clone https://github.com/hongv/browser-cli.git
cd browser-cli
uv sync --dev

The CLI targets stable Google Chrome. Playwright Chromium is mainly useful for local integration testing and is installed through the repo environment.

Optional: Extension Mode

For real-Chrome execution:

Open chrome://extensions
Enable Developer mode
Click Load unpacked
Select browser-cli-extension/

Once connected, browser-cli status reports extension capability state and the daemon can prefer the extension backend at safe idle points.

Quick Start

If you installed Browser CLI with uv, use the dedicated installed-user guide at docs/installed-with-uv.md. The short version is:

browser-cli doctor
browser-cli paths
browser-cli read https://example.com

One-Shot Read

browser-cli read https://example.com
browser-cli read https://example.com --snapshot
browser-cli read https://example.com --scroll-bottom

Interactive Control

browser-cli open https://example.com
browser-cli snapshot
browser-cli click @8d4b03a9
browser-cli fill @input_ref "value"
browser-cli html
browser-cli status
browser-cli reload

Multi-Agent Tabs

X_AGENT_ID=agent-a browser-cli open https://example.com
X_AGENT_ID=agent-a browser-cli tabs

X_AGENT_ID=agent-b browser-cli open https://example.org
X_AGENT_ID=agent-b browser-cli tabs

Task And Automation Model

Browser CLI separates local authoring from durable publication:

task is local editable source
automation is a published immutable snapshot

Typical task layout:

tasks/
  my_task/
    task.py
    task.meta.json
    automation.toml

Validate and run a task directly:

browser-cli task validate tasks/my_task
browser-cli task run tasks/my_task --set url=https://example.com

Publish the current task directory into the automation service:

browser-cli automation publish tasks/my_task
browser-cli automation status
browser-cli automation ui

Publication semantics:

automation publish snapshots task.py, task.meta.json, and automation.toml together under ~/.browser-cli/automations/<automation-id>/versions/<version>/
if source automation.toml exists, Browser CLI uses it as the publish-time configuration truth
if source automation.toml is absent, Browser CLI publishes generated defaults and reports that explicitly via manifest_source

Export a persisted automation back to automation.toml:

browser-cli automation export my_task --output /tmp/my_task.automation.toml

Included examples:

Automation-packaged reference and real-site tasks:
Additional real-site task examples:
- tasks/karpathy_nitter_latest_five/task.py
Additional usage notes:
- docs/examples/task-and-automation.md

Real-site publish example:

browser-cli task validate tasks/douyin_video_download
browser-cli automation publish tasks/douyin_video_download
browser-cli automation inspect douyin_video_download
browser-cli automation status

Inspect semantics:

browser-cli automation inspect <automation-id> shows the current live automation-service configuration
browser-cli automation inspect <automation-id> --version <n> shows snapshot_config for the immutable published version and live_config for the current service state
latest_run remains a separate operational view

Output Contracts

`read`

stdout: final rendered result only
stderr: diagnostics only

Exit codes:

0: success
1: unexpected internal error
2: usage error
66: empty content
69: browser unavailable
73: profile unavailable
75: temporary read failure

Daemon-backed Commands

success stdout: JSON only
failure stderr: short error summary
stable machine-readable error codes include:
- NO_ACTIVE_TAB
- AGENT_ACTIVE_TAB_BUSY
- TAB_NOT_FOUND
- NO_SNAPSHOT_CONTEXT
- REF_NOT_FOUND
- STALE_SNAPSHOT
- AMBIGUOUS_REF

Runtime Notes

Managed profile mode is the default backend.
Extension mode is the preferred real-Chrome backend when connected and healthy.
Driver rebinding happens only at safe idle points and is reported as state_reset.
runtime.timeout_seconds is the total wall-clock timeout for one automation run in the automation service.

Documentation

Repo navigation and subsystem ownership: AGENTS.md
Installed-user guide: docs/installed-with-uv.md
Uninstall and cleanup guide: docs/uninstall.md
Explore-to-task skill: skills/browser-cli-explore-delivery/SKILL.md
Smoke checklist: docs/smoke-checklist.md

Testing

Run lint:

./scripts/lint.sh

Run tests:

./scripts/test.sh

Run guards:

./scripts/guard.sh

Run the full local validation flow:

./scripts/check.sh

Fast Python 3.10 compatibility check:

uv run python scripts/guards/python_compatibility.py

When the runtime behaves unexpectedly, use:

browser-cli status
browser-cli reload
browser-cli status

The integration coverage is fixture-driven and local-first. It exercises:

navigation, tabs, and history
snapshot and rendered HTML capture
semantic ref reconstruction after DOM re-render
stale and ambiguous ref failures
iframe refs
ref-driven element actions
console, network, dialogs, trace, video, screenshot, and PDF
cookies, storage save/load, and X_AGENT_ID isolation
task runtime and automation publishing/service flows

Acknowledgements

This project is deeply inspired by bridgic-browser. Browser CLI keeps the semantic ref and daemon-backed strengths while pushing the product toward a CLI-first, agent-first surface with reusable task and automation layers.

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
.githooks		.githooks
.github		.github
browser-cli-extension		browser-cli-extension
docs		docs
scripts		scripts
skills		skills
src/browser_cli		src/browser_cli
tasks		tasks
tests		tests
third_party/bridgic-browser		third_party/bridgic-browser
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

browser-cli

Architecture

Features

Installation

Development

Optional: Extension Mode

Quick Start

One-Shot Read

Interactive Control

Multi-Agent Tabs

Task And Automation Model

Output Contracts

`read`

Daemon-backed Commands

Runtime Notes

Documentation

Testing

Acknowledgements

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

browser-cli

Architecture

Features

Installation

Development

Optional: Extension Mode

Quick Start

One-Shot Read

Interactive Control

Multi-Agent Tabs

Task And Automation Model

Output Contracts

read

Daemon-backed Commands

Runtime Notes

Documentation

Testing

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`read`

Packages