GitHub - vericontext/snact: AI agent-optimized browser CLI — snap + act. Extreme token efficiency for LLM-driven browser automation.

snact
AI agent-optimized browser CLI — snap + act

snact lets AI agents control browsers with extreme token efficiency. One snap returns page structure, section content, and every actionable element — enough for an LLM to understand and act in a single turn.

$ snact snap https://www.apple.com/shop/buy-mac/macbook-pro

# Buy MacBook Pro

## Model. Choose your size.
> 14-inch — From $1,699 or $141.58/mo. | 16-inch — From $2,699 or $224.91/mo.
@e35 [input:radio] "14-inch" selected
@e36 [input:radio] "16-inch"

## Chip. Choose from these powerful options.
> M5 Pro — 12-core CPU, 16-core GPU | M5 Max — 16-core CPU, 40-core GPU
@e40 [link]

$ snact click @e36
ok
---
## Model. Choose your size.                    # ← auto re-snap included
> 16-inch — Available with M5 Pro or M5 Max
@e35 [input:radio] "14-inch"
@e36 [input:radio] "16-inch" selected

Every action automatically returns a fresh page snapshot — no manual re-snap needed.

Performance comparison

Task: Visit npmjs.com for 10 React state management libraries (zustand, jotai, recoil, valtio, mobx, redux, xstate, effector, nanostores, legend-state). Collect weekly downloads, last publish date, unpacked size, and dependencies for each.

comparison-compressed.mp4

^{Both sides played at 16x speed. Left: Playwright MCP (5m 17s real time). Right: snact CLI (2m 39s real time).}

	snact CLI	Playwright CLI	Playwright MCP
Time	2m 39s	5m 10s	5m 17s
Total tokens	34.1K (17%)	35.4K (18%)	88K (44%)
Message tokens	18.8K	20.1K	73.4K
Data accuracy	Correct	Correct	Correct

snact finished in half the time with half the tokens. All three produced identical data.

Speed: Both Playwright approaches took ~5 minutes. snact finished in 2m 39s. Token efficiency: snact and Playwright CLI used similar total tokens (~34-35K), but Playwright MCP consumed 2.5x more (88K) due to accessibility tree snapshots accumulating in context. Answer quality: All three produced identical data with minor format differences.

Per-page token measurements

Measured with wc -c / 4 on actual snap output (1 token ≈ 4 chars):

Site	snact (full)	snact (`--focus`)
example.com	46	—
GitHub Login	172	60
GitHub Trending	2,152	614
Hacker News	2,670	—
Apple MacBook Pro	2,546	—
StackOverflow	4,363	—
NYTimes	2,417	—

Simple pages: 50-200 tokens. Typical pages: 2K-4K. With --focus: 60-600.

Playwright token estimates from scrolltest.medium.com (MCP ~114K per test session, CLI ~27K). snact numbers are directly measured.

Record & Replay

Record task: Use snact to record a workflow called "npm-react-state" that visits npmjs.com for these 10 libraries. For each, snap the page and read the sidebar stats.

Replay task: Replay npm-react-state and build me an updated comparison table.

snact-replayx8-small.mp4

^{Played at 8x speed. First: record (2m 18s real time). Then: replay (47s real time).}

The replay skips all LLM reasoning — it re-executes the recorded commands directly against Chrome and returns fresh data.

	Record (first run)	Replay
Time	2m 18s	47s
LLM turns	~20+	1
Data	Fresh	Fresh (re-visits pages)

Why snact?

	Playwright MCP	Playwright CLI	snact
Architecture	Persistent MCP server	Daemon + CLI	Stateless CLI
After click/fill	Snapshot in response	Manual re-snapshot	Snapshot in response
Tokens per page	~3K-50K	~1K-13K	~50-4K (measured)
Repeated tasks	Full LLM call	Full LLM call	0 (workflow replay)
Session persistence	Config-based	`--persistent` flag	`session save/load`
Cron automation	Requires LLM API	Requires LLM API	Shell one-liner
Locale/Geo override	Via `run-code`	Via config	`--locale` / `--geo` flags
Install	npm + Playwright	npm + Playwright	Single binary (Rust)
Multi-browser	Chromium/FF/WebKit	Chromium/FF/WebKit	Chrome only

Installation

# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/vericontext/snact/main/install.sh | bash

# Windows (PowerShell) — experimental, see #3
irm https://raw.githubusercontent.com/vericontext/snact/main/install.ps1 | iex

# From source (all platforms)
cargo install --path crates/snact-cli

# Verify
snact --version

Quick start

snact browser launch --background          # 1. start Chrome
snact snap https://github.com/trending     # 2. page structure + elements
snact click @e28                           # 3. act (auto re-snap included)
snact browser stop                         # 4. done

snap — structure + content + elements

snact snap https://github.com/trending

# Trending

## NousResearch / hermes-agent
> The agent that grows with you | Python | Star
@e28 [link] href="/NousResearch/hermes-agent"

## microsoft / markitdown
> Python tool for converting files and office documents to Markdown. | Python | Star
@e37 [link] href="/microsoft/markitdown"

Section headings group elements. > lines summarize content. Each @eN reference is stable until the next snap.

act — actions return updated state

snact click @e28

ok
---
# NousResearch/hermes-agent
> The agent that grows with you. Build AI agents...
@e1 [link] "Code" href="/NousResearch/hermes-agent"
@e2 [link] "Issues" href="/NousResearch/hermes-agent/issues"
...

Every mutation (click, fill, type, select, scroll) returns a fresh snap. Use --no-snap to disable.

read — full text content

snact read https://example.com --focus="main"

# Example Domain
This domain is for use in documentation examples.
Learn more

snap = structure + elements + summaries. read = full text when you need more detail.

eval — custom JavaScript

When snap/read can't capture dynamic content (e.g. Amazon product cards):

snact eval "JSON.stringify(Array.from(document.querySelectorAll('.product')).map(p => ({
  title: p.querySelector('h2')?.textContent,
  price: p.querySelector('.price')?.textContent
})))"

session — persist browser state

snact session save github           # cookies + localStorage
snact session load github           # restore later

record & replay — zero LLM cost

snact record start login-flow
snact snap https://app.example.com/login
snact fill @e1 "user@example.com" --no-snap
snact click @e3 --no-snap
snact wait navigation
snact record stop

# Day 2, 3, 4... — no LLM, no tokens
snact replay login-flow

Commands

Command	Description
`snap [url]`	Page structure + section summaries + interactable elements
`read [url]`	Full visible text as structured markdown
`click <@ref>`	Click element (returns updated snap)
`fill <@ref> <value>`	Set input value (returns updated snap)
`type <@ref> <text>`	Type character by character (returns updated snap)
`select <@ref> <value>`	Select dropdown option (returns updated snap)
`scroll [direction]`	Scroll page (returns updated snap)
`eval <expression>`	Execute JavaScript on the page
`screenshot [--file]`	Capture page as PNG
`wait <condition>`	Wait for navigation, CSS selector, or timeout (ms)
`session save\|load\|list\|delete`	Manage browser sessions
`record start\|stop\|list\|delete`	Record command sequences
`replay <name>`	Replay a recorded workflow
`browser launch\|stop\|status`	Manage Chrome instance
`schema [command]`	JSON Schema introspection
`mcp`	Start MCP server (JSON-RPC over stdio)
`init`	Create AGENT.md for Claude Code skill discovery

Global flags

--port <PORT>       Chrome debugging port [default: 9222]
--output <FMT>      Output format: text, json, ndjson [default: text]
--dry-run           Preview action without executing
--no-snap           Skip automatic re-snap after actions
--profile <NAME>    Browser profile name [default: "default"] (browser launch)
--idle-timeout <MIN> Auto-stop Chrome after N minutes of inactivity (browser launch)
--lang <LANG>       Accept-Language header [default: en-US]
--locale <LOCALE>   JS navigator.language override (e.g. en-US, ja-JP)
--geo <LAT,LON>     Geolocation override (e.g. "37.7749,-122.4194")
--user-agent <UA>   Custom User-Agent string
--focus <SEL>       CSS selector to limit scope (snap/read)
--verbose           Debug logging

AI agent integration

Claude Code

snact works as a native CLI tool — no MCP configuration needed:

snact browser launch --background
claude
# "Use snact to find the MacBook Pro M4 Pro price on apple.com"

Run snact init in your project directory to create an AGENT.md skill file for Claude Code.

MCP server

For Claude Desktop or any MCP client:

{
  "mcpServers": {
    "snact": {
      "command": "snact",
      "args": ["mcp"]
    }
  }
}

Piped / scripted

snact snap https://example.com --output=json | jq '.elements | keys[]'
snact snap https://example.com --output=ndjson

Architecture

graph TD
    A["AI Agent (Claude, GPT, ...)"] -->|"CLI stdout/stdin"| B
    A -->|"JSON-RPC stdio"| M

    subgraph snact
        B["snact-cli<br/><small>Thin CLI shell (clap)</small>"]
        M["MCP Server<br/><small>JSON-RPC over stdio</small>"]
        B --> C
        M --> C

        subgraph core["snact-core"]
            C["Snap"] & D["Read"] & E["Action + snap"] & F["Record/Replay"]
            C --> G["Element Map<br/><small>@eN refs</small>"]
            E --> G
            H["Session Storage"]
        end

        core --> I

        I["snact-cdp<br/><small>WebSocket + ~30 hand-written CDP commands</small>"]
    end

    I -->|"WebSocket (CDP)"| J["Chrome"]

Three-crate workspace — cdp handles Chrome protocol, core is the library, cli is a thin shell. MCP server exposes the same core over JSON-RPC for Claude Desktop and other MCP clients.

How contextual snap works

DOMSnapshot.captureSnapshot — Full flattened DOM including Shadow DOM
Accessibility.getFullAXTree — Semantic roles, names, descriptions, properties
Merge — Join DOM nodes with AX nodes by backendNodeId
Extract context — Headings, text blocks (DOM + JS fallback for SPAs)
Filter — Keep only interactable elements, exclude hidden/aria-hidden
Compress — Group by section headings, add content summaries, assign @eN refs

Auto re-snap after actions

Every mutation action (click, fill, type, select, scroll) automatically:

Executes the action via CDP
Waits for settle — detects navigation (waits for page load, 3s timeout) or SPA mutation (300ms settle)
Takes a fresh snap on the same transport connection
Returns ok\n---\n{snap output} so the LLM sees updated state in one turn

Snap output format reference

## Section Heading
> Content summary: prices, options, descriptions (up to 300 chars)
@e1 [role] "label" id="..." href="..." expanded desc="Opens in new tab"
@e2 [input:text] "Search" placeholder="..." required

Component	Purpose
`## Heading`	Page section structure (h1-h6)
`> summary`	Key text content from that section
`@eN`	Stable element reference for actions
`[role]`	Semantic role (button, link, textbox, etc.)
`"label"`	Accessible name
`id=`, `href=`	Key attributes
`expanded`, `collapsed`	Dropdown/accordion state
`selected`	Active tab/option
`required`, `readonly`	Form field constraints
`desc="..."`	Accessibility description

Design decisions

Hand-written CDP types over generated bindings — ~30 commands, fast compile
Disk-based state between invocations — element maps, sessions, workflows as JSON
backendNodeId as element identifier — stable within a page load, selector hints for replay
Text output by default — optimized for LLM comprehension, not JSON parsing
Persistent browser profiles — cookies survive restarts, reduces bot detection
Single-threaded tokio — one thing at a time

Data storage

User scope — ~/.local/share/snact/ (Linux) or ~/Library/Application Support/snact/ (macOS):

snact/
├── element_map.json        # Current @eN → element mappings
├── heartbeat               # Last command timestamp (for --idle-timeout)
├── chrome-{port}.pid       # Chrome process ID
├── profiles/default/       # Persistent Chrome profile
├── sessions/{name}.json    # Saved browser sessions
├── workflows/{name}.json   # Recorded workflows (personal)
└── recording.json          # Active recording state

Project scope — .snact/ in the project directory (created by snact init, git-committable):

.snact/
└── workflows/{name}.json   # Shared workflows (team/repo)

Workflows save to project scope when .snact/ exists, otherwise user scope. On load, project scope takes priority.

Contributing

See CONTRIBUTING.md for development setup, project structure, and commit conventions.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.claude		.claude
.github		.github
crates		crates
scripts		scripts
.gitignore		.gitignore
AGENT.md		AGENT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DEMO.md		DEMO.md
LICENSE		LICENSE
README.md		README.md
install.ps1		install.ps1
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Performance comparison

Record & Replay

Why snact?

Installation

Quick start

snap — structure + content + elements

act — actions return updated state

read — full text content

eval — custom JavaScript

session — persist browser state

record & replay — zero LLM cost

Commands

Global flags

AI agent integration

Claude Code

MCP server

Piped / scripted

Architecture

Design decisions

Data storage

Contributing

License

About

Uh oh!

Releases 38

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Performance comparison

Record & Replay

Why snact?

Installation

Quick start

snap — structure + content + elements

act — actions return updated state

read — full text content

eval — custom JavaScript

session — persist browser state

record & replay — zero LLM cost

Commands

Global flags

AI agent integration

Claude Code

MCP server

Piped / scripted

Architecture

Design decisions

Data storage

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 38

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages