snact
AI agent-optimized browser CLI — snap + act
snact lets AI agents control browsers with extreme token efficiency. One snap returns page structure, section content, and every actionable element — enough for an LLM to understand and act in a single turn.
$ snact snap https://www.apple.com/shop/buy-mac/macbook-pro
# Buy MacBook Pro
## Model. Choose your size.
> 14-inch — From $1,699 or $141.58/mo. | 16-inch — From $2,699 or $224.91/mo.
@e35 [input:radio] "14-inch" selected
@e36 [input:radio] "16-inch"
## Chip. Choose from these powerful options.
> M5 Pro — 12-core CPU, 16-core GPU | M5 Max — 16-core CPU, 40-core GPU
@e40 [link]
$ snact click @e36
ok
---
## Model. Choose your size. # ← auto re-snap included
> 16-inch — Available with M5 Pro or M5 Max
@e35 [input:radio] "14-inch"
@e36 [input:radio] "16-inch" selected
Every action automatically returns a fresh page snapshot — no manual re-snap needed.
Task: Visit npmjs.com for 10 React state management libraries (zustand, jotai, recoil, valtio, mobx, redux, xstate, effector, nanostores, legend-state). Collect weekly downloads, last publish date, unpacked size, and dependencies for each.
comparison-compressed.mp4
Both sides played at 16x speed. Left: Playwright MCP (5m 17s real time). Right: snact CLI (2m 39s real time).
| snact CLI | Playwright CLI | Playwright MCP | |
|---|---|---|---|
| Time | 2m 39s | 5m 10s | 5m 17s |
| Total tokens | 34.1K (17%) | 35.4K (18%) | 88K (44%) |
| Message tokens | 18.8K | 20.1K | 73.4K |
| Data accuracy | Correct | Correct | Correct |
snact finished in half the time with half the tokens. All three produced identical data.
Speed: Both Playwright approaches took ~5 minutes. snact finished in 2m 39s. Token efficiency: snact and Playwright CLI used similar total tokens (~34-35K), but Playwright MCP consumed 2.5x more (88K) due to accessibility tree snapshots accumulating in context. Answer quality: All three produced identical data with minor format differences.
Per-page token measurements
Measured with wc -c / 4 on actual snap output (1 token ≈ 4 chars):
| Site | snact (full) | snact (--focus) |
|---|---|---|
| example.com | 46 | — |
| GitHub Login | 172 | 60 |
| GitHub Trending | 2,152 | 614 |
| Hacker News | 2,670 | — |
| Apple MacBook Pro | 2,546 | — |
| StackOverflow | 4,363 | — |
| NYTimes | 2,417 | — |
Simple pages: 50-200 tokens. Typical pages: 2K-4K. With --focus: 60-600.
Playwright token estimates from scrolltest.medium.com (MCP ~114K per test session, CLI ~27K). snact numbers are directly measured.
Record task: Use snact to record a workflow called "npm-react-state" that visits npmjs.com for these 10 libraries. For each, snap the page and read the sidebar stats.
Replay task: Replay npm-react-state and build me an updated comparison table.
snact-replayx8-small.mp4
Played at 8x speed. First: record (2m 18s real time). Then: replay (47s real time).
The replay skips all LLM reasoning — it re-executes the recorded commands directly against Chrome and returns fresh data.
| Record (first run) | Replay | |
|---|---|---|
| Time | 2m 18s | 47s |
| LLM turns | ~20+ | 1 |
| Data | Fresh | Fresh (re-visits pages) |
| Playwright MCP | Playwright CLI | snact | |
|---|---|---|---|
| Architecture | Persistent MCP server | Daemon + CLI | Stateless CLI |
| After click/fill | Snapshot in response | Manual re-snapshot | Snapshot in response |
| Tokens per page | ~3K-50K | ~1K-13K | ~50-4K (measured) |
| Repeated tasks | Full LLM call | Full LLM call | 0 (workflow replay) |
| Session persistence | Config-based | --persistent flag |
session save/load |
| Cron automation | Requires LLM API | Requires LLM API | Shell one-liner |
| Locale/Geo override | Via run-code |
Via config | --locale / --geo flags |
| Install | npm + Playwright | npm + Playwright | Single binary (Rust) |
| Multi-browser | Chromium/FF/WebKit | Chromium/FF/WebKit | Chrome only |
# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/vericontext/snact/main/install.sh | bash# Windows (PowerShell) — experimental, see #3
irm https://raw.githubusercontent.com/vericontext/snact/main/install.ps1 | iex# From source (all platforms)
cargo install --path crates/snact-cli
# Verify
snact --versionsnact browser launch --background # 1. start Chrome
snact snap https://github.com/trending # 2. page structure + elements
snact click @e28 # 3. act (auto re-snap included)
snact browser stop # 4. donesnact snap https://github.com/trending# Trending
## NousResearch / hermes-agent
> The agent that grows with you | Python | Star
@e28 [link] href="/NousResearch/hermes-agent"
## microsoft / markitdown
> Python tool for converting files and office documents to Markdown. | Python | Star
@e37 [link] href="/microsoft/markitdown"
Section headings group elements. > lines summarize content. Each @eN reference is stable until the next snap.
snact click @e28ok
---
# NousResearch/hermes-agent
> The agent that grows with you. Build AI agents...
@e1 [link] "Code" href="/NousResearch/hermes-agent"
@e2 [link] "Issues" href="/NousResearch/hermes-agent/issues"
...
Every mutation (click, fill, type, select, scroll) returns a fresh snap. Use --no-snap to disable.
snact read https://example.com --focus="main"# Example Domain
This domain is for use in documentation examples.
Learn more
snap = structure + elements + summaries. read = full text when you need more detail.
When snap/read can't capture dynamic content (e.g. Amazon product cards):
snact eval "JSON.stringify(Array.from(document.querySelectorAll('.product')).map(p => ({
title: p.querySelector('h2')?.textContent,
price: p.querySelector('.price')?.textContent
})))"snact session save github # cookies + localStorage
snact session load github # restore latersnact record start login-flow
snact snap https://app.example.com/login
snact fill @e1 "user@example.com" --no-snap
snact click @e3 --no-snap
snact wait navigation
snact record stop
# Day 2, 3, 4... — no LLM, no tokens
snact replay login-flow| Command | Description |
|---|---|
snap [url] |
Page structure + section summaries + interactable elements |
read [url] |
Full visible text as structured markdown |
click <@ref> |
Click element (returns updated snap) |
fill <@ref> <value> |
Set input value (returns updated snap) |
type <@ref> <text> |
Type character by character (returns updated snap) |
select <@ref> <value> |
Select dropdown option (returns updated snap) |
scroll [direction] |
Scroll page (returns updated snap) |
eval <expression> |
Execute JavaScript on the page |
screenshot [--file] |
Capture page as PNG |
wait <condition> |
Wait for navigation, CSS selector, or timeout (ms) |
session save|load|list|delete |
Manage browser sessions |
record start|stop|list|delete |
Record command sequences |
replay <name> |
Replay a recorded workflow |
browser launch|stop|status |
Manage Chrome instance |
schema [command] |
JSON Schema introspection |
mcp |
Start MCP server (JSON-RPC over stdio) |
init |
Create AGENT.md for Claude Code skill discovery |
--port <PORT> Chrome debugging port [default: 9222]
--output <FMT> Output format: text, json, ndjson [default: text]
--dry-run Preview action without executing
--no-snap Skip automatic re-snap after actions
--profile <NAME> Browser profile name [default: "default"] (browser launch)
--idle-timeout <MIN> Auto-stop Chrome after N minutes of inactivity (browser launch)
--lang <LANG> Accept-Language header [default: en-US]
--locale <LOCALE> JS navigator.language override (e.g. en-US, ja-JP)
--geo <LAT,LON> Geolocation override (e.g. "37.7749,-122.4194")
--user-agent <UA> Custom User-Agent string
--focus <SEL> CSS selector to limit scope (snap/read)
--verbose Debug logging
snact works as a native CLI tool — no MCP configuration needed:
snact browser launch --background
claude
# "Use snact to find the MacBook Pro M4 Pro price on apple.com"Run snact init in your project directory to create an AGENT.md skill file for Claude Code.
For Claude Desktop or any MCP client:
{
"mcpServers": {
"snact": {
"command": "snact",
"args": ["mcp"]
}
}
}snact snap https://example.com --output=json | jq '.elements | keys[]'
snact snap https://example.com --output=ndjsongraph TD
A["AI Agent (Claude, GPT, ...)"] -->|"CLI stdout/stdin"| B
A -->|"JSON-RPC stdio"| M
subgraph snact
B["snact-cli<br/><small>Thin CLI shell (clap)</small>"]
M["MCP Server<br/><small>JSON-RPC over stdio</small>"]
B --> C
M --> C
subgraph core["snact-core"]
C["Snap"] & D["Read"] & E["Action + snap"] & F["Record/Replay"]
C --> G["Element Map<br/><small>@eN refs</small>"]
E --> G
H["Session Storage"]
end
core --> I
I["snact-cdp<br/><small>WebSocket + ~30 hand-written CDP commands</small>"]
end
I -->|"WebSocket (CDP)"| J["Chrome"]
Three-crate workspace — cdp handles Chrome protocol, core is the library, cli is a thin shell. MCP server exposes the same core over JSON-RPC for Claude Desktop and other MCP clients.
How contextual snap works
DOMSnapshot.captureSnapshot— Full flattened DOM including Shadow DOMAccessibility.getFullAXTree— Semantic roles, names, descriptions, properties- Merge — Join DOM nodes with AX nodes by
backendNodeId - Extract context — Headings, text blocks (DOM + JS fallback for SPAs)
- Filter — Keep only interactable elements, exclude hidden/aria-hidden
- Compress — Group by section headings, add content summaries, assign
@eNrefs
Auto re-snap after actions
Every mutation action (click, fill, type, select, scroll) automatically:
- Executes the action via CDP
- Waits for settle — detects navigation (waits for page load, 3s timeout) or SPA mutation (300ms settle)
- Takes a fresh snap on the same transport connection
- Returns
ok\n---\n{snap output}so the LLM sees updated state in one turn
Snap output format reference
## Section Heading
> Content summary: prices, options, descriptions (up to 300 chars)
@e1 [role] "label" id="..." href="..." expanded desc="Opens in new tab"
@e2 [input:text] "Search" placeholder="..." required
| Component | Purpose |
|---|---|
## Heading |
Page section structure (h1-h6) |
> summary |
Key text content from that section |
@eN |
Stable element reference for actions |
[role] |
Semantic role (button, link, textbox, etc.) |
"label" |
Accessible name |
id=, href= |
Key attributes |
expanded, collapsed |
Dropdown/accordion state |
selected |
Active tab/option |
required, readonly |
Form field constraints |
desc="..." |
Accessibility description |
- Hand-written CDP types over generated bindings — ~30 commands, fast compile
- Disk-based state between invocations — element maps, sessions, workflows as JSON
backendNodeIdas element identifier — stable within a page load, selector hints for replay- Text output by default — optimized for LLM comprehension, not JSON parsing
- Persistent browser profiles — cookies survive restarts, reduces bot detection
- Single-threaded tokio — one thing at a time
User scope — ~/.local/share/snact/ (Linux) or ~/Library/Application Support/snact/ (macOS):
snact/
├── element_map.json # Current @eN → element mappings
├── heartbeat # Last command timestamp (for --idle-timeout)
├── chrome-{port}.pid # Chrome process ID
├── profiles/default/ # Persistent Chrome profile
├── sessions/{name}.json # Saved browser sessions
├── workflows/{name}.json # Recorded workflows (personal)
└── recording.json # Active recording state
Project scope — .snact/ in the project directory (created by snact init, git-committable):
.snact/
└── workflows/{name}.json # Shared workflows (team/repo)
Workflows save to project scope when .snact/ exists, otherwise user scope. On load, project scope takes priority.
See CONTRIBUTING.md for development setup, project structure, and commit conventions.
MIT