⚠️ Work in progress. Core functionality works; some features (e.g. CAPTCHA solving) are not yet implemented.
byp is a Python library and CLI that gives AI agents a reliable,
bot-detection-resistant browser interface. Instead of raw HTML, agents get a
structured accessibility tree they can reason over and act on — navigate, click,
type, screenshot — all from a simple CLI with JSON output designed for
subprocess-based tool use.
Most browser automation libraries are built for humans writing test scripts.
byp is built for agents:
- Structured page perception — ARIA accessibility tree output (main frame + all iframes), not raw HTML soup
- JSON-mode CLI — every command outputs clean JSON, making it trivial to wire up as an agent tool via subprocess
- Persistent named sessions — agents can resume browser state across calls without re-authenticating or losing cookies
- Anti-bot evasion — built on Camoufox, a Firefox fork with built-in fingerprint randomisation and UA consistency checks
- Proxy support — HTTP, SOCKS5, and SOCKS5h (remote DNS) with per-session persistence and DNS leak protection
| Feature | Status |
|---|---|
| Navigation, click, type, fill | ✅ Working |
| ARIA tree snapshot | ✅ Working |
| Screenshot (bytes + file) | ✅ Working |
| Persistent named sessions | ✅ Working |
| Proxy support (HTTP/SOCKS5/SOCKS5h) | ✅ Working |
| CAPTCHA solving (playwright-captcha) | 🚧 Planned |
| Async API | 🚧 Planned |
| MCP server mode | 🚧 Planned |
Requires Python 3.12+, uv.
git clone https://github.com/nandrzej/byp
cd byp
uv syncuv run byp navigate https://example.com --session my-sessionuv run byp snapshot --session my-session --jsonuv run byp click @e12 --session my-session --jsonuv run byp type @e5 "search query" --session my-session --jsonuv run byp screenshot --session my-session --path out.pnguv run byp navigate https://example.com --session my-session \
--proxy socks5h://user:pass@proxy-host:1080All commands support --json for machine-readable output — designed for
agent tool-call integration.
Agent
│
▼ subprocess / tool call (JSON in, JSON out)
CLI (byp/cli/main.py)
│
▼
BrowserDriver (byp/engine/driver.py)
├── Camoufox (Firefox + anti-fingerprinting)
├── SessionManager — persistent XDG-compliant user data dirs
├── EnginePatcher — UA consistency checks
└── AriaTreeProcessor — ARIA snapshot → structured JSON
Instead of raw HTML, the agent receives a structured accessibility tree:
{
"role": "WebArea",
"name": "Example Domain",
"children": [
{ "role": "heading", "name": "Example Domain", "ref": "@e1" },
{ "role": "paragraph", "ref": "@e2" },
{ "role": "link", "name": "More information...", "ref": "@e3" }
]
}Element refs (e.g. @e3) are used directly in click, type, and fill
commands.
MIT