From e68a18d53f0eb36d877e2cb4a129adb3f8a8c6b7 Mon Sep 17 00:00:00 2001 From: Albert Date: Fri, 22 May 2026 10:51:22 +0800 Subject: [PATCH] docs: add browserbase agent operations guide --- .gitignore | 2 + README.md | 373 ++++++---------- README.zh-CN.md | 403 ++++++------------ docs/.vitepress/config.mts | 19 +- docs/adapters/social-comments.md | 83 ++-- docs/advanced/browserbase.md | 218 +++++----- docs/advanced/rate-limiter-plugin.md | 2 +- docs/developer/ai-workflow.md | 2 +- docs/guide/ai-agent-operations.md | 115 +++++ docs/guide/browser-bridge.md | 2 +- docs/guide/getting-started.md | 127 +++--- docs/guide/installation.md | 73 ++-- docs/guide/remote-orchestration.md | 2 +- docs/guide/skills.md | 38 ++ docs/guide/troubleshooting.md | 2 +- docs/index.md | 43 +- docs/zh/adapters/index.md | 2 + docs/zh/adapters/social-comments.md | 86 ++++ docs/zh/advanced/browserbase.md | 202 +++++++++ docs/zh/guide/ai-agent-operations.md | 115 +++++ docs/zh/guide/browser-bridge.md | 2 +- docs/zh/guide/getting-started.md | 115 +++-- docs/zh/guide/installation.md | 69 ++- docs/zh/guide/skills.md | 38 ++ docs/zh/index.md | 43 +- package.json | 6 +- skills/opencli-autofix/SKILL.md | 6 +- skills/opencli-browser/SKILL.md | 4 +- skills/opencli-browserbase/SKILL.md | 56 +++ .../references/commands.md | 60 +++ .../references/pooled-runs.md | 30 ++ .../references/security.md | 9 + skills/opencli-social-comments/SKILL.md | 48 +++ .../references/matrix.md | 11 + .../references/recipes.md | 45 ++ skills/opencli-usage/SKILL.md | 184 +++----- 36 files changed, 1637 insertions(+), 998 deletions(-) create mode 100644 docs/guide/ai-agent-operations.md create mode 100644 docs/guide/skills.md create mode 100644 docs/zh/adapters/social-comments.md create mode 100644 docs/zh/advanced/browserbase.md create mode 100644 docs/zh/guide/ai-agent-operations.md create mode 100644 docs/zh/guide/skills.md create mode 100644 skills/opencli-browserbase/SKILL.md create mode 100644 skills/opencli-browserbase/references/commands.md create mode 100644 skills/opencli-browserbase/references/pooled-runs.md create mode 100644 skills/opencli-browserbase/references/security.md create mode 100644 skills/opencli-social-comments/SKILL.md create mode 100644 skills/opencli-social-comments/references/matrix.md create mode 100644 skills/opencli-social-comments/references/recipes.md diff --git a/.gitignore b/.gitignore index 8a1b49d73..e95abdf8c 100644 --- a/.gitignore +++ b/.gitignore @@ -4,6 +4,8 @@ dist/ *.tsbuildinfo .opencli/ .worktrees/ +.agents/ +skills-lock.json .mcp.json *.log .DS_Store diff --git a/README.md b/README.md index 5dab8eeb0..57dbe5d49 100644 --- a/README.md +++ b/README.md @@ -1,310 +1,183 @@ # OpenCLI -> **Convert any website into a CLI & run Browser Use on your logged-in Chrome.** -> Turn websites, browser sessions, Electron apps, and local tools into deterministic interfaces for humans and AI agents. -> Or run Browser Use against any page — navigate, fill forms, click, extract, automate. +> Turn websites, logged-in browser profiles, Browserbase cloud browsers, Electron apps, and local CLIs into deterministic commands for humans and AI agents. [![中文文档](https://img.shields.io/badge/docs-%E4%B8%AD%E6%96%87-0F766E?style=flat-square)](./README.zh-CN.md) -[![npm](https://img.shields.io/npm/v/@jackwener/opencli?style=flat-square)](https://www.npmjs.com/package/@jackwener/opencli) -[![Node.js Version](https://img.shields.io/node/v/@jackwener/opencli?style=flat-square)](https://nodejs.org) -[![License](https://img.shields.io/npm/l/@jackwener/opencli?style=flat-square)](./LICENSE) +[![Node.js](https://img.shields.io/badge/node-%3E%3D20-339933?style=flat-square)](https://nodejs.org) +[![License](https://img.shields.io/badge/license-Apache--2.0-blue?style=flat-square)](./LICENSE) -OpenCLI gives you one surface for three different kinds of automation: +OpenCLI is an agent-facing CLI surface. It can: -- **Use built-in adapters** for sites like Bilibili, Zhihu, Xiaohongshu, Reddit, HackerNews, Twitter/X, and [many more](#built-in-commands). -- **Let AI Agents operate any website** — install the `opencli-browser` skill in your AI agent (Claude Code, Cursor, etc.), and it can navigate, click, type/fill, extract, and inspect any page through your logged-in browser via `opencli browser` primitives. -- **Write new adapters** end-to-end with `opencli browser` + the `opencli-adapter-author` skill, which guides from first recon through field decoding, code, and `opencli browser verify`. +- run 100+ built-in website and desktop adapters with stable JSON/table/CSV output +- reuse a local Chrome login through Browser Bridge +- run Browserbase sessions with persistent account Contexts, per-account proxy binding, and up to 10 concurrent cloud browsers +- execute social comment workflows for Reddit, Twitter/X, YouTube, Instagram, TikTok, Xiaohongshu, and supported LinkedIn timeline metadata +- expose skills that tell an AI agent which `opencli` command to call -It also works as a **CLI hub** for local tools such as `gh`, `docker`, `longbridge`, `tg`, `discord`, `wx`, `ntn` (Notion), and other binaries you register yourself, plus **desktop app adapters** for Electron apps like Cursor, Codex, Antigravity, and ChatGPT. +This fork is documented and maintained at [`albertcyhe/opencli`](https://github.com/albertcyhe/opencli). It is not published as a separate npm package yet; install it from source. Adapter/plugin import names still use `@jackwener/opencli`. ## Quick Start -### 1. Install OpenCLI - -OpenCLI requires **Node.js >= 21**. +OpenCLI requires Node.js 20 or newer. Install the current fork from source: ```bash node --version -npm install -g @jackwener/opencli +git clone git@github.com:albertcyhe/opencli.git +cd opencli +npm install +npm run build +npm link +opencli doctor +opencli list ``` -### 2. Install the Browser Bridge Extension - -OpenCLI connects to Chrome/Chromium through a lightweight Browser Bridge extension plus a small local daemon. The daemon auto-starts when needed. +For local Chrome login reuse, install the OpenCLI Browser Bridge extension from the [Chrome Web Store](https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk), or download the extension zip from [albertcyhe/opencli releases](https://github.com/albertcyhe/opencli/releases). -**Option A — Chrome Web Store (recommended):** -Install **OpenCLI** from the [Chrome Web Store](https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk). +## What An Agent Can Do -**Option B — Manual install:** -1. Download the latest `opencli-extension-v{version}.zip` from the GitHub [Releases page](https://github.com/jackwener/opencli/releases). -2. Unzip it, open `chrome://extensions`, and enable **Developer mode**. -3. Click **Load unpacked** and select the unzipped folder. - -### 3. Verify the setup +Discover commands at runtime instead of guessing: ```bash -opencli doctor +opencli list -f json +opencli reddit --help +opencli reddit get-comments --help ``` -### 4. Optional: name your Chrome profile - -Each Chrome profile runs its own OpenCLI extension instance. If you use multiple Chrome profiles, list the connected profiles and assign local aliases: +Run atomic adapter commands with machine-readable output: ```bash -opencli profile list -opencli profile rename work -opencli profile use work -opencli --profile work browser state +opencli hackernews top --limit 5 -f json +opencli reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" --limit 100 -f json +opencli twitter get-comments "https://x.com/user/status/123" --limit 50 -f json +opencli youtube comments "https://www.youtube.com/watch?v=VIDEO_ID" --limit 100 -f json +opencli xiaohongshu comments "https://www.xiaohongshu.com/search_result/?xsec_token=..." --with-replies -f json ``` -With only one connected profile, OpenCLI uses it automatically. With multiple connected profiles and no default, OpenCLI asks you to choose instead of guessing. +Use `-f json` by default when another agent or program will consume the result. -### 5. Run your first commands +## Browserbase Multi-Account Setup + +Set Browserbase credentials: ```bash -opencli list -opencli hackernews top --limit 5 -opencli bilibili hot --limit 5 +export BROWSERBASE_API_KEY=... +export BROWSERBASE_PROJECT_ID=... ``` -## For Humans - -Use OpenCLI directly when you want a reliable command instead of a live browser session: - -- `opencli list` shows every registered command. -- `opencli ` runs a built-in or generated adapter. -- `opencli external register mycli` exposes a local CLI through the same discovery surface. -- `opencli doctor` helps diagnose browser connectivity. - -## Extending OpenCLI - -If you want to add your own commands, start with the [Extending OpenCLI guide](./docs/guide/extending-opencli.md). README keeps this short; the guide covers the directory layout, source-control model, and install commands. - -| Need | Recommended path | -|------|------------------| -| Keep personal website commands in your own Git repo | `opencli plugin create` + `opencli plugin install file://...` | -| Quickly draft a private local adapter | `opencli browser init /` in `~/.opencli/clis/` | -| Modify an official adapter locally | `opencli adapter eject ` + `opencli adapter reset ` | -| Publish or install third-party commands | `opencli plugin install github:user/repo` | -| Wrap an existing local binary | `opencli external register ` | - -## For AI Agents - -OpenCLI's browser commands are designed to be used by AI Agents — not run manually. Install skills into your AI agent (Claude Code, Cursor, etc.), and the agent operates websites on your behalf using your logged-in Chrome session. - -### Install skills (also refreshes existing installs) +Create a proxy profile. External proxy passwords should live in environment variables: ```bash -npx skills add jackwener/opencli +export PROXY_REDDIT1_PASS=... + +opencli browserbase proxy add reddit1-proxy \ + --type external \ + --server http://133.169.0.110:60088 \ + --username p9SIbn0S6AoC \ + --password-env PROXY_REDDIT1_PASS ``` -Or install only what you need: +Create ten persistent login profiles and open Live View URLs for manual login: ```bash -npx skills add jackwener/opencli --skill opencli-adapter-author -npx skills add jackwener/opencli --skill opencli-autofix -npx skills add jackwener/opencli --skill opencli-browser -npx skills add jackwener/opencli --skill opencli-usage +opencli browserbase account bootstrap \ + --site reddit \ + --count 10 \ + --name-prefix reddit-main \ + --proxy reddit1-proxy \ + --open ``` -### Which skill to use - -| Skill | When to use | Example prompt to your AI agent | -|-------|------------|-------------------------------| -| **opencli-adapter-author** | Write a reusable adapter for a new site or add a command to an existing site | "Write an adapter for douyin trending" / "Make a command that grabs the top posts from this page" | -| **opencli-autofix** | Repair a broken adapter when a built-in command fails | "`opencli zhihu hot` is returning empty — fix it" | -| **opencli-browser** | Drive a real Chrome page ad-hoc — navigate, fill forms, click, extract | "Help me check my Xiaohongshu notifications" / "Help me fill out this form" / "Use browser commands to scrape this page" | -| **opencli-usage** | Quick reference for all OpenCLI commands and sites | "What commands does OpenCLI have for Twitter?" | - -### How it works - -Once `opencli-browser` is installed, your AI agent can: - -1. **Navigate** to any URL using your logged-in browser -2. **Read** page content via structured DOM snapshots (not screenshots) -3. **Interact** — click buttons, fill forms, select options, press keys -4. **Extract** data from the page or intercept network API responses -5. **Wait** for elements, text, or page transitions +Each account maps to one Browserbase Context. Cookies, localStorage, and IndexedDB stay in Browserbase; OpenCLI stores only metadata in `~/.opencli/browserbase.json`. -The agent handles all the `opencli browser` commands internally — you just describe what you want done in natural language. - -**Skill references:** -- [`skills/opencli-browser/SKILL.md`](./skills/opencli-browser/SKILL.md) — drive Chrome ad-hoc (navigate, fill forms, click, extract) -- [`skills/opencli-adapter-author/SKILL.md`](./skills/opencli-adapter-author/SKILL.md) — write a new adapter end-to-end -- [`skills/opencli-autofix/SKILL.md`](./skills/opencli-autofix/SKILL.md) — repair broken adapters -- [`skills/opencli-usage/SKILL.md`](./skills/opencli-usage/SKILL.md) — command and site reference - -Available browser commands include `open`, `state`, `click`, `type`, `fill`, `select`, `keys`, `wait`, `get`, `find`, `extract`, `frames`, `screenshot`, `scroll`, `back`, `eval`, `network`, `tab list`, `tab new`, `tab select`, `tab close`, `init`, `verify`, and `close`. - -`opencli browser` commands require a `` positional immediately after `browser`. `opencli browser work open ` and `opencli browser work tab new [url]` both return a target ID. Use `opencli browser work tab list` to inspect target IDs, then pass `--tab ` to route a command to a specific tab. `tab new` creates a new tab without changing the default browser target; only `tab select ` promotes that tab to the default target for later untargeted commands in the same session. - -## Writing a new adapter - -When the site you need is not yet covered, use the `opencli-adapter-author` skill end-to-end: - -1. **Recon** the site and pick a pattern (SPA / SSR / JSONP / Token / Streaming). -2. **Discover** the right endpoint — network inspection, initial state, bundle search, token trace, or interceptor fallback. -3. **Pick auth** — `PUBLIC` / `COOKIE` / `INTERCEPT` / `UI` / `LOCAL`. -4. **Decode** response fields and design output columns. -5. `opencli browser recon analyze ` → `opencli browser recon init /` → write adapter → `opencli browser recon verify /`. -6. Site knowledge persists to `~/.opencli/sites//` so the next adapter for the same site starts from context. - -## Prerequisites - -- **Node.js**: >= 21.0.0 (required for the standard npm install path) -- **Bun**: >= 1.0 (optional alternative runtime) -- **Chrome or Chromium** running and logged into the target site for browser-backed commands - -> **Important**: Browser-backed commands reuse your Chrome/Chromium login session. If you get empty data or permission-like failures, first confirm the site is already open and authenticated in Chrome/Chromium. - -## Configuration - -| Variable | Default | Description | -|----------|---------|-------------| -| `OPENCLI_DAEMON_PORT` | `19825` | HTTP port for the daemon-extension bridge | -| `OPENCLI_PROFILE` | — | Browser Bridge profile alias/contextId to use when multiple Chrome profiles are connected | -| `OPENCLI_WINDOW` | command default | Set to `foreground` or `background` to override Browser Bridge window placement. Browser-backed commands also accept `--window `. | -| `OPENCLI_BROWSER_CONNECT_TIMEOUT` | `30` | Seconds to wait for browser connection | -| `OPENCLI_BROWSER_COMMAND_TIMEOUT` | `60` | Seconds to wait for a single browser command | -| `OPENCLI_CDP_ENDPOINT` | — | Chrome DevTools Protocol endpoint for remote browser or Electron apps | -| `OPENCLI_CDP_TARGET` | — | Filter CDP targets by URL substring (e.g. `detail.1688.com`) | -| `BROWSERBASE_API_KEY` | — | API key for Browserbase cloud browser sessions, contexts, and account profiles | -| `BROWSERBASE_PROJECT_ID` | — | Browserbase project id required for context/account operations | -| `BROWSERBASE_SESSION_ID` | — | Existing Browserbase session ID for adapter browser commands; equivalent to root `--browserbase-session ` / legacy `--session ` | -| `OPENCLI_VERBOSE` | `false` | Enable verbose logging (`-v` flag also works) | -| `DEBUG_SNAPSHOT` | — | Set to `1` for DOM snapshot debug output | - -`opencli browser *` requires an explicit `` positional, uses a foreground browser window by default, and keeps that session's tab lease until `opencli browser close` or idle cleanup. Browser-backed adapters use a background adapter window and release one-shot tab leases by default. Interactive adapters can declare `siteSession: 'persistent'` to keep a stable site tab for continuity; pass `--site-session ephemeral` for a one-shot tab. - -For cloud browsers, OpenCLI can manage Browserbase account profiles, persistent -Contexts, proxy bindings, Live View login sessions, and parallel session pools: -`opencli browserbase account bootstrap ...`, -`opencli --browserbase-account ...`, and -`opencli run --browserbase ...`. Existing Browserbase sessions still work with -`--browserbase-session` or legacy `--session`. See -[`docs/advanced/browserbase.md`](./docs/advanced/browserbase.md). - -Social comment workflows are documented in -[`docs/adapters/social-comments.md`](./docs/adapters/social-comments.md), -including Twitter/X, YouTube, Reddit, LinkedIn, Instagram, TikTok, and -Xiaohongshu support. - -## Built-in Commands - -| Site | Commands | -|------|----------| -| **xiaohongshu** | `search` `note` `comments` `reply` `feed` `user` `download` `publish` `notifications` `creator-notes` `creator-notes-summary` `creator-note-detail` `creator-profile` `creator-stats` `delete-note` | -| **rednote** | `search` `note` `comments` `user` `download` `feed` `notifications` | -| **bilibili** | `hot` `search` `history` `feed` `ranking` `download` `comments` `reply` `dynamic` `favorite` `following` `me` `subtitle` `summary` `video` `user-videos` | -| **tieba** | `hot` `posts` `search` `read` | -| **hupu** | `hot` `search` `detail` `mentions` `reply` `like` `unlike` | -| **zhihu** | `hot` `search` `question` `download` `follow` `like` `favorite` `comment` `answer` | -| **hackernews** | `top` `new` `best` `ask` `show` `jobs` `search` `user` | -| **linkedin** | `connect` `inbox` `safe-send` `search` `sent-invitations` `thread-snapshot` `timeline` `salesnav-search` `salesnav-inbox` `salesnav-message` `salesnav-thread` | -| **reddit** | `hot` `frontpage` `popular` `search` `subreddit` `read` `get-comments` `user` `user-posts` `user-comments` `upvote` `upvoted` `save` `saved` `comment` `reply` `subscribe` `subscribed` | -| **twitter** | `trending` `search` `timeline` `tweets` `lists` `list-tweets` `list-add` `list-create` `list-remove` `bookmarks` `post` `download` `profile` `article` `like` `likes` `notifications` `device-follow` `get-comments` `reply` `reply-dm` `thread` `follow` `unfollow` `followers` `following` `block` `unblock` `bookmark` `unbookmark` `delete` `hide-reply` `accept` | -| **youtube** | `search` `video` `transcript` `comments` `reply` `reply-comment` `channel` `playlist` `feed` `history` `watch-later` `subscriptions` `like` `unlike` `subscribe` `unsubscribe` | -| **instagram** | `explore` `profile` `search` `search-posts` `user` `followers` `following` `follow` `unfollow` `like` `unlike` `comment` `get-comments` `reply` `save` `unsave` `saved` | -| **tiktok** | `explore` `search` `profile` `user` `following` `follow` `unfollow` `like` `unlike` `comment` `get-comments` `reply` `save` `unsave` `live` `notifications` `friends` | -| **claude** | `ask` `send` `new` `status` `read` `history` `detail` | -| **gemini** | `new` `ask` `image` `deep-research` `deep-research-result` | -| **notebooklm** | `status` `list` `open` `current` `get` `history` `summary` `note-list` `notes-get` `source-list` `source-get` `source-fulltext` `source-guide` | -| **amazon** | `bestsellers` `search` `product` `offer` `discussion` `movers-shakers` `new-releases` `rankings` | - -Curated highlights — **[→ see all 100+ supported sites & commands](./docs/adapters/index.md)** (douyin / weibo / spotify / 1688 / quark / nowcoder / google-scholar / hupu / xianyu / weread / weread-official / xiaoyuzhou / and more). - -## CLI Hub - -Unified passthrough for your existing command-line tools. Run `opencli ...` for any of: - -`gh` · `docker` · `vercel` · `wrangler` · `obsidian` · `longbridge` · `lark-cli` · `ntn(notion)` · `dws(DingTalk Workspace)` · `wecom-cli(企业微信)` · `tg(tg-cli)` · `discord(discord-cli)` · `wx(wx-cli)` - -Register your own with `opencli external register `; list everything with `opencli external list`. - -**Desktop app adapters** (Electron, via CDP): Cursor / Codex / Antigravity / ChatGPT App / ChatWise / Discord / Doubao — see [`docs/adapters/desktop/`](./docs/adapters/desktop/). - -## Download Support - -OpenCLI supports downloading images, videos, and articles from supported platforms. - -| Platform | Content Types | Notes | -|----------|---------------|-------| -| **xiaohongshu** | Images, Videos | Downloads all media from a note | -| **rednote** | Images, Videos | Downloads all media from a signed rednote note URL | -| **bilibili** | Videos | Requires `yt-dlp` installed | -| **twitter** | Images, Videos | From user media tab or single tweet | -| **douban** | Images | Poster / still image lists | -| **pixiv** | Images | Original-quality illustrations, multi-page | -| **1688** | Images, Videos | Downloads page-visible product media from item pages | -| **xiaoyuzhou** | Audio, Transcript | Downloads episode audio and transcript JSON/text with local credentials | -| **zhihu** | Articles (Markdown) | Exports with optional image download | -| **weixin** | Articles (Markdown) | WeChat Official Account articles | - -For video downloads, install `yt-dlp` first: `brew install yt-dlp` +After login, run one task through a named account: ```bash -opencli xiaohongshu download "https://www.xiaohongshu.com/search_result/?xsec_token=..." --output ./xhs -opencli xiaohongshu download "https://xhslink.com/..." --output ./xhs -opencli rednote download "https://www.rednote.com/search_result/?xsec_token=..." --output ./rednote -opencli bilibili download BV1xxx --output ./bilibili -opencli twitter download elonmusk --limit 20 --output ./twitter -opencli 1688 download 841141931191 --output ./1688-downloads -opencli xiaoyuzhou download 69b3b675772ac2295bfc01d0 --output ./xiaoyuzhou -opencli xiaoyuzhou transcript 69dd0c98e2c8be31551f6a33 --output ./xiaoyuzhou-transcripts +opencli --browserbase-account reddit-main-1 \ + reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" \ + --limit 100 \ + -f json ``` -`opencli xiaoyuzhou download` and `transcript` require local Xiaoyuzhou credentials in `~/.opencli/xiaoyuzhou.json`. - -## Output Formats - -All built-in commands support `--format` / `-f` with `table` (default), `json`, `yaml`, `md`, and `csv`. +Run a JSONL job file across up to ten accounts: ```bash -opencli bilibili hot -f json # Pipe to jq or LLMs -opencli bilibili hot -f csv # Spreadsheet-friendly -opencli bilibili hot -v # Verbose: show pipeline debug steps +opencli run --browserbase \ + --accounts reddit-main-1,reddit-main-2,reddit-main-3,reddit-main-4,reddit-main-5,reddit-main-6,reddit-main-7,reddit-main-8,reddit-main-9,reddit-main-10 \ + --parallel 10 \ + --pool-size 10 \ + jobs.jsonl ``` -## Exit Codes - -opencli follows Unix `sysexits.h` so CI / scripts can branch on failure mode: `0` success, `66` empty result, `69` Browser Bridge down, `75` timeout, `77` auth required, `78` config error, `130` Ctrl-C. Full reference: [docs/guide/exit-codes.md](./docs/guide/exit-codes.md). - -## Plugins - -Extend OpenCLI with community-contributed adapters: +Useful account and proxy operations: ```bash -opencli plugin install github:user/opencli-plugin-my-tool -opencli plugin list -opencli plugin update --all -opencli plugin uninstall my-tool +opencli browserbase account list +opencli browserbase account login reddit-main-1 --open --wait +opencli browserbase account check reddit-main-1 --command "reddit whoami" +opencli browserbase account set-proxy reddit-main-1 reddit1-proxy +opencli browserbase account clear-proxy reddit-main-1 +opencli browserbase account clear-login reddit-main-1 --recreate-context --delete-old-context +opencli browserbase account delete reddit-main-1 --delete-context + +opencli browserbase proxy list +opencli browserbase proxy get reddit1-proxy +opencli browserbase proxy update reddit1-proxy --server http://new-host:port +opencli browserbase proxy test reddit1-proxy +opencli browserbase proxy delete reddit1-proxy ``` -| Plugin | Type | Description | -|--------|------|-------------| -| [opencli-plugin-github-trending](https://github.com/ByteYue/opencli-plugin-github-trending) | JS | GitHub Trending repositories | -| [opencli-plugin-hot-digest](https://github.com/ByteYue/opencli-plugin-hot-digest) | JS | Multi-platform trending aggregator | -| [opencli-plugin-juejin](https://github.com/Astro-Han/opencli-plugin-juejin) | JS | 稀土掘金 (Juejin) hot articles | -| [opencli-plugin-vk](https://github.com/flobo3/opencli-plugin-vk) | JS | VK (VKontakte) wall, feed, and search | +Details: [Browserbase accounts, proxies, and parallel sessions](./docs/advanced/browserbase.md). -See [Plugins Guide](./docs/guide/plugins.md) for creating your own plugin. +## Social Comments -## Testing +| Platform | Read comments | Write support | +| --- | --- | --- | +| Reddit | `reddit read`, `reddit get-comments` | `reddit comment`, `reddit reply` | +| Twitter/X | `twitter get-comments`, `twitter thread` | `twitter reply`, `twitter post` | +| YouTube | `youtube comments` | `youtube reply`, `youtube reply-comment` | +| Instagram | `instagram get-comments` | `instagram comment`, `instagram reply` | +| TikTok | `tiktok get-comments` | `tiktok comment`, `tiktok reply` | +| Xiaohongshu | `xiaohongshu comments` | `xiaohongshu reply` | +| LinkedIn | `linkedin timeline` comment counts | comment threads are not exposed yet | -See **[TESTING.md](./TESTING.md)** for how to run and write tests. +See [Social Comment Support](./docs/adapters/social-comments.md) for platform-specific arguments, IDs, and safety notes. -## Troubleshooting +## Skills For AI Agents -- **"Extension not connected"** — Ensure the Browser Bridge extension is installed from the [Chrome Web Store](https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk) and **enabled** in `chrome://extensions`. -- **"attach failed: Cannot access a chrome-extension:// URL"** — Another extension may be interfering. Try disabling other extensions temporarily. -- **Empty data or 'Unauthorized' error** — Your Chrome/Chromium login session may have expired. Navigate to the target site and log in again. -- **Node API errors / missing `fetch` / startup crash on old Node** — OpenCLI requires **Node.js >= 21**. Run `node --version`, upgrade Node if needed, then retry. -- **Daemon issues** — Check status: `curl localhost:19825/status` · View logs: `curl localhost:19825/logs` +Install or refresh all OpenCLI skills from this fork: -## Star History +```bash +npx skills add albertcyhe/opencli +``` -[![Star History Chart](https://api.star-history.com/svg?repos=jackwener/opencli&type=Date)](https://star-history.com/#jackwener/opencli&Date) +Install only the skill you need: -## License +```bash +npx skills add albertcyhe/opencli --skill opencli-usage +npx skills add albertcyhe/opencli --skill opencli-browserbase +npx skills add albertcyhe/opencli --skill opencli-social-comments +npx skills add albertcyhe/opencli --skill opencli-browser +npx skills add albertcyhe/opencli --skill opencli-adapter-author +npx skills add albertcyhe/opencli --skill opencli-autofix +``` -[Apache-2.0](./LICENSE) +| Skill | Use it when the agent needs to | +| --- | --- | +| `opencli-usage` | discover OpenCLI capabilities and choose the next skill | +| `opencli-browserbase` | manage Browserbase accounts, contexts, sessions, proxies, or pooled runs | +| `opencli-social-comments` | read or write comments across supported social adapters | +| `opencli-browser` | drive a local Chrome page ad hoc through Browser Bridge | +| `opencli-adapter-author` | create or extend a reusable adapter | +| `opencli-autofix` | repair a broken adapter after a traced failure | + +Skill source files live in [`skills/`](./skills/). The docs entry point is [Skills for AI agents](./docs/guide/skills.md). + +## More Docs + +- [Getting Started](./docs/guide/getting-started.md) +- [AI Agent Operations](./docs/guide/ai-agent-operations.md) +- [Installation](./docs/guide/installation.md) +- [Browser Bridge](./docs/guide/browser-bridge.md) +- [All adapters](./docs/adapters/index.md) +- [Browserbase](./docs/advanced/browserbase.md) +- [Social comments](./docs/adapters/social-comments.md) diff --git a/README.zh-CN.md b/README.zh-CN.md index e4a9465bd..b9df9dc50 100644 --- a/README.zh-CN.md +++ b/README.zh-CN.md @@ -1,340 +1,181 @@ # OpenCLI -> **把任意网站变成 CLI & 在你的登录态浏览器上跑 Browser Use。** -> 把网站、浏览器会话、Electron 应用和本地工具,统一变成适合人类与 AI Agent 使用的确定性接口。 -> 或者在任意页面上跑 Browser Use —— 导航、填表单、点击、抓取、自动化。 +> 把网站、已登录浏览器、Browserbase 云端浏览器、Electron 应用和本地 CLI 变成稳定的命令接口,给人和 AI Agent 调用。 [![English](https://img.shields.io/badge/docs-English-1D4ED8?style=flat-square)](./README.md) -[![npm](https://img.shields.io/npm/v/@jackwener/opencli?style=flat-square)](https://www.npmjs.com/package/@jackwener/opencli) -[![Node.js Version](https://img.shields.io/node/v/@jackwener/opencli?style=flat-square)](https://nodejs.org) -[![License](https://img.shields.io/npm/l/@jackwener/opencli?style=flat-square)](./LICENSE) +[![Node.js](https://img.shields.io/badge/node-%3E%3D20-339933?style=flat-square)](https://nodejs.org) +[![License](https://img.shields.io/badge/license-Apache--2.0-blue?style=flat-square)](./LICENSE) -OpenCLI 可以用同一套 CLI 做三类事情: +OpenCLI 是面向 agent 的 CLI 入口。它可以: -- **直接使用现成适配器**:B站、知乎、小红书、Twitter/X、Reddit、HackerNews 等 [100+ 站点](#内置命令) 开箱即用。 -- **让 AI Agent 操作任意网站**:在你的 AI Agent(Claude Code、Cursor 等)中安装 `opencli-browser` skill,Agent 就能用你的已登录浏览器导航、点击、输入/填充、提取任意网页内容。 -- **把新网站写成 CLI**:用 `opencli browser` 原语 + `opencli-adapter-author` skill,从站点侦察、API 发现、字段解码到 `opencli browser verify` 一条龙。 +- 用稳定的 JSON/table/CSV 输出调用 100+ 网站和桌面应用 adapter +- 通过 Browser Bridge 复用本地 Chrome 登录态 +- 通过 Browserbase 管理持久 Context、账号登录态、账号绑定 proxy,以及最多 10 个并发云端 browser session +- 对 Reddit、Twitter/X、YouTube、Instagram、TikTok、小红书等平台执行 comments 相关原子操作 +- 提供 Codex/Claude/Cursor 等 agent 可安装的 skills,让 agent 知道该调用哪些 `opencli` 命令 -除了网站能力,OpenCLI 还是一个 **CLI 枢纽**:你可以把 `gh`、`docker`、`longbridge`、`tg`、`discord`、`wx`、`ntn`(Notion)等本地工具统一注册到 `opencli` 下,也可以通过桌面端适配器控制 Cursor、Codex、Antigravity、ChatGPT 等 Electron 应用。 +当前 fork 的文档和仓库地址是 [`albertcyhe/opencli`](https://github.com/albertcyhe/opencli)。这个 fork 还没有单独发布 npm 包,请从源码安装;adapter/plugin import 名暂时仍是 `@jackwener/opencli`。 ## 快速开始 -### 1. 安装 OpenCLI - -OpenCLI 要求 **Node.js >= 21**。 +OpenCLI 要求 Node.js 20 或更高版本。当前 fork 从源码安装: ```bash node --version -npm install -g @jackwener/opencli -``` - -### 2. 安装 Browser Bridge 扩展 - -OpenCLI 通过轻量 Browser Bridge 扩展和本地微型 daemon 与 Chrome/Chromium 通信。daemon 会按需自动启动。 - -**方式 A — Chrome Web Store(推荐):** -在 [Chrome Web Store](https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk) 安装 **OpenCLI** 扩展。 - -**方式 B — 手动安装:** -1. 到 GitHub [Releases 页面](https://github.com/jackwener/opencli/releases) 下载最新的 `opencli-extension-v{version}.zip`。 -2. 解压后打开 `chrome://extensions`,启用 **开发者模式**。 -3. 点击 **加载已解压的扩展程序**,选择解压后的目录。 - -### 3. 验证环境 - -```bash +git clone git@github.com:albertcyhe/opencli.git +cd opencli +npm install +npm run build +npm link opencli doctor -``` - -### 4. 跑第一个命令 - -```bash opencli list -opencli hackernews top --limit 5 -opencli bilibili hot --limit 5 ``` -## 给人类用户 - -如果你只是想稳定地调用网站或桌面应用能力,主路径很简单: - -- `opencli list` 查看当前所有命令 -- `opencli ` 调用内置或生成好的适配器 -- `opencli external register mycli` 把本地 CLI 接入同一发现入口 -- `opencli doctor` 处理浏览器连通性问题 - -## 扩展 OpenCLI - -如果你想新增自己的命令,先看 [扩展 OpenCLI](./docs/zh/guide/extending-opencli.md)。README 只保留入口;目录结构、源码管理方式和安装命令放在文档里。 - -| 需求 | 推荐路径 | -|------|----------| -| 把个人网站命令放在自己的 Git repo | `opencli plugin create` + `opencli plugin install file://...` | -| 快速写一个本机私人 adapter | `opencli browser init /`,放在 `~/.opencli/clis/` | -| 本地修改官方 adapter | `opencli adapter eject ` + `opencli adapter reset ` | -| 发布或安装第三方命令 | `opencli plugin install github:user/repo` | -| 包装已有本机 binary | `opencli external register ` | - -## 给 AI Agent +如果要复用本地 Chrome 登录态,请从 [Chrome Web Store](https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk) 安装 OpenCLI Browser Bridge 扩展,或从 [albertcyhe/opencli releases](https://github.com/albertcyhe/opencli/releases) 下载扩展 zip。 -OpenCLI 的 browser 命令是给 AI Agent 用的——不是手动执行的。把 skill 安装到你的 AI Agent(Claude Code、Cursor 等)中,Agent 就能用你的已登录 Chrome 会话替你操作网站。 +## Agent 能做什么 -### 安装 skill(同时也用于更新) +不要猜命令,先实时发现: ```bash -npx skills add jackwener/opencli +opencli list -f json +opencli reddit --help +opencli reddit get-comments --help ``` -或只装需要的 skill: +执行原子化 adapter 命令,给下游 agent 或程序读取时默认用 JSON: ```bash -npx skills add jackwener/opencli --skill opencli-adapter-author -npx skills add jackwener/opencli --skill opencli-autofix -npx skills add jackwener/opencli --skill opencli-browser -npx skills add jackwener/opencli --skill opencli-usage +opencli hackernews top --limit 5 -f json +opencli reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" --limit 100 -f json +opencli twitter get-comments "https://x.com/user/status/123" --limit 50 -f json +opencli youtube comments "https://www.youtube.com/watch?v=VIDEO_ID" --limit 100 -f json +opencli xiaohongshu comments "https://www.xiaohongshu.com/search_result/?xsec_token=..." --with-replies -f json ``` -### 选择哪个 skill - -| Skill | 适用场景 | 你对 AI Agent 说的话 | -|-------|---------|-------------------| -| **opencli-adapter-author** | 为新站点写可复用适配器,或给已有站点添加命令 | "帮我做一个抖音热门的适配器" / "帮我做一个抓取这个页面热帖的命令" | -| **opencli-autofix** | 内置命令失败时修复已有适配器 | "`opencli zhihu hot` 返回空了,修一下" | -| **opencli-browser** | 实时驱动 Chrome 页面——导航、填表单、点击、抓取 | "帮我看看小红书的通知" / "帮我填一下这个表单" / "用浏览器命令抓取这个页面" | -| **opencli-usage** | 所有命令和站点的快速参考 | "OpenCLI 有哪些 Twitter 相关的命令?" | - -### 工作原理 - -安装 `opencli-browser` skill 后,你的 AI Agent 可以: - -1. **导航**到任意 URL,使用你的已登录浏览器 -2. **读取**页面内容——通过结构化 DOM 快照(不是截图) -3. **交互**——点击按钮、填写表单、选择选项、按键 -4. **提取**页面数据或拦截网络 API 响应 -5. **等待**元素、文本或页面跳转 - -Agent 在内部自动处理所有 `opencli browser` 命令——你只需用自然语言描述想做的事。 - -**Skill 参考文档:** -- [`skills/opencli-browser/SKILL.md`](./skills/opencli-browser/SKILL.md) — 实时驱动 Chrome(导航、填表单、点击、抓取) -- [`skills/opencli-adapter-author/SKILL.md`](./skills/opencli-adapter-author/SKILL.md) — 给新站点写适配器,全流程 -- [`skills/opencli-autofix/SKILL.md`](./skills/opencli-autofix/SKILL.md) — 修复已有适配器 -- [`skills/opencli-usage/SKILL.md`](./skills/opencli-usage/SKILL.md) — 命令和站点参考 - -`browser` 可用命令包括:`open`、`state`、`click`、`type`、`fill`、`select`、`keys`、`wait`、`get`、`find`、`extract`、`frames`、`screenshot`、`scroll`、`back`、`eval`、`network`、`tab list`、`tab new`、`tab select`、`tab close`、`init`、`verify`、`close`。 - -`opencli browser` 命令必须紧跟一个 `` 位置参数。`opencli browser work open ` 和 `opencli browser work tab new [url]` 都会返回 target ID。`opencli browser work tab list` 用来查看当前已存在 tab 的 target ID,再通过 `--tab ` 把命令明确路由到某个 tab。`tab new` 只会新建 tab,不会改变默认浏览器目标;只有显式执行 `tab select `,才会把该 tab 设为同一 session 后续未指定 target 的默认目标。 - -## 为新站点写适配器 - -当你需要的网站还没覆盖时,用 `opencli-adapter-author` skill,全流程: - -1. **侦察**站点,分类 pattern(SPA / SSR / JSONP / Token / Streaming) -2. **发现** endpoint——network 精读、initial state、bundle 搜索、token 溯源,或 interceptor 兜底 -3. **定认证**——`PUBLIC` / `COOKIE` / `INTERCEPT` / `UI` / `LOCAL` -4. **字段解码** + 设计输出列 -5. `opencli browser recon analyze ` → `opencli browser recon init /` → 写适配器 → `opencli browser recon verify /` -6. 站点知识沉到 `~/.opencli/sites//`,下次同站点直接吃缓存 - -## 前置要求 - -- **Node.js**: >= 21.0.0(标准 npm 安装路径要求) -- **Bun**: >= 1.0(可选替代运行时) -- 浏览器型命令需要 Chrome 或 Chromium 处于运行中,并已登录目标网站 - -> **重要**:浏览器型命令直接复用你的 Chrome/Chromium 登录态。如果拿到空数据或出现权限类失败,先确认目标站点已经在浏览器里打开并完成登录。 - -## 配置 - -| 变量 | 默认值 | 说明 | -|------|--------|------| -| `OPENCLI_DAEMON_PORT` | `19825` | daemon-extension 通信端口 | -| `OPENCLI_WINDOW` | 命令默认值 | 设为 `foreground` 或 `background` 来覆盖 Browser Bridge 窗口位置。浏览器型命令也支持 `--window ` | -| `OPENCLI_BROWSER_CONNECT_TIMEOUT` | `30` | 浏览器连接超时(秒) | -| `OPENCLI_BROWSER_COMMAND_TIMEOUT` | `60` | 单个浏览器命令超时(秒) | -| `OPENCLI_CDP_ENDPOINT` | — | Chrome DevTools Protocol 端点,用于远程浏览器或 Electron 应用 | -| `OPENCLI_CDP_TARGET` | — | 按 URL 子串过滤 CDP target(如 `detail.1688.com`) | -| `BROWSERBASE_API_KEY` | — | Browserbase 云端浏览器、Context 和账号配置所需 API key | -| `BROWSERBASE_PROJECT_ID` | — | Browserbase context/account 操作所需 project id | -| `BROWSERBASE_SESSION_ID` | — | 现有 Browserbase session ID;等价于根参数 `--browserbase-session ` / 兼容参数 `--session ` | -| `OPENCLI_VERBOSE` | `false` | 启用详细日志(`-v` 也可以) | -| `DEBUG_SNAPSHOT` | — | 设为 `1` 输出 DOM 快照调试信息 | - -`opencli browser *` 必须紧跟一个 `` 位置参数,默认使用前台窗口,并保留该 session 的 tab lease,直到你手动执行 `opencli browser close` 或等空闲超时。浏览器型 adapter 默认使用后台 adapter 窗口并在命令结束后释放一次性 tab lease;如果需要调试最终页面,可以传 `--window foreground --keep-tab true`。 - -Browserbase 现在可以由 OpenCLI 直接管理账号配置、持久 Context、账号绑定 proxy、Live View 登录 session 和并发任务池:`opencli browserbase account bootstrap ...`、`opencli --browserbase-account ...`、`opencli run --browserbase ...`。已有 Browserbase session 仍可用 `--browserbase-session` 或兼容的 `--session`。详见 [`docs/advanced/browserbase.md`](./docs/advanced/browserbase.md)。 +## Browserbase 多账号多 Proxy -社交平台评论能力见 [`docs/adapters/social-comments.md`](./docs/adapters/social-comments.md),覆盖 Twitter/X、YouTube、Reddit、LinkedIn、Instagram、TikTok 和小红书。 - -## 内置命令 - -运行 `opencli list` 查看完整注册表。 - -| 站点 | 命令 | -|------|------| -| **xiaohongshu** | `search` `note` `comments` `reply` `notifications` `feed` `user` `download` `publish` `creator-notes` `creator-note-detail` `creator-notes-summary` `creator-profile` `creator-stats` `delete-note` | -| **bilibili** | `hot` `search` `me` `favorite` `history` `feed` `subtitle` `summary` `video` `comments` `dynamic` `ranking` `following` `user-videos` `download` | -| **zhihu** | `hot` `search` `question` `download` `follow` `like` `favorite` `comment` `answer` | -| **hackernews** | `top` `new` `best` `ask` `show` `jobs` `search` `user` | -| **linkedin** | `connect` `inbox` `safe-send` `search` `people-search` `sent-invitations` `thread-snapshot` `timeline` `salesnav-search` `salesnav-inbox` `salesnav-message` `salesnav-thread` | -| **reddit** | `hot` `frontpage` `popular` `search` `subreddit` `read` `get-comments` `user` `user-posts` `user-comments` `upvote` `save` `comment` `reply` `subscribe` `subscribed` `saved` `upvoted` | -| **twitter** | `trending` `search` `timeline` `tweets` `lists` `list-tweets` `list-add` `list-remove` `bookmarks` `profile` `thread` `get-comments` `following` `followers` `notifications` `post` `reply` `delete` `like` `likes` `article` `follow` `unfollow` `bookmark` `unbookmark` `download` `accept` `reply-dm` `block` `unblock` `hide-reply` | -| **youtube** | `search` `video` `transcript` `comments` `reply` `reply-comment` `channel` `playlist` `feed` `history` `watch-later` `subscriptions` `like` `unlike` `subscribe` `unsubscribe` | -| **instagram** | `explore` `profile` `search` `search-posts` `user` `followers` `following` `follow` `unfollow` `like` `unlike` `comment` `get-comments` `reply` `save` `unsave` `saved` | -| **tiktok** | `explore` `search` `profile` `user` `following` `follow` `unfollow` `like` `unlike` `comment` `get-comments` `reply` `save` `unsave` `live` `notifications` `friends` | -| **claude** | `ask` `send` `new` `status` `read` `history` `detail` | -| **gemini** | `new` `ask` `image` `deep-research` `deep-research-result` | -| **notebooklm** | `status` `list` `open` `current` `get` `history` `summary` `note-list` `notes-get` `source-list` `source-get` `source-fulltext` `source-guide` | -| **amazon** | `bestsellers` `search` `product` `offer` `discussion` `movers-shakers` `new-releases` `rankings` | - -精选清单 — **[→ 查看全部 100+ 站点和命令](./docs/adapters/index.md)**(小红书 / B站 / 知乎 / Twitter / Reddit / 抖音 / 微博 / 微信读书 / 小宇宙 / 1688 / 夸克 / Spotify / 牛客 / arxiv / Bilibili / 等)。 - -### 外部 CLI 枢纽 - -把现有命令行工具统一接入 `opencli ...`: - -`gh` · `docker` · `vercel` · `wrangler` · `obsidian` · `longbridge` · `lark-cli` · `ntn(notion)` · `dws(DingTalk Workspace)` · `wecom-cli(企业微信)` · `tg(tg-cli)` · `discord(discord-cli)` · `wx(wx-cli)` - -注册自定义本地 CLI:`opencli external register `;查看所有:`opencli external list`。 - -**桌面应用适配器**(Electron,通过 CDP):Cursor / Codex / Antigravity / ChatGPT App / ChatWise / Discord / Doubao — 详见 [`docs/adapters/desktop/`](./docs/adapters/desktop/)。 - -## 下载支持 - -OpenCLI 支持从各平台下载图片、视频和文章。 - -### 支持的平台 - -| 平台 | 内容类型 | 说明 | -|------|----------|------| -| **小红书** | 图片、视频 | 下载笔记中的所有媒体文件 | -| **B站** | 视频 | 需要安装 `yt-dlp` | -| **Twitter/X** | 图片、视频 | 从用户媒体页或单条推文下载 | -| **Pixiv** | 图片 | 下载原始画质插画,支持多页作品 | -| **1688** | 图片、视频 | 下载商品页中可见的商品素材 | -| **小宇宙** | 音频、转录 | 使用本地凭证下载单集音频和转录 JSON / 文本 | -| **知乎** | 文章(Markdown) | 导出文章,可选下载图片到本地 | -| **微信公众号** | 文章(Markdown) | 导出微信公众号文章为 Markdown | -| **豆瓣** | 图片 | 下载电影条目的海报 / 剧照图片 | - -### 前置依赖 - -下载流媒体平台的视频需要安装 `yt-dlp`: +先设置 Browserbase 凭证: ```bash -# 安装 yt-dlp -pip install yt-dlp -# 或者 -brew install yt-dlp +export BROWSERBASE_API_KEY=... +export BROWSERBASE_PROJECT_ID=... ``` -### 使用示例 +创建 proxy profile。外部 proxy 的密码推荐放在环境变量: ```bash -# 下载小红书笔记中的图片/视频 -opencli xiaohongshu download "https://www.xiaohongshu.com/search_result/?xsec_token=..." --output ./xhs -opencli xiaohongshu download "https://xhslink.com/..." --output ./xhs -opencli rednote download "https://www.rednote.com/search_result/?xsec_token=..." --output ./rednote - -# 下载B站视频(需要 yt-dlp) -opencli bilibili download BV1xxx --output ./bilibili -opencli bilibili download BV1xxx --quality 1080p # 指定画质 - -# 下载 Twitter 用户的媒体 -opencli twitter download elonmusk --limit 20 --output ./twitter - -# 下载单条推文的媒体 -opencli twitter download --tweet-url "https://x.com/user/status/123" --output ./twitter +export PROXY_REDDIT1_PASS=... -# 下载豆瓣电影海报 / 剧照 -opencli douban download 30382501 --output ./douban - -# 下载 1688 商品页中的图片 / 视频素材 -opencli 1688 download 841141931191 --output ./1688-downloads - -# 下载小宇宙单集音频 -opencli xiaoyuzhou download 69b3b675772ac2295bfc01d0 --output ./xiaoyuzhou - -# 下载小宇宙单集转录 -opencli xiaoyuzhou transcript 69dd0c98e2c8be31551f6a33 --output ./xiaoyuzhou-transcripts - -# 导出知乎文章为 Markdown -opencli zhihu download "https://zhuanlan.zhihu.com/p/xxx" --output ./zhihu - -# 导出并下载图片 -opencli zhihu download "https://zhuanlan.zhihu.com/p/xxx" --download-images - -# 导出微信公众号文章为 Markdown -opencli weixin download --url "https://mp.weixin.qq.com/s/xxx" --output ./weixin +opencli browserbase proxy add reddit1-proxy \ + --type external \ + --server http://133.169.0.110:60088 \ + --username p9SIbn0S6AoC \ + --password-env PROXY_REDDIT1_PASS ``` -`opencli xiaoyuzhou download` 和 `transcript` 需要本地小宇宙凭证:`~/.opencli/xiaoyuzhou.json`。 - - - -## 输出格式 - -所有内置命令都支持 `--format` / `-f`,可选值为 `table`、`json`、`yaml`、`md`、`csv`。 -`list` 命令也支持同样的格式参数,同时继续兼容 `--json`。 +创建 10 个持久登录账号,并打开 Live View URL 给人工登录: ```bash -opencli list -f yaml # 用 YAML 列出命令注册表 -opencli bilibili hot -f table # 默认:富文本表格 -opencli bilibili hot -f json # JSON(适合传给 jq 或者各类 AI Agent) -opencli bilibili hot -f yaml # YAML(更适合人类直接阅读) -opencli bilibili hot -f md # Markdown -opencli bilibili hot -f csv # CSV -opencli bilibili hot -v # 详细模式:展示管线执行步骤调试信息 +opencli browserbase account bootstrap \ + --site reddit \ + --count 10 \ + --name-prefix reddit-main \ + --proxy reddit1-proxy \ + --open ``` -## 退出码 +每个账号对应一个 Browserbase Context。cookies、localStorage、IndexedDB 保存在 Browserbase;OpenCLI 本地只保存 `~/.opencli/browserbase.json` 元数据。 -opencli 遵循 Unix `sysexits.h`,CI / 脚本可按失败模式分支:`0` 成功、`66` 无数据、`69` Browser Bridge 未连接、`75` 超时、`77` 需要认证、`78` 配置错误、`130` Ctrl-C。完整参考:[docs/zh/guide/exit-codes.md](./docs/zh/guide/exit-codes.md)。 +用指定账号跑一个任务: -## 插件 +```bash +opencli --browserbase-account reddit-main-1 \ + reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" \ + --limit 100 \ + -f json +``` -通过社区贡献的插件扩展 OpenCLI。插件使用与内置命令相同的 JS 格式,启动时自动发现。 +用最多 10 个账号并发跑 JSONL 任务: ```bash -opencli plugin install github:user/opencli-plugin-my-tool # 安装 -opencli plugin list # 查看已安装 -opencli plugin update my-tool # 更新到最新 -opencli plugin update --all # 更新全部已安装插件 -opencli plugin uninstall my-tool # 卸载 +opencli run --browserbase \ + --accounts reddit-main-1,reddit-main-2,reddit-main-3,reddit-main-4,reddit-main-5,reddit-main-6,reddit-main-7,reddit-main-8,reddit-main-9,reddit-main-10 \ + --parallel 10 \ + --pool-size 10 \ + jobs.jsonl ``` -当 plugin 的版本被记录到 `~/.opencli/plugins.lock.json` 后,`opencli plugin list` 也会显示对应的短 commit hash。 +常用账号和 proxy 管理命令: -| 插件 | 类型 | 描述 | -|------|------|------| -| [opencli-plugin-github-trending](https://github.com/ByteYue/opencli-plugin-github-trending) | JS | GitHub Trending 仓库 | -| [opencli-plugin-hot-digest](https://github.com/ByteYue/opencli-plugin-hot-digest) | JS | 多平台热榜聚合 | -| [opencli-plugin-juejin](https://github.com/Astro-Han/opencli-plugin-juejin) | JS | 稀土掘金热门文章 | -| [opencli-plugin-vk](https://github.com/flobo3/opencli-plugin-vk) | JS | VK (VKontakte) 动态、信息流和搜索 | +```bash +opencli browserbase account list +opencli browserbase account login reddit-main-1 --open --wait +opencli browserbase account check reddit-main-1 --command "reddit whoami" +opencli browserbase account set-proxy reddit-main-1 reddit1-proxy +opencli browserbase account clear-proxy reddit-main-1 +opencli browserbase account clear-login reddit-main-1 --recreate-context --delete-old-context +opencli browserbase account delete reddit-main-1 --delete-context + +opencli browserbase proxy list +opencli browserbase proxy get reddit1-proxy +opencli browserbase proxy update reddit1-proxy --server http://new-host:port +opencli browserbase proxy test reddit1-proxy +opencli browserbase proxy delete reddit1-proxy +``` -详见 [插件指南](./docs/zh/guide/plugins.md) 了解如何创建自己的插件。 +完整说明见 [Browserbase 账号、Proxy 与并发 Session](./docs/zh/advanced/browserbase.md)。 -## 常见问题排查 +## 社交平台 Comments -- **"Extension not connected" 报错** - - 确保你已从 [Chrome Web Store](https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk) 安装 OpenCLI 扩展,且在 `chrome://extensions` 中**已启用**。 -- **"attach failed: Cannot access a chrome-extension:// URL" 报错** - - 其他 Chrome/Chromium 扩展(如 youmind、New Tab Override 或 AI 助手类扩展)可能产生冲突。请尝试**暂时禁用其他扩展**后重试。 -- **返回空数据,或者报错 "Unauthorized"** - - Chrome/Chromium 里的登录态可能已经过期。请打开当前页面,在新标签页重新手工登录或刷新该页面。 -- **Node API 错误 / 缺少 `fetch` / 旧 Node 启动即崩** - - OpenCLI 要求 **Node.js >= 21**。先执行 `node --version`,如果版本过低先升级,再重试命令。 -- **Daemon 问题** - - 检查 daemon 状态:`curl localhost:19825/status` - - 查看扩展日志:`curl localhost:19825/logs` +| 平台 | 读取评论 | 写入支持 | +| --- | --- | --- | +| Reddit | `reddit read`, `reddit get-comments` | `reddit comment`, `reddit reply` | +| Twitter/X | `twitter get-comments`, `twitter thread` | `twitter reply`, `twitter post` | +| YouTube | `youtube comments` | `youtube reply`, `youtube reply-comment` | +| Instagram | `instagram get-comments` | `instagram comment`, `instagram reply` | +| TikTok | `tiktok get-comments` | `tiktok comment`, `tiktok reply` | +| 小红书 | `xiaohongshu comments` | `xiaohongshu reply` | +| LinkedIn | `linkedin timeline` 评论数量 | 暂未暴露评论 thread 读写 | +平台参数、ID 和安全说明见 [Social Comment Support](./docs/zh/adapters/social-comments.md)。 -## Star History +## 给 AI Agent 安装 Skills -[![Star History Chart](https://api.star-history.com/svg?repos=jackwener/opencli&type=Date)](https://star-history.com/#jackwener/opencli&Date) +安装或刷新当前 fork 的所有 OpenCLI skills: +```bash +npx skills add albertcyhe/opencli +``` +只安装需要的 skill: -## License +```bash +npx skills add albertcyhe/opencli --skill opencli-usage +npx skills add albertcyhe/opencli --skill opencli-browserbase +npx skills add albertcyhe/opencli --skill opencli-social-comments +npx skills add albertcyhe/opencli --skill opencli-browser +npx skills add albertcyhe/opencli --skill opencli-adapter-author +npx skills add albertcyhe/opencli --skill opencli-autofix +``` -[Apache-2.0](./LICENSE) +| Skill | 适合场景 | +| --- | --- | +| `opencli-usage` | 总入口:发现能力并选择下一步 skill | +| `opencli-browserbase` | 管理 Browserbase account/context/session/proxy/并发任务 | +| `opencli-social-comments` | 读取或写入社交平台评论 | +| `opencli-browser` | 通过本地 Browser Bridge 临时操作 Chrome 页面 | +| `opencli-adapter-author` | 新增或扩展可复用 adapter | +| `opencli-autofix` | 基于 trace 修复失效 adapter | + +Skill 源码在 [`skills/`](./skills/),文档入口见 [给 AI Agent 的 Skills](./docs/zh/guide/skills.md)。 + +## 更多文档 + +- [快速开始](./docs/zh/guide/getting-started.md) +- [AI Agent 操作指南](./docs/zh/guide/ai-agent-operations.md) +- [安装](./docs/zh/guide/installation.md) +- [Browser Bridge](./docs/zh/guide/browser-bridge.md) +- [所有适配器](./docs/zh/adapters/index.md) +- [Browserbase](./docs/zh/advanced/browserbase.md) +- [社交平台 comments](./docs/zh/adapters/social-comments.md) diff --git a/docs/.vitepress/config.mts b/docs/.vitepress/config.mts index 917c5796d..f3ebd0070 100644 --- a/docs/.vitepress/config.mts +++ b/docs/.vitepress/config.mts @@ -18,9 +18,10 @@ export default defineConfig({ themeConfig: { nav: [ { text: 'Guide', link: '/guide/getting-started' }, + { text: 'Agent Ops', link: '/guide/ai-agent-operations' }, { text: 'Adapters', link: '/adapters/' }, { text: 'Developer', link: '/developer/contributing' }, - { text: 'Advanced', link: '/advanced/cdp' }, + { text: 'Advanced', link: '/advanced/browserbase' }, ], sidebar: { '/guide/': [ @@ -29,6 +30,8 @@ export default defineConfig({ items: [ { text: 'Getting Started', link: '/guide/getting-started' }, { text: 'Installation', link: '/guide/installation' }, + { text: 'AI Agent Operations', link: '/guide/ai-agent-operations' }, + { text: 'Skills for AI Agents', link: '/guide/skills' }, { text: 'Comparison', link: '/comparison' }, { text: 'Browser Bridge', link: '/guide/browser-bridge' }, { text: 'Remote Orchestration', link: '/guide/remote-orchestration' }, @@ -206,9 +209,10 @@ export default defineConfig({ themeConfig: { nav: [ { text: '指南', link: '/zh/guide/getting-started' }, + { text: 'Agent 操作', link: '/zh/guide/ai-agent-operations' }, { text: '适配器', link: '/zh/adapters/' }, { text: '开发者', link: '/zh/developer/contributing' }, - { text: '进阶', link: '/zh/advanced/cdp' }, + { text: '进阶', link: '/zh/advanced/browserbase' }, ], sidebar: { '/zh/guide/': [ @@ -217,6 +221,8 @@ export default defineConfig({ items: [ { text: '快速开始', link: '/zh/guide/getting-started' }, { text: '安装', link: '/zh/guide/installation' }, + { text: 'AI Agent 操作指南', link: '/zh/guide/ai-agent-operations' }, + { text: '给 AI Agent 的 Skills', link: '/zh/guide/skills' }, { text: 'Browser Bridge', link: '/zh/guide/browser-bridge' }, { text: '给新 Electron 应用生成 CLI', link: '/zh/guide/electron-app-cli' }, { text: '扩展 OpenCLI', link: '/zh/guide/extending-opencli' }, @@ -229,6 +235,7 @@ export default defineConfig({ text: '适配器概览', items: [ { text: '所有适配器', link: '/zh/adapters/' }, + { text: '社交平台 Comments', link: '/zh/adapters/social-comments' }, ], }, ], @@ -245,6 +252,7 @@ export default defineConfig({ text: '进阶', items: [ { text: 'Chrome DevTools Protocol', link: '/zh/advanced/cdp' }, + { text: 'Browserbase', link: '/zh/advanced/browserbase' }, ], }, ], @@ -259,18 +267,17 @@ export default defineConfig({ }, socialLinks: [ - { icon: 'github', link: 'https://github.com/jackwener/opencli' }, - { icon: 'npm', link: 'https://www.npmjs.com/package/@jackwener/opencli' }, + { icon: 'github', link: 'https://github.com/albertcyhe/opencli' }, ], editLink: { - pattern: 'https://github.com/jackwener/opencli/edit/main/docs/:path', + pattern: 'https://github.com/albertcyhe/opencli/edit/main/docs/:path', text: 'Edit this page on GitHub', }, footer: { message: 'Released under the Apache-2.0 License.', - copyright: 'Copyright © 2024-present jackwener', + copyright: 'Copyright © 2024-present OpenCLI contributors', }, }, }) diff --git a/docs/adapters/social-comments.md b/docs/adapters/social-comments.md index 986d4801e..1628ff3b9 100644 --- a/docs/adapters/social-comments.md +++ b/docs/adapters/social-comments.md @@ -1,41 +1,47 @@ # Social Comment Support -OpenCLI exposes comment workflows for the major social adapters through the -same browser-backed runtime used by normal site commands. These commands reuse -your logged-in browser session, or a Browserbase account profile when you pass -`--browserbase-account`. +OpenCLI exposes social comment workflows as atomic adapter commands. Use them directly, or route them through Browserbase accounts when identity, login state, or proxy matters. ## Support Matrix -| Platform | Read comments | Post top-level comment | Reply to comment | Notes | -|----------|---------------|------------------------|------------------|-------| -| Twitter / X | `twitter get-comments` | `twitter reply` | `twitter reply` | Tweet replies are represented as comment rows with reply-able `comment_id` values. | -| YouTube | `youtube comments` | `youtube reply` | `youtube reply-comment` | `reply-comment` needs the original video URL for page context. | -| Reddit | `reddit read`, `reddit get-comments` | `reddit comment` | `reddit reply` | `read --expand-more` expands threaded comments; `get-comments` returns a flat top-level list with IDs. | -| LinkedIn | `linkedin timeline` comment counts | Not exposed | Not exposed | Current LinkedIn support reports timeline post `comments` counts, but does not extract or write comment threads. | -| Instagram | `instagram get-comments` | `instagram comment` | `instagram reply` | `get-comments` accepts a username plus post index, or a post/reel URL. | -| TikTok | `tiktok get-comments` | `tiktok comment` | `tiktok reply` | Reply can fall back to `--comment-text` and `--comment-author` when an ID cannot be matched. | -| Xiaohongshu | `xiaohongshu comments` | Not exposed | `xiaohongshu reply` | `comments --with-replies` includes nested replies; note URLs need `xsec_token`. | +| Platform | Read comments | Top-level write | Reply write | Browserbase account example | +| --- | --- | --- | --- | --- | +| Reddit | `reddit read`, `reddit get-comments` | `reddit comment` | `reddit reply` | `opencli --browserbase-account reddit1 reddit get-comments -f json` | +| Twitter / X | `twitter get-comments`, `twitter thread` | `twitter post` / tweet reply with `twitter reply` | `twitter reply` | `opencli --browserbase-account x-main-1 twitter get-comments -f json` | +| YouTube | `youtube comments` | `youtube reply` | `youtube reply-comment` | `opencli --browserbase-account yt1 youtube comments -f json` | +| Instagram | `instagram get-comments` | `instagram comment` | `instagram reply` | `opencli --browserbase-account ig1 instagram get-comments -f json` | +| TikTok | `tiktok get-comments` | `tiktok comment` | `tiktok reply` | `opencli --browserbase-account tiktok1 tiktok get-comments -f json` | +| Xiaohongshu | `xiaohongshu comments` | Not exposed | `xiaohongshu reply` | `opencli --browserbase-account xhs1 xiaohongshu comments -f json` | +| LinkedIn | `linkedin timeline` comment counts | Not exposed | Not exposed | `opencli --browserbase-account linkedin1 linkedin timeline -f json` | -## Common Patterns +## Read Recipes -Read comments in JSON so the returned IDs can be reused: +Read comments in JSON so returned IDs can be reused by reply commands: ```bash -opencli reddit get-comments https://www.reddit.com/r/example/comments/1abc123/title/ --limit 100 -f json -opencli twitter get-comments https://x.com/user/status/123 --limit 50 -f json +opencli reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" --limit 100 -f json +opencli reddit read "https://www.reddit.com/r/example/comments/1abc123/title/" --expand-more -f json + +opencli twitter get-comments "https://x.com/user/status/123" --limit 50 -f json +opencli twitter thread "https://x.com/user/status/123" -f json + opencli youtube comments "https://www.youtube.com/watch?v=VIDEO_ID" --limit 100 -f json -opencli instagram get-comments https://www.instagram.com/p/SHORTCODE/ --limit 50 -f json +opencli instagram get-comments "https://www.instagram.com/p/SHORTCODE/" --limit 50 -f json opencli tiktok get-comments "https://www.tiktok.com/@user/video/123" --limit 50 -f json opencli xiaohongshu comments "https://www.xiaohongshu.com/search_result/?xsec_token=..." --with-replies -f json +opencli linkedin timeline --limit 20 -f json ``` -Post or reply only when you intentionally want a write action: +## Write Recipes + +Only run write commands when the user explicitly wants to post or reply. ```bash opencli reddit comment 1abc123 "Comment text" opencli reddit reply t1_okf3s7u "Reply text" +opencli twitter reply "https://x.com/user/status/123" "Reply text" + opencli youtube reply "https://www.youtube.com/watch?v=VIDEO_ID" "Comment text" opencli youtube reply-comment Ugxxx "Reply text" --url "https://www.youtube.com/watch?v=VIDEO_ID" @@ -48,26 +54,33 @@ opencli tiktok reply "https://www.tiktok.com/@user/video/123" "COMMENT_ID" "Repl opencli xiaohongshu reply "https://www.xiaohongshu.com/search_result/?xsec_token=..." "COMMENT_ID" "Reply text" ``` -## Browserbase Accounts +## Browserbase Identity Pattern + +Use a Browserbase account when login state and exit IP should travel together: + +```bash +opencli browserbase account set-proxy reddit1 reddit1-proxy + +opencli --browserbase-account reddit1 \ + reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" \ + --limit 100 \ + -f json +``` -Use Browserbase when the comment workflow needs a cloud browser, a fixed proxy, -or multiple logged-in identities: +For many jobs: ```bash -opencli --browserbase-account reddit1 reddit get-comments https://www.reddit.com/r/example/comments/1abc123/title/ -opencli --browserbase-account x-main-1 twitter get-comments https://x.com/user/status/123 +opencli run --browserbase \ + --accounts reddit1,reddit2,reddit3 \ + --parallel 3 \ + jobs.jsonl ``` -For durable login state, proxy CRUD, account-to-proxy binding, and pooled -parallel sessions, see [Browserbase Accounts, Login State, Proxies, and Parallel Sessions](../advanced/browserbase.md). +See [Browserbase Accounts, Proxies, and Parallel Sessions](../advanced/browserbase.md). -## Limits And Safety +## Safety Notes -- Read commands return the rows exposed by the platform's current web APIs or - UI state. Some platforms hide replies behind paginated "more" affordances or - rate limits. -- Write commands require a valid logged-in session. They raise typed errors on - auth walls, malformed IDs, captcha/rate-limit states, or failed post-click - verification instead of returning silent success rows. -- Treat returned comment IDs as platform-scoped IDs. Reuse them only with the - same platform adapter that produced them. +- Read commands return what the current platform API or UI exposes; pagination, hidden replies, auth walls, and rate limits can reduce coverage. +- Write commands require a valid logged-in identity and may have irreversible social effects. Do not run them without explicit user intent. +- Treat comment IDs as platform-scoped. Reuse a returned ID only with the same platform adapter. +- Do not print cookies, proxy passwords, Browserbase connect URLs, or API keys in final output. diff --git a/docs/advanced/browserbase.md b/docs/advanced/browserbase.md index a46c84a36..bbc071be4 100644 --- a/docs/advanced/browserbase.md +++ b/docs/advanced/browserbase.md @@ -1,159 +1,164 @@ -# Browserbase Accounts, Login State, Proxies, and Parallel Sessions +# Browserbase Accounts, Proxies, and Parallel Sessions -OpenCLI can run browser-backed adapters through Browserbase in three ways: +OpenCLI treats Browserbase as three first-class concepts: -- an explicit Browserbase session (`--browserbase-session` or legacy `--session`) -- an OpenCLI Browserbase account profile (`--browserbase-account`) -- a pooled JSONL run (`opencli run --browserbase`) +- **Account profile**: local OpenCLI metadata with a name, site tag, Browserbase `contextId`, default proxy, and login-state timestamps. +- **Context**: Browserbase's durable browser profile. It stores cookies, localStorage, IndexedDB, and other login state. +- **Session**: a short-lived Browserbase browser instance used for manual Live View login or automation. -The durable login state is stored in Browserbase Contexts. Browserbase Sessions -are short-lived browser instances used for manual login or automation, and proxy -settings are applied when a session is created. Binding an account profile to a -proxy means every future session for that account uses the same exit path unless -you override it. +Proxy is a session creation setting. Binding a proxy to an account means future sessions for that account use that exit path. Existing sessions are not changed. -## Configure +## First 15 Minutes + +Configure credentials: ```bash export BROWSERBASE_API_KEY=... export BROWSERBASE_PROJECT_ID=... ``` -OpenCLI stores only local metadata at `~/.opencli/browserbase.json` with `0600` -permissions: account names, site tags, Browserbase `contextId` values, proxy -profile references, and login-state timestamps. Cookies, localStorage, and -IndexedDB stay inside Browserbase Contexts. Proxy passwords should be referenced -through environment variables. - -Add a proxy profile: - -```bash -opencli browserbase proxy add us-ny \ - --type browserbase \ - --country US \ - --state NY \ - --city "New York" -``` - -External proxies should use an environment variable for the password: +Create a proxy. Prefer `--password-env` for external proxy passwords: ```bash -export PROXY_DC1_PASS=... +export PROXY_REDDIT1_PASS=... -opencli browserbase proxy add dc1 \ +opencli browserbase proxy add reddit1-proxy \ --type external \ --server http://host:port \ --username user \ - --password-env PROXY_DC1_PASS + --password-env PROXY_REDDIT1_PASS ``` -Proxy CRUD: +Create accounts and open Live View login sessions: ```bash -opencli browserbase proxy list -opencli browserbase proxy get dc1 -opencli browserbase proxy update dc1 --server http://new-host:port -opencli browserbase proxy test dc1 -opencli browserbase proxy delete dc1 +opencli browserbase account bootstrap \ + --site reddit \ + --count 10 \ + --name-prefix reddit-main \ + --proxy reddit1-proxy \ + --open ``` -`proxy delete` refuses to remove a proxy that accounts still reference. Use -`--force` when you intentionally want OpenCLI to clear those account references. +Log in manually in each Live View URL, then mark accounts ready or use `login --wait` for one account: -## Create Login Profiles +```bash +opencli browserbase account login reddit-main-1 --open --wait +opencli browserbase account mark reddit-main-1 --state ready +``` -Create ten account profiles, each with its own Browserbase Context, then open -Live View URLs for manual login: +Run one atomic task: ```bash -opencli browserbase account bootstrap \ - --site x \ - --count 10 \ - --name-prefix x-main \ - --proxy us-ny \ - --open +opencli --browserbase-account reddit-main-1 \ + reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" \ + --limit 100 \ + -f json ``` -After logging in, either keep the sessions open and mark accounts manually: +## Daily Account Operations ```bash -opencli browserbase account mark x-main-1 --state ready +opencli browserbase account list +opencli browserbase account get reddit-main-1 +opencli browserbase account check reddit-main-1 --command "reddit whoami" +opencli browserbase account login reddit-main-1 --open --wait +opencli browserbase account mark reddit-main-1 --state invalidated +``` + +Import an existing Browserbase Context: + +```bash +opencli browserbase account import reddit1 \ + --site reddit \ + --context-id \ + --proxy reddit1-proxy ``` -or use `--wait` so OpenCLI waits for Enter, releases the login session, and -marks the account ready: +Change or remove the default proxy for future sessions: ```bash -opencli browserbase account login x-main-1 --open --wait +opencli browserbase account set-proxy reddit-main-1 reddit1-proxy +opencli browserbase account clear-proxy reddit-main-1 ``` -Use the same account name later and OpenCLI will create a fresh Browserbase -Session attached to the saved Context, so the site should see the prior login -state. Site-side cookie expiry, password changes, or server-side logout can still -invalidate that state; mark or refresh the account when that happens. +Clear login state but keep the account name: ```bash -opencli browserbase account check x-main-1 --command "reddit whoami" -opencli browserbase account mark x-main-1 --state invalidated -opencli browserbase account login x-main-1 --open --wait +opencli browserbase account clear-login reddit-main-1 \ + --recreate-context \ + --delete-old-context ``` -To import an existing Browserbase Context as an OpenCLI account: +Delete local account metadata and optionally the remote Context: ```bash -opencli browserbase account import reddit1 --site reddit --context-id --proxy dc1 +opencli browserbase account delete reddit-main-1 --delete-context ``` -## Run Commands +## Proxy CRUD + +Browserbase geo proxy: + +```bash +opencli browserbase proxy add us-ny \ + --type browserbase \ + --country US \ + --state NY \ + --city "New York" +``` -Run one adapter command with a specific Browserbase account: +External proxy: ```bash -opencli --browserbase-account x-main-1 reddit get-comments https://reddit.com/r/... +export PROXY_DC1_PASS=... + +opencli browserbase proxy add dc1 \ + --type external \ + --server http://host:port \ + --username user \ + --password-env PROXY_DC1_PASS ``` -Run a JSONL job file across up to ten Browserbase account sessions: +Manage proxies: ```bash -opencli run --browserbase \ - --accounts x-main-1,x-main-2,x-main-3,x-main-4,x-main-5,x-main-6,x-main-7,x-main-8,x-main-9,x-main-10 \ - --parallel 10 \ - --pool-size 10 \ - jobs.jsonl +opencli browserbase proxy list +opencli browserbase proxy get dc1 +opencli browserbase proxy update dc1 --server http://new-host:port +opencli browserbase proxy test dc1 +opencli browserbase proxy delete dc1 ``` -Each line in `jobs.jsonl` is one job: +`proxy delete` refuses to remove a proxy while accounts still reference it. Use `--force` only when you intentionally want OpenCLI to clear those account references. + +## Parallel Runs + +Create `jobs.jsonl` with one independent job per line: ```json -{"id":"reddit-1","command":"reddit get-comments","args":{"post-id":"https://reddit.com/r/...","limit":100}} +{"id":"reddit-1","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1abc123/title/","limit":100}} +{"id":"reddit-2","account":"reddit-main-2","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1def456/title/","limit":100}} ``` -Jobs can pin an account with `"account"` or `"accountName"`; otherwise OpenCLI -round-robins across `--accounts`. The pool guarantees one active automation -session per account/context while allowing different accounts to run in -parallel. `--pool-size` is capped at 10; the effective limit can still be lower -if your Browserbase plan or API response rejects more sessions. - -## Manage State +Run across account profiles: ```bash -opencli browserbase account list -opencli browserbase account get x-main-1 -opencli browserbase account check x-main-1 --command "reddit whoami" -opencli browserbase account set-proxy x-main-1 us-ny -opencli browserbase account clear-proxy x-main-1 -opencli browserbase account clear-login x-main-1 --recreate-context --delete-old-context -opencli browserbase account delete x-main-1 --delete-context +opencli run --browserbase \ + --accounts reddit-main-1,reddit-main-2,reddit-main-3,reddit-main-4,reddit-main-5,reddit-main-6,reddit-main-7,reddit-main-8,reddit-main-9,reddit-main-10 \ + --parallel 10 \ + --pool-size 10 \ + jobs.jsonl ``` -`clear-login --recreate-context` replaces the account's Context so the next -login starts from an empty browser profile. `delete --delete-context` removes -both the local account profile and the remote Context. +The pool guarantees one active automation session per account/context while allowing different accounts to run in parallel. `--pool-size` is capped at 10; the Browserbase plan or API can still enforce a lower effective limit. + +## Session And Context Commands -Session commands are available when you need a manual session: +Manual session lifecycle: ```bash -opencli browserbase session create --account x-main-1 --keep-alive --print-live-url +opencli browserbase session create --account reddit-main-1 --keep-alive --print-live-url opencli browserbase session live-url opencli browserbase session get opencli browserbase session list @@ -161,7 +166,7 @@ opencli browserbase session release opencli browserbase session delete ``` -Context commands are lower-level tools for direct Browserbase Context work: +Lower-level Context commands: ```bash opencli browserbase context create @@ -170,14 +175,24 @@ opencli browserbase context list-local opencli browserbase context delete ``` -Sensitive fields such as proxy plaintext passwords and Browserbase connect URLs -are redacted by default. Use `--show-sensitive` only when you need to inspect -them directly. +Existing Browserbase sessions still work: -## Session Selection Priority +```bash +opencli --browserbase-session reddit get-comments -f json +opencli --session reddit get-comments -f json +``` -For browser-backed adapters, OpenCLI resolves remote browser settings in this -order: +## Security And Storage + +- Local metadata is stored at `~/.opencli/browserbase.json` with `0600` permissions. +- Cookies and login state stay in Browserbase Contexts, not in the local config. +- Proxy plaintext passwords and Browserbase connect URLs are redacted by default. +- Use `--show-sensitive` only when you intentionally need sensitive fields in JSON output. +- Read-only checks default to non-persistent context writes; login and normal task sessions persist by default. + +## Selection Priority + +For browser-backed adapters, OpenCLI resolves browser settings in this order: 1. `--browserbase-account ` 2. `--browserbase-session ` @@ -185,8 +200,3 @@ order: 4. `BROWSERBASE_SESSION_ID` 5. `OPENCLI_CDP_ENDPOINT` 6. local Browser Bridge - -Read-only checks run with `persistContext: false`; login and normal task -sessions persist by default. Pass `--no-persist-context` to `opencli run` or -`browserbase session create` when you intentionally do not want a session to -write back context changes. diff --git a/docs/advanced/rate-limiter-plugin.md b/docs/advanced/rate-limiter-plugin.md index 9d273298d..3dc9a71b2 100644 --- a/docs/advanced/rate-limiter-plugin.md +++ b/docs/advanced/rate-limiter-plugin.md @@ -5,7 +5,7 @@ An optional plugin that adds a random sleep between browser-based commands to re ## Install ```bash -opencli plugin install github:jackwener/opencli-plugin-rate-limiter +opencli plugin install github:user/opencli-plugin-rate-limiter ``` Or copy the example below into `~/.opencli/plugins/rate-limiter/` to use it locally without installing from GitHub. diff --git a/docs/developer/ai-workflow.md b/docs/developer/ai-workflow.md index f0fc17bc9..4f9ed28d2 100644 --- a/docs/developer/ai-workflow.md +++ b/docs/developer/ai-workflow.md @@ -24,7 +24,7 @@ opencli browser verify / The skill `opencli-adapter-author` walks through: coverage self-test → site recon → API discovery → field decoding → output design → adapter coding → verify → write-back to site memory. -See [skills/opencli-adapter-author/SKILL.md](https://github.com/jackwener/opencli/blob/main/skills/opencli-adapter-author/SKILL.md). +See [skills/opencli-adapter-author/SKILL.md](https://github.com/albertcyhe/opencli/blob/main/skills/opencli-adapter-author/SKILL.md). ## Primitives diff --git a/docs/guide/ai-agent-operations.md b/docs/guide/ai-agent-operations.md new file mode 100644 index 000000000..c877184bf --- /dev/null +++ b/docs/guide/ai-agent-operations.md @@ -0,0 +1,115 @@ +# AI Agent Operations + +This page is the command map for a background-free AI agent. Start here when the user asks for an OpenCLI task and you need to choose the right command. + +## 1. Discover The Surface + +```bash +opencli list -f json +opencli --help +opencli --help +``` + +Use `opencli list -f json` as the source of truth. It returns command names, aliases, strategy, browser requirements, arguments, and output columns. + +## 2. Pick The Runtime + +| Runtime | Command shape | When to use | +| --- | --- | --- | +| Public adapter | `opencli ... -f json` | No login or browser state needed. | +| Local Chrome | `opencli ... -f json` | The user's local Chrome is logged in and Browser Bridge is healthy. | +| Browserbase account | `opencli --browserbase-account ... -f json` | Need durable cloud login state, account identity, or account-bound proxy. | +| Existing Browserbase session | `opencli --browserbase-session ... -f json` | A session already exists and should be reused directly. | +| Pooled jobs | `opencli run --browserbase --accounts ... --parallel ... jobs.jsonl` | Many independent atomic jobs should run across accounts. | + +Session priority for browser-backed adapters is: `--browserbase-account`, `--browserbase-session`, legacy `--session`, `BROWSERBASE_SESSION_ID`, `OPENCLI_CDP_ENDPOINT`, then local Browser Bridge. + +## 3. Configure Browserbase Identity + +```bash +export BROWSERBASE_API_KEY=... +export BROWSERBASE_PROJECT_ID=... + +opencli browserbase proxy add us-ny \ + --type browserbase \ + --country US \ + --state NY \ + --city "New York" + +opencli browserbase account bootstrap \ + --site reddit \ + --count 10 \ + --name-prefix reddit-main \ + --proxy us-ny \ + --open +``` + +The Live View URLs are for manual login. OpenCLI stores account metadata locally; Browserbase Context stores cookies, localStorage, IndexedDB, and other login state. + +Daily operations: + +```bash +opencli browserbase account list +opencli browserbase account get reddit-main-1 +opencli browserbase account login reddit-main-1 --open --wait +opencli browserbase account check reddit-main-1 --command "reddit whoami" +opencli browserbase account set-proxy reddit-main-1 us-ny +opencli browserbase account clear-proxy reddit-main-1 +opencli browserbase account mark reddit-main-1 --state invalidated +``` + +## 4. Run Atomic Commands + +Always prefer adapter commands over raw browser driving when a command exists. + +```bash +opencli --browserbase-account reddit-main-1 \ + reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" \ + --limit 100 \ + -f json + +opencli --browserbase-account x-main-1 \ + twitter get-comments "https://x.com/user/status/123" \ + --limit 50 \ + -f json +``` + +Only use `opencli browser ...` for ad-hoc UI work, debugging, or tasks that have no adapter. + +## 5. Run Parallel JSONL Jobs + +Each JSONL line is one independent job: + +```json +{"id":"reddit-1","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1abc123/title/","limit":100}} +{"id":"reddit-2","account":"reddit-main-2","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1def456/title/","limit":100}} +``` + +Run through a Browserbase account pool: + +```bash +opencli run --browserbase \ + --accounts reddit-main-1,reddit-main-2,reddit-main-3,reddit-main-4,reddit-main-5,reddit-main-6,reddit-main-7,reddit-main-8,reddit-main-9,reddit-main-10 \ + --parallel 10 \ + --pool-size 10 \ + jobs.jsonl +``` + +The pool does not reuse the same account/context concurrently. Proxy changes affect only sessions created after the change. + +## 6. Choose The Right Skill + +```bash +npx skills add albertcyhe/opencli +``` + +| Task | Skill | +| --- | --- | +| General command discovery | `opencli-usage` | +| Browserbase accounts, sessions, proxies, pooled jobs | `opencli-browserbase` | +| Reddit/X/YouTube/Instagram/TikTok/Xiaohongshu comments | `opencli-social-comments` | +| Ad-hoc local Chrome operation | `opencli-browser` | +| Build a reusable adapter | `opencli-adapter-author` | +| Fix a broken adapter | `opencli-autofix` | + +Detailed skill install and routing: [Skills for AI agents](./skills.md). diff --git a/docs/guide/browser-bridge.md b/docs/guide/browser-bridge.md index 927001e12..14368157c 100644 --- a/docs/guide/browser-bridge.md +++ b/docs/guide/browser-bridge.md @@ -8,7 +8,7 @@ OpenCLI connects to your browser through a lightweight **Browser Bridge** Chrome ### Method 1: Download Pre-built Release (Recommended) -1. Go to the GitHub [Releases page](https://github.com/jackwener/opencli/releases) and download the latest `opencli-extension-v{version}.zip`. +1. Go to the GitHub [Releases page](https://github.com/albertcyhe/opencli/releases) and download the latest `opencli-extension-v{version}.zip`. 2. Unzip the file and open `chrome://extensions`, enable **Developer mode** (top-right toggle). 3. Click **Load unpacked** and select the unzipped folder. diff --git a/docs/guide/getting-started.md b/docs/guide/getting-started.md index 63131cfc1..af19430df 100644 --- a/docs/guide/getting-started.md +++ b/docs/guide/getting-started.md @@ -1,81 +1,102 @@ # Getting Started -> **Make any website or Electron App your CLI.** -> Zero risk · Reuse Chrome login · AI-powered discovery · Browser + Desktop automation +OpenCLI gives humans and AI agents one deterministic command surface for websites, Browserbase cloud browsers, Electron apps, and external CLIs. -[![npm](https://img.shields.io/npm/v/@jackwener/opencli?style=flat-square)](https://www.npmjs.com/package/@jackwener/opencli) -[![Node.js Version](https://img.shields.io/node/v/@jackwener/opencli?style=flat-square)](https://nodejs.org) -[![License](https://img.shields.io/npm/l/@jackwener/opencli?style=flat-square)](https://github.com/jackwener/opencli/blob/main/LICENSE) +## Install -OpenCLI turns **any website** or **Electron app** into a command-line interface — Bilibili, Zhihu, 小红书, Twitter/X, Reddit, YouTube, Antigravity, and [many more](/adapters/) — powered by browser session reuse and AI-native discovery. +The current fork is not published as a separate npm package yet. Install it from source: -## Highlights - -- **Desktop App Control** — Drive Electron apps (Cursor, Codex, ChatGPT, etc.) directly from the terminal via CDP. -- **Browser Automation** — `browser` gives AI agents direct browser control: click, type/fill, extract, screenshot — fully scriptable. -- **Website → CLI** — Turn any website into a deterministic CLI: 100+ site surfaces are already registered, or author your own with the `opencli-adapter-author` skill. -- **Account-safe** — Reuses Chrome's logged-in state; your credentials never leave the browser. -- **AI Agent ready** — `opencli browser *` primitives (`open` / `network` / `state` / `eval` / `init` / `verify`) drive the adapter-authoring loop. -- **Zero LLM cost** — No tokens consumed at runtime. Run 10,000 times and pay nothing. -- **Deterministic** — Same command, same output schema, every time. Pipeable, scriptable, CI-friendly. +```bash +node --version +git clone git@github.com:albertcyhe/opencli.git +cd opencli +npm install +npm run build +npm link +opencli doctor +opencli list +``` -## Quick Start +## Discover Commands -### Install via npm +Use runtime discovery first. Adapter support changes faster than docs. ```bash -npm install -g @jackwener/opencli +opencli list -f json +opencli --help +opencli --help ``` -### Basic Usage +Agents should usually request JSON: ```bash -opencli list # See all commands -opencli hackernews top --limit 5 # Public API, no browser -opencli bilibili hot --limit 5 # Browser command -opencli zhihu hot -f json # JSON output +opencli hackernews top --limit 5 -f json +opencli reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" --limit 100 -f json ``` -### Output Formats +## Choose A Browser Runtime + +| Runtime | Use when | +| --- | --- | +| Public/API adapter | The command does not need login or browser state. | +| Local Browser Bridge | The task should reuse a Chrome profile on this machine. | +| Browserbase account | The task needs a cloud browser, durable login state, account-to-proxy binding, or parallel identities. | +| Explicit Browserbase session | You already created a Browserbase session and only need OpenCLI to attach to it. | + +Local Chrome setup: [Browser Bridge](./browser-bridge.md). -All built-in commands support `--format` / `-f`: +Browserbase setup: [Browserbase accounts, proxies, and parallel sessions](../advanced/browserbase.md). + +## First Browserbase Task ```bash -opencli bilibili hot -f table # Default: rich terminal table -opencli bilibili hot -f json # JSON (pipe to jq or LLMs) -opencli bilibili hot -f yaml # YAML (human-readable) -opencli bilibili hot -f md # Markdown -opencli bilibili hot -f csv # CSV -opencli bilibili hot -v # Verbose: show pipeline debug +export BROWSERBASE_API_KEY=... +export BROWSERBASE_PROJECT_ID=... +export PROXY_REDDIT1_PASS=... + +opencli browserbase proxy add reddit1-proxy \ + --type external \ + --server http://host:port \ + --username user \ + --password-env PROXY_REDDIT1_PASS + +opencli browserbase account bootstrap \ + --site reddit \ + --count 1 \ + --name-prefix reddit-main \ + --proxy reddit1-proxy \ + --open + +opencli --browserbase-account reddit-main-1 \ + reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" \ + --limit 100 \ + -f json ``` -### Tab Completion +## First Parallel Run + +Create `jobs.jsonl` with one command per line: + +```json +{"id":"reddit-1","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1abc123/title/","limit":100}} +{"id":"reddit-2","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1def456/title/","limit":100}} +``` -OpenCLI supports intelligent tab completion to speed up command input: +Run the jobs through named Browserbase accounts: ```bash -# Add shell completion to your startup config -echo 'eval "$(opencli completion zsh)"' >> ~/.zshrc # Zsh -echo 'eval "$(opencli completion bash)"' >> ~/.bashrc # Bash -echo 'opencli completion fish | source' >> ~/.config/fish/config.fish # Fish - -# Restart your shell, then press Tab to complete: -opencli [Tab] # Complete site names (bilibili, zhihu, twitter...) -opencli bilibili [Tab] # Complete commands (hot, search, me, download...) +opencli run --browserbase \ + --accounts reddit-main-1,reddit-main-2,reddit-main-3 \ + --parallel 3 \ + jobs.jsonl ``` -The completion includes: -- All available sites and adapters -- Built-in commands (list, validate, verify, browser, doctor, plugin...) -- Command aliases -- Real-time updates as you add new adapters +The pool keeps one active automation session per account/context, while different accounts can run concurrently. ## Next Steps -- [Installation details](/guide/installation) -- [Browser Bridge setup](/guide/browser-bridge) -- [Extending OpenCLI — custom commands, plugins, and external CLIs](/guide/extending-opencli) -- [Plugins — extend with community adapters](/guide/plugins) -- [All available adapters](/adapters/) -- [For developers / AI agents](/developer/contributing) -- [Add a new Electron app CLI](/guide/electron-app-cli) +- [AI Agent Operations](./ai-agent-operations.md) +- [Skills for AI agents](./skills.md) +- [Social Comment Support](../adapters/social-comments.md) +- [All adapters](../adapters/index.md) +- [Extending OpenCLI](./extending-opencli.md) diff --git a/docs/guide/installation.md b/docs/guide/installation.md index 1a1eebf87..8c8f92ca5 100644 --- a/docs/guide/installation.md +++ b/docs/guide/installation.md @@ -2,48 +2,73 @@ ## Requirements -- **Node.js**: >= 21.0.0, or **Bun** >= 1.0 -- **Chrome** running and logged into the target site (for browser commands) +- Node.js 20 or newer, or Bun 1.0 or newer +- Chrome/Chromium plus the OpenCLI Browser Bridge extension for local browser-backed commands +- Browserbase credentials for cloud browser account/session workflows -## Install via npm (Recommended) +## Install This Fork From Source -```bash -npm install -g @jackwener/opencli -``` - -## Install from Source +The current `albertcyhe/opencli` fork is not published as a separate npm package yet. Use the source install path: ```bash -git clone git@github.com:jackwener/opencli.git +node --version +git clone git@github.com:albertcyhe/opencli.git cd opencli npm install npm run build -npm link # Link binary globally -opencli list # Now you can use it anywhere! +npm link +opencli --version +opencli list ``` -## Update +## Browser Bridge + +Install the extension from the [Chrome Web Store](https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk), or download the release zip from [albertcyhe/opencli releases](https://github.com/albertcyhe/opencli/releases). + +Verify local browser connectivity: ```bash -npm install -g @jackwener/opencli@latest +opencli doctor +``` + +## Browserbase -# If you use the packaged OpenCLI skills, refresh them too -npx skills add jackwener/opencli +```bash +export BROWSERBASE_API_KEY=... +export BROWSERBASE_PROJECT_ID=... +opencli browserbase account list +opencli browserbase proxy list ``` -Or refresh only the skills you actually use: +See [Browserbase accounts, proxies, and parallel sessions](../advanced/browserbase.md) for account bootstrap, Live View login, proxy CRUD, and pooled runs. + +## Skills + +Install or refresh all OpenCLI skills from this fork: ```bash -npx skills add jackwener/opencli --skill opencli-adapter-author -npx skills add jackwener/opencli --skill opencli-autofix -npx skills add jackwener/opencli --skill opencli-browser -npx skills add jackwener/opencli --skill opencli-usage +npx skills add albertcyhe/opencli ``` -## Verify Installation +Or install only the skills you need: ```bash -opencli --version # Check version -opencli list # List all commands -opencli doctor # Diagnose connectivity +npx skills add albertcyhe/opencli --skill opencli-usage +npx skills add albertcyhe/opencli --skill opencli-browserbase +npx skills add albertcyhe/opencli --skill opencli-social-comments +npx skills add albertcyhe/opencli --skill opencli-browser +npx skills add albertcyhe/opencli --skill opencli-adapter-author +npx skills add albertcyhe/opencli --skill opencli-autofix +``` + +See [Skills for AI agents](./skills.md). + +## Update + +```bash +git pull +npm install +npm run build +npm link +npx skills add albertcyhe/opencli ``` diff --git a/docs/guide/remote-orchestration.md b/docs/guide/remote-orchestration.md index 4381b6de9..d04293db0 100644 --- a/docs/guide/remote-orchestration.md +++ b/docs/guide/remote-orchestration.md @@ -16,7 +16,7 @@ The first instinct is "let the extension connect to a remote daemon" — type a - Execute arbitrary JavaScript in any of your tabs - Take screenshots, send arbitrary HTTP requests, dump page content -Treat the daemon port the way you'd treat your unlocked desktop: never put it on a network you don't fully trust. Native "extension talks to remote daemon" support was [proposed in #636](https://github.com/jackwener/OpenCLI/pull/636) and deferred until daemon authentication exists. +Treat the daemon port the way you'd treat your unlocked desktop: never put it on a network you don't fully trust. Native "extension talks to remote daemon" support was proposed upstream in PR #636 and deferred until daemon authentication exists. ## The pattern: reverse-tunnel a localhost daemon diff --git a/docs/guide/skills.md b/docs/guide/skills.md new file mode 100644 index 000000000..ad5ea31e4 --- /dev/null +++ b/docs/guide/skills.md @@ -0,0 +1,38 @@ +# Skills For AI Agents + +OpenCLI ships Codex/Claude/Cursor-style skills so an agent can load the right workflow instead of reading all docs. + +## Install + +Install or refresh all skills from this fork: + +```bash +npx skills add albertcyhe/opencli +``` + +Install one skill: + +```bash +npx skills add albertcyhe/opencli --skill opencli-browserbase +``` + +## Skill Router + +| Skill | Load when | +| --- | --- | +| `opencli-usage` | You need the top-level map of OpenCLI commands, output formats, discovery commands, and which skill to load next. | +| `opencli-browserbase` | You need Browserbase Contexts, account profiles, session Live View URLs, proxy CRUD, account-bound proxies, or `opencli run --browserbase`. | +| `opencli-social-comments` | You need comment extraction, posting, or replies on Reddit, Twitter/X, YouTube, Instagram, TikTok, Xiaohongshu, or LinkedIn timeline counts. | +| `opencli-browser` | You need to drive a local Chrome page through Browser Bridge with `opencli browser ...`. | +| `opencli-adapter-author` | You are writing a reusable adapter or adding a command to an existing site. | +| `opencli-autofix` | An adapter command failed and you need trace-driven repair. | + +## Agent Defaults + +- Use `opencli list -f json` before choosing an adapter. +- Prefer adapter commands over raw browser driving. +- Use `-f json` for machine-consumed output. +- Use `--browserbase-account ` when identity, login state, or proxy matters. +- Do not expose Browserbase API keys, proxy passwords, cookies, or session connect URLs in final output. + +The skill files live under `skills/` in this repository. Detailed commands remain in each skill's references so agents only load what they need. diff --git a/docs/guide/troubleshooting.md b/docs/guide/troubleshooting.md index 7c26c8613..733f4b3df 100644 --- a/docs/guide/troubleshooting.md +++ b/docs/guide/troubleshooting.md @@ -63,5 +63,5 @@ npx tsc --noEmit ## Getting Help -- [GitHub Issues](https://github.com/jackwener/opencli/issues) — Bug reports and feature requests +- [GitHub Issues](https://github.com/albertcyhe/opencli/issues) — Bug reports and feature requests - Run `opencli doctor` for comprehensive diagnostics diff --git a/docs/index.md b/docs/index.md index 422bced25..6abac9191 100644 --- a/docs/index.md +++ b/docs/index.md @@ -3,33 +3,36 @@ layout: home hero: name: OpenCLI - text: Make any website or Electron App your CLI - tagline: Zero risk · Reuse Chrome login · AI-powered discovery · Browser + Desktop automation + text: Browserbase-ready CLI automation for agents + tagline: Websites · Social comments · Multi-account Browserbase · Multi-proxy sessions · Desktop adapters actions: - theme: brand text: Get Started link: /guide/getting-started + - theme: alt + text: Agent Operations + link: /guide/ai-agent-operations - theme: alt text: View on GitHub - link: https://github.com/jackwener/opencli + link: https://github.com/albertcyhe/opencli features: - - icon: 🖥️ - title: Desktop App Control - details: Drive Electron apps (Cursor, Codex, ChatGPT, etc.) directly from the terminal via CDP. - icon: 🌐 - title: Browser Automation - details: "AI agents get direct browser control: click, type/fill, extract, screenshot — any interaction, fully scriptable." - - icon: 🔐 - title: Account Safe - details: Reuses Chrome's logged-in state. Your credentials never leave the browser — no tokens, no exposed passwords. - - icon: 🤖 - title: AI Agent Ready - details: "Browser primitives plus adapter-authoring skills give AI agents a repeatable loop for recon, extraction, verification, and adapter writing." - - icon: 💰 - title: Zero LLM Cost - details: No tokens consumed at runtime. Run 10,000 times and pay nothing. - - icon: 🔁 - title: Deterministic - details: Same command, same output schema, every time. Pipeable, scriptable, CI-friendly. + title: Browserbase Accounts + details: Persist login state in Browserbase Contexts, create Live View login sessions, and run future tasks by account name. + - icon: 🧭 + title: Multi-Proxy Routing + details: Bind proxies to accounts, update proxy profiles, and keep login state and exit IP together. + - icon: ⚙️ + title: Parallel Atomic Jobs + details: Run JSONL jobs through `opencli run --browserbase` across up to ten Browserbase sessions. + - icon: 💬 + title: Social Comments + details: Read and write comments on Reddit, Twitter/X, YouTube, Instagram, TikTok, Xiaohongshu, and supported LinkedIn timeline metadata. + - icon: 🧩 + title: Agent Skills + details: Install OpenCLI skills so an AI agent knows which command to call and when to use Browserbase. + - icon: 🖥️ + title: Local Browser Bridge + details: Reuse a logged-in local Chrome profile for browser-backed adapters and ad-hoc browser control. --- diff --git a/docs/zh/adapters/index.md b/docs/zh/adapters/index.md index 00f3f19d2..9e48ef05c 100644 --- a/docs/zh/adapters/index.md +++ b/docs/zh/adapters/index.md @@ -2,4 +2,6 @@ 运行 `opencli list` 查看完整命令列表。 +社交平台 comments 的读取、评论和回复能力见 [社交平台 Comments](./social-comments.md)。 + 详细文档请参考 [英文版本](/adapters/)。 diff --git a/docs/zh/adapters/social-comments.md b/docs/zh/adapters/social-comments.md new file mode 100644 index 000000000..70ee2cb7f --- /dev/null +++ b/docs/zh/adapters/social-comments.md @@ -0,0 +1,86 @@ +# 社交平台 Comments 支持 + +OpenCLI 把社交平台评论流程暴露成原子 adapter 命令。可以直接调用,也可以通过 Browserbase account 指定登录态、账号身份和 proxy 出口。 + +## 支持矩阵 + +| 平台 | 读取评论 | 一级评论/发帖 | 回复评论 | Browserbase 账号示例 | +| --- | --- | --- | --- | --- | +| Reddit | `reddit read`, `reddit get-comments` | `reddit comment` | `reddit reply` | `opencli --browserbase-account reddit1 reddit get-comments -f json` | +| Twitter / X | `twitter get-comments`, `twitter thread` | `twitter post` / `twitter reply` 发 tweet 回复 | `twitter reply` | `opencli --browserbase-account x-main-1 twitter get-comments -f json` | +| YouTube | `youtube comments` | `youtube reply` | `youtube reply-comment` | `opencli --browserbase-account yt1 youtube comments -f json` | +| Instagram | `instagram get-comments` | `instagram comment` | `instagram reply` | `opencli --browserbase-account ig1 instagram get-comments -f json` | +| TikTok | `tiktok get-comments` | `tiktok comment` | `tiktok reply` | `opencli --browserbase-account tiktok1 tiktok get-comments -f json` | +| 小红书 | `xiaohongshu comments` | 暂未暴露 | `xiaohongshu reply` | `opencli --browserbase-account xhs1 xiaohongshu comments -f json` | +| LinkedIn | `linkedin timeline` 评论数量 | 暂未暴露 | 暂未暴露 | `opencli --browserbase-account linkedin1 linkedin timeline -f json` | + +## 读取示例 + +读取评论时用 JSON,方便复用返回的 ID: + +```bash +opencli reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" --limit 100 -f json +opencli reddit read "https://www.reddit.com/r/example/comments/1abc123/title/" --expand-more -f json + +opencli twitter get-comments "https://x.com/user/status/123" --limit 50 -f json +opencli twitter thread "https://x.com/user/status/123" -f json + +opencli youtube comments "https://www.youtube.com/watch?v=VIDEO_ID" --limit 100 -f json +opencli instagram get-comments "https://www.instagram.com/p/SHORTCODE/" --limit 50 -f json +opencli tiktok get-comments "https://www.tiktok.com/@user/video/123" --limit 50 -f json +opencli xiaohongshu comments "https://www.xiaohongshu.com/search_result/?xsec_token=..." --with-replies -f json +opencli linkedin timeline --limit 20 -f json +``` + +## 写入示例 + +只有在用户明确要发评论或回复时才运行写入命令。 + +```bash +opencli reddit comment 1abc123 "Comment text" +opencli reddit reply t1_okf3s7u "Reply text" + +opencli twitter reply "https://x.com/user/status/123" "Reply text" + +opencli youtube reply "https://www.youtube.com/watch?v=VIDEO_ID" "Comment text" +opencli youtube reply-comment Ugxxx "Reply text" --url "https://www.youtube.com/watch?v=VIDEO_ID" + +opencli instagram comment nasa "Comment text" --index 1 +opencli instagram reply nasa 18000000000000000 "Reply text" --index 1 + +opencli tiktok comment "https://www.tiktok.com/@user/video/123" "Comment text" +opencli tiktok reply "https://www.tiktok.com/@user/video/123" "COMMENT_ID" "Reply text" + +opencli xiaohongshu reply "https://www.xiaohongshu.com/search_result/?xsec_token=..." "COMMENT_ID" "Reply text" +``` + +## Browserbase 身份模式 + +当登录态和出口 IP 需要绑定时,使用 Browserbase account: + +```bash +opencli browserbase account set-proxy reddit1 reddit1-proxy + +opencli --browserbase-account reddit1 \ + reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" \ + --limit 100 \ + -f json +``` + +多个任务并发: + +```bash +opencli run --browserbase \ + --accounts reddit1,reddit2,reddit3 \ + --parallel 3 \ + jobs.jsonl +``` + +详见 [Browserbase 账号、Proxy 与并发 Session](../advanced/browserbase.md)。 + +## 安全说明 + +- 读取命令返回当前平台 API 或 UI 暴露的数据;分页、隐藏回复、登录墙和限流都会影响覆盖度。 +- 写入命令需要有效登录态,并且可能产生不可撤销的社交影响。没有用户明确意图时不要执行。 +- 评论 ID 是平台内 ID,只能回传给同一个平台 adapter。 +- 最终输出不要泄露 cookies、proxy password、Browserbase connect URL 或 API key。 diff --git a/docs/zh/advanced/browserbase.md b/docs/zh/advanced/browserbase.md new file mode 100644 index 000000000..61a0f07ae --- /dev/null +++ b/docs/zh/advanced/browserbase.md @@ -0,0 +1,202 @@ +# Browserbase 账号、Proxy 与并发 Session + +OpenCLI 把 Browserbase 拆成三个一等概念: + +- **Account profile**:OpenCLI 本地账号元数据,包含账号名、站点标签、Browserbase `contextId`、默认 proxy 和登录状态时间戳。 +- **Context**:Browserbase 持久浏览器 profile,保存 cookies、localStorage、IndexedDB 和其它登录态。 +- **Session**:短生命周期 Browserbase 浏览器实例,用于 Live View 人工登录或自动化任务。 + +Proxy 是创建 session 时的参数。账号绑定 proxy 的含义是:以后这个账号新建 session 时默认使用同一个出口。已经运行的 session 不会被修改。 + +## 15 分钟上手 + +配置凭证: + +```bash +export BROWSERBASE_API_KEY=... +export BROWSERBASE_PROJECT_ID=... +``` + +创建 proxy。外部 proxy 密码优先用 `--password-env`: + +```bash +export PROXY_REDDIT1_PASS=... + +opencli browserbase proxy add reddit1-proxy \ + --type external \ + --server http://host:port \ + --username user \ + --password-env PROXY_REDDIT1_PASS +``` + +创建账号并打开 Live View 登录 session: + +```bash +opencli browserbase account bootstrap \ + --site reddit \ + --count 10 \ + --name-prefix reddit-main \ + --proxy reddit1-proxy \ + --open +``` + +在每个 Live View URL 中人工登录,然后标记账号 ready,或对单个账号使用 `login --wait`: + +```bash +opencli browserbase account login reddit-main-1 --open --wait +opencli browserbase account mark reddit-main-1 --state ready +``` + +跑一个原子任务: + +```bash +opencli --browserbase-account reddit-main-1 \ + reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" \ + --limit 100 \ + -f json +``` + +## 日常账号操作 + +```bash +opencli browserbase account list +opencli browserbase account get reddit-main-1 +opencli browserbase account check reddit-main-1 --command "reddit whoami" +opencli browserbase account login reddit-main-1 --open --wait +opencli browserbase account mark reddit-main-1 --state invalidated +``` + +导入已有 Browserbase Context: + +```bash +opencli browserbase account import reddit1 \ + --site reddit \ + --context-id \ + --proxy reddit1-proxy +``` + +修改或移除未来 session 的默认 proxy: + +```bash +opencli browserbase account set-proxy reddit-main-1 reddit1-proxy +opencli browserbase account clear-proxy reddit-main-1 +``` + +清除登录态但保留账号名: + +```bash +opencli browserbase account clear-login reddit-main-1 \ + --recreate-context \ + --delete-old-context +``` + +删除本地账号元数据,并可选删除远端 Context: + +```bash +opencli browserbase account delete reddit-main-1 --delete-context +``` + +## Proxy CRUD + +Browserbase 地理 proxy: + +```bash +opencli browserbase proxy add us-ny \ + --type browserbase \ + --country US \ + --state NY \ + --city "New York" +``` + +外部 proxy: + +```bash +export PROXY_DC1_PASS=... + +opencli browserbase proxy add dc1 \ + --type external \ + --server http://host:port \ + --username user \ + --password-env PROXY_DC1_PASS +``` + +管理 proxy: + +```bash +opencli browserbase proxy list +opencli browserbase proxy get dc1 +opencli browserbase proxy update dc1 --server http://new-host:port +opencli browserbase proxy test dc1 +opencli browserbase proxy delete dc1 +``` + +`proxy delete` 默认拒绝删除仍被账号引用的 proxy。只有在你明确想让 OpenCLI 清空这些账号引用时才使用 `--force`。 + +## 并发任务 + +创建 `jobs.jsonl`,每行一个独立任务: + +```json +{"id":"reddit-1","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1abc123/title/","limit":100}} +{"id":"reddit-2","account":"reddit-main-2","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1def456/title/","limit":100}} +``` + +跨账号执行: + +```bash +opencli run --browserbase \ + --accounts reddit-main-1,reddit-main-2,reddit-main-3,reddit-main-4,reddit-main-5,reddit-main-6,reddit-main-7,reddit-main-8,reddit-main-9,reddit-main-10 \ + --parallel 10 \ + --pool-size 10 \ + jobs.jsonl +``` + +任务池保证同一个 account/context 同时只有一个 active automation session,不同账号可以并发。`--pool-size` 最高 10;Browserbase plan 或 API 仍可能给出更低的实际上限。 + +## Session 和 Context 命令 + +手动 session 生命周期: + +```bash +opencli browserbase session create --account reddit-main-1 --keep-alive --print-live-url +opencli browserbase session live-url +opencli browserbase session get +opencli browserbase session list +opencli browserbase session release +opencli browserbase session delete +``` + +底层 Context 命令: + +```bash +opencli browserbase context create +opencli browserbase context get +opencli browserbase context list-local +opencli browserbase context delete +``` + +已有 Browserbase session 仍可直接使用: + +```bash +opencli --browserbase-session reddit get-comments -f json +opencli --session reddit get-comments -f json +``` + +## 安全与存储 + +- 本地元数据保存在 `~/.opencli/browserbase.json`,权限为 `0600`。 +- cookies 和登录态保存在 Browserbase Context,不写入本地配置。 +- proxy 明文密码和 Browserbase connect URL 默认 redacted。 +- 只有明确需要查看敏感字段时才使用 `--show-sensitive`。 +- 只读检查默认不持久化 context 写入;登录和普通任务 session 默认持久化。 + +## 选择优先级 + +浏览器型 adapter 按以下顺序解析浏览器配置: + +1. `--browserbase-account ` +2. `--browserbase-session ` +3. 兼容参数 `--session ` +4. `BROWSERBASE_SESSION_ID` +5. `OPENCLI_CDP_ENDPOINT` +6. 本地 Browser Bridge diff --git a/docs/zh/guide/ai-agent-operations.md b/docs/zh/guide/ai-agent-operations.md new file mode 100644 index 000000000..d1a87f444 --- /dev/null +++ b/docs/zh/guide/ai-agent-operations.md @@ -0,0 +1,115 @@ +# AI Agent 操作指南 + +这页是给没有背景的 AI Agent 用的命令地图。用户要求 OpenCLI 任务时,先按这里选择命令和运行环境。 + +## 1. 发现能力 + +```bash +opencli list -f json +opencli --help +opencli --help +``` + +`opencli list -f json` 是事实来源。它会返回命令名、别名、策略、浏览器需求、参数和输出列。 + +## 2. 选择运行环境 + +| 运行环境 | 命令形状 | 适用场景 | +| --- | --- | --- | +| Public adapter | `opencli ... -f json` | 不需要登录态或浏览器状态。 | +| 本地 Chrome | `opencli ... -f json` | 用户本机 Chrome 已登录,Browser Bridge 正常。 | +| Browserbase account | `opencli --browserbase-account ... -f json` | 需要持久云端登录态、账号身份或账号绑定 proxy。 | +| 已有 Browserbase session | `opencli --browserbase-session ... -f json` | 已经有 session,只要直接接入。 | +| 并发任务池 | `opencli run --browserbase --accounts ... --parallel ... jobs.jsonl` | 多个独立原子任务要跨账号并发。 | + +浏览器型 adapter 的 session 优先级是:`--browserbase-account`、`--browserbase-session`、兼容的 `--session`、`BROWSERBASE_SESSION_ID`、`OPENCLI_CDP_ENDPOINT`、本地 Browser Bridge。 + +## 3. 配置 Browserbase 身份 + +```bash +export BROWSERBASE_API_KEY=... +export BROWSERBASE_PROJECT_ID=... + +opencli browserbase proxy add us-ny \ + --type browserbase \ + --country US \ + --state NY \ + --city "New York" + +opencli browserbase account bootstrap \ + --site reddit \ + --count 10 \ + --name-prefix reddit-main \ + --proxy us-ny \ + --open +``` + +Live View URL 用于人工登录。OpenCLI 本地只保存账号元数据;Browserbase Context 保存 cookies、localStorage、IndexedDB 和其它登录态。 + +日常操作: + +```bash +opencli browserbase account list +opencli browserbase account get reddit-main-1 +opencli browserbase account login reddit-main-1 --open --wait +opencli browserbase account check reddit-main-1 --command "reddit whoami" +opencli browserbase account set-proxy reddit-main-1 us-ny +opencli browserbase account clear-proxy reddit-main-1 +opencli browserbase account mark reddit-main-1 --state invalidated +``` + +## 4. 执行原子命令 + +有 adapter 时优先用 adapter,不要直接手写浏览器流程。 + +```bash +opencli --browserbase-account reddit-main-1 \ + reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" \ + --limit 100 \ + -f json + +opencli --browserbase-account x-main-1 \ + twitter get-comments "https://x.com/user/status/123" \ + --limit 50 \ + -f json +``` + +只有在没有 adapter、需要调试或一次性 UI 操作时,才用 `opencli browser ...`。 + +## 5. 并发 JSONL 任务 + +每行是一个独立任务: + +```json +{"id":"reddit-1","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1abc123/title/","limit":100}} +{"id":"reddit-2","account":"reddit-main-2","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1def456/title/","limit":100}} +``` + +用 Browserbase 账号池执行: + +```bash +opencli run --browserbase \ + --accounts reddit-main-1,reddit-main-2,reddit-main-3,reddit-main-4,reddit-main-5,reddit-main-6,reddit-main-7,reddit-main-8,reddit-main-9,reddit-main-10 \ + --parallel 10 \ + --pool-size 10 \ + jobs.jsonl +``` + +任务池不会并发复用同一个 account/context。修改 proxy 只影响之后新建的 session。 + +## 6. 选择对应 Skill + +```bash +npx skills add albertcyhe/opencli +``` + +| 任务 | Skill | +| --- | --- | +| 总体命令发现 | `opencli-usage` | +| Browserbase account/session/proxy/并发任务 | `opencli-browserbase` | +| Reddit/X/YouTube/Instagram/TikTok/小红书 comments | `opencli-social-comments` | +| 临时操作本地 Chrome | `opencli-browser` | +| 写可复用 adapter | `opencli-adapter-author` | +| 修复失效 adapter | `opencli-autofix` | + +详细安装和分流见 [给 AI Agent 的 Skills](./skills.md)。 diff --git a/docs/zh/guide/browser-bridge.md b/docs/zh/guide/browser-bridge.md index 9a2078dd7..d488c11ac 100644 --- a/docs/zh/guide/browser-bridge.md +++ b/docs/zh/guide/browser-bridge.md @@ -8,7 +8,7 @@ OpenCLI 通过轻量级 **Browser Bridge** Chrome 扩展 + 微守护进程连接 ### 方法 1:下载预构建版本(推荐) -1. 前往 GitHub [Releases 页面](https://github.com/jackwener/opencli/releases) 下载最新的 `opencli-extension-v{version}.zip`。 +1. 前往 GitHub [Releases 页面](https://github.com/albertcyhe/opencli/releases) 下载最新的 `opencli-extension-v{version}.zip`。 2. 解压后打开 `chrome://extensions`,启用**开发者模式**。 3. 点击**加载已解压的扩展程序**,选择解压后的文件夹。 diff --git a/docs/zh/guide/getting-started.md b/docs/zh/guide/getting-started.md index 04c0a7dfd..a420af101 100644 --- a/docs/zh/guide/getting-started.md +++ b/docs/zh/guide/getting-started.md @@ -1,63 +1,102 @@ # 快速开始 -> **让任何网站或 Electron 应用成为你的 CLI。** -> 零风险 · 复用 Chrome 登录态 · AI 驱动发现 · 浏览器 + 桌面自动化 - -OpenCLI 将**任何网站**或 **Electron 应用**变成命令行界面 — Bilibili、知乎、小红书、Twitter/X、Reddit、YouTube、Antigravity 等 — 基于浏览器会话复用和 AI 原生发现。 +OpenCLI 给人和 AI Agent 提供一套确定性的命令接口,用来调用网站、Browserbase 云端浏览器、Electron 应用和外部 CLI。 ## 安装 +当前 fork 还没有单独发布 npm 包,请从源码安装: + +```bash +node --version +git clone git@github.com:albertcyhe/opencli.git +cd opencli +npm install +npm run build +npm link +opencli doctor +opencli list +``` + +## 发现命令 + +先用运行时发现能力,不要凭文档猜命令: + ```bash -npm install -g @jackwener/opencli +opencli list -f json +opencli --help +opencli --help ``` -## 基本使用 +Agent 默认应使用 JSON: ```bash -opencli list # 查看所有命令 -opencli hackernews top --limit 5 # 公开 API,无需浏览器 -opencli bilibili hot --limit 5 # 浏览器命令 -opencli zhihu hot -f json # JSON 输出 +opencli hackernews top --limit 5 -f json +opencli reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" --limit 100 -f json ``` -## 输出格式 +## 选择浏览器运行环境 -所有命令支持 `--format` / `-f`: +| 运行环境 | 适用场景 | +| --- | --- | +| Public/API adapter | 命令不需要登录态或浏览器状态。 | +| 本地 Browser Bridge | 任务需要复用这台机器上的 Chrome 登录态。 | +| Browserbase account | 任务需要云端 browser、持久登录态、账号绑定 proxy,或多身份并发。 | +| 显式 Browserbase session | 你已经创建了 Browserbase session,只需要 OpenCLI 接入。 | + +本地 Chrome 设置见 [Browser Bridge](./browser-bridge.md)。 + +Browserbase 设置见 [Browserbase 账号、Proxy 与并发 Session](../advanced/browserbase.md)。 + +## 第一个 Browserbase 任务 ```bash -opencli bilibili hot -f table # 默认:终端表格 -opencli bilibili hot -f json # JSON -opencli bilibili hot -f yaml # YAML -opencli bilibili hot -f md # Markdown -opencli bilibili hot -f csv # CSV +export BROWSERBASE_API_KEY=... +export BROWSERBASE_PROJECT_ID=... +export PROXY_REDDIT1_PASS=... + +opencli browserbase proxy add reddit1-proxy \ + --type external \ + --server http://host:port \ + --username user \ + --password-env PROXY_REDDIT1_PASS + +opencli browserbase account bootstrap \ + --site reddit \ + --count 1 \ + --name-prefix reddit-main \ + --proxy reddit1-proxy \ + --open + +opencli --browserbase-account reddit-main-1 \ + reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" \ + --limit 100 \ + -f json ``` -## 终端自动补全 +## 第一个并发任务 + +创建 `jobs.jsonl`,每行一个命令: + +```json +{"id":"reddit-1","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1abc123/title/","limit":100}} +{"id":"reddit-2","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1def456/title/","limit":100}} +``` -OpenCLI 支持智能的 Tab 自动补全,加快命令输入: +用指定 Browserbase 账号池执行: ```bash -# 把自动补全加入 shell 启动配置 -echo 'eval "$(opencli completion zsh)"' >> ~/.zshrc # Zsh -echo 'eval "$(opencli completion bash)"' >> ~/.bashrc # Bash -echo 'opencli completion fish | source' >> ~/.config/fish/config.fish # Fish - -# 重启 shell 后,按 Tab 键补全: -opencli [Tab] # 补全站点名称(bilibili、zhihu、twitter...) -opencli bilibili [Tab] # 补全命令(hot、search、me、download...) +opencli run --browserbase \ + --accounts reddit-main-1,reddit-main-2,reddit-main-3 \ + --parallel 3 \ + jobs.jsonl ``` -补全功能包含: -- 所有可用的站点和适配器 -- 内置命令(list、validate、verify、browser、doctor、plugin、adapter...) -- 命令别名 -- 新增适配器时的实时更新 +任务池保证同一个 account/context 同时只有一个自动化 session,不同账号可以并发。 ## 下一步 -- [安装详情](/zh/guide/installation) -- [Browser Bridge 设置](/zh/guide/browser-bridge) -- [扩展 OpenCLI:自定义命令、plugin 和 external CLI](/zh/guide/extending-opencli) -- [所有适配器](/zh/adapters/) -- [开发者指南](/zh/developer/contributing) -- [给新 Electron 应用生成 CLI](/zh/guide/electron-app-cli) +- [AI Agent 操作指南](./ai-agent-operations.md) +- [给 AI Agent 的 Skills](./skills.md) +- [社交平台 Comments](../adapters/social-comments.md) +- [所有适配器](../adapters/index.md) +- [扩展 OpenCLI](./extending-opencli.md) diff --git a/docs/zh/guide/installation.md b/docs/zh/guide/installation.md index 080f72393..f6680a028 100644 --- a/docs/zh/guide/installation.md +++ b/docs/zh/guide/installation.md @@ -2,48 +2,73 @@ ## 系统要求 -- **Node.js**: >= 21.0.0,或 **Bun** >= 1.0 -- **Chrome** 已运行并登录目标网站(浏览器命令需要) +- Node.js 20 或更高版本,或 Bun 1.0 或更高版本 +- 本地浏览器型命令需要 Chrome/Chromium 和 OpenCLI Browser Bridge 扩展 +- Browserbase account/session 流程需要 Browserbase 凭证 -## 通过 npm 安装(推荐) +## 从当前 fork 源码安装 -```bash -npm install -g @jackwener/opencli -``` - -## 从源码安装 +当前 `albertcyhe/opencli` fork 还没有单独发布 npm 包,请使用源码安装路径: ```bash -git clone git@github.com:jackwener/opencli.git +node --version +git clone git@github.com:albertcyhe/opencli.git cd opencli npm install npm run build npm link +opencli --version opencli list ``` -## 更新 +## Browser Bridge + +从 [Chrome Web Store](https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk) 安装扩展,或从 [albertcyhe/opencli releases](https://github.com/albertcyhe/opencli/releases) 下载 release zip。 + +验证本地浏览器连接: ```bash -npm install -g @jackwener/opencli@latest +opencli doctor +``` + +## Browserbase -# 如果你在用打包发布的 OpenCLI skills,也一起刷新 -npx skills add jackwener/opencli +```bash +export BROWSERBASE_API_KEY=... +export BROWSERBASE_PROJECT_ID=... +opencli browserbase account list +opencli browserbase proxy list ``` -如果你只装了部分 skill,也可以只刷新自己在用的: +账号初始化、Live View 登录、proxy CRUD 和并发任务池见 [Browserbase 账号、Proxy 与并发 Session](../advanced/browserbase.md)。 + +## Skills + +安装或刷新当前 fork 的所有 OpenCLI skills: ```bash -npx skills add jackwener/opencli --skill opencli-adapter-author -npx skills add jackwener/opencli --skill opencli-autofix -npx skills add jackwener/opencli --skill opencli-browser -npx skills add jackwener/opencli --skill opencli-usage +npx skills add albertcyhe/opencli ``` -## 验证安装 +只安装需要的 skill: ```bash -opencli --version -opencli list -opencli doctor +npx skills add albertcyhe/opencli --skill opencli-usage +npx skills add albertcyhe/opencli --skill opencli-browserbase +npx skills add albertcyhe/opencli --skill opencli-social-comments +npx skills add albertcyhe/opencli --skill opencli-browser +npx skills add albertcyhe/opencli --skill opencli-adapter-author +npx skills add albertcyhe/opencli --skill opencli-autofix +``` + +见 [给 AI Agent 的 Skills](./skills.md)。 + +## 更新 + +```bash +git pull +npm install +npm run build +npm link +npx skills add albertcyhe/opencli ``` diff --git a/docs/zh/guide/skills.md b/docs/zh/guide/skills.md new file mode 100644 index 000000000..c8ab5b2b0 --- /dev/null +++ b/docs/zh/guide/skills.md @@ -0,0 +1,38 @@ +# 给 AI Agent 的 Skills + +OpenCLI 提供 Codex/Claude/Cursor 风格的 skills,让 agent 不需要读完整文档,也能加载正确工作流。 + +## 安装 + +安装或刷新当前 fork 的全部 skills: + +```bash +npx skills add albertcyhe/opencli +``` + +只安装一个 skill: + +```bash +npx skills add albertcyhe/opencli --skill opencli-browserbase +``` + +## Skill 分流 + +| Skill | 什么时候加载 | +| --- | --- | +| `opencli-usage` | 需要 OpenCLI 总地图:命令发现、输出格式、通用规则、下一步该用哪个 skill。 | +| `opencli-browserbase` | 需要 Browserbase Context、账号配置、Live View URL、proxy CRUD、账号绑定 proxy,或 `opencli run --browserbase`。 | +| `opencli-social-comments` | 需要 Reddit、Twitter/X、YouTube、Instagram、TikTok、小红书 comments 读取、评论或回复,或 LinkedIn timeline 评论数量。 | +| `opencli-browser` | 需要通过 Browser Bridge 临时操作本地 Chrome 页面。 | +| `opencli-adapter-author` | 需要写可复用 adapter 或给已有站点加命令。 | +| `opencli-autofix` | adapter 命令失败,需要基于 trace 修复。 | + +## Agent 默认规则 + +- 选择 adapter 前先跑 `opencli list -f json`。 +- 有 adapter 时优先用 adapter,不要直接驱动浏览器。 +- 下游要读取时使用 `-f json`。 +- 身份、登录态或 proxy 重要时使用 `--browserbase-account `。 +- 最终输出不要泄露 Browserbase API key、proxy password、cookies 或 session connect URL。 + +Skill 文件位于仓库的 `skills/` 目录。详细命令放在各 skill 的 references 中,agent 需要时再加载。 diff --git a/docs/zh/index.md b/docs/zh/index.md index 482d9ecda..f7ac78ae9 100644 --- a/docs/zh/index.md +++ b/docs/zh/index.md @@ -3,33 +3,36 @@ layout: home hero: name: OpenCLI - text: 让任何网站或 Electron 应用成为你的 CLI - tagline: 零风险 · 复用 Chrome 登录态 · AI 驱动发现 · 浏览器 + 桌面自动化 + text: 面向 Agent 的 Browserbase-ready CLI 自动化 + tagline: 网站 · 社交评论 · 多账号 Browserbase · 多 Proxy Session · 桌面适配器 actions: - theme: brand text: 快速开始 link: /zh/guide/getting-started + - theme: alt + text: Agent 操作指南 + link: /zh/guide/ai-agent-operations - theme: alt text: 在 GitHub 查看 - link: https://github.com/jackwener/opencli + link: https://github.com/albertcyhe/opencli features: - - icon: 🖥️ - title: 桌面应用控制 - details: 通过 CDP 直接在终端驱动 Electron 应用(Cursor、Codex、ChatGPT 等)。 - icon: 🌐 - title: 浏览器自动化 - details: AI Agent 直接控制浏览器:点击、输入、提取、截图 — 任何交互,完全可编程。 - - icon: 🔐 - title: 账号安全 - details: 复用 Chrome 登录态,凭证永远不会离开浏览器 — 无 token,无密码泄露。 - - icon: 🤖 - title: AI Agent 就绪 - details: Browser 原语加上适配器编写 skill,让 AI Agent 可以稳定完成侦察、提取、验证和适配器落地。 - - icon: 💰 - title: 零 LLM 成本 - details: 运行时不消耗模型 token。跑 10,000 次也不花一分钱。 - - icon: 🔁 - title: 确定性输出 - details: 相同命令,相同输出结构,每次一致。可管道、可脚本、CI 友好。 + title: Browserbase 账号 + details: 用 Browserbase Context 持久保存登录态,创建 Live View 登录 session,之后按账号名执行任务。 + - icon: 🧭 + title: 多 Proxy 路由 + details: 账号绑定 proxy,支持查询、更新和删除 proxy profile,让登录态和出口 IP 一起管理。 + - icon: ⚙️ + title: 并发原子任务 + details: 通过 `opencli run --browserbase` 把 JSONL 任务分发到最多 10 个 Browserbase session。 + - icon: 💬 + title: 社交平台 Comments + details: 支持 Reddit、Twitter/X、YouTube、Instagram、TikTok、小红书,以及 LinkedIn timeline 评论数量。 + - icon: 🧩 + title: Agent Skills + details: 安装 OpenCLI skills 后,AI Agent 能判断该调用哪个命令、什么时候使用 Browserbase。 + - icon: 🖥️ + title: 本地 Browser Bridge + details: 复用本地 Chrome 登录态,执行浏览器型 adapter 和临时浏览器操作。 --- diff --git a/package.json b/package.json index 8d3ce956f..540874db2 100644 --- a/package.json +++ b/package.json @@ -4,7 +4,7 @@ "publishConfig": { "access": "public" }, - "description": "Make any website or Electron App your CLI. AI-powered.", + "description": "Browserbase-ready CLI automation for websites, social comments, browser sessions, and desktop apps.", "engines": { "node": ">=20.0.0" }, @@ -71,11 +71,11 @@ "web", "ai" ], - "author": "jackwener", + "author": "OpenCLI contributors", "license": "Apache-2.0", "repository": { "type": "git", - "url": "git+https://github.com/jackwener/opencli.git" + "url": "git+https://github.com/albertcyhe/opencli.git" }, "dependencies": { "@mozilla/readability": "^0.6.0", diff --git a/skills/opencli-autofix/SKILL.md b/skills/opencli-autofix/SKILL.md index 6b47495ea..751760bf2 100644 --- a/skills/opencli-autofix/SKILL.md +++ b/skills/opencli-autofix/SKILL.md @@ -206,7 +206,7 @@ If it still fails, go back to Step 1 and collect a fresh trace. You have a budge ## Step 6: File an Upstream Issue -If the retry **passes**, the local adapter has drifted from upstream. File a GitHub issue so the fix flows back to `jackwener/OpenCLI`. +If the retry **passes**, the local adapter has drifted from the maintained fork. File a GitHub issue so the fix flows back to `albertcyhe/opencli`. **Do NOT file for:** - `AUTH_REQUIRED`, `BROWSER_CONNECT`, `ARGUMENT`, `CONFIG` — environment/usage issues, not adapter bugs @@ -251,7 +251,7 @@ _Issue filed by OpenCLI autofix after a verified local repair._ 3. If the user approves and `gh auth status` succeeds: ```bash -gh issue create --repo jackwener/OpenCLI \ +gh issue create --repo albertcyhe/opencli \ --title "[autofix] /: " \ --body "" ``` @@ -293,5 +293,5 @@ In all stop cases, clearly communicate the situation to the user rather than mak 7. AI prepares upstream issue draft, shows it to the user -8. User approves → AI runs: gh issue create --repo jackwener/OpenCLI --title "[autofix] zhihu/hot: SELECTOR" --body "..." +8. User approves → AI runs: gh issue create --repo albertcyhe/opencli --title "[autofix] zhihu/hot: SELECTOR" --body "..." ``` diff --git a/skills/opencli-browser/SKILL.md b/skills/opencli-browser/SKILL.md index 657e40154..020710001 100644 --- a/skills/opencli-browser/SKILL.md +++ b/skills/opencli-browser/SKILL.md @@ -1,6 +1,6 @@ --- name: opencli-browser -description: Use when an agent needs to drive a real Chrome window via opencli — inspect a page, fill forms, click through logged-in flows, or extract data ad-hoc. Covers the selector-first target contract, compound form fields, stale-ref handling, network capture, and the agent-native envelopes the CLI returns. Not for writing adapters — see opencli-adapter-author for that. +description: Use when an agent needs to drive a local Chrome window via OpenCLI Browser Bridge — inspect a page, fill forms, click through logged-in flows, or extract data ad-hoc. Not for Browserbase account/session/proxy management; use opencli-browserbase for that. Not for writing adapters; use opencli-adapter-author. allowed-tools: Bash(opencli:*), Read, Edit, Write --- @@ -8,7 +8,7 @@ allowed-tools: Bash(opencli:*), Read, Edit, Write The first reader of this CLI is an agent, not a human. Every subcommand returns a structured envelope that tells you exactly what matched, how confident the match is, and what to do if it didn't. Lean on those envelopes — do not guess. -This skill is for **driving a live browser** to accomplish an agent task. If you are building a reusable adapter under `~/.opencli/clis//` use `opencli-adapter-author` instead. +This skill is for **driving a local live browser** to accomplish an agent task. If you need Browserbase cloud sessions, persistent Contexts, account-bound proxies, Live View login, or `opencli run --browserbase`, load `opencli-browserbase` instead. If you are building a reusable adapter under `~/.opencli/clis//`, use `opencli-adapter-author`. --- diff --git a/skills/opencli-browserbase/SKILL.md b/skills/opencli-browserbase/SKILL.md new file mode 100644 index 000000000..92512f13a --- /dev/null +++ b/skills/opencli-browserbase/SKILL.md @@ -0,0 +1,56 @@ +--- +name: opencli-browserbase +description: Use when an OpenCLI task needs Browserbase cloud browsers, persistent login Contexts, account profiles, Live View login URLs, account-bound proxies, external or Browserbase proxy CRUD, existing session attachment, or parallel `opencli run --browserbase` jobs. +allowed-tools: Bash(opencli:*), Read +--- + +# opencli-browserbase + +Use this skill whenever identity, login state, cloud browser sessions, or proxy exit path matters. OpenCLI stores only account/proxy metadata locally; Browserbase Contexts store the actual login state. + +## Mental Model + +- **Account profile**: local name, site tag, Browserbase `contextId`, default proxy, and state metadata. +- **Context**: Browserbase persistent browser profile. Cookies/localStorage/IndexedDB live here. +- **Session**: short-lived browser instance for Live View login or automation. +- **Proxy**: applied when a session is created. Changing a proxy affects future sessions only. + +## First Workflow + +```bash +export BROWSERBASE_API_KEY=... +export BROWSERBASE_PROJECT_ID=... +export PROXY_REDDIT1_PASS=... + +opencli browserbase proxy add reddit1-proxy \ + --type external \ + --server http://host:port \ + --username user \ + --password-env PROXY_REDDIT1_PASS + +opencli browserbase account bootstrap \ + --site reddit \ + --count 10 \ + --name-prefix reddit-main \ + --proxy reddit1-proxy \ + --open + +opencli --browserbase-account reddit-main-1 \ + reddit get-comments \ + --limit 100 \ + -f json +``` + +## Load References When Needed + +- [commands.md](references/commands.md): account/proxy/session/context command catalog. +- [pooled-runs.md](references/pooled-runs.md): JSONL format and `opencli run --browserbase` usage. +- [security.md](references/security.md): storage, redaction, and sensitive-output rules. + +## Defaults + +- Use `--password-env` for proxy passwords; avoid plaintext storage. +- Use `--browserbase-account ` rather than an explicit session when login state should persist. +- Use `account check --command " whoami"` or an equivalent read command after suspected login expiry. +- For many independent tasks, use `opencli run --browserbase --accounts ... --parallel ...`. +- Do not print API keys, proxy passwords, cookies, or Browserbase connect URLs. diff --git a/skills/opencli-browserbase/references/commands.md b/skills/opencli-browserbase/references/commands.md new file mode 100644 index 000000000..4d9a18bd5 --- /dev/null +++ b/skills/opencli-browserbase/references/commands.md @@ -0,0 +1,60 @@ +# Browserbase Command Reference + +## Proxy + +```bash +opencli browserbase proxy list +opencli browserbase proxy get +opencli browserbase proxy add us-ny --type browserbase --country US --state NY --city "New York" +opencli browserbase proxy add dc1 --type external --server http://host:port --username user --password-env PROXY_DC1_PASS +opencli browserbase proxy update dc1 --server http://new-host:port +opencli browserbase proxy test dc1 +opencli browserbase proxy delete dc1 +``` + +`proxy delete` refuses active account references unless `--force` is intentional. + +## Account + +```bash +opencli browserbase account bootstrap --site reddit --count 10 --name-prefix reddit-main --proxy dc1 --open +opencli browserbase account list +opencli browserbase account get reddit-main-1 +opencli browserbase account import reddit1 --site reddit --context-id --proxy dc1 +opencli browserbase account login reddit-main-1 --open --wait +opencli browserbase account check reddit-main-1 --command "reddit whoami" +opencli browserbase account mark reddit-main-1 --state ready +opencli browserbase account mark reddit-main-1 --state invalidated +opencli browserbase account set-proxy reddit-main-1 dc1 +opencli browserbase account clear-proxy reddit-main-1 +opencli browserbase account clear-login reddit-main-1 --recreate-context --delete-old-context +opencli browserbase account delete reddit-main-1 --delete-context +``` + +## Session + +```bash +opencli browserbase session create --account reddit-main-1 --keep-alive --print-live-url +opencli browserbase session live-url +opencli browserbase session get +opencli browserbase session list +opencli browserbase session release +opencli browserbase session delete +``` + +## Context + +```bash +opencli browserbase context create +opencli browserbase context get +opencli browserbase context list-local +opencli browserbase context delete +``` + +## Adapter Runtime + +```bash +opencli --browserbase-account ... -f json +opencli --browserbase-session ... -f json +opencli --session ... -f json +``` diff --git a/skills/opencli-browserbase/references/pooled-runs.md b/skills/opencli-browserbase/references/pooled-runs.md new file mode 100644 index 000000000..c1407438d --- /dev/null +++ b/skills/opencli-browserbase/references/pooled-runs.md @@ -0,0 +1,30 @@ +# Browserbase Pooled Runs + +Use pooled runs for independent atomic jobs that can be distributed across account profiles. + +## Job File + +Each line in `jobs.jsonl` is one job: + +```json +{"id":"reddit-1","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1abc123/title/","limit":100}} +{"id":"reddit-2","account":"reddit-main-2","command":"reddit get-comments","args":{"post-id":"https://www.reddit.com/r/example/comments/1def456/title/","limit":100}} +``` + +Jobs can pin an account with `account` or `accountName`; otherwise OpenCLI rotates through the `--accounts` list. + +## Run + +```bash +opencli run --browserbase \ + --accounts reddit-main-1,reddit-main-2,reddit-main-3,reddit-main-4,reddit-main-5,reddit-main-6,reddit-main-7,reddit-main-8,reddit-main-9,reddit-main-10 \ + --parallel 10 \ + --pool-size 10 \ + jobs.jsonl +``` + +The pool keeps one active automation session per account/context. Different accounts can run concurrently. Browserbase plan limits can be lower than 10. + +## Persistence + +Task sessions persist context by default. Use `--no-persist-context` only when the job must not write back browser state. diff --git a/skills/opencli-browserbase/references/security.md b/skills/opencli-browserbase/references/security.md new file mode 100644 index 000000000..b3f5525c0 --- /dev/null +++ b/skills/opencli-browserbase/references/security.md @@ -0,0 +1,9 @@ +# Browserbase Security + +- `~/.opencli/browserbase.json` stores only metadata: account names, site tags, context ids, proxy names, and timestamps. +- Cookies, localStorage, IndexedDB, and login state remain inside Browserbase Contexts. +- Prefer `--password-env` for external proxy passwords. +- Plaintext proxy password storage requires an explicit plaintext flag when supported; do not suggest it by default. +- Output redacts proxy passwords, Browserbase API keys, connect URLs, and other sensitive fields by default. +- Only use `--show-sensitive` when the user explicitly needs sensitive JSON and understands the exposure. +- Never echo secrets in final answers or commit them to docs, fixtures, traces, or examples. diff --git a/skills/opencli-social-comments/SKILL.md b/skills/opencli-social-comments/SKILL.md new file mode 100644 index 000000000..3bb8f2cc2 --- /dev/null +++ b/skills/opencli-social-comments/SKILL.md @@ -0,0 +1,48 @@ +--- +name: opencli-social-comments +description: Use when an OpenCLI task needs to read, list, post, or reply to comments on Reddit, Twitter/X, YouTube, Instagram, TikTok, Xiaohongshu, or inspect LinkedIn timeline comment counts. Includes Browserbase account routing when login state or proxy identity matters. +allowed-tools: Bash(opencli:*), Read +--- + +# opencli-social-comments + +Use adapter commands for social comment work. Prefer `-f json` so returned comment IDs can be reused for replies. + +## Start + +```bash +opencli list -f json +opencli --help +``` + +If identity, login state, or proxy matters, prefix commands with: + +```bash +opencli --browserbase-account ... -f json +``` + +Load `opencli-browserbase` for account/proxy setup. + +## Common Reads + +```bash +opencli reddit get-comments --limit 100 -f json +opencli twitter get-comments --limit 50 -f json +opencli youtube comments --limit 100 -f json +opencli instagram get-comments --limit 50 -f json +opencli tiktok get-comments --limit 50 -f json +opencli xiaohongshu comments --with-replies -f json +opencli linkedin timeline --limit 20 -f json +``` + +## Load References When Needed + +- [matrix.md](references/matrix.md): current platform support. +- [recipes.md](references/recipes.md): read/write examples and ID handling. + +## Safety + +- Do not run comment/reply/post commands unless the user explicitly wants a write action. +- Treat comment IDs as platform-scoped. +- Use Browserbase account profiles when a post must come from a specific identity or proxy. +- Do not print cookies, API keys, proxy passwords, Browserbase connect URLs, or private account details. diff --git a/skills/opencli-social-comments/references/matrix.md b/skills/opencli-social-comments/references/matrix.md new file mode 100644 index 000000000..d37395654 --- /dev/null +++ b/skills/opencli-social-comments/references/matrix.md @@ -0,0 +1,11 @@ +# Social Comments Matrix + +| Platform | Read comments | Top-level write | Reply write | Notes | +| --- | --- | --- | --- | --- | +| Reddit | `reddit read`, `reddit get-comments` | `reddit comment` | `reddit reply` | `read --expand-more` expands threaded comments; `get-comments` returns reusable IDs. | +| Twitter / X | `twitter get-comments`, `twitter thread` | `twitter post`, tweet reply via `twitter reply` | `twitter reply` | Replies are represented as comment rows with reply-able IDs or tweet URLs. | +| YouTube | `youtube comments` | `youtube reply` | `youtube reply-comment` | `reply-comment` needs the original video URL for page context. | +| Instagram | `instagram get-comments` | `instagram comment` | `instagram reply` | `get-comments` accepts post/reel URL or username; write commands use username plus `--index`. | +| TikTok | `tiktok get-comments` | `tiktok comment` | `tiktok reply` | Reply may fall back to text/author matching when IDs are unavailable. | +| Xiaohongshu | `xiaohongshu comments` | Not exposed | `xiaohongshu reply` | Note URLs usually require `xsec_token`; `--with-replies` includes nested replies. | +| LinkedIn | `linkedin timeline` comment counts | Not exposed | Not exposed | Current support exposes timeline post metadata, not full comment threads. | diff --git a/skills/opencli-social-comments/references/recipes.md b/skills/opencli-social-comments/references/recipes.md new file mode 100644 index 000000000..350f7fc3b --- /dev/null +++ b/skills/opencli-social-comments/references/recipes.md @@ -0,0 +1,45 @@ +# Social Comment Recipes + +## Read + +```bash +opencli reddit get-comments "https://www.reddit.com/r/example/comments/1abc123/title/" --limit 100 -f json +opencli reddit read "https://www.reddit.com/r/example/comments/1abc123/title/" --expand-more -f json + +opencli twitter get-comments "https://x.com/user/status/123" --limit 50 -f json +opencli twitter thread "https://x.com/user/status/123" -f json + +opencli youtube comments "https://www.youtube.com/watch?v=VIDEO_ID" --limit 100 -f json +opencli instagram get-comments "https://www.instagram.com/p/SHORTCODE/" --limit 50 -f json +opencli tiktok get-comments "https://www.tiktok.com/@user/video/123" --limit 50 -f json +opencli xiaohongshu comments "https://www.xiaohongshu.com/search_result/?xsec_token=..." --with-replies -f json +``` + +## Write + +```bash +opencli reddit comment 1abc123 "Comment text" +opencli reddit reply t1_okf3s7u "Reply text" + +opencli twitter reply "https://x.com/user/status/123" "Reply text" + +opencli youtube reply "https://www.youtube.com/watch?v=VIDEO_ID" "Comment text" +opencli youtube reply-comment Ugxxx "Reply text" --url "https://www.youtube.com/watch?v=VIDEO_ID" + +opencli instagram comment nasa "Comment text" --index 1 +opencli instagram reply nasa 18000000000000000 "Reply text" --index 1 + +opencli tiktok comment "https://www.tiktok.com/@user/video/123" "Comment text" +opencli tiktok reply "https://www.tiktok.com/@user/video/123" "COMMENT_ID" "Reply text" + +opencli xiaohongshu reply "https://www.xiaohongshu.com/search_result/?xsec_token=..." "COMMENT_ID" "Reply text" +``` + +## Browserbase Account + +```bash +opencli --browserbase-account reddit1 reddit get-comments --limit 100 -f json +opencli --browserbase-account x-main-1 twitter get-comments --limit 50 -f json +``` + +Use `opencli browserbase account set-proxy ` when login state and exit IP should be bound. diff --git a/skills/opencli-usage/SKILL.md b/skills/opencli-usage/SKILL.md index 34def31a7..75cea2308 100644 --- a/skills/opencli-usage/SKILL.md +++ b/skills/opencli-usage/SKILL.md @@ -1,168 +1,90 @@ --- name: opencli-usage -description: Use at the start of any OpenCLI session — this is the top-level map of what `opencli` can do, how to discover adapters, what flags and output formats are universal, and which specialized skill to load next. Point here when an agent asks "what can opencli do?" or "how do I find the right command?". +description: Use at the start of any OpenCLI session. This is the top-level map for command discovery, install/update, output formats, Browser Bridge vs Browserbase runtime choice, and which specialized OpenCLI skill to load next. allowed-tools: Bash(opencli:*), Read --- # opencli-usage -OpenCLI turns any website, Electron desktop app, or external CLI into a uniform `opencli ` surface that agents can drive without screen-scraping. This skill is the orientation layer — once you know what you want to do, load one of the specialized skills below. - -## The three pillars - -- **Adapter commands** — `opencli [...]`. Built-in adapters live in `clis/`, user adapters in `~/.opencli/clis/`. Each is backed by a strategy (`PUBLIC | COOKIE | INTERCEPT | UI | LOCAL`) that tells you whether a Chrome session is needed. -- **Browser driving** — `opencli browser *` subcommands (`open`, `state`, `click`, `type`, `select`, `find`, `extract`, `network`, …) for ad-hoc interaction and scraping when no adapter covers the task. See `opencli-browser`. -- **Current-tab binding** — `opencli browser bind` attaches the Chrome tab the user already opened/logged into to that browser session. Follow-up commands use `opencli browser ...`. See `opencli-browser` before using it; bound sessions still block tab mutation. -- **External CLI passthrough** — `opencli gh`, `opencli docker`, `opencli vercel`, etc. Managed via `opencli external install ` (auto-install from `external-clis.yaml`) or `opencli external register ` (bring your own). +OpenCLI turns websites, Browserbase cloud browser accounts, Electron apps, and external CLIs into a uniform `opencli ` surface. Use this skill to orient, then load a specialized skill when the task is clear. ## Install ```bash -# npm global -npm install -g @jackwener/opencli # binary: opencli, requires Node >= 21 -opencli doctor # run before browser-dependent work (see below) - -# From source -git clone git@github.com:jackwener/OpenCLI.git -cd OpenCLI && npm install -npx tsx src/main.ts # same surface, no global install +git clone git@github.com:albertcyhe/opencli.git +cd opencli +npm install +npm run build +npm link +opencli doctor +opencli list ``` -`opencli doctor` prints a structured `DoctorReport` — daemon status, extension connection, version checks, and a live browser connectivity probe. Scope is narrow: it diagnoses the **browser bridge** (daemon + extension + Chrome wiring). `PUBLIC` / `LOCAL` adapters, `opencli list`, `validate`, `verify`, plugin commands, and external-CLI passthrough don't need it to be green — only `COOKIE` / `INTERCEPT` / `UI` adapters and the `opencli browser *` subcommands do. Flag: `-v` (verbose). - -## Prerequisites by command type - -| Strategy tag on `opencli list` | What it needs | -|--------------------------------|---------------| -| `PUBLIC` | Nothing — pure HTTP, no browser. | -| `COOKIE` | Chrome logged into the target site + **OpenCLI** extension installed from the [Chrome Web Store](https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk). Command captures the credential from your live session — no re-login. | -| `INTERCEPT` | Same as COOKIE, plus opencli opens an automation window to capture a signed request. | -| `UI` | Same as COOKIE, full DOM interaction. | -| `LOCAL` | No browser; talks to a local/dev endpoint. | - -Electron desktop apps (cursor, codex, chatwise, discord-app, doubao-app, antigravity, chatgpt-app) route through CDP against the running app — same cookie-less flow as a logged-in browser. Make sure the app is running before invoking. - -## Discover what's installed — don't read this file, run a command +The active fork is not published as a separate npm package yet. Install from source, and install skills from the fork: ```bash -opencli list # table, grouped by site -opencli list -f json # machine-readable; pipe to jq or your agent -opencli list | grep -i twitter # find commands for a specific site -opencli --help # see that site's commands + flags -opencli --help # see positional args and command-specific flags +npx skills add albertcyhe/opencli ``` -Do not hard-code adapter lists — there are 100+ sites and the count moves every week. `opencli list -f json` is the source of truth; it emits one entry per command with `{site, name, aliases, description, strategy, browser, args, columns, ...}`. For an agent, that is always better than grepping a doc. - -## Universal flags (work on every adapter command) - -| flag | effect | -|------|--------| -| `-f, --format ` | `table` (default in TTY) · `yaml` (default in non-TTY) · `json` · `plain` · `md` · `csv`. Pass explicitly when you want a specific shape; agents almost always want `-f json`. | -| `-v, --verbose` | Debug logs + stack traces on failure; also sets `OPENCLI_VERBOSE=1` for the process. | - -Command-specific flags (`--limit`, `--tab`, `--filter`, …) are not universal — consult ` --help`. - -## Output formats - -- `json` — pretty-printed, 2-space indent. Default choice for agents. -- `plain` — prints a single primary field for chat-style commands (`response`/`content`/`text`/`value`). Useful for piping to another tool. -- `yaml` — fallback when output is not a TTY and `-f` is not explicit. -- `table` — color-coded, site-grouped; meant for humans. -- `md`, `csv` — straightforward tabular dumps. - -A few commands override the default via `cmd.defaultFormat` (e.g. chat commands default to `plain`), so don't assume without reading `--help`. - -## Environment variables - -| variable | default | purpose | -|----------|---------|---------| -| `OPENCLI_DAEMON_PORT` | `19825` | Daemon ↔ extension bridge port. | -| `OPENCLI_BROWSER_CONNECT_TIMEOUT` | `30` | Seconds to wait for the browser bridge. | -| `OPENCLI_BROWSER_COMMAND_TIMEOUT` | `60` | Per-command timeout. | -| `OPENCLI_CDP_ENDPOINT` | — | Manual CDP endpoint override (dev / remote Chrome / Electron). | -| `OPENCLI_CACHE_DIR` | `~/.opencli/cache` | Network capture + browser-state cache. | -| `OPENCLI_WINDOW` | command-specific | `foreground` or `background` browser window mode. | -| `OPENCLI_VERBOSE` | `false` | Verbose logging (also triggered by `-v`). | +## Discover Commands -## Self-repair - -When an adapter command fails because the site changed (selectors drifted, API rotated, response schema shifted), re-run with `--trace retain-on-failure`. The error envelope includes a `trace` block pointing at `summary.md`; patch only the `adapterSourcePath` from that summary and retry. Max 3 repair rounds. The full flow is in `opencli-autofix`. - -## Writing your own adapter - -Two-path storage: - -- **Private**: `~/.opencli/clis//.js` — no build step, hot-available, not visible in the public package. -- **Public / PR**: `clis//.js` — for upstream contribution; requires build. - -Scaffolding & verification: +Always discover the live registry instead of copying stale command lists: ```bash -opencli browser init / # generates a skeleton -opencli validate [target] # semantic checks on the loaded registry (description, domain, pipeline step names, func|pipeline|_lazy presence, arg duplicates) — no network, no browser -opencli verify [target] [--smoke] # run the command with synthetic args -opencli browser verify / # end-to-end smoke inside the bridge +opencli list -f json +opencli --help +opencli --help ``` -Adapters import only `@jackwener/opencli/registry` and `@jackwener/opencli/errors`. `columns` must align 1:1 (in name and order) with keys of the object returned by `func`. For the full workflow see `opencli-adapter-author`. - -## Plugins +`opencli list -f json` is the source of truth for `{site, name, aliases, description, strategy, browser, args, columns}`. -Plugins are third-party extensions pulled from git, separate from the main adapter registry: +## Runtime Choice -```bash -opencli plugin install github:user/repo # install -opencli plugin list [-f json] # see installed -opencli plugin update [name] | --all # keep current -opencli plugin uninstall -opencli plugin create # scaffold a new plugin -``` - -## External CLI passthrough - -Wraps external command-line tools so you can discover + invoke them through the same `opencli …` entrypoint: - -```bash -opencli external install gh # auto-install via brew/apt/npm per external-clis.yaml -opencli external register my-tool \ - --binary my-tool \ - --install "npm i -g my-tool" \ - --desc "My internal CLI" -opencli external list -opencli gh pr list --limit 5 # passthrough; stdio is inherited, exit code propagated -opencli docker ps -``` +| Use | Command shape | +| --- | --- | +| Public or local adapter | `opencli ... -f json` | +| Local Chrome login via Browser Bridge | `opencli ... -f json` after `opencli doctor` | +| Browserbase account/context/proxy | `opencli --browserbase-account ... -f json` | +| Existing Browserbase session | `opencli --browserbase-session ... -f json` | +| Parallel Browserbase jobs | `opencli run --browserbase --accounts ... --parallel ... jobs.jsonl` | -Built-in entries live in `src/external-clis.yaml`; user overrides and additions in `~/.opencli/external-clis.yaml`. Commonly shipped: `gh`, `docker`, `vercel`, `lark-cli`, `longbridge`, `dws`, `wecom-cli`, `obsidian`, `ntn`, `tg(tg-cli)`, `discord(discord-cli)`, `wx(wx-cli)`. +Browser-backed adapters resolve in this priority: `--browserbase-account`, `--browserbase-session`, legacy `--session`, `BROWSERBASE_SESSION_ID`, `OPENCLI_CDP_ENDPOINT`, local Browser Bridge. -Some official CLIs use shell-script installers instead of a shell-free package-manager command. Entries without an `install` config, such as `ntn`, must be installed manually from their homepage before passthrough use. +## Universal Output -## Shell completion +Agents should pass `-f json` unless the user asks for a human table or markdown: ```bash -opencli completion bash # also: zsh, fish -# -> script on stdout; source or save per your shell's convention +opencli reddit get-comments --limit 100 -f json ``` -## Where to go next +Formats: `table`, `json`, `yaml`, `plain`, `md`, `csv`. Use `-v` for verbose error detail. -| If you're about to… | Load this skill | -|---------------------|-----------------| -| Drive a live browser ad-hoc (no adapter available, or prototyping) | `opencli-browser` | -| Write a new adapter, or add a command to an existing site | `opencli-adapter-author` | -| Fix a broken adapter after a command failure | `opencli-autofix` | +## Skill Router -## Commands that used to exist +| If the task is about... | Load | +| --- | --- | +| Browserbase account profiles, Contexts, Live View login, proxy CRUD, account-bound proxy, pooled JSONL runs | `opencli-browserbase` | +| Reddit/X/YouTube/Instagram/TikTok/Xiaohongshu comments or LinkedIn timeline comment counts | `opencli-social-comments` | +| Ad-hoc local Chrome navigation, clicks, form filling, extraction, network capture | `opencli-browser` | +| Writing or extending a reusable adapter | `opencli-adapter-author` | +| Repairing a failing adapter with trace artifacts | `opencli-autofix` | -The following were removed in the PR #1094 consolidation — don't try to invoke them: +## Environment Variables -- `opencli explore ` — superseded by `opencli browser network` + `opencli browser find` for live API discovery, and by the `opencli-adapter-author` workflow for capture. -- `opencli record ` — removed; manual capture now lives in `opencli browser network --detail`. -- `opencli web read` / `opencli desktop *` as top-level groups — folded into their respective adapters (`opencli web read` still exists as the `web` adapter's `read` command, but there is no standalone `web` / `desktop` top-level group command). +| variable | purpose | +| --- | --- | +| `OPENCLI_DAEMON_PORT` | Browser Bridge daemon port, default `19825`. | +| `OPENCLI_PROFILE` | Local Browser Bridge profile alias/contextId. | +| `OPENCLI_WINDOW` | `foreground` or `background` browser window mode. | +| `OPENCLI_CDP_ENDPOINT` | Manual CDP endpoint for remote Chrome or Electron apps. | +| `BROWSERBASE_API_KEY` | Browserbase API key. | +| `BROWSERBASE_PROJECT_ID` | Browserbase project id for Context/account operations. | +| `BROWSERBASE_SESSION_ID` | Existing Browserbase session id fallback. | -## Don't +## Rules -- Don't paste this skill's command list into your plan; it will rot. Call `opencli list -f json` at the start of a task instead. -- Don't assume every adapter needs a browser — strategy `PUBLIC` and `LOCAL` don't. Check the `strategy` field. -- Don't silently fall back from a failing adapter to a hand-rolled `fetch` — `--trace retain-on-failure` gives you the browser evidence and adapter source path. Do that first. +- Prefer adapter commands over raw browser driving. +- Use Browserbase account profiles when login state, account identity, or proxy exit path matters. +- Do not expose Browserbase API keys, proxy passwords, cookies, or session connect URLs in final output. +- Adapter imports still use `@jackwener/opencli/registry` and `@jackwener/opencli/errors` unless the package name is migrated later.