Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion .agents/skills/expect/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: "Use when editing .tsx/.jsx/.css/.html, React components, pages, ro
license: MIT
metadata:
author: millionco
version: "2.3.0"
version: "2.4.0"
---

# Expect
Expand All @@ -13,6 +13,14 @@ You verify code changes in a real browser before claiming they work. No browser

Use the expect MCP tools (`open`, `playwright`, `screenshot`, etc.) for all browser interactions. Do not use raw browser tools (Playwright MCP, chrome tools, etc.) unless the user explicitly asks.

## Subagent Usage

Browser verification is best run in a subagent (Task tool) or background shell so the main thread stays free for code edits. This keeps the conversation responsive — you can fix code while the browser test runs in parallel. Strongly prefer launching a subagent for browser work, especially when the test involves multiple steps or long interactions. If the test is truly trivial (single screenshot check), inline is acceptable.

## Resuming Browser State

Before opening a new browser, check if one is already running. Use `browser_tabs` (action `list`) or the expect `screenshot` tool to see if a session is still active. If a tab is already open at the target URL, reuse it — don't close and reopen. When re-verifying after a code fix, prefer navigating or refreshing the existing session over starting from scratch.

## Compounding

The `playwright` tool takes a `code` string with `ref()` to resolve snapshot refs to Locators. One call can do an entire interaction — fills, clicks, AND data collection. Use that.
Expand Down Expand Up @@ -56,6 +64,8 @@ Use `return` to collect data. Response: `{ result: <value>, resultFile: "<tmp pa

## Rationalizations

- "I'll run the browser test inline, it's quick" — Probably not. Launch a subagent so you can keep editing code in parallel. Only skip the subagent for a single screenshot sanity check.
- "I'll open a fresh browser to re-test" — Check for an existing session first. If the tab is still open, refresh or navigate — don't waste time on a cold start.
- "I'll make one `playwright` call per action" — No. Whole sequence in one call.
- "I need a snapshot between fills" — No. Fills don't change DOM. Batch them.
- "Let me snapshot to see what changed" — Did the page navigate or submit? No? Use `snapshotAfter=true` on the action that does.
11 changes: 10 additions & 1 deletion .cursor/rules/browser-testing.mdc
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,9 @@ When using the `cursor-ide-browser` MCP for testing or verification, follow this
## Pre-flight

1. Use `browser_tabs` with action `list` to check for existing tabs
2. If a tab exists for the target URL, `browser_lock` it before interacting
2. If a tab already exists for the target URL (or a closely related URL), reuse it — `browser_lock` it and continue from the current page state instead of navigating from scratch
3. If no tab exists, `browser_navigate` first, then `browser_lock`
4. When re-verifying after a code change, refresh the existing tab rather than closing and reopening — this preserves scroll position, form state, and avoids cold-start overhead

## Accessibility Quick Audit

Expand Down Expand Up @@ -53,6 +54,14 @@ Reference the performance skill at `.agents/skills/performance/SKILL.md` and the
1. `browser_console_messages` — look for errors, warnings, deprecation notices
2. `browser_network_requests` — check for failed requests, mixed content, slow responses

## Session Persistence

When you expect to re-verify after code fixes, keep the browser session alive between turns:

1. Do NOT call `close` or `browser_lock unlock` if you plan to revisit the page
2. On the next turn, `browser_tabs list` will show the existing tab — reuse it
3. Only close/unlock when ALL verification is complete and no more iterations are expected
Comment on lines +61 to +63
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The new session-persistence guidance conflicts with the existing "unlock when done" rule by telling users not to unlock across turns. Keep tabs open across turns, but release browser_lock at turn end and reacquire next turn.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .cursor/rules/browser-testing.mdc, line 61:

<comment>The new session-persistence guidance conflicts with the existing "unlock when done" rule by telling users not to unlock across turns. Keep tabs open across turns, but release `browser_lock` at turn end and reacquire next turn.</comment>

<file context>
@@ -53,6 +54,14 @@ Reference the performance skill at `.agents/skills/performance/SKILL.md` and the
+
+When you expect to re-verify after code fixes, keep the browser session alive between turns:
+
+1. Do NOT call `close` or `browser_lock unlock` if you plan to revisit the page
+2. On the next turn, `browser_tabs list` will show the existing tab — reuse it
+3. Only close/unlock when ALL verification is complete and no more iterations are expected
</file context>
Suggested change
1. Do NOT call `close` or `browser_lock unlock` if you plan to revisit the page
2. On the next turn, `browser_tabs list` will show the existing tab — reuse it
3. Only close/unlock when ALL verification is complete and no more iterations are expected
1. Do NOT call `close` if you plan to revisit the page
2. On the next turn, `browser_tabs list` will show the existing tab — reuse it
3. Still call `browser_lock` with action `unlock` at the end of each turn, then re-lock the tab on the next turn
Fix with Cubic


## Unlock When Done

Always call `browser_lock` with action `unlock` when finished with ALL browser operations for the turn.
5 changes: 2 additions & 3 deletions .mcp.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
{
"mcpServers": {
"expect": {
"command": "npx",
"command": "node",
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The MCP server command uses a hardcoded absolute local path, which makes this config non-portable and likely broken for every other environment.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .mcp.json, line 4:

<comment>The MCP server command uses a hardcoded absolute local path, which makes this config non-portable and likely broken for every other environment.</comment>

<file context>
@@ -1,10 +1,9 @@
   "mcpServers": {
     "expect": {
-      "command": "npx",
+      "command": "node",
       "args": [
-        "-y",
</file context>
Fix with Cubic

"args": [
"-y",
"expect-cli@latest",
"/Users/aidenybai/Developer/expect/apps/cli/dist/index.js",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded local filesystem path in MCP config

High Severity

The .mcp.json config was changed from npx -y expect-cli@latest mcp to a hardcoded absolute path /Users/aidenybai/Developer/expect/apps/cli/dist/index.js. This is a developer-specific local filesystem path that will not work for any other contributor or in CI. This appears to be a local development override that was accidentally committed.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f8b5480. Configure here.

"mcp"
]
}
Expand Down
61 changes: 33 additions & 28 deletions apps/cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,39 +42,38 @@ Coming soon. Email [aiden@million.dev](mailto:aiden@million.dev) if you have que

## Options

| Flag | Description | Default |
| ----------------------------- | -------------------------------------------------------------------------------------------- | ----------- |
| `-m, --message <instruction>` | Natural language instruction for what to test | - |
| `-f, --flow <slug>` | Reuse a saved flow by its slug | - |
| `-y, --yes` | Run immediately without confirmation | - |
| `-a, --agent <provider>` | Agent provider (`claude`, `codex`, `copilot`, `gemini`, `cursor`, `opencode`, `droid`, `pi`) | auto-detect |
| `-t, --target <target>` | What to test: `unstaged`, `branch`, or `changes` | `changes` |
| `-u, --url <urls...>` | Base URL(s) for the dev server (skips port picker) | - |
| `--browser-mode <mode>` | Browser mode: `headed` or `headless` | `headed` |
| `--cdp <url>` | Connect to an existing Chrome via CDP WebSocket URL | - |
| `--profile <name>` | Reuse a Chrome profile by name (e.g. Default) | - |
| `--no-cookies` | Skip system browser cookie extraction | - |
| `--ci` | Force CI mode: headless, no cookies, auto-yes, 30-min timeout | - |
| `--timeout <ms>` | Execution timeout in milliseconds | - |
| `--output <format>` | Output format: `text` or `json` | `text` |
| `--verbose` | Enable verbose logging | - |
| `-v, --version` | Print version | - |
| `-h, --help` | Display help | - |
| Flag | Description | Default |
| ----------------------------- | -------------------------------------------------------------------------------------- | ----------- |
| `-m, --message <instruction>` | Natural language instruction for what to test | - |
| `-f, --flow <slug>` | Reuse a saved flow by its slug | - |
| `-y, --yes` | Run immediately without confirmation | - |
| `-a, --agent <provider>` | Agent provider (`claude`, `codex`, `copilot`, `gemini`, `cursor`, `opencode`, `droid`) | auto-detect |
| `-t, --target <target>` | What to test: `unstaged`, `branch`, or `changes` | `changes` |
| `-u, --url <urls...>` | Base URL(s) for the dev server (skips port picker) | - |
| `--browser-mode <mode>` | Browser mode: `headed` or `headless` | `headed` |
| `--cdp <url>` | Connect to an existing Chrome via CDP WebSocket URL | - |
| `--profile <name>` | Reuse a Chrome profile by name (e.g. Default) | - |
| `--no-cookies` | Skip system browser cookie extraction | - |
| `--ci` | Force CI mode: headless, no cookies, auto-yes, 30-min timeout | - |
| `--timeout <ms>` | Execution timeout in milliseconds | - |
| `--output <format>` | Output format: `text` or `json` | `text` |
| `--verbose` | Enable verbose logging | - |
| `-v, --version` | Print version | - |
| `-h, --help` | Display help | - |

## Supported Agents

Expect works with the following coding agents. It auto-detects which agents are installed on your `PATH`. If multiple are available, it defaults to the first one found. Use `-a <provider>` to pick a specific agent.

| Agent | Flag | Install |
| ------------------------------------------------------------- | ------------- | ---------------------------------------------- |
| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | `-a claude` | `npm install -g @anthropic-ai/claude-code` |
| [Codex](https://github.com/openai/codex#readme) | `-a codex` | `npm install -g @openai/codex` |
| [GitHub Copilot](https://github.com/features/copilot/cli) | `-a copilot` | `npm install -g @github/copilot` |
| [Gemini CLI](https://github.com/google-gemini/gemini-cli) | `-a gemini` | `npm install -g @google/gemini-cli` |
| [Cursor](https://cursor.com) | `-a cursor` | [cursor.com](https://cursor.com) |
| [OpenCode](https://opencode.ai) | `-a opencode` | `npm install -g opencode-ai` |
| [Factory Droid](https://factory.ai) | `-a droid` | `npm install -g droid` |
| [Pi](https://github.com/mariozechner/pi-coding-agent) | `-a pi` | `npm install -g @mariozechner/pi-coding-agent` |
| Agent | Flag | Install |
| ------------------------------------------------------------- | ------------- | ------------------------------------------ |
| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | `-a claude` | `npm install -g @anthropic-ai/claude-code` |
| [Codex](https://github.com/openai/codex#readme) | `-a codex` | `npm install -g @openai/codex` |
| [GitHub Copilot](https://github.com/features/copilot/cli) | `-a copilot` | `npm install -g @github/copilot` |
| [Gemini CLI](https://github.com/google-gemini/gemini-cli) | `-a gemini` | `npm install -g @google/gemini-cli` |
| [Cursor](https://cursor.com) | `-a cursor` | [cursor.com](https://cursor.com) |
| [OpenCode](https://opencode.ai) | `-a opencode` | `npm install -g opencode-ai` |
| [Factory Droid](https://factory.ai) | `-a droid` | `npm install -g droid` |

## Resources & Contributing Back

Expand All @@ -86,6 +85,12 @@ We expect all contributors to abide by the terms of our [Code of Conduct](https:

**[→ Start contributing on GitHub](https://github.com/millionco/expect/blob/main/CONTRIBUTING.md)**

### Acknowledgements

Expect wouldn't exist without the ideas and work of others:

- [**dev-browser**](https://github.com/SawyerHood/dev-browser) by Sawyer Hood — the Playwright-first ("bitter lesson") approach that inspired Expect's core design: give the agent real browser APIs instead of screenshots and coordinates.

### License

FSL-1.1-MIT © [Million Software, Inc.](https://million.dev)
2 changes: 1 addition & 1 deletion apps/demo/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Chirp — Demo</title>
<title>Sheets</title>
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&display=swap" rel="stylesheet" />
</head>
<body>
Expand Down
16 changes: 14 additions & 2 deletions apps/demo/src/app.tsx
Original file line number Diff line number Diff line change
@@ -1,10 +1,22 @@
import { useState } from "react";
import { BrowserRouter, Routes, Route } from "react-router-dom";
import { ErrorBoundary } from "./error-boundary";
import { LoginPage } from "./pages/login";
import { Layout } from "./components/layout";
import { DashboardPage } from "./pages/dashboard";

export const App = () => {
const [, setTick] = useState(0);
const forceUpdate = () => setTick((previous) => previous + 1);

return (
<ErrorBoundary>
<LoginPage />
<BrowserRouter>
<Routes>
<Route element={<Layout />}>
<Route path="/" element={<DashboardPage onUpdate={forceUpdate} />} />
</Route>
</Routes>
</BrowserRouter>
</ErrorBoundary>
);
};
41 changes: 0 additions & 41 deletions apps/demo/src/components/avatar.tsx

This file was deleted.

49 changes: 0 additions & 49 deletions apps/demo/src/components/compose-box.tsx

This file was deleted.

Loading
Loading