millionco · aidenybai · Apr 8, 2026 · cubic-dev-ai · Apr 8, 2026 · cubic-dev-ai
diff --git a/.agents/skills/expect/SKILL.md b/.agents/skills/expect/SKILL.md
@@ -4,7 +4,7 @@ description: "Use when editing .tsx/.jsx/.css/.html, React components, pages, ro
 license: MIT
 metadata:
   author: millionco
-  version: "2.3.0"
+  version: "2.4.0"
 ---
 
 # Expect
@@ -13,6 +13,14 @@ You verify code changes in a real browser before claiming they work. No browser
 
 Use the expect MCP tools (`open`, `playwright`, `screenshot`, etc.) for all browser interactions. Do not use raw browser tools (Playwright MCP, chrome tools, etc.) unless the user explicitly asks.
 
+## Subagent Usage
+
+Browser verification is best run in a subagent (Task tool) or background shell so the main thread stays free for code edits. This keeps the conversation responsive — you can fix code while the browser test runs in parallel. Strongly prefer launching a subagent for browser work, especially when the test involves multiple steps or long interactions. If the test is truly trivial (single screenshot check), inline is acceptable.
+
+## Resuming Browser State
+
+Before opening a new browser, check if one is already running. Use `browser_tabs` (action `list`) or the expect `screenshot` tool to see if a session is still active. If a tab is already open at the target URL, reuse it — don't close and reopen. When re-verifying after a code fix, prefer navigating or refreshing the existing session over starting from scratch.
+
 ## Compounding
 
 The `playwright` tool takes a `code` string with `ref()` to resolve snapshot refs to Locators. One call can do an entire interaction — fills, clicks, AND data collection. Use that.
@@ -56,6 +64,8 @@ Use `return` to collect data. Response: `{ result: <value>, resultFile: "<tmp pa
 
 ## Rationalizations
 
+- "I'll run the browser test inline, it's quick" — Probably not. Launch a subagent so you can keep editing code in parallel. Only skip the subagent for a single screenshot sanity check.
+- "I'll open a fresh browser to re-test" — Check for an existing session first. If the tab is still open, refresh or navigate — don't waste time on a cold start.
 - "I'll make one `playwright` call per action" — No. Whole sequence in one call.
 - "I need a snapshot between fills" — No. Fills don't change DOM. Batch them.
 - "Let me snapshot to see what changed" — Did the page navigate or submit? No? Use `snapshotAfter=true` on the action that does.
diff --git a/.cursor/rules/browser-testing.mdc b/.cursor/rules/browser-testing.mdc
@@ -11,8 +11,9 @@ When using the `cursor-ide-browser` MCP for testing or verification, follow this
 ## Pre-flight
 
 1. Use `browser_tabs` with action `list` to check for existing tabs
-2. If a tab exists for the target URL, `browser_lock` it before interacting
+2. If a tab already exists for the target URL (or a closely related URL), reuse it — `browser_lock` it and continue from the current page state instead of navigating from scratch
 3. If no tab exists, `browser_navigate` first, then `browser_lock`
+4. When re-verifying after a code change, refresh the existing tab rather than closing and reopening — this preserves scroll position, form state, and avoids cold-start overhead
 
 ## Accessibility Quick Audit
 
@@ -53,6 +54,14 @@ Reference the performance skill at `.agents/skills/performance/SKILL.md` and the
 1. `browser_console_messages` — look for errors, warnings, deprecation notices
 2. `browser_network_requests` — check for failed requests, mixed content, slow responses
 
+## Session Persistence
+
+When you expect to re-verify after code fixes, keep the browser session alive between turns:
+
+1. Do NOT call `close` or `browser_lock unlock` if you plan to revisit the page
+2. On the next turn, `browser_tabs list` will show the existing tab — reuse it
+3. Only close/unlock when ALL verification is complete and no more iterations are expected
-1. Do NOT call `close` or `browser_lock unlock` if you plan to revisit the page
-2. On the next turn, `browser_tabs list` will show the existing tab — reuse it
-3. Only close/unlock when ALL verification is complete and no more iterations are expected
+1. Do NOT call `close` if you plan to revisit the page
+2. On the next turn, `browser_tabs list` will show the existing tab — reuse it
+3. Still call `browser_lock` with action `unlock` at the end of each turn, then re-lock the tab on the next turn
-1. Do NOT call `close` or `browser_lock unlock` if you plan to revisit the page
-2. On the next turn, `browser_tabs list` will show the existing tab — reuse it
-3. Only close/unlock when ALL verification is complete and no more iterations are expected
+1. Do NOT call `close` if you plan to revisit the page
+2. On the next turn, `browser_tabs list` will show the existing tab — reuse it
+3. Still call `browser_lock` with action `unlock` at the end of each turn, then re-lock the tab on the next turn
+
 ## Unlock When Done
 
 Always call `browser_lock` with action `unlock` when finished with ALL browser operations for the turn.
diff --git a/.mcp.json b/.mcp.json
@@ -1,10 +1,9 @@
 {
   "mcpServers": {
     "expect": {
-      "command": "npx",
+      "command": "node",
       "args": [
-        "-y",
-        "expect-cli@latest",
+        "/Users/aidenybai/Developer/expect/apps/cli/dist/index.js",
         "mcp"
       ]
     }

diff --git a/apps/cli/README.md b/apps/cli/README.md
@@ -42,39 +42,38 @@ Coming soon. Email [aiden@million.dev](mailto:aiden@million.dev) if you have que
 
 ## Options
 
-| Flag                          | Description                                                                                  | Default     |
-| ----------------------------- | -------------------------------------------------------------------------------------------- | ----------- |
-| `-m, --message <instruction>` | Natural language instruction for what to test                                                | -           |
-| `-f, --flow <slug>`           | Reuse a saved flow by its slug                                                               | -           |
-| `-y, --yes`                   | Run immediately without confirmation                                                         | -           |
-| `-a, --agent <provider>`      | Agent provider (`claude`, `codex`, `copilot`, `gemini`, `cursor`, `opencode`, `droid`, `pi`) | auto-detect |
-| `-t, --target <target>`       | What to test: `unstaged`, `branch`, or `changes`                                             | `changes`   |
-| `-u, --url <urls...>`         | Base URL(s) for the dev server (skips port picker)                                           | -           |
-| `--browser-mode <mode>`       | Browser mode: `headed` or `headless`                                                         | `headed`    |
-| `--cdp <url>`                 | Connect to an existing Chrome via CDP WebSocket URL                                          | -           |
-| `--profile <name>`            | Reuse a Chrome profile by name (e.g. Default)                                                | -           |
-| `--no-cookies`                | Skip system browser cookie extraction                                                        | -           |
-| `--ci`                        | Force CI mode: headless, no cookies, auto-yes, 30-min timeout                                | -           |
-| `--timeout <ms>`              | Execution timeout in milliseconds                                                            | -           |
-| `--output <format>`           | Output format: `text` or `json`                                                              | `text`      |
-| `--verbose`                   | Enable verbose logging                                                                       | -           |
-| `-v, --version`               | Print version                                                                                | -           |
-| `-h, --help`                  | Display help                                                                                 | -           |
+| Flag                          | Description                                                                            | Default     |
+| ----------------------------- | -------------------------------------------------------------------------------------- | ----------- |
+| `-m, --message <instruction>` | Natural language instruction for what to test                                          | -           |
+| `-f, --flow <slug>`           | Reuse a saved flow by its slug                                                         | -           |
+| `-y, --yes`                   | Run immediately without confirmation                                                   | -           |
+| `-a, --agent <provider>`      | Agent provider (`claude`, `codex`, `copilot`, `gemini`, `cursor`, `opencode`, `droid`) | auto-detect |
+| `-t, --target <target>`       | What to test: `unstaged`, `branch`, or `changes`                                       | `changes`   |
+| `-u, --url <urls...>`         | Base URL(s) for the dev server (skips port picker)                                     | -           |
+| `--browser-mode <mode>`       | Browser mode: `headed` or `headless`                                                   | `headed`    |
+| `--cdp <url>`                 | Connect to an existing Chrome via CDP WebSocket URL                                    | -           |
+| `--profile <name>`            | Reuse a Chrome profile by name (e.g. Default)                                          | -           |
+| `--no-cookies`                | Skip system browser cookie extraction                                                  | -           |
+| `--ci`                        | Force CI mode: headless, no cookies, auto-yes, 30-min timeout                          | -           |
+| `--timeout <ms>`              | Execution timeout in milliseconds                                                      | -           |
+| `--output <format>`           | Output format: `text` or `json`                                                        | `text`      |
+| `--verbose`                   | Enable verbose logging                                                                 | -           |
+| `-v, --version`               | Print version                                                                          | -           |
+| `-h, --help`                  | Display help                                                                           | -           |
 
 ## Supported Agents
 
 Expect works with the following coding agents. It auto-detects which agents are installed on your `PATH`. If multiple are available, it defaults to the first one found. Use `-a <provider>` to pick a specific agent.
 
-| Agent                                                         | Flag          | Install                                        |
-| ------------------------------------------------------------- | ------------- | ---------------------------------------------- |
-| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | `-a claude`   | `npm install -g @anthropic-ai/claude-code`     |
-| [Codex](https://github.com/openai/codex#readme)               | `-a codex`    | `npm install -g @openai/codex`                 |
-| [GitHub Copilot](https://github.com/features/copilot/cli)     | `-a copilot`  | `npm install -g @github/copilot`               |
-| [Gemini CLI](https://github.com/google-gemini/gemini-cli)     | `-a gemini`   | `npm install -g @google/gemini-cli`            |
-| [Cursor](https://cursor.com)                                  | `-a cursor`   | [cursor.com](https://cursor.com)               |
-| [OpenCode](https://opencode.ai)                               | `-a opencode` | `npm install -g opencode-ai`                   |
-| [Factory Droid](https://factory.ai)                           | `-a droid`    | `npm install -g droid`                         |
-| [Pi](https://github.com/mariozechner/pi-coding-agent)         | `-a pi`       | `npm install -g @mariozechner/pi-coding-agent` |
+| Agent                                                         | Flag          | Install                                    |
+| ------------------------------------------------------------- | ------------- | ------------------------------------------ |
+| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | `-a claude`   | `npm install -g @anthropic-ai/claude-code` |
+| [Codex](https://github.com/openai/codex#readme)               | `-a codex`    | `npm install -g @openai/codex`             |
+| [GitHub Copilot](https://github.com/features/copilot/cli)     | `-a copilot`  | `npm install -g @github/copilot`           |
+| [Gemini CLI](https://github.com/google-gemini/gemini-cli)     | `-a gemini`   | `npm install -g @google/gemini-cli`        |
+| [Cursor](https://cursor.com)                                  | `-a cursor`   | [cursor.com](https://cursor.com)           |
+| [OpenCode](https://opencode.ai)                               | `-a opencode` | `npm install -g opencode-ai`               |
+| [Factory Droid](https://factory.ai)                           | `-a droid`    | `npm install -g droid`                     |
 
 ## Resources & Contributing Back
 
@@ -86,6 +85,12 @@ We expect all contributors to abide by the terms of our [Code of Conduct](https:
 
 **[→ Start contributing on GitHub](https://github.com/millionco/expect/blob/main/CONTRIBUTING.md)**
 
+### Acknowledgements
+
+Expect wouldn't exist without the ideas and work of others:
+
+- [**dev-browser**](https://github.com/SawyerHood/dev-browser) by Sawyer Hood — the Playwright-first ("bitter lesson") approach that inspired Expect's core design: give the agent real browser APIs instead of screenshots and coordinates.
+
 ### License
 
 FSL-1.1-MIT © [Million Software, Inc.](https://million.dev)
diff --git a/apps/demo/index.html b/apps/demo/index.html
@@ -3,7 +3,7 @@
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <title>Chirp — Demo</title>
+    <title>Sheets</title>
     <link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&display=swap" rel="stylesheet" />
   </head>
   <body>

diff --git a/apps/demo/src/app.tsx b/apps/demo/src/app.tsx
@@ -1,10 +1,22 @@
+import { useState } from "react";
+import { BrowserRouter, Routes, Route } from "react-router-dom";
 import { ErrorBoundary } from "./error-boundary";
-import { LoginPage } from "./pages/login";
+import { Layout } from "./components/layout";
+import { DashboardPage } from "./pages/dashboard";
 
 export const App = () => {
+  const [, setTick] = useState(0);
+  const forceUpdate = () => setTick((previous) => previous + 1);
+
   return (
     <ErrorBoundary>
-      <LoginPage />
+      <BrowserRouter>
+        <Routes>
+          <Route element={<Layout />}>
+            <Route path="/" element={<DashboardPage onUpdate={forceUpdate} />} />
+          </Route>
+        </Routes>
+      </BrowserRouter>
     </ErrorBoundary>
   );
 };
diff --git a/apps/demo/src/components/avatar.tsx b/apps/demo/src/components/avatar.tsx
diff --git a/apps/demo/src/components/compose-box.tsx b/apps/demo/src/components/compose-box.tsx