research: agent runtime architecture — from embedded component to browser-native runtime

## Summary

Research into evolving Review Loop from an embedded dev-only component (Shadow DOM overlay injected via Vite/Astro/Express) to an **agent runtime** — enabling visual regression, autonomous annotation verification, multi-page crawling, and deep AI agent integration.

Full research report: [`docs/reports/agent-runtime-research.md`](https://github.com/viv/review-loop/blob/main/docs/reports/agent-runtime-research.md)

## Prior Art: Tidewave and Runtime-Embedded Agents

José Valim's [Tidewave](https://tidewave.ai) demonstrates a compelling alternative to the "agent controls a browser externally" approach. Tidewave runs an MCP server **inside** the web application runtime (Phoenix/Rails), giving agents direct access to the live system — DOM-to-source mapping, database queries, background jobs, REPL execution, and correlated error context.

Valim's [vertical integration essay](https://tidewave.ai/blog/the-future-of-coding-agents-is-vertical-integration) identifies three frustrations with current coding agents that directly mirror Review Loop's challenges:

1. **Agent can't verify its own work** — it says a feature is complete but can't see the browser to confirm
2. **Error context requires human mediation** — developers copy-paste stacktraces from browser to agent
3. **No UI-to-source mapping** — developers must manually translate "this dropdown" to the source file that renders it

Point 3 is the exact problem our [source file mapping feature (#66)](https://github.com/viv/review-loop/issues/66) targets. Point 1 is what the Playwright-based verification in this research aims to solve.

### Key Insight: Review Loop Already Has Runtime Access

The critical realisation from studying Tidewave is that **Review Loop's Vite plugin is already running inside the dev server**. The existing middleware has access to Vite's module graph, which maps source files to rendered output. This means Review Loop is already in a Tidewave-like position — it just doesn't exploit it yet.

| | Tidewave | Review Loop (current) | Review Loop (proposed) |
|---|---|---|---|
| **Runtime position** | MCP server inside the app | Vite middleware inside the dev server | Vite middleware + Playwright |
| **Source mapping** | Runtime knows which component renders each element | Has Vite module graph access but doesn't use it | Could use module graph for annotation → source hints |
| **Verification** | Agent has REPL + live system access | None — agent is blind to rendered output | Playwright screenshots + DOM inspection |
| **Framework scope** | Deep per-framework integration | Framework-agnostic via adapters | Same, with optional deeper integration |

### Implications for Architecture Choice

Tidewave's approach validates **Architecture D (Hybrid)** as the right starting point, but reframes *why*:

- **Don't abandon the embedded plugin** — it's the runtime integration point. Enhance it with source mapping, error context, and richer MCP tools
- **Add Playwright for what the plugin can't do** — visual verification, multi-page crawling, screenshot diffing
- **The Vite plugin should become smarter, not be replaced** — more like Tidewave's runtime MCP server, less like a passive UI injector

This is a both/and approach rather than either/or: runtime intelligence from the plugin, visual intelligence from the browser.

## Motivation

The current embedded model has limitations that runtime-aware tooling can address:

- **No visual verification** — agents can't screenshot, diff, or verify their own changes
- **No autonomous navigation** — agents can't browse the site to check their work
- **Same-page constraint** — can only annotate pages the user navigates to manually
- **No source mapping** — agents must grep for source files (see #66)
- **Unexploited runtime access** — the Vite plugin sits inside the dev server but doesn't expose module graph, error context, or build diagnostics to agents
- **Can't review deployed sites** — limited to local dev environments

## Architectures Researched

Four architectures were evaluated:

| Architecture | Approach | Install Friction | Framework Adapters | Visual Verification | Works on Any Site |
|---|---|---|---|---|---|
| **A: Playwright + Extension** | Launch Chromium via Playwright, sideload review extension | Medium | None | Yes | Yes |
| **B: Electron App** | Self-contained review browser with webview + panel | Medium | None | Yes | Yes |
| **C: Chrome Extension + Native Host** | Extension in user's browser, native messaging to MCP | Medium | None | Partial | Yes |
| **D: Hybrid** | Keep Vite plugin + add agent browser mode | Low-Medium | Yes (embedded mode) | Yes (agent mode) | Partial |

## Recommended Phased Approach

### Phase 1: Runtime Intelligence (Tidewave-Inspired) — Enhance the Vite Plugin

Before adding browser automation, exploit the runtime access we already have:
- **Source file mapping** via Vite's module graph — annotation → source file hints (#66)
- **Build error context** — surface Vite compilation errors/warnings to agents via MCP
- **Module dependency info** — which components contribute to a page
- **HMR-aware feedback** — notify agents when their changes trigger successful or failed hot reloads

This is low-risk, high-value, and requires no new infrastructure.

### Phase 2: Visual Verification — Add Playwright

Add `npx review-loop verify` that launches Chromium via Playwright to:
- Navigate to each annotated page
- Take screenshots and verify DOM state
- Report results via new MCP tools (`verify_annotation`, `screenshot_page`, `crawl_site`)

Non-breaking — existing users unaffected.

### Phase 3: Chrome Extension — Replace Embedded UI

Build a Chrome extension (content script + side panel + native messaging host) that eliminates framework adapters entirely. Works on any site, dev or production.

### Phase 4: Full Agent Runtime

Unify all phases: runtime intelligence from the plugin, visual verification from Playwright, human review from the extension, MCP orchestrates everything.

## Technologies Evaluated

- **Tidewave** — José Valim's runtime-embedded MCP server for Phoenix/Rails ([blog](https://tidewave.ai/blog/the-future-of-coding-agents-is-vertical-integration))
- **Browser-Use** — Python AI browser agent framework (Playwright-based)
- **Playwright MCP** — Microsoft's MCP server for browser automation via accessibility tree
- **Chrome DevTools MCP** — Google's AI DevTools integration (26 tools via CDP)
- **Stagehand** — Browserbase's `act`/`extract`/`observe` primitives
- **Vercel Agent-Browser** — daemon architecture with Rust CDP client
- **Electron / Tauri v2 / Puppeteer / WebDriver BiDi** — embedding technologies

## New MCP Tools (proposed)

### Runtime intelligence (Phase 1)
- `get_source_hint(annotationId)` — source file + line range for an annotation
- `get_build_status()` — current Vite build errors/warnings
- `get_module_graph(pageUrl)` — which source modules contribute to a page

### Visual verification (Phase 2)
- `screenshot_page(url)` — capture page screenshot
- `verify_annotation(id)` — navigate to annotation's page, check if change rendered
- `get_accessibility_tree(url)` — structured page representation
- `crawl_site(baseUrl, depth?)` — discover and catalogue all pages
- `visual_diff(url, before?, after?)` — compare page states

## Open Questions

1. Should the agent runtime be a separate package (`review-loop-browser`)?
2. Chrome extension distribution: Web Store, enterprise sideload, or both?
3. Should Playwright be a peer dependency or bundled (~50MB)?
4. How to handle authentication on target sites in agent browser mode?
5. Should the extension support Firefox via WebExtensions API?
6. Visual diff engine: custom, `pixelmatch`, or existing tools?
7. Is the Electron app (Architecture B) worth pursuing given the extension approach?
8. How deep should per-framework runtime integration go? Tidewave goes very deep (REPL, DB access); Review Loop's zero-config principle may favour a lighter touch
9. Should Review Loop adopt ACP (Agent Client Protocol) alongside MCP for multi-agent portability?

## What Stays Unchanged

- `ReviewStorage` class and `inline-review.json` format
- MCP protocol and tool interface (extended, not replaced)
- Shared types (`src/shared/types.ts`)
- Export functionality

## References

- [The future of coding agents is vertical integration](https://tidewave.ai/blog/the-future-of-coding-agents-is-vertical-integration) — José Valim, Feb 2026
- [The path to Tidewave: beyond code intelligence](https://dashbit.co/blog/the-path-to-tidewave) — José Valim, Jun 2025
- [Why Elixir is the best language for AI](https://dashbit.co/blog/why-elixir-best-language-for-ai) — José Valim, Feb 2026


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research: agent runtime architecture — from embedded component to browser-native runtime #72

Summary

Prior Art: Tidewave and Runtime-Embedded Agents

Key Insight: Review Loop Already Has Runtime Access

Implications for Architecture Choice

Motivation

Architectures Researched

Recommended Phased Approach

Phase 1: Runtime Intelligence (Tidewave-Inspired) — Enhance the Vite Plugin

Phase 2: Visual Verification — Add Playwright

Phase 3: Chrome Extension — Replace Embedded UI

Phase 4: Full Agent Runtime

Technologies Evaluated

New MCP Tools (proposed)

Runtime intelligence (Phase 1)

Visual verification (Phase 2)

Open Questions

What Stays Unchanged

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	Tidewave	Review Loop (current)	Review Loop (proposed)
Runtime position	MCP server inside the app	Vite middleware inside the dev server	Vite middleware + Playwright
Source mapping	Runtime knows which component renders each element	Has Vite module graph access but doesn't use it	Could use module graph for annotation → source hints
Verification	Agent has REPL + live system access	None — agent is blind to rendered output	Playwright screenshots + DOM inspection
Framework scope	Deep per-framework integration	Framework-agnostic via adapters	Same, with optional deeper integration

Architecture	Approach	Install Friction	Framework Adapters	Visual Verification	Works on Any Site
A: Playwright + Extension	Launch Chromium via Playwright, sideload review extension	Medium	None	Yes	Yes
B: Electron App	Self-contained review browser with webview + panel	Medium	None	Yes	Yes
C: Chrome Extension + Native Host	Extension in user's browser, native messaging to MCP	Medium	None	Partial	Yes
D: Hybrid	Keep Vite plugin + add agent browser mode	Low-Medium	Yes (embedded mode)	Yes (agent mode)	Partial

research: agent runtime architecture — from embedded component to browser-native runtime #72

Description

Summary

Prior Art: Tidewave and Runtime-Embedded Agents

Key Insight: Review Loop Already Has Runtime Access

Implications for Architecture Choice

Motivation

Architectures Researched

Recommended Phased Approach

Phase 1: Runtime Intelligence (Tidewave-Inspired) — Enhance the Vite Plugin

Phase 2: Visual Verification — Add Playwright

Phase 3: Chrome Extension — Replace Embedded UI

Phase 4: Full Agent Runtime

Technologies Evaluated

New MCP Tools (proposed)

Runtime intelligence (Phase 1)

Visual verification (Phase 2)

Open Questions

What Stays Unchanged

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions