You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Research into evolving Review Loop from an embedded dev-only component (Shadow DOM overlay injected via Vite/Astro/Express) to an agent runtime — enabling visual regression, autonomous annotation verification, multi-page crawling, and deep AI agent integration.
José Valim's Tidewave demonstrates a compelling alternative to the "agent controls a browser externally" approach. Tidewave runs an MCP server inside the web application runtime (Phoenix/Rails), giving agents direct access to the live system — DOM-to-source mapping, database queries, background jobs, REPL execution, and correlated error context.
Valim's vertical integration essay identifies three frustrations with current coding agents that directly mirror Review Loop's challenges:
Agent can't verify its own work — it says a feature is complete but can't see the browser to confirm
Error context requires human mediation — developers copy-paste stacktraces from browser to agent
No UI-to-source mapping — developers must manually translate "this dropdown" to the source file that renders it
Point 3 is the exact problem our source file mapping feature (#66) targets. Point 1 is what the Playwright-based verification in this research aims to solve.
Key Insight: Review Loop Already Has Runtime Access
The critical realisation from studying Tidewave is that Review Loop's Vite plugin is already running inside the dev server. The existing middleware has access to Vite's module graph, which maps source files to rendered output. This means Review Loop is already in a Tidewave-like position — it just doesn't exploit it yet.
Tidewave
Review Loop (current)
Review Loop (proposed)
Runtime position
MCP server inside the app
Vite middleware inside the dev server
Vite middleware + Playwright
Source mapping
Runtime knows which component renders each element
Has Vite module graph access but doesn't use it
Could use module graph for annotation → source hints
Verification
Agent has REPL + live system access
None — agent is blind to rendered output
Playwright screenshots + DOM inspection
Framework scope
Deep per-framework integration
Framework-agnostic via adapters
Same, with optional deeper integration
Implications for Architecture Choice
Tidewave's approach validates Architecture D (Hybrid) as the right starting point, but reframes why:
Don't abandon the embedded plugin — it's the runtime integration point. Enhance it with source mapping, error context, and richer MCP tools
Add Playwright for what the plugin can't do — visual verification, multi-page crawling, screenshot diffing
The Vite plugin should become smarter, not be replaced — more like Tidewave's runtime MCP server, less like a passive UI injector
This is a both/and approach rather than either/or: runtime intelligence from the plugin, visual intelligence from the browser.
Motivation
The current embedded model has limitations that runtime-aware tooling can address:
No visual verification — agents can't screenshot, diff, or verify their own changes
No autonomous navigation — agents can't browse the site to check their work
Same-page constraint — can only annotate pages the user navigates to manually
Build error context — surface Vite compilation errors/warnings to agents via MCP
Module dependency info — which components contribute to a page
HMR-aware feedback — notify agents when their changes trigger successful or failed hot reloads
This is low-risk, high-value, and requires no new infrastructure.
Phase 2: Visual Verification — Add Playwright
Add npx review-loop verify that launches Chromium via Playwright to:
Navigate to each annotated page
Take screenshots and verify DOM state
Report results via new MCP tools (verify_annotation, screenshot_page, crawl_site)
Non-breaking — existing users unaffected.
Phase 3: Chrome Extension — Replace Embedded UI
Build a Chrome extension (content script + side panel + native messaging host) that eliminates framework adapters entirely. Works on any site, dev or production.
Phase 4: Full Agent Runtime
Unify all phases: runtime intelligence from the plugin, visual verification from Playwright, human review from the extension, MCP orchestrates everything.
Technologies Evaluated
Tidewave — José Valim's runtime-embedded MCP server for Phoenix/Rails (blog)
Browser-Use — Python AI browser agent framework (Playwright-based)
Playwright MCP — Microsoft's MCP server for browser automation via accessibility tree
Chrome DevTools MCP — Google's AI DevTools integration (26 tools via CDP)
crawl_site(baseUrl, depth?) — discover and catalogue all pages
visual_diff(url, before?, after?) — compare page states
Open Questions
Should the agent runtime be a separate package (review-loop-browser)?
Chrome extension distribution: Web Store, enterprise sideload, or both?
Should Playwright be a peer dependency or bundled (~50MB)?
How to handle authentication on target sites in agent browser mode?
Should the extension support Firefox via WebExtensions API?
Visual diff engine: custom, pixelmatch, or existing tools?
Is the Electron app (Architecture B) worth pursuing given the extension approach?
How deep should per-framework runtime integration go? Tidewave goes very deep (REPL, DB access); Review Loop's zero-config principle may favour a lighter touch
Should Review Loop adopt ACP (Agent Client Protocol) alongside MCP for multi-agent portability?
What Stays Unchanged
ReviewStorage class and inline-review.json format
MCP protocol and tool interface (extended, not replaced)
Summary
Research into evolving Review Loop from an embedded dev-only component (Shadow DOM overlay injected via Vite/Astro/Express) to an agent runtime — enabling visual regression, autonomous annotation verification, multi-page crawling, and deep AI agent integration.
Full research report:
docs/reports/agent-runtime-research.mdPrior Art: Tidewave and Runtime-Embedded Agents
José Valim's Tidewave demonstrates a compelling alternative to the "agent controls a browser externally" approach. Tidewave runs an MCP server inside the web application runtime (Phoenix/Rails), giving agents direct access to the live system — DOM-to-source mapping, database queries, background jobs, REPL execution, and correlated error context.
Valim's vertical integration essay identifies three frustrations with current coding agents that directly mirror Review Loop's challenges:
Point 3 is the exact problem our source file mapping feature (#66) targets. Point 1 is what the Playwright-based verification in this research aims to solve.
Key Insight: Review Loop Already Has Runtime Access
The critical realisation from studying Tidewave is that Review Loop's Vite plugin is already running inside the dev server. The existing middleware has access to Vite's module graph, which maps source files to rendered output. This means Review Loop is already in a Tidewave-like position — it just doesn't exploit it yet.
Implications for Architecture Choice
Tidewave's approach validates Architecture D (Hybrid) as the right starting point, but reframes why:
This is a both/and approach rather than either/or: runtime intelligence from the plugin, visual intelligence from the browser.
Motivation
The current embedded model has limitations that runtime-aware tooling can address:
Architectures Researched
Four architectures were evaluated:
Recommended Phased Approach
Phase 1: Runtime Intelligence (Tidewave-Inspired) — Enhance the Vite Plugin
Before adding browser automation, exploit the runtime access we already have:
This is low-risk, high-value, and requires no new infrastructure.
Phase 2: Visual Verification — Add Playwright
Add
npx review-loop verifythat launches Chromium via Playwright to:verify_annotation,screenshot_page,crawl_site)Non-breaking — existing users unaffected.
Phase 3: Chrome Extension — Replace Embedded UI
Build a Chrome extension (content script + side panel + native messaging host) that eliminates framework adapters entirely. Works on any site, dev or production.
Phase 4: Full Agent Runtime
Unify all phases: runtime intelligence from the plugin, visual verification from Playwright, human review from the extension, MCP orchestrates everything.
Technologies Evaluated
act/extract/observeprimitivesNew MCP Tools (proposed)
Runtime intelligence (Phase 1)
get_source_hint(annotationId)— source file + line range for an annotationget_build_status()— current Vite build errors/warningsget_module_graph(pageUrl)— which source modules contribute to a pageVisual verification (Phase 2)
screenshot_page(url)— capture page screenshotverify_annotation(id)— navigate to annotation's page, check if change renderedget_accessibility_tree(url)— structured page representationcrawl_site(baseUrl, depth?)— discover and catalogue all pagesvisual_diff(url, before?, after?)— compare page statesOpen Questions
review-loop-browser)?pixelmatch, or existing tools?What Stays Unchanged
ReviewStorageclass andinline-review.jsonformatsrc/shared/types.ts)References