feat(judge): judge SEES screenshots for real visual review#275
Conversation
…esponses allow) The browser pass captured screenshots but the judge never saw the pixels — @playwright/mcp omitted image responses, so browser_take_screenshot returned only a text file-link. The judge reviewed structure (accessibility snapshot) but was blind to the rendered visual. - _augment_playwright_args injects --image-responses allow so screenshots come back inline; opus actually sees layout/spacing/contrast/overflow. - Addendum now tells the judge to LOOK at the returned image and judge the visual (alignment, hierarchy, truncation, contrast, responsive breakage), holding the persona's 'would a real user be worse off' bar — genuine visual defects, not style nits. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 3 minutes and 1 second. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
Comment |
Problem
The browser pass captured screenshots and posted them to the PR, but the judge never saw the pixels.
@playwright/mcpwas omitting image responses (verified: zero image blocks reached the model), sobrowser_take_screenshotreturned only a text file-link. The judge reviewed structure (accessibility snapshot) and computed styles, but was blind to the rendered visual — so it couldn't evaluate design.Fix
_augment_playwright_argsinjects--image-responses allow→ screenshots return inline → opus actually sees the rendered page.Stacks on #273 (storage_state, authed browsing) and #274 (screenshot relocation).
🤖 Generated with Claude Code