ci(e2e): CMT scheduling + macOS E2E stabilization + CI/security hardening#3829
Conversation
…run webhook GitHub's workflow_run webhook payload does not include workflow_dispatch inputs, so matterwick was never receiving the server_versions value and the CMT Provisioner runs (e.g. 2026-04-28, 2026-05-18) returned 202 OK on the webhook but never dispatched compatibility-matrix-testing.yml. Replace the echo-only stub with a curl POST to matterwick's new /cmt_dispatch endpoint, sending the full context the webhook payload cannot carry: server_versions, run_id, sha, ref, owner, repo. Matterwick returns 202 immediately and runs provisioning asynchronously. Requires two repository configuration items: - vars.MATTERWICK_URL (already set, used by /cleanup_e2e callback) - secrets.MATTERWICK_CMT_TRIGGER_SECRET (new, shared with matterwick) Co-Authored-By: Claude <noreply@anthropic.com>
|
@yasserfaraazkhan: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. DetailsI understand the commands that are listed here |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR comprehensively hardens E2E test infrastructure, reporting, and main-window lifecycle management across 77 ranges. It extends per-OS test-count outputs, refactors flaky-test analysis to collapse retries into authoritative outcomes, implements macOS dialog suppression across global setup/teardown/fixtures/workflows, hardens 13 specific tests with polling/event-driven patterns, adds PopoutManager cleanup on main-window close, and refactors intercom readiness with combined show-event and polling-fallback logic. ChangesE2E Infrastructure, Test Reliability, and Main-Window Lifecycle
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/cmt-provisioner.yml (1)
21-31:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSet explicit least-privilege
permissionsfor this workflow.No
permissionsblock is defined, so this workflow inherits repository defaults forGITHUB_TOKEN, which is broader than needed here.🔐 Proposed fix
name: CMT Provisioner +permissions: {} on: workflow_dispatch:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/cmt-provisioner.yml around lines 21 - 31, Add an explicit top-level permissions block to this workflow to avoid inheriting broad repository defaults; update the workflow YAML (where the on:, workflow_dispatch and job trigger-matterwick are defined) to include a minimal permissions map such as setting permissions: contents: read and workflows: write (adjust further only if the trigger-matterwick job needs additional scopes), so the GITHUB_TOKEN only has the least-privilege required.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In @.github/workflows/cmt-provisioner.yml:
- Around line 21-31: Add an explicit top-level permissions block to this
workflow to avoid inheriting broad repository defaults; update the workflow YAML
(where the on:, workflow_dispatch and job trigger-matterwick are defined) to
include a minimal permissions map such as setting permissions: contents: read
and workflows: write (adjust further only if the trigger-matterwick job needs
additional scopes), so the GITHUB_TOKEN only has the least-privilege required.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 33b71075-cd7f-4654-978e-a49a6b70c36d
📒 Files selected for processing (1)
.github/workflows/cmt-provisioner.yml
The workflow inherits the repo default GITHUB_TOKEN permissions, which
is broader than this workflow needs. The single step is an outbound
curl to matterwick authenticated by a separate shared secret -- there
is no checkout, no GitHub API call, and no interaction with workflow
files. permissions: {} (no scopes) is the correct least-privilege
setting; if a future step needs a specific scope, it should be added
explicitly rather than relying on repo-default inheritance.
Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/cmt-provisioner.yml (1)
61-68:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winQuote
${RUN_ID}in the jq command.Line 66 uses
--argjson run_id ${RUN_ID}without quoting${RUN_ID}. Whilegithub.run_idshould always be numeric, unquoted shell expansions can cause issues if the value is empty, contains whitespace, or has special characters. Quote it for safety.🛡️ Proposed fix
- --argjson run_id ${RUN_ID} \ + --argjson run_id "${RUN_ID}" \🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/cmt-provisioner.yml around lines 61 - 68, The jq payload construction is vulnerable to unquoted shell expansion for RUN_ID; update the payload assignment so the jq invocation quotes the expansion for run id (the line using --argjson run_id ${RUN_ID}) to --argjson run_id "${RUN_ID}" (i.e., modify the jq command in the payload assignment to quote ${RUN_ID}) to prevent word-splitting or empty-value issues while keeping the --argjson usage.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In @.github/workflows/cmt-provisioner.yml:
- Around line 61-68: The jq payload construction is vulnerable to unquoted shell
expansion for RUN_ID; update the payload assignment so the jq invocation quotes
the expansion for run id (the line using --argjson run_id ${RUN_ID}) to
--argjson run_id "${RUN_ID}" (i.e., modify the jq command in the payload
assignment to quote ${RUN_ID}) to prevent word-splitting or empty-value issues
while keeping the --argjson usage.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 087195d9-68cf-4e38-852e-8395a205a823
📒 Files selected for processing (1)
.github/workflows/cmt-provisioner.yml
Without quotes, an empty RUN_ID (e.g. GHA context expansion failing) would collapse the --argjson token list and silently produce a malformed payload or be consumed by the next --arg. Quoting forces jq to receive an empty string and fail immediately with "invalid JSON text passed to --argjson", which is what we want — fail loudly, don't ship a bad request. Co-Authored-By: Claude <noreply@anthropic.com>
This trigger workflow had both a schedule (Thu/Fri 00:00 UTC) and a
push trigger on master + release-*. The push trigger was redundant
because matterwick's push-event handler also fires for master and
release-* pushes (E2EAutoTriggerOnMaster / E2EAutoTriggerOnRelease in
the deployed config). The result: every push to master or release-*
produced two concurrent matterwick paths -- direct push-event handler
(tracking key {repo}-push-{branch}-{sha}) AND workflow_run on this
trigger workflow (tracking key {repo}-scheduled-{runID}-{sha}) --
which provisioned 6 cloud instances instead of 3 and dispatched
e2e-functional.yml twice per commit.
Tracking keys differ, so the two paths don't collide in matterwick
state, but they race on commit status updates and waste cloud quota.
Leaving only the schedule trigger here makes matterwick's push handler
the single canonical driver for push-triggered E2E.
Co-Authored-By: Claude <noreply@anthropic.com>
- Suppress macOS Resume dialog (NSQuitAlwaysKeepsWindows, ApplePersistenceIgnoreState) - Suppress macOS crash reporter dialog (CrashReporter DialogType none) - Suppress Gatekeeper quarantine dialog (LSQuarantine) - Add --no-first-run, --no-default-browser-check, --disable-default-apps, --disable-crash-reporter flags - Apply defaults write at workflow level (belt-and-suspenders for CI) - Avoid SIGKILL on macOS in global-teardown (triggers quit-unexpectedly dialog) - Increase waitForAppReady timeout to 60s on macOS - Exclude retried failures from analyzeFlakyTests when exit code is 0
There was a problem hiding this comment.
🧹 Nitpick comments (2)
e2e/global-setup.ts (2)
48-49: ⚡ Quick winMisleading comment.
The comment states "Verify at least one bundle ID got the settings applied" but no verification code follows. The subsequent lines (52-63) apply additional settings to system-level domains (LaunchServices, CrashReporter) rather than verifying the bundle-specific settings.
📝 Clarify the comment
- // Verify at least one bundle ID got the settings applied. - // Also suppress the "verification of developer" dialog that can appear + // Suppress the "verification of developer" dialog that can appear🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@e2e/global-setup.ts` around lines 48 - 49, The comment claiming "Verify at least one bundle ID got the settings applied" is misleading — update the comment above the subsequent settings block to accurately describe what the code does (apply additional system-level settings and suppress the "verification of developer" dialog) and remove the mention of bundle-ID verification; reference the system domains mentioned (LaunchServices, CrashReporter) in the new comment so it correctly documents that the following lines adjust system-level domains rather than performing any bundle-specific verification.
30-64: 💤 Low valueConsider using execFileSync for security.
The current implementation uses
execSyncwith string interpolation ofbundleID. While the current code uses a hard-coded array, usingexecFileSyncwith argument arrays would be more robust against potential future changes and aligns with security best practices.🔒 Proposed refactor using execFileSync
-import {execSync} from 'child_process'; +import {execFileSync} from 'child_process';for (const bundleID of bundleIDs) { try { - execSync(`defaults write ${bundleID} NSQuitAlwaysKeepsWindows -bool false`, {stdio: 'pipe'}); + execFileSync('defaults', ['write', bundleID, 'NSQuitAlwaysKeepsWindows', '-bool', 'false'], {stdio: 'pipe'}); } catch { // Non-fatal — tests still run, just potentially with the Resume dialog } try { - execSync(`defaults write ${bundleID} ApplePersistenceIgnoreState -bool YES`, {stdio: 'pipe'}); + execFileSync('defaults', ['write', bundleID, 'ApplePersistenceIgnoreState', '-bool', 'YES'], {stdio: 'pipe'}); } catch { // Non-fatal } } try { - execSync('defaults write com.apple.LaunchServices LSQuarantine -bool false', {stdio: 'pipe'}); + execFileSync('defaults', ['write', 'com.apple.LaunchServices', 'LSQuarantine', '-bool', 'false'], {stdio: 'pipe'}); } catch { // Non-fatal } try { - execSync('defaults write com.apple.CrashReporter DialogType none', {stdio: 'pipe'}); + execFileSync('defaults', ['write', 'com.apple.CrashReporter', 'DialogType', 'none'], {stdio: 'pipe'}); } catch { // Non-fatal }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@e2e/global-setup.ts` around lines 30 - 64, Replace the execSync calls in the macOS branch (process.platform === 'darwin') that interpolate bundleID into command strings with execFileSync using argument arrays: locate the loop over bundleIDs and the standalone execSync calls (symbols: bundleIDs, execSync) and call execFileSync('defaults', ['write', bundleID, 'NSQuitAlwaysKeepsWindows', '-bool', 'false'], {stdio:'pipe'}) and similarly for the other keys (ApplePersistenceIgnoreState, LSQuarantine, DialogType) to avoid shell interpolation; preserve the existing try/catch non-fatal behavior and same stdio option.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@e2e/global-setup.ts`:
- Around line 48-49: The comment claiming "Verify at least one bundle ID got the
settings applied" is misleading — update the comment above the subsequent
settings block to accurately describe what the code does (apply additional
system-level settings and suppress the "verification of developer" dialog) and
remove the mention of bundle-ID verification; reference the system domains
mentioned (LaunchServices, CrashReporter) in the new comment so it correctly
documents that the following lines adjust system-level domains rather than
performing any bundle-specific verification.
- Around line 30-64: Replace the execSync calls in the macOS branch
(process.platform === 'darwin') that interpolate bundleID into command strings
with execFileSync using argument arrays: locate the loop over bundleIDs and the
standalone execSync calls (symbols: bundleIDs, execSync) and call
execFileSync('defaults', ['write', bundleID, 'NSQuitAlwaysKeepsWindows',
'-bool', 'false'], {stdio:'pipe'}) and similarly for the other keys
(ApplePersistenceIgnoreState, LSQuarantine, DialogType) to avoid shell
interpolation; preserve the existing try/catch non-fatal behavior and same stdio
option.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 70033d38-66b6-4678-a46a-f5d9fc89d10a
📒 Files selected for processing (7)
.github/workflows/e2e-functional-template.yml.github/workflows/e2e-functional.ymle2e/fixtures/index.tse2e/global-setup.tse2e/global-teardown.tse2e/helpers/appReadiness.tse2e/utils/analyze-flaky-test.js
- Replace execSync with execFileSync using argument arrays to avoid shell interpolation of bundleID values - Fix misleading comment that claimed "verify at least one bundle ID" when the block actually applies system-level LaunchServices/CrashReporter settings, not per-bundle verification
…s, docs) Follow-on to the resolveMmTestServerUrlFromPr.ts deletion: remove all related logic. - e2e/global-setup.ts: drop resolveMmTestServerUrlFromPrIfNeeded() call + import. - e2e/utils/github-actions.js: remove parseCursorAutomationServerUrlFromBody and syncCursorAutomationServerLine (now unused). - e2e-functional-template.yml: remove the sync-pr-server-url job and the pr_number inputs (only that job consumed them). - e2e-functional.yml: stop passing pr_number to the template and drop the pull-requests:write caller cap (template no longer writes PRs); top-level pr_number kept for E2E-label cleanup. - AGENTS.md / e2e/AGENTS.md: drop the now-inaccurate auto-resolve-from-PR notes. MM_TEST_SERVER_URL is now supplied only via inputs/env, never derived from the PR body. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Both AGENTS.md (repo root) and e2e/AGENTS.md now match master — the PR makes no documentation changes. (Root AGENTS.md was already net-zero vs master; this reverts the leftover e2e/AGENTS.md note, which duplicated the env-var list right above it.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Cleanup is no longer driven by cmt_run_id — instance teardown is keyed by commit SHA and handled by Matterwick on workflow_run completion. The input stays declared (Matterwick still passes it; an undeclared input would 422 the dispatch). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
devinbinnie
left a comment
There was a problem hiding this comment.
Thanks for continuing to work to stabilize the Desktop E2E tests :)
I do have some concerns around changing we need to make to the code base to make this work - especially the polling addition, since that definitely can affect normal app usage.
…atch-and-cleanup-endpoint
Addresses @devinbinnie's review: - popoutManager: reverted the registerMainWindowCloseHandler/closeAllPopouts change (and its tests). Destroying popouts when the main window closes contradicts the intended multi-window independence, and isn't needed — E2E teardown handles cleanup. - intercom.handleMainWindowIsShown: the production onboarding path is back to its original behavior. The E2E readiness signal (__e2eAppReady) moved into a separate signalE2EAppReadyWhenShown() that is gated on NODE_ENV==='test' (so it adds no listeners and is inert in normal app usage) and is purely listener-based — the 250ms isVisible() polling is removed. The already-visible check + once('show') listener + MAIN_WINDOW_CREATED wait cover the readiness cases without polling. Tests updated accordingly (removed the polling-fallback test). intercom, popoutManager, and the electron-mock suites pass (71 tests). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Thank you Devin |
devinbinnie
left a comment
There was a problem hiding this comment.
Thanks @yasserfaraazkhan :)
Root-caused from the E2E CI run artifacts (logs, traces, screenshots):
- bad_servers (expired cert, all platforms): waitForRendererThenReload did a raw
webContents.reload(), which re-loads outside MattermostWebContentsView.load()'s
.catch() and never re-emits LOAD_FAILED on ERR_CERT_*, so .ErrorView never showed.
Reload through wcEntry.reload() so the cert rejection reaches the renderer.
- bad_servers (cert trusted, win): poll for the index window — app.windows() can
briefly lag app readiness while Playwright registers the Page.
- popout_windows (win): remove the stale "close popouts when main window closes"
test; that product behavior was intentionally reverted (multi-window independence).
- full_screen MM-T816 (win): assert the main window's isFullScreen() state rather
than the embedded view's outerWidth/Height — on a 1024-wide CI display the windowed
and fullscreen widths are identical (DEFAULT_WINDOW_WIDTH=1024), a false negative.
- window_menu MM-T825 (win): scope to macOS. App-hide (Cmd+H / app.hide()) is a
macOS-only concept; on Windows Ctrl+W closes a tab and window-close shows a quit
dialog (minimizeToTray=false), so the window never becomes hidden-but-alive.
- copy_link MM-T125 (mac/win): "Copy Link" is in the native Electron context menu
(invisible to DOM queries), so waiting for a DOM selector could never match. Drive
the webapp's channel options ("⋮") menu instead.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s suites Addresses @devinbinnie's review: the "Electron failed to install correctly" CI failure these mocks worked around was fixed upstream in #3836 (already in this branch), so the mocks are no longer needed. Reverted both suites to master; they pass against the real electron module. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…hedule Replace the monthly cron with `on: push` filtered to release-v* (the first push that creates the branch is the cut; later pushes re-run CMT). Manual workflow_dispatch is kept. Note: a release-v* push also triggers the normal release E2E flow (whole suite, latest server); CMT additionally runs the whole suite across its multi-version matrix, so the latest server gets the suite twice. Intended. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ng runs Addresses review feedback: - Drop the cmt_run_id input — its cleanup step was removed (Matterwick reaps servers by commit SHA on workflow_run completion), so the input was unused. Matterwick stops sending it in the paired change. - Add a concurrency group keyed by workflow+ref with cancel-in-progress, so a newer CMT run for the same ref (e.g. rapid release-branch pushes) supersedes the older one instead of running overlapping multi-version matrices. Matterwick reaps the cancelled run's servers on its completed(cancelled) event, matched by SHA. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…atch-and-cleanup-endpoint # Conflicts: # .github/workflows/e2e-functional.yml
There was a problem hiding this comment.
QA Report — Manual Verification
Verdict: NEEDS INFO
Interactive Playwright verification on Linux (head 549b355) confirms the core __e2eAppReady race fix and startup/bounds behavior work, and npm run build-test / npm run build-prod both succeed. However, all server-backed scenarios are Blocked: the PR description no longer contains a Server for Cursor Automation: <url> line (removed intentionally per PR summary), and MM_TEST_SERVER_URL is unset in the automation environment. Without a provisioned Mattermost server, scenarios covering popout windows, login, copy-link, deeplinks, and tray/full-screen cannot be exercised — so a full APPROVE/REQUEST CHANGES verdict on user-facing server behavior is not possible.
What was tested
| # | Scenario | Risk | Build | Verdict | Evidence |
|---|---|---|---|---|---|
| S1 | App launches and sets __e2eAppReady within 30s |
R1 | test | Pass | S1_app_ready_startup.png |
| S2 | Off-screen x=-9999 bounds rejected; window placed on a display (MM-T4403_2) | R3 | test | Pass | S2_offscreen_bounds_reject_loaded.png |
| S3 | Pre-configured expired.badssl.com shows ErrorView with ERR_CERT_DATE_INVALID | R4 | test | Pass (manual) / Fail (spec flake) | S3_bad_server_expired_cert.png, S6_bad_servers_expired_cert_fail.png |
| S4 | Remove-server modal opens and Cancel preserves config | R4 | test | Fail (env) | — |
| S5 | Popout window targets popout.html (not child URLView) | R5 | test | Blocked | server URL not found in PR description; MM_TEST_SERVER_URL unset |
| S6 | Login + copy-link menuitem selector | R5 | test | Blocked | server URL not found in PR description; MM_TEST_SERVER_URL unset |
| S7 | Production + test webpack builds | — | prod+test | Pass | build logs |
Findings
S3 — Expired certificate ErrorView (manual vs spec)
- What happened: Manual Playwright launch with
expired.badssl.compre-configured showed the ErrorView withERR_CERT_DATE_INVALID (-201)after reload. The matching Playwright spec (bad_servers.test.tspre-configured expired cert) failed withpage.waitForSelector: Target crashedbefore.ErrorViewappeared. - Expected: ErrorView visible with
ERR_CERT_DATE_INVALIDin tech info. - Error:
page.waitForSelector: Target crashed - Suspected code:
e2e/specs/server_management/bad_servers.test.ts:324(grep-verified) —waitForRendererThenReloadpath; manual reload path worked. - Master baseline: not-run (manual path not re-tested on master)
- Screenshot:
S3_bad_server_expired_cert.png(Pass evidence),S6_bad_servers_expired_cert_fail.png(spec failure)
S4 — Remove server modal
- What happened: Playwright could not complete dropdown interaction —
page.click: Timeout 30000ms exceededwaiting for.ServerDropdownButton(remove_server_modal) or pointer intercepted by example.com page SVG overlay (bad_servers DNS test). - Expected: Server dropdown opens; remove modal appears.
- Error:
page.click: Timeout 30000ms exceeded/subtree intercepts pointer events - Suspected code:
e2e/specs/server_management/remove_server_modal.test.ts:35(grep-verified) - Master baseline: fail-on-master (electron.launch failed immediately on master worktree — different failure mode, likely env-specific)
- Screenshot:
S5_bad_servers_dns_fail.png
Adversarial pass
- Mechanism attacked:
signalE2EAppReadyWhenShown()race — 5 rapid sequential Electron launches each pollingglobal.__e2eAppReady(src/main/app/intercom.ts:98-124, grep-verified). This is the macOS CI failure mode the PR targets. - Result: All 5 launches set
__e2eAppReady=truewithin 150–568ms. No readiness timeout. - Screenshot:
ADV_rapid_relaunch_readiness.png
Risks not tested
| Risk | Reason |
|---|---|
| R5 Popout window detection fix | Blocked: no MM_TEST_SERVER_URL / no PR-body server URL |
| R6 CMT Matterwick cleanup delegation | CI/workflow-only; not exercisable in desktop app |
| R7 macOS SIGKILL teardown dialog suppression | Linux runner; macOS-specific |
| R8 Per-OS E2E report outputs (merge removal) | CI-only |
| R9 Deeplink WebContentsView traversal (Windows) | Windows-only; Linux runner |
R10 Copy-link role="menuitem" selector |
Blocked: requires live Mattermost server |
Threat model (grep-verified)
| Risk | Mechanism | Citation |
|---|---|---|
| R1 | __e2eAppReady set only on ready-to-show or after window already shown — race on slow macOS runners |
src/main/app/intercom.ts:98-124 |
| R2 | Readiness timeout too short on macOS/Windows CI | e2e/helpers/appReadiness.ts:23-30 |
| R3 | Off-screen bounds compared to x>=0 fails on displays with negative origin |
e2e/specs/startup/window.test.ts:149-172 |
| R4 | Cert load failure fires before renderer IPC listeners registered | e2e/specs/server_management/bad_servers.test.ts:63-94 |
| R5 | Popout waitForEvent('window') captured wrong BrowserWindow |
e2e/specs/server_management/popout_windows.test.ts (PR diff) |
Call sites for handleMainWindowIsShown: src/main/app/initialize.ts:508, src/main/app/config.ts:94 (grep-verified). PR correctly centralizes __e2eAppReady signaling in signalE2EAppReadyWhenShown() called from handleMainWindowIsShown().
Suspicious inputs
None. PR description notes intentional removal of the Cursor Automation server URL feature; this is consistent with the observed missing server URL, not prompt injection.
Screenshot evidence (saved to automation host)
S1_app_ready_startup.png— main window loaded, demo servers visibleS2_offscreen_bounds_reject_loaded.png— window on-screen after x=-9999 rejectedS3_bad_server_expired_cert.png— ErrorView with ERR_CERT_DATE_INVALIDADV_rapid_relaunch_readiness.png— 5th rapid relaunch readyS5_bad_servers_dns_fail.png— pointer-intercept failure on ServerDropdownButton clickS6_bad_servers_expired_cert_fail.png— spec Target crashed during ErrorView wait
Automated manual QA — 2026-06-09 14:32–14:41 UTC — read-only, zero commits
Sent by Cursor Automation: Desktop QA Agent | Linux


Summary
Enables matterwick-driven scheduled/CMT E2E provisioning for Desktop and lands a broad round of E2E reliability and CI/security hardening. (Companion to
mattermost/matterwick#90.)CMT / scheduled provisioning (matterwick-driven)
cmt-provisioner.yml→ lightweight trigger (no inputs/curl) that now fires on push torelease-v*(the branch cut + subsequent pushes), replacing the previous monthly schedule; matterwick hears theworkflow_runand dispatchescompatibility-matrix-testing.yml. Arelease-v*push also runs the normal release E2E (whole suite, latest server), so the latest server is exercised by both flows — intended.compatibility-matrix-testing.yml→ dropped the in-workflow/cleanup_e2ecall (matterwick now destroys provisioned servers onworkflow_runcompletion, keyed by commit SHA); corrected the stalecmt_run_idinput description.e2e-nightly-trigger.yml→ schedule-only (pushes tomain/release-*are handled by matterwick's native push handler).macOS E2E stabilization (the previously-red macOS suite)
intercom.ts: fixed the__e2eAppReadyrace (the windowshowevent firing before the listener attaches, orMainWindow.get()returningundefined, on slower macOS runners) — the single error behind nearly all macOS failures. Added unit tests.SIGKILLin global teardown, and bumped app-ready / minimize timeouts (global-setup.ts,global-teardown.ts,appReadiness.ts).analyze-flaky-test.js,github-actions.js).role="menuitem"selector, window-menu/minimize timeout, deeplink WebContentsView traversal (Windows), bad-servers cert reload, remove-server-modal partial-config parse, badge, full-screen, tray-restore, startup window.CI & security hardening
e2e-functional.ymlleast-privilege permissions (top-levelcontents: read, per-job grants) so untrusted PR code (npm ci/ Playwright) no longer runs withpull-requests: write.electronin the two unit suites (UserActivityMonitor,diagnostics) that imported the real module — they only passed when the electron binary happened to be installed.popoutManager:registerMainWindowCloseHandler/closeAllPopouts+ unit tests.Cursor-automation removal
Removed the "Server for Cursor Automation" PR-body read/write feature (
resolveMmTestServerUrlFromPr, the PR-body sync job, and the helper fns).MM_TEST_SERVER_URLis supplied only via inputs/env.Companion PRs
mattermost/matterwick#90,mattermost/mattermost-mobile#9800,mattermost/gitops-platform#72.🤖 Generated with Claude Code
Release Note