ci(e2e): CMT scheduling + macOS E2E stabilization + CI/security hardening by yasserfaraazkhan · Pull Request #3829 · mattermost/desktop

yasserfaraazkhan · 2026-05-25T04:18:18Z

Summary

Enables matterwick-driven scheduled/CMT E2E provisioning for Desktop and lands a broad round of E2E reliability and CI/security hardening. (Companion to mattermost/matterwick#90.)

CMT / scheduled provisioning (matterwick-driven)

cmt-provisioner.yml → lightweight trigger (no inputs/curl) that now fires on push to release-v* (the branch cut + subsequent pushes), replacing the previous monthly schedule; matterwick hears the workflow_run and dispatches compatibility-matrix-testing.yml. A release-v* push also runs the normal release E2E (whole suite, latest server), so the latest server is exercised by both flows — intended.
compatibility-matrix-testing.yml → dropped the in-workflow /cleanup_e2e call (matterwick now destroys provisioned servers on workflow_run completion, keyed by commit SHA); corrected the stale cmt_run_id input description.
e2e-nightly-trigger.yml → schedule-only (pushes to main/release-* are handled by matterwick's native push handler).

macOS E2E stabilization (the previously-red macOS suite)

intercom.ts: fixed the __e2eAppReady race (the window show event firing before the listener attaches, or MainWindow.get() returning undefined, on slower macOS runners) — the single error behind nearly all macOS failures. Added unit tests.
Suppressed the macOS Resume / crash-reporter / Gatekeeper dialogs, stopped sending SIGKILL in global teardown, and bumped app-ready / minimize timeouts (global-setup.ts, global-teardown.ts, appReadiness.ts).
Per-OS reporting: each platform uploads its own HTML report; removed the cross-OS merge that produced misleading aggregate counts. Flaky-test accounting fixed so a retried-then-passed test isn't counted as a failure (analyze-flaky-test.js, github-actions.js).
Targeted spec fixes: popout (was capturing the child URLView instead of the popout window), copy-link role="menuitem" selector, window-menu/minimize timeout, deeplink WebContentsView traversal (Windows), bad-servers cert reload, remove-server-modal partial-config parse, badge, full-screen, tray-restore, startup window.

CI & security hardening

e2e-functional.yml least-privilege permissions (top-level contents: read, per-job grants) so untrusted PR code (npm ci / Playwright) no longer runs with pull-requests: write.
Flaky-CI fix: mocked electron in the two unit suites (UserActivityMonitor, diagnostics) that imported the real module — they only passed when the electron binary happened to be installed.
popoutManager: registerMainWindowCloseHandler / closeAllPopouts + unit tests.

Cursor-automation removal

Removed the "Server for Cursor Automation" PR-body read/write feature (resolveMmTestServerUrlFromPr, the PR-body sync job, and the helper fns). MM_TEST_SERVER_URL is supplied only via inputs/env.

Companion PRs

mattermost/matterwick#90, mattermost/mattermost-mobile#9800, mattermost/gitops-platform#72.

🤖 Generated with Claude Code

Release Note

NONE

…run webhook GitHub's workflow_run webhook payload does not include workflow_dispatch inputs, so matterwick was never receiving the server_versions value and the CMT Provisioner runs (e.g. 2026-04-28, 2026-05-18) returned 202 OK on the webhook but never dispatched compatibility-matrix-testing.yml. Replace the echo-only stub with a curl POST to matterwick's new /cmt_dispatch endpoint, sending the full context the webhook payload cannot carry: server_versions, run_id, sha, ref, owner, repo. Matterwick returns 202 immediately and runs provisioning asynchronously. Requires two repository configuration items: - vars.MATTERWICK_URL (already set, used by /cleanup_e2e callback) - secrets.MATTERWICK_CMT_TRIGGER_SECRET (new, shared with matterwick) Co-Authored-By: Claude <noreply@anthropic.com>

mm-cloud-bot · 2026-05-25T04:18:22Z

@yasserfaraazkhan: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

I understand the commands that are listed here

coderabbitai · 2026-05-25T04:18:34Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR comprehensively hardens E2E test infrastructure, reporting, and main-window lifecycle management across 77 ranges. It extends per-OS test-count outputs, refactors flaky-test analysis to collapse retries into authoritative outcomes, implements macOS dialog suppression across global setup/teardown/fixtures/workflows, hardens 13 specific tests with polling/event-driven patterns, adds PopoutManager cleanup on main-window close, and refactors intercom readiness with combined show-event and polling-fallback logic.

Changes

E2E Infrastructure, Test Reliability, and Main-Window Lifecycle

Layer / File(s)	Summary
CMT provisioner and compatibility-matrix workflow updates `.github/workflows/cmt-provisioner.yml`, `.github/workflows/compatibility-matrix-testing.yml`	Converts cmt-provisioner to scheduled/dispatchable trigger without manual inputs, documents Matterwick event-driven dispatch, and delegates instance cleanup to external Matterwick handler keyed by commit SHA.
E2E functional workflow permissions and job dependencies `.github/workflows/e2e-functional.yml`	Reduces workflow permissions to least-privilege `contents: read`, removes merged-report dependency, adds Windows runner pinning with Node-gyp comment, updates pr_number input description, and adds macOS suppress-dialogs step to main workflow.
E2E functional template outputs and reporting wiring `.github/workflows/e2e-functional-template.yml`	Extends workflow-call and job outputs to include per-OS PASSED/TOTAL/SKIPPED test counts, adds HTML report generation from blob-report, uploads to S3 with per-OS report URLs, and updates analyze-flaky-tests step to emit expanded per-OS outputs.
Flaky test analysis improvements `e2e/utils/analyze-flaky-test.js`	Rewrites failure counting to collapse testcase retries into base-name outcomes via regex matching, introduces getOutcomeCounts helper with pass/fail/skip dominance rules, and reconciles pass/total/skip counts against authoritative failure count for per-OS outputs.
Status description and PR reporting utilities `e2e/utils/github-actions.js`	Adds formatStatusDescription helper to compute commit-status text from passed/failed counts, refactors updateFinalStatus to use per-OS outputs and REPORT_LINK_* fallback without merged-report URL, and exports new utility.
macOS dialog suppression infrastructure `e2e/fixtures/index.ts`, `e2e/global-setup.ts`, `e2e/global-teardown.ts`, `.github/workflows/e2e-functional-template.yml`, `.github/workflows/e2e-functional.yml`	Electron fixture args disable first-run/default-apps/crash-reporter, global-setup snapshots/applies macOS defaults across bundleIDs (NSQuitAlwaysKeepsWindows, LSQuarantine, crash DialogType), global-teardown restores snapshot and adjusts graceful timeouts/SIGKILL behavior, and workflow template adds suppress-dialogs step.
E2E test timeout and app readiness configuration `e2e/playwright.config.ts`, `e2e/helpers/appReadiness.ts`	Increases Playwright timeout to 90_000 with Windows launch comments, computes platform-specific app readiness timeout (60s darwin/win32, 30s others), and expands timeout failure message with troubleshooting guidance.
Deep linking and copy-link test fixes `e2e/specs/deep_linking/deeplink.test.ts`, `e2e/specs/mattermost/copy_link.test.ts`	Deep-link test uses mattermost-dev:// protocol, polls server map per iteration, and checks window URL; copy-link test dispatches explicit contextmenu event and broadens selector to match both text variants and role-based containers.
Fullscreen and minimize window tests `e2e/specs/menu_bar/full_screen.test.ts`, `e2e/specs/menu_bar/window_menu.test.ts`	Fullscreen test applies demoMattermostConfig, targets specific BrowserWindow via focus/testRefs/first-available, clicks native menu item, and waits on enter/leave-full-screen events; minimize test calls BrowserWindow.minimize() and polls isMinimized().
Notification badge and server management tests `e2e/specs/notification_trigger/notification_badge_windows_linux.test.ts`, `e2e/specs/server_management/bad_servers.test.ts`, `e2e/specs/server_management/popout_windows.test.ts`	Badge trigger adds 15s polling loop with transient-error retry for global.__testTriggerBadge registration, beforeEach adds 10s retry loop for unread-badge-setting registration; bad_servers reloads all server views; popout_windows snapshots windows and polls for popout.html, uses native menu to open popouts, and sets willAppQuit before main-window close.
Window bounds, tray restore, and defensive file operations `e2e/specs/startup/window.test.ts`, `e2e/specs/system/tray_restore.test.ts`, `e2e/specs/server_management/remove_server_modal.test.ts`	Startup window tests replace non-negative bounds assertions with center-point-in-workArea checks to handle negative display origins; tray restore uses expect.poll for visibility; remove-server-modal wraps config.json reads in try/catch within polling callback.
PopoutManager main-window close cleanup `src/app/windows/popoutManager.ts`, `src/app/windows/popoutManager.test.js`	Adds MAIN_WINDOW_CREATED listener registration, attaches one-time closed handler on main window, and implements closeAllPopouts to invoke cleanup listeners, destroy browser windows, and clear tracking maps; includes comprehensive test coverage for registration, destruction, and integration via event trigger.
Intercom app readiness refactor and tests `src/main/app/intercom.ts`, `src/main/app/intercom.test.js`	Replaces isVisible/show branching with attachWindowListeners helper that marks readiness immediately if visible, otherwise waits for show event and polls isVisible every 250ms; ensures single markReady execution via done guard; defers to MAIN_WINDOW_CREATED if window not yet available; comprehensive test coverage with makeWindow mock helper and scenarios for visible-state, show-event, polling-fallback, and late-main-window construction.
Test infrastructure electron mocking `src/main/UserActivityMonitor.test.js`, `src/main/diagnostics/index.test.js`	Adds jest.mock('electron') blocks to UserActivityMonitor.test.js and diagnostics/index.test.js to prevent real Electron loading in CI and provide deterministic mock implementations of app.isReady, powerMonitor.getSystemIdleTime, shell, session, systemPreferences, and Notification.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

mattermost/desktop#3834: Implements overlapping E2E reporting and status-description refactors in e2e/utils/github-actions.js and matching workflow changes removing merged-report handling.

Suggested reviewers

devinbinnie

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.81% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title accurately summarizes the main changes: CMT scheduling shift, macOS E2E stabilization, and CI/security hardening are all substantive themes across the changeset.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/cmt-direct-dispatch-and-cleanup-endpoint

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.github/workflows/cmt-provisioner.yml (1)
21-31: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Set explicit least-privilege permissions for this workflow.

No permissions block is defined, so this workflow inherits repository defaults for GITHUB_TOKEN, which is broader than needed here.
🔐 Proposed fix
 name: CMT Provisioner
+permissions: {}
 
 on:
   workflow_dispatch:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/cmt-provisioner.yml around lines 21 - 31, Add an explicit
top-level permissions block to this workflow to avoid inheriting broad
repository defaults; update the workflow YAML (where the on:, workflow_dispatch
and job trigger-matterwick are defined) to include a minimal permissions map
such as setting permissions: contents: read and workflows: write (adjust further
only if the trigger-matterwick job needs additional scopes), so the GITHUB_TOKEN
only has the least-privilege required.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.github/workflows/cmt-provisioner.yml:
- Around line 21-31: Add an explicit top-level permissions block to this
workflow to avoid inheriting broad repository defaults; update the workflow YAML
(where the on:, workflow_dispatch and job trigger-matterwick are defined) to
include a minimal permissions map such as setting permissions: contents: read
and workflows: write (adjust further only if the trigger-matterwick job needs
additional scopes), so the GITHUB_TOKEN only has the least-privilege required.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 33b71075-cd7f-4654-978e-a49a6b70c36d

📥 Commits

Reviewing files that changed from the base of the PR and between e699409 and fe9514e.

📒 Files selected for processing (1)

.github/workflows/cmt-provisioner.yml

The workflow inherits the repo default GITHUB_TOKEN permissions, which is broader than this workflow needs. The single step is an outbound curl to matterwick authenticated by a separate shared secret -- there is no checkout, no GitHub API call, and no interaction with workflow files. permissions: {} (no scopes) is the correct least-privilege setting; if a future step needs a specific scope, it should be added explicitly rather than relying on repo-default inheritance. Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.github/workflows/cmt-provisioner.yml (1)
61-68: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Quote ${RUN_ID} in the jq command.

Line 66 uses --argjson run_id ${RUN_ID} without quoting ${RUN_ID}. While github.run_id should always be numeric, unquoted shell expansions can cause issues if the value is empty, contains whitespace, or has special characters. Quote it for safety.
🛡️ Proposed fix
-            --argjson run_id ${RUN_ID} \
+            --argjson run_id "${RUN_ID}" \
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/cmt-provisioner.yml around lines 61 - 68, The jq payload
construction is vulnerable to unquoted shell expansion for RUN_ID; update the
payload assignment so the jq invocation quotes the expansion for run id (the
line using --argjson run_id ${RUN_ID}) to --argjson run_id "${RUN_ID}" (i.e.,
modify the jq command in the payload assignment to quote ${RUN_ID}) to prevent
word-splitting or empty-value issues while keeping the --argjson usage.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.github/workflows/cmt-provisioner.yml:
- Around line 61-68: The jq payload construction is vulnerable to unquoted shell
expansion for RUN_ID; update the payload assignment so the jq invocation quotes
the expansion for run id (the line using --argjson run_id ${RUN_ID}) to
--argjson run_id "${RUN_ID}" (i.e., modify the jq command in the payload
assignment to quote ${RUN_ID}) to prevent word-splitting or empty-value issues
while keeping the --argjson usage.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 087195d9-68cf-4e38-852e-8395a205a823

📥 Commits

Reviewing files that changed from the base of the PR and between fe9514e and c41490b.

📒 Files selected for processing (1)

.github/workflows/cmt-provisioner.yml

Without quotes, an empty RUN_ID (e.g. GHA context expansion failing) would collapse the --argjson token list and silently produce a malformed payload or be consumed by the next --arg. Quoting forces jq to receive an empty string and fail immediately with "invalid JSON text passed to --argjson", which is what we want — fail loudly, don't ship a bad request. Co-Authored-By: Claude <noreply@anthropic.com>

This trigger workflow had both a schedule (Thu/Fri 00:00 UTC) and a push trigger on master + release-*. The push trigger was redundant because matterwick's push-event handler also fires for master and release-* pushes (E2EAutoTriggerOnMaster / E2EAutoTriggerOnRelease in the deployed config). The result: every push to master or release-* produced two concurrent matterwick paths -- direct push-event handler (tracking key {repo}-push-{branch}-{sha}) AND workflow_run on this trigger workflow (tracking key {repo}-scheduled-{runID}-{sha}) -- which provisioned 6 cloud instances instead of 3 and dispatched e2e-functional.yml twice per commit. Tracking keys differ, so the two paths don't collide in matterwick state, but they race on commit status updates and waste cloud quota. Leaving only the schedule trigger here makes matterwick's push handler the single canonical driver for push-triggered E2E. Co-Authored-By: Claude <noreply@anthropic.com>

- Suppress macOS Resume dialog (NSQuitAlwaysKeepsWindows, ApplePersistenceIgnoreState) - Suppress macOS crash reporter dialog (CrashReporter DialogType none) - Suppress Gatekeeper quarantine dialog (LSQuarantine) - Add --no-first-run, --no-default-browser-check, --disable-default-apps, --disable-crash-reporter flags - Apply defaults write at workflow level (belt-and-suspenders for CI) - Avoid SIGKILL on macOS in global-teardown (triggers quit-unexpectedly dialog) - Increase waitForAppReady timeout to 60s on macOS - Exclude retried failures from analyzeFlakyTests when exit code is 0

…d-comment

coderabbitai

🧹 Nitpick comments (2)

e2e/global-setup.ts (2)

48-49: ⚡ Quick win

Misleading comment.

The comment states "Verify at least one bundle ID got the settings applied" but no verification code follows. The subsequent lines (52-63) apply additional settings to system-level domains (LaunchServices, CrashReporter) rather than verifying the bundle-specific settings.

📝 Clarify the comment

-        // Verify at least one bundle ID got the settings applied.
-        // Also suppress the "verification of developer" dialog that can appear
+        // Suppress the "verification of developer" dialog that can appear

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/global-setup.ts` around lines 48 - 49, The comment claiming "Verify at
least one bundle ID got the settings applied" is misleading — update the comment
above the subsequent settings block to accurately describe what the code does
(apply additional system-level settings and suppress the "verification of
developer" dialog) and remove the mention of bundle-ID verification; reference
the system domains mentioned (LaunchServices, CrashReporter) in the new comment
so it correctly documents that the following lines adjust system-level domains
rather than performing any bundle-specific verification.

30-64: 💤 Low value

Consider using execFileSync for security.

The current implementation uses execSync with string interpolation of bundleID. While the current code uses a hard-coded array, using execFileSync with argument arrays would be more robust against potential future changes and aligns with security best practices.

🔒 Proposed refactor using execFileSync

-import {execSync} from 'child_process';
+import {execFileSync} from 'child_process';

         for (const bundleID of bundleIDs) {
             try {
-                execSync(`defaults write ${bundleID} NSQuitAlwaysKeepsWindows -bool false`, {stdio: 'pipe'});
+                execFileSync('defaults', ['write', bundleID, 'NSQuitAlwaysKeepsWindows', '-bool', 'false'], {stdio: 'pipe'});
             } catch {
                 // Non-fatal — tests still run, just potentially with the Resume dialog
             }
             try {
-                execSync(`defaults write ${bundleID} ApplePersistenceIgnoreState -bool YES`, {stdio: 'pipe'});
+                execFileSync('defaults', ['write', bundleID, 'ApplePersistenceIgnoreState', '-bool', 'YES'], {stdio: 'pipe'});
             } catch {
                 // Non-fatal
             }
         }
 
         try {
-            execSync('defaults write com.apple.LaunchServices LSQuarantine -bool false', {stdio: 'pipe'});
+            execFileSync('defaults', ['write', 'com.apple.LaunchServices', 'LSQuarantine', '-bool', 'false'], {stdio: 'pipe'});
         } catch {
             // Non-fatal
         }
 
         try {
-            execSync('defaults write com.apple.CrashReporter DialogType none', {stdio: 'pipe'});
+            execFileSync('defaults', ['write', 'com.apple.CrashReporter', 'DialogType', 'none'], {stdio: 'pipe'});
         } catch {
             // Non-fatal
         }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/global-setup.ts` around lines 30 - 64, Replace the execSync calls in the
macOS branch (process.platform === 'darwin') that interpolate bundleID into
command strings with execFileSync using argument arrays: locate the loop over
bundleIDs and the standalone execSync calls (symbols: bundleIDs, execSync) and
call execFileSync('defaults', ['write', bundleID, 'NSQuitAlwaysKeepsWindows',
'-bool', 'false'], {stdio:'pipe'}) and similarly for the other keys
(ApplePersistenceIgnoreState, LSQuarantine, DialogType) to avoid shell
interpolation; preserve the existing try/catch non-fatal behavior and same stdio
option.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@e2e/global-setup.ts`:
- Around line 48-49: The comment claiming "Verify at least one bundle ID got the
settings applied" is misleading — update the comment above the subsequent
settings block to accurately describe what the code does (apply additional
system-level settings and suppress the "verification of developer" dialog) and
remove the mention of bundle-ID verification; reference the system domains
mentioned (LaunchServices, CrashReporter) in the new comment so it correctly
documents that the following lines adjust system-level domains rather than
performing any bundle-specific verification.
- Around line 30-64: Replace the execSync calls in the macOS branch
(process.platform === 'darwin') that interpolate bundleID into command strings
with execFileSync using argument arrays: locate the loop over bundleIDs and the
standalone execSync calls (symbols: bundleIDs, execSync) and call
execFileSync('defaults', ['write', bundleID, 'NSQuitAlwaysKeepsWindows',
'-bool', 'false'], {stdio:'pipe'}) and similarly for the other keys
(ApplePersistenceIgnoreState, LSQuarantine, DialogType) to avoid shell
interpolation; preserve the existing try/catch non-fatal behavior and same stdio
option.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 70033d38-66b6-4678-a46a-f5d9fc89d10a

📥 Commits

Reviewing files that changed from the base of the PR and between e2e49b9 and cd36811.

📒 Files selected for processing (7)

.github/workflows/e2e-functional-template.yml
.github/workflows/e2e-functional.yml
e2e/fixtures/index.ts
e2e/global-setup.ts
e2e/global-teardown.ts
e2e/helpers/appReadiness.ts
e2e/utils/analyze-flaky-test.js

- Replace execSync with execFileSync using argument arrays to avoid shell interpolation of bundleID values - Fix misleading comment that claimed "verify at least one bundle ID" when the block actually applies system-level LaunchServices/CrashReporter settings, not per-bundle verification

…s, docs) Follow-on to the resolveMmTestServerUrlFromPr.ts deletion: remove all related logic. - e2e/global-setup.ts: drop resolveMmTestServerUrlFromPrIfNeeded() call + import. - e2e/utils/github-actions.js: remove parseCursorAutomationServerUrlFromBody and syncCursorAutomationServerLine (now unused). - e2e-functional-template.yml: remove the sync-pr-server-url job and the pr_number inputs (only that job consumed them). - e2e-functional.yml: stop passing pr_number to the template and drop the pull-requests:write caller cap (template no longer writes PRs); top-level pr_number kept for E2E-label cleanup. - AGENTS.md / e2e/AGENTS.md: drop the now-inaccurate auto-resolve-from-PR notes. MM_TEST_SERVER_URL is now supplied only via inputs/env, never derived from the PR body. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Both AGENTS.md (repo root) and e2e/AGENTS.md now match master — the PR makes no documentation changes. (Root AGENTS.md was already net-zero vs master; this reverts the leftover e2e/AGENTS.md note, which duplicated the env-var list right above it.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Cleanup is no longer driven by cmt_run_id — instance teardown is keyed by commit SHA and handled by Matterwick on workflow_run completion. The input stays declared (Matterwick still passes it; an undeclared input would 422 the dispatch). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

yasserfaraazkhan · 2026-06-02T10:07:39Z

@CodeRabbit review

coderabbitai · 2026-06-02T10:07:46Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

devinbinnie

Thanks for continuing to work to stabilize the Desktop E2E tests :)

I do have some concerns around changing we need to make to the code base to make this work - especially the polling addition, since that definitely can affect normal app usage.

…atch-and-cleanup-endpoint

@devinbinnie

Addresses @devinbinnie's review: - popoutManager: reverted the registerMainWindowCloseHandler/closeAllPopouts change (and its tests). Destroying popouts when the main window closes contradicts the intended multi-window independence, and isn't needed — E2E teardown handles cleanup. - intercom.handleMainWindowIsShown: the production onboarding path is back to its original behavior. The E2E readiness signal (__e2eAppReady) moved into a separate signalE2EAppReadyWhenShown() that is gated on NODE_ENV==='test' (so it adds no listeners and is inert in normal app usage) and is purely listener-based — the 250ms isVisible() polling is removed. The already-visible check + once('show') listener + MAIN_WINDOW_CREATED wait cover the readiness cases without polling. Tests updated accordingly (removed the polling-fallback test). intercom, popoutManager, and the electron-mock suites pass (71 tests). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

yasserfaraazkhan · 2026-06-04T08:54:12Z

Thank you Devin
I made the changes to only have it behind E2E variable.

devinbinnie

Thanks @yasserfaraazkhan :)

Root-caused from the E2E CI run artifacts (logs, traces, screenshots): - bad_servers (expired cert, all platforms): waitForRendererThenReload did a raw webContents.reload(), which re-loads outside MattermostWebContentsView.load()'s .catch() and never re-emits LOAD_FAILED on ERR_CERT_*, so .ErrorView never showed. Reload through wcEntry.reload() so the cert rejection reaches the renderer. - bad_servers (cert trusted, win): poll for the index window — app.windows() can briefly lag app readiness while Playwright registers the Page. - popout_windows (win): remove the stale "close popouts when main window closes" test; that product behavior was intentionally reverted (multi-window independence). - full_screen MM-T816 (win): assert the main window's isFullScreen() state rather than the embedded view's outerWidth/Height — on a 1024-wide CI display the windowed and fullscreen widths are identical (DEFAULT_WINDOW_WIDTH=1024), a false negative. - window_menu MM-T825 (win): scope to macOS. App-hide (Cmd+H / app.hide()) is a macOS-only concept; on Windows Ctrl+W closes a tab and window-close shows a quit dialog (minimizeToTray=false), so the window never becomes hidden-but-alive. - copy_link MM-T125 (mac/win): "Copy Link" is in the native Electron context menu (invisible to DOM queries), so waiting for a DOM selector could never match. Drive the webapp's channel options ("⋮") menu instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@devinbinnie

…s suites Addresses @devinbinnie's review: the "Electron failed to install correctly" CI failure these mocks worked around was fixed upstream in #3836 (already in this branch), so the mocks are no longer needed. Reverted both suites to master; they pass against the real electron module. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…hedule Replace the monthly cron with `on: push` filtered to release-v* (the first push that creates the branch is the cut; later pushes re-run CMT). Manual workflow_dispatch is kept. Note: a release-v* push also triggers the normal release E2E flow (whole suite, latest server); CMT additionally runs the whole suite across its multi-version matrix, so the latest server gets the suite twice. Intended. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ng runs Addresses review feedback: - Drop the cmt_run_id input — its cleanup step was removed (Matterwick reaps servers by commit SHA on workflow_run completion), so the input was unused. Matterwick stops sending it in the paired change. - Add a concurrency group keyed by workflow+ref with cancel-in-progress, so a newer CMT run for the same ref (e.g. rapid release-branch pushes) supersedes the older one instead of running overlapping multi-version matrices. Matterwick reaps the cancelled run's servers on its completed(cancelled) event, matched by SHA. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…atch-and-cleanup-endpoint # Conflicts: # .github/workflows/e2e-functional.yml

cursor

QA Report — Manual Verification

Verdict: NEEDS INFO

Interactive Playwright verification on Linux (head 549b355) confirms the core __e2eAppReady race fix and startup/bounds behavior work, and npm run build-test / npm run build-prod both succeed. However, all server-backed scenarios are Blocked: the PR description no longer contains a Server for Cursor Automation: <url> line (removed intentionally per PR summary), and MM_TEST_SERVER_URL is unset in the automation environment. Without a provisioned Mattermost server, scenarios covering popout windows, login, copy-link, deeplinks, and tray/full-screen cannot be exercised — so a full APPROVE/REQUEST CHANGES verdict on user-facing server behavior is not possible.

What was tested

#	Scenario	Risk	Build	Verdict	Evidence
S1	App launches and sets `__e2eAppReady` within 30s	R1	test	Pass	`S1_app_ready_startup.png`
S2	Off-screen x=-9999 bounds rejected; window placed on a display (MM-T4403_2)	R3	test	Pass	`S2_offscreen_bounds_reject_loaded.png`
S3	Pre-configured expired.badssl.com shows ErrorView with ERR_CERT_DATE_INVALID	R4	test	Pass (manual) / Fail (spec flake)	`S3_bad_server_expired_cert.png`, `S6_bad_servers_expired_cert_fail.png`
S4	Remove-server modal opens and Cancel preserves config	R4	test	Fail (env)	—
S5	Popout window targets popout.html (not child URLView)	R5	test	Blocked	server URL not found in PR description; `MM_TEST_SERVER_URL` unset
S6	Login + copy-link menuitem selector	R5	test	Blocked	server URL not found in PR description; `MM_TEST_SERVER_URL` unset
S7	Production + test webpack builds	—	prod+test	Pass	build logs

Findings

S3 — Expired certificate ErrorView (manual vs spec)

What happened: Manual Playwright launch with expired.badssl.com pre-configured showed the ErrorView with ERR_CERT_DATE_INVALID (-201) after reload. The matching Playwright spec (bad_servers.test.ts pre-configured expired cert) failed with page.waitForSelector: Target crashed before .ErrorView appeared.
Expected: ErrorView visible with ERR_CERT_DATE_INVALID in tech info.
Error: page.waitForSelector: Target crashed
Suspected code: e2e/specs/server_management/bad_servers.test.ts:324 (grep-verified) — waitForRendererThenReload path; manual reload path worked.
Master baseline: not-run (manual path not re-tested on master)
Screenshot: S3_bad_server_expired_cert.png (Pass evidence), S6_bad_servers_expired_cert_fail.png (spec failure)

S4 — Remove server modal

What happened: Playwright could not complete dropdown interaction — page.click: Timeout 30000ms exceeded waiting for .ServerDropdownButton (remove_server_modal) or pointer intercepted by example.com page SVG overlay (bad_servers DNS test).
Expected: Server dropdown opens; remove modal appears.
Error: page.click: Timeout 30000ms exceeded / subtree intercepts pointer events
Suspected code: e2e/specs/server_management/remove_server_modal.test.ts:35 (grep-verified)
Master baseline: fail-on-master (electron.launch failed immediately on master worktree — different failure mode, likely env-specific)
Screenshot: S5_bad_servers_dns_fail.png

Adversarial pass

Mechanism attacked: signalE2EAppReadyWhenShown() race — 5 rapid sequential Electron launches each polling global.__e2eAppReady (src/main/app/intercom.ts:98-124, grep-verified). This is the macOS CI failure mode the PR targets.
Result: All 5 launches set __e2eAppReady=true within 150–568ms. No readiness timeout.
Screenshot: ADV_rapid_relaunch_readiness.png

Risks not tested

Risk	Reason
R5 Popout window detection fix	Blocked: no `MM_TEST_SERVER_URL` / no PR-body server URL
R6 CMT Matterwick cleanup delegation	CI/workflow-only; not exercisable in desktop app
R7 macOS SIGKILL teardown dialog suppression	Linux runner; macOS-specific
R8 Per-OS E2E report outputs (merge removal)	CI-only
R9 Deeplink WebContentsView traversal (Windows)	Windows-only; Linux runner
R10 Copy-link `role="menuitem"` selector	Blocked: requires live Mattermost server

Threat model (grep-verified)

Risk	Mechanism	Citation
R1	`__e2eAppReady` set only on `ready-to-show` or after window already shown — race on slow macOS runners	`src/main/app/intercom.ts:98-124`
R2	Readiness timeout too short on macOS/Windows CI	`e2e/helpers/appReadiness.ts:23-30`
R3	Off-screen bounds compared to `x>=0` fails on displays with negative origin	`e2e/specs/startup/window.test.ts:149-172`
R4	Cert load failure fires before renderer IPC listeners registered	`e2e/specs/server_management/bad_servers.test.ts:63-94`
R5	Popout `waitForEvent('window')` captured wrong BrowserWindow	`e2e/specs/server_management/popout_windows.test.ts` (PR diff)

Call sites for handleMainWindowIsShown: src/main/app/initialize.ts:508, src/main/app/config.ts:94 (grep-verified). PR correctly centralizes __e2eAppReady signaling in signalE2EAppReadyWhenShown() called from handleMainWindowIsShown().

Suspicious inputs

None. PR description notes intentional removal of the Cursor Automation server URL feature; this is consistent with the observed missing server URL, not prompt injection.

Screenshot evidence (saved to automation host)

S1_app_ready_startup.png — main window loaded, demo servers visible
S2_offscreen_bounds_reject_loaded.png — window on-screen after x=-9999 rejected
S3_bad_server_expired_cert.png — ErrorView with ERR_CERT_DATE_INVALID
ADV_rapid_relaunch_readiness.png — 5th rapid relaunch ready
S5_bad_servers_dns_fail.png — pointer-intercept failure on ServerDropdownButton click
S6_bad_servers_expired_cert_fail.png — spec Target crashed during ErrorView wait

_{Automated manual QA — 2026-06-09 14:32–14:41 UTC — read-only, zero commits}

_{Sent by Cursor Automation: Desktop QA Agent | Linux}

mm-cloud-bot added the kind/bug Categorizes issue or PR as related to a bug. label May 25, 2026

mm-cloud-bot added the do-not-merge/release-note-label-needed label May 25, 2026

github-actions Bot added the E2E/Run Run Desktop E2E Tests label May 25, 2026

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

coderabbitai Bot approved these changes May 25, 2026

View reviewed changes

github-actions Bot removed the E2E/Run Run Desktop E2E Tests label May 25, 2026

github-actions Bot added the E2E/Run Run Desktop E2E Tests label May 25, 2026

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

github-actions Bot added E2E/Run Run Desktop E2E Tests and removed E2E/Run Run Desktop E2E Tests labels May 25, 2026

This comment was marked as spam.

Sign in to view

github-actions Bot added the E2E/Run Run Desktop E2E Tests label May 26, 2026

fix(e2e): add blank line before comment to satisfy eslint lines-aroun…

ddcc6c0

…d-comment

github-actions Bot added E2E/Run Run Desktop E2E Tests and removed E2E/Run Run Desktop E2E Tests labels May 26, 2026

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

github-actions Bot removed the E2E/Run Run Desktop E2E Tests label May 26, 2026

github-actions Bot added E2E/Run Run Desktop E2E Tests and removed E2E/Run Run Desktop E2E Tests labels May 26, 2026

github-actions Bot added E2E/Run Run Desktop E2E Tests and removed E2E/Run Run Desktop E2E Tests labels Jun 2, 2026

yasserfaraazkhan requested a review from devinbinnie June 2, 2026 10:08

yasserfaraazkhan changed the title ~~ci(e2e): scheduled CMT trigger + drop in-workflow cleanup call~~ ci(e2e): CMT scheduling + macOS E2E stabilization + CI/security hardening Jun 2, 2026

coderabbitai Bot approved these changes Jun 2, 2026

View reviewed changes

devinbinnie requested changes Jun 3, 2026

View reviewed changes

Comment thread src/app/windows/popoutManager.ts Outdated

Comment thread src/main/app/intercom.ts Outdated

Merge remote-tracking branch 'origin/master' into fix/cmt-direct-disp…

1ddad60

…atch-and-cleanup-endpoint

github-actions Bot added E2E/Run Run Desktop E2E Tests and removed E2E/Run Run Desktop E2E Tests labels Jun 3, 2026

devinbinnie approved these changes Jun 4, 2026

View reviewed changes

Comment thread src/main/diagnostics/index.test.js Outdated

Comment thread src/main/UserActivityMonitor.test.js Outdated

yasserfaraazkhan and others added 5 commits June 5, 2026 05:24

Merge remote-tracking branch 'origin/master' into fix/cmt-direct-disp…

549b355

…atch-and-cleanup-endpoint # Conflicts: # .github/workflows/e2e-functional.yml

cursor Bot reviewed Jun 9, 2026

View reviewed changes

Conversation

yasserfaraazkhan commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

CMT / scheduled provisioning (matterwick-driven)

macOS E2E stabilization (the previously-red macOS suite)

CI & security hardening

Cursor-automation removal

Companion PRs

Release Note

Uh oh!

mm-cloud-bot commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as spam.

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

yasserfaraazkhan commented Jun 2, 2026

Uh oh!

coderabbitai Bot commented Jun 2, 2026

Uh oh!

devinbinnie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yasserfaraazkhan commented Jun 4, 2026

Uh oh!

devinbinnie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

QA Report — Manual Verification

What was tested

Findings

Adversarial pass

Risks not tested

Threat model (grep-verified)

Suspicious inputs

Screenshot evidence (saved to automation host)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yasserfaraazkhan commented May 25, 2026 •

edited

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading