Skip to content

Stabilize flaky Playwright e2e tests#109

Merged
danielchalmers merged 1 commit into
mainfrom
claude/cool-kare-27b3d7
Jun 17, 2026
Merged

Stabilize flaky Playwright e2e tests#109
danielchalmers merged 1 commit into
mainfrom
claude/cool-kare-27b3d7

Conversation

@danielchalmers

Copy link
Copy Markdown
Owner

Problem

The Extension E2E job fails intermittently in CI and goes green on retry — classic flakiness, not real regressions. Pulling the actual failure logs from past runs surfaced three recurring signatures, all rooted in timing/network nondeterminism around the MV3 service worker:

  1. net::ERR_NAME_NOT_RESOLVED — a navigation to a non-resolving test domain (*.example.test / *.example.invalid) escaped its per-test context.route mock and hit real DNS.
  2. Test timeout of 30000ms exceeded on expectBlocked/expectAllowed — Playwright's hidden 5s default expect timeout is too tight for the path "wake the service worker → react to a storage change → reconcile open tabs" under CI load.
  3. page.reload: Not attached to an active page — in the whitelist-from-blocked test, a manual page.reload() raced the background's own tab restore.

Changes

test/e2e/fixtures.ts

  • Deterministic network: a context-level catch-all route stubs every otherwise-unmocked http(s) request with a local 200, so no test depends on real DNS. Per-test mockAllowedPage routes still win (Playwright matches the most-recently-registered route first), and block assertions are unaffected because blocking is driven by chrome.tabs.update redirecting to blocked.html regardless of the original request. Extension/internal URLs pass through untouched.
  • Set a 15s navigation timeout on the context and capture a trace on retry only. Both are done directly on the context because the suite launches its own persistent context (to load the unpacked extension), so Playwright's built-in use.trace / use.navigationTimeout wiring does not apply. Tracing on retry avoids per-action screenshot overhead on the common path — that overhead is itself a flake source under load.

playwright.config.ts

test/e2e/blocked.spec.ts / test/e2e/extension.spec.ts

  • Replaced the suite's only real, resolvable domain (example.com) in the warning-bypass test with a non-resolving stub, removing a dependency on CI egress/DNS.
  • Converted the three Go-Back Promise.all([page.waitForURL(target), click()]) sites to click-then-expect.poll(() => page.url()). waitForURL is bound to the navigation lifecycle and rejects on a transient failed navigation; polling the committed URL tolerates the service worker settling a beat later. All content assertions are retained (a URL alone is weaker than verifying the restored page rendered).

test/e2e/lifecycle.spec.ts

  • Dropped the redundant manual browsingPage.reload() in the whitelist-from-blocked test and rely on the background reconcile to restore the open tab — matching the primed-storage and popup-disable sibling tests. The manual reload raced the background's own tab navigation and produced the Not attached to an active page failure.

Validation

Reproduced and fixed locally under sustained heavy CPU load (cores saturated with busy workers — a quiet machine passes even when CI is flaky):

  • Before: the whitelist-from-blocked race reproduced under load.
  • After: 3 consecutive full-suite runs (40/40 each) and a --repeat-each=3 interleaved run (120/120) — all green.

No product code changed; this is test-infrastructure hardening only. The fixes target root causes (determinism + adequate timeouts); retries remain a thin backstop, so a genuine regression that fails all attempts still reds the build.

The Extension E2E job failed intermittently in CI (passing on retry) with
three recurring signatures, all rooted in timing/network nondeterminism
rather than product bugs:

- net::ERR_NAME_NOT_RESOLVED when a navigation to a non-resolving test
  domain escaped its per-test route mock and hit real DNS.
- "Test timeout of 30000ms exceeded" on expectBlocked/expectAllowed, where
  the default 5s expect timeout was too tight for the MV3 service worker to
  wake, react to a storage change, and reconcile open tabs under load.
- "Not attached to an active page" in the whitelist-from-blocked test, where
  a manual page.reload() raced the background's own tab restore.

Changes:
- fixtures.ts: install a context-level catch-all route that stubs every
  otherwise-unmocked http(s) request with a local 200, so tests never depend
  on real DNS. Per-test mocks still take precedence and block assertions are
  unaffected (blocking is chrome.tabs.update-driven). Set a 15s context
  navigation timeout and capture a trace on retry only (the built-in
  use.trace / use.navigationTimeout wiring does not apply to a directly
  launched persistent context).
- playwright.config.ts: raise the expect timeout to 15s (keeping the 30s test
  ceiling) and add retries (2 on CI, 0 locally) as a backstop.
- blocked.spec.ts / extension.spec.ts: replace the only real-domain target
  with a non-resolving stub, and convert the Go-Back Promise.all([waitForURL,
  click]) sites to click-then-poll-url so a transient navigation cannot reject
  waitForURL. Content assertions are retained.
- lifecycle.spec.ts: drop the redundant manual reload in the
  whitelist-from-blocked test and rely on the background reconcile, matching
  the primed-storage and popup-disable siblings.

Verified green under sustained heavy CPU load locally: 3 consecutive
full-suite runs plus a repeat-each=3 run (120 executions).
@danielchalmers danielchalmers merged commit 0bcb8ac into main Jun 17, 2026
12 checks passed
@danielchalmers danielchalmers deleted the claude/cool-kare-27b3d7 branch June 17, 2026 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant