Skip to content

fix: unblock /auth/providers on cold start + show error UI on login#48

Merged
yourbuddyconner merged 3 commits into
yourbuddyconner:mainfrom
tkhq:fix/auth-providers-loading
May 8, 2026
Merged

fix: unblock /auth/providers on cold start + show error UI on login#48
yourbuddyconner merged 3 commits into
yourbuddyconner:mainfrom
tkhq:fix/auth-providers-loading

Conversation

@figitaki
Copy link
Copy Markdown
Collaborator

@figitaki figitaki commented May 8, 2026

Summary

Fixes a prod issue where the Valet login page renders with no sign-in buttons — just two dark gray rectangles where the provider buttons should be. Those rectangles were the loading skeleton; users hit the page before the worker finished booting.

Three layered fixes (one commit each):

  1. fix(worker): run plugin sync in background, not blocking requestssyncPluginsOnce was awaited in a app.use('*', …) middleware, doing ~47 serial D1 writes before the route handler ran. On a cold isolate, every public request (including /auth/providers) waited for that. Moved it into c.executionCtx.waitUntil(...) so it runs after the response is sent. The registry sync is already idempotent and best-effort — slightly stale content for the first request is fine.

  2. fix(worker): skip plugin sync on public latency-critical paths — even with waitUntil, kicking the sync off on every unauthenticated request still wastes D1 capacity on the cold-start path users see most. Skip entirely for /auth/* and /health*. Authenticated traffic on /api/* still drives it.

  3. fix(client): show error + retry on LoginForm when providers fetch fails — when /auth/providers errored out, the loading skeleton would disappear and providers would be undefined, so the card rendered with just a title and zero buttons — visually identical to "no providers configured." Added an explicit error message + Retry button, and bumped that one query's retry count from the default 1 to 3 so transient cold-starts don't trip it.

Root cause writeup

Full RCA (with code references) is in the linked secret gist: https://gist.github.com/figitaki/887c9a34e95a8820ba14280399f2abda

Fixes #1, #2, and #3 from that doc. (#4 — rate-limiting the sync — is left as defense-in-depth follow-up.)

Test plan

  • cd packages/worker && pnpm typecheck — clean
  • cd packages/client && pnpm typecheck — clean
  • Deploy to a preview env and confirm /auth/providers returns within ~100ms on a cold isolate
  • Force fetchAuthProviders to fail (e.g. block the request in devtools) and confirm the LoginForm shows the error + Retry button instead of an empty card
  • Confirm authenticated /api/* traffic still triggers a plugin sync (check worker logs for [plugin-sync])

figitaki added 3 commits May 8, 2026 13:53
The plugin-sync middleware awaited syncPluginsOnce on every request,
which on cold start performed ~47 serial D1 writes before the route
handler even ran. Public latency-critical endpoints like
/auth/providers got stuck behind it, causing the login page to show
its loading skeleton (the dark gray pulse-loaders) for tens of
seconds — long enough that users reported "no sign-in buttons."

Move the sync into ctx.waitUntil so it runs after the response is
returned. The registry sync is already idempotent and best-effort, so
the first cold-start request seeing slightly-stale content is fine.
Even with the sync running in the background, kicking it off on
unauthenticated traffic (/auth/providers, /auth/<provider> redirects,
/health) wastes D1 capacity on the cold-start path that's most
visible to end users. Skip the sync entirely for those prefixes —
authenticated traffic on /api/* still drives it.
Previously, if /auth/providers errored out (default retry=1), the
loading skeleton would disappear, providers would be undefined, and
the card would render with just the title and zero buttons — visually
identical to a "no login providers configured" state. Users had no
indication anything was wrong and no way to recover short of a full
page reload.

Render an explicit error message with a Retry button when the query
fails, and bump retry to 3 for this specific query so transient
worker cold-starts don't trip it.
@figitaki figitaki requested a review from yourbuddyconner May 8, 2026 21:03
Copy link
Copy Markdown
Owner

@yourbuddyconner yourbuddyconner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, well-scoped fix. The waitUntil + singleton-promise interaction is safe, skip prefixes align with unauthenticated route mounts, and the client error UI degrades gracefully.

@yourbuddyconner yourbuddyconner merged commit f2204e4 into yourbuddyconner:main May 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants