Skip to content

[wrangler] fix: prevent remote binding sessions expiring during long dev sessions#13016

Open
penalosa wants to merge 20 commits intomainfrom
penalosa/remote-bindings-timeout
Open

[wrangler] fix: prevent remote binding sessions expiring during long dev sessions#13016
penalosa wants to merge 20 commits intomainfrom
penalosa/remote-bindings-timeout

Conversation

@penalosa
Copy link
Copy Markdown
Contributor

@penalosa penalosa commented Mar 23, 2026

Fixes #12901

Fixes an issue where remote bindings (e.g. Workers AI) would stop working after ~1 hour in wrangler dev, requiring a manual restart.

Root causes fixed:

  1. Preview tokens expired silently — tokens have a 1-hour TTL set by the Workers control plane. The ProxyWorker would detect expiry reactively (only after a request failed), meaning one request would always fail before a refresh was triggered. This PR adds a proactive refresh at 50 minutes so no request ever sees an expired token.

  2. error code: 1031 not recognised as expiry — bindings like Workers AI return text/plain error code: 1031 when their session times out, but the existing check only matched the HTML Invalid Workers Preview configuration response. The check is now content-type-agnostic.

  3. Failed refresh left ProxyWorker permanently paused — if #refreshPreviewToken threw after emitReloadStartEvent had already paused the ProxyWorker, no reloadComplete was ever emitted to unpause it. The catch block now re-emits the last known proxyData to recover.

  4. Auth resolved too earlyunwrapHook(dev.auth) was called at bundle-complete time and the resolved value stored. This means OAuth tokens captured at startup could go stale. Auth is now resolved lazily each time a session starts.

  5. Handler reserved keys proxied to remote stub — workerd probes service binding entrypoints for lifecycle handler methods (tail, trace, scheduled, etc.) by accessing those properties. Without a guard these fell through to the capnweb stub, returning truthy RpcPromise objects and making workerd think the binding handled those events. Added a HANDLER_RESERVED_KEYS guard matching the existing pattern in external-service.ts.


  • Tests
    • Tests included/updated
    • Automated tests not possible - manual testing has been completed as follows:
    • Additional testing not necessary because:
  • Public documentation
    • Cloudflare docs PR(s):
    • Documentation not necessary because: internal dev tooling behaviour fix

Open with Devin

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 23, 2026

🦋 Changeset detected

Latest commit: 4852073

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 23, 2026

✅ All changesets look good

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Mar 23, 2026

create-cloudflare

npm i https://pkg.pr.new/create-cloudflare@13016

@cloudflare/kv-asset-handler

npm i https://pkg.pr.new/@cloudflare/kv-asset-handler@13016

miniflare

npm i https://pkg.pr.new/miniflare@13016

@cloudflare/pages-shared

npm i https://pkg.pr.new/@cloudflare/pages-shared@13016

@cloudflare/unenv-preset

npm i https://pkg.pr.new/@cloudflare/unenv-preset@13016

@cloudflare/vite-plugin

npm i https://pkg.pr.new/@cloudflare/vite-plugin@13016

@cloudflare/vitest-pool-workers

npm i https://pkg.pr.new/@cloudflare/vitest-pool-workers@13016

@cloudflare/workers-editor-shared

npm i https://pkg.pr.new/@cloudflare/workers-editor-shared@13016

wrangler

npm i https://pkg.pr.new/wrangler@13016

commit: 4852073

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Mar 23, 2026

I've completed my thorough review of the PR. Let me summarize my findings:

The PR is well-structured and addresses the stated issues clearly. The code changes are logical, the test coverage for the new proactive refresh behavior is solid, and the error recovery pattern (re-emitting reloadComplete on refresh failure) is sound.

After careful review, I don't find any actionable logic bugs, security issues, backward compatibility violations, or incorrect API behavior. The key areas I verified:

  1. Proactive refresh timer - properly cleared on bundleStart, teardown, and before rescheduling. No timer leak.
  2. Error recovery path - re-emitting #latestProxyData on failed refresh to unpause ProxyWorker is correct.
  3. HANDLER_RESERVED_KEYS - identical to the existing set in external-service.ts.
  4. checkForPreviewTokenError moved outside isHtmlResponse guard - correct, since error code: 1031 is text/plain. The function clones the response before reading, so the original body is preserved.
  5. Auth lazy resolution - passing the hook through instead of eagerly unwrapping is the right pattern for long-lived sessions.
  6. deepStrictEqual on auth - the preExistingRemoteProxySessionData doesn't include auth in the return type, so this comparison always sees undefined for the old value. But this is a pre-existing behavior, not introduced by this PR.
  7. Retry semantics - rescheduling proactive refresh in the catch block at the same 50-minute interval is reasonable.
  8. Test quality - tests properly use fake timers, cover the happy path, cancellation on bundleStart, cancellation on teardown, and the reactive refresh path.

LGTM

github run

@penalosa penalosa force-pushed the penalosa/remote-bindings-timeout branch 2 times, most recently from 6ab5cab to 35d0aae Compare March 23, 2026 17:07
@penalosa penalosa force-pushed the penalosa/remote-bindings-timeout branch from 35d0aae to cb76ef9 Compare March 23, 2026 17:32
@penalosa penalosa marked this pull request as ready for review March 23, 2026 17:32
@penalosa penalosa requested a review from a team as a code owner March 23, 2026 17:33
@workers-devprod
Copy link
Copy Markdown
Contributor

Codeowners approval required for this PR:

  • @cloudflare/wrangler
Show detailed file reviewers
  • packages/miniflare/src/workers/shared/remote-proxy-client.worker.ts: [@cloudflare/wrangler]
  • packages/wrangler/src/tests/api/startDevWorker/RemoteRuntimeController.test.ts: [@cloudflare/wrangler]
  • packages/wrangler/src/tests/dev/remote-bindings.test.ts: [@cloudflare/wrangler]
  • packages/wrangler/src/api/remoteBindings/index.ts: [@cloudflare/wrangler]
  • packages/wrangler/src/api/startDevWorker/LocalRuntimeController.ts: [@cloudflare/wrangler]
  • packages/wrangler/src/api/startDevWorker/RemoteRuntimeController.ts: [@cloudflare/wrangler]
  • packages/wrangler/src/api/startDevWorker/utils.ts: [@cloudflare/wrangler]
  • packages/wrangler/templates/startDevWorker/ProxyWorker.ts: [@cloudflare/wrangler]

devin-ai-integration[bot]

This comment was marked as resolved.

@penalosa penalosa marked this pull request as draft March 23, 2026 18:13
Copy link
Copy Markdown
Contributor

@petebacondarwin petebacondarwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far. I wonder if we could achieve a similar result by emitting the previewTokenExpired event after 50 mins and let the rest of the machinery do the refresh?

@penalosa penalosa marked this pull request as ready for review March 27, 2026 13:05
devin-ai-integration[bot]

This comment was marked as resolved.

@penalosa penalosa marked this pull request as draft March 30, 2026 12:00
@penalosa penalosa marked this pull request as ready for review April 2, 2026 10:49
@workers-devprod
Copy link
Copy Markdown
Contributor

workers-devprod commented Apr 2, 2026

Codeowners approval required for this PR:

  • ✅ @cloudflare/wrangler
Show detailed file reviewers

@penalosa penalosa requested a review from petebacondarwin April 2, 2026 10:56
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@penalosa
Copy link
Copy Markdown
Contributor Author

penalosa commented Apr 7, 2026

/bonk fix the lint issues on this PR

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Apr 7, 2026

Fixed the lint issue. The problem was a single unused import:

  • RemoteRuntimeController.ts:2UserError was imported from @cloudflare/workers-utils but never used. Likely a leftover from a previous iteration where the refresh failure threw a UserError (removed in commit 1ba3254). Removed it from the import statement.

Lint and format checks now pass clean.

github run

Copy link
Copy Markdown
Contributor

@workers-devprod workers-devprod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codeowners reviews satisfied

@github-project-automation github-project-automation bot moved this from Untriaged to Approved in workers-sdk Apr 7, 2026
@workers-devprod
Copy link
Copy Markdown
Contributor

Codeowners approval required for this PR:

  • ✅ @cloudflare/wrangler
Show detailed file reviewers

@penalosa penalosa added this pull request to the merge queue Apr 7, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Approved

Development

Successfully merging this pull request may close these issues.

Wrangler should detect when remote bindings have timed out

4 participants