fix(tokens): issueTokenPair signs via signToken dispatcher (restores /v1/tokens in prod)#33
Closed
fix(tokens): issueTokenPair signs via signToken dispatcher (restores /v1/tokens in prod)#33
Conversation
…prod) Regression root cause: cloud#299 retired the SIGNING_KEY worker binding on the relayauth worker as part of HS256 sunset (phase 122 step 3). The JWKS bugfix (#32) and the SIGNING_KEY-gate fix in the JWKS route were shipped. But issueTokenPair in routes/tokens.ts was still calling signHs256Jwt(claims, env.SIGNING_KEY, env.SIGNING_KEY_ID) directly — bypassing the signToken(claims, env) dispatcher added in phase 120. With SIGNING_KEY undefined at runtime, signHs256Jwt's crypto.subtle.importKey call threw "DataError: Imported HMAC key length (0) must be non-zero", producing a 500 on every POST /v1/tokens. Sage (and every other api-key caller) hit this 500 as soon as they tried to mint a delegated token, cascading into production failures: "I ran into an issue processing your request" in the Slack path was the user-facing manifestation. Fix - Replace the two direct signHs256Jwt calls with signToken(claims, env). signToken dispatches on RELAYAUTH_SIGNING_ALG (RS256 in production per infra/relayauth.ts:55) and signs via RELAYAUTH_SIGNING_KEY_PEM, which is still bound. - Delete the now-unused signHs256Jwt helper. (encodeBase64UrlJson + encodeBase64UrlBytes stay; other code paths use them.) No behavior change for any caller that was working before #299. The issued access + refresh tokens are now RS256-signed (they already were, via the JWKS published key, for OTHER code paths — but NOT for /v1/tokens, until now). Tests - tokens-route.test.ts: 33/33 pass (no changes to test fixtures needed — RS256 signing is invisible to the assertions which only check response shape / status codes). - tsc --noEmit clean on packages/server. Deploy - Publish @relayauth/server@0.2.5 - Bump cloud's @relayauth/* dep to ^0.2.5 - Cloud deploy restores /v1/tokens production flow. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
Author
|
Superseded by the full HS256 purge PR (in progress by RelayauthHs256Purge agent on branch |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Urgent — fixes production 500 on POST /v1/tokens
Sage's Slack handler (and every other RelayAuth api-key caller that mints delegated tokens) is currently failing with a generic "I ran into an issue" fallback. Root cause traced via wrangler tail on sage + observability logs on relayauth:
Relayauth worker log:
Root cause
cloud#299 retired the
SIGNING_KEYworker binding (phase 122 step 3). Everything else in the HS256 sunset worked: JWKS route was fixed (#32), relayauth's JWKS now only shows RSA in production. ButissueTokenPairinroutes/tokens.tswas still callingsignHs256Jwt(claims, env.SIGNING_KEY, env.SIGNING_KEY_ID)directly, bypassing thesignToken(claims, env)dispatcher we added in phase 120 that respectsRELAYAUTH_SIGNING_ALG.With
SIGNING_KEYundefined,crypto.subtle.importKeythrewDataError→ 500 on everyPOST /v1/tokens.Fix
signHs256Jwt(...)calls withsignToken(claims, env). The dispatcher routes to RS256 signing viaRELAYAUTH_SIGNING_KEY_PEM(still bound) sinceRELAYAUTH_SIGNING_ALG=RS256(set incloud/infra/relayauth.ts:55).signHs256Jwthelper.No behavior change for working paths
Issued access + refresh tokens are now RS256-signed via the same key published to JWKS. Verifiers have been accepting RS256 since phase 121 dual-verify. The only things that change:
/v1/tokensstops 500'ing, andissueTokenPair-produced tokens now carry RS256 + thekidpublished in JWKS instead of the legacy HS256kid.Tests
tokens-route.test.tspass (assertions check response shape + status codes; RS256 signing is invisible to them).tsc --noEmitclean onpackages/server.Deploy ordering
@relayauth/server@0.2.5(you — only you can publish)@relayauth/*to^0.2.5/v1/tokensreturns 201 with an RS256 token againRollback
Trivial — revert this PR brings back
signHs256Jwt. But rollback doesn't unbreak production (theSIGNING_KEYbinding is still gone from the worker), so fixing forward is the only path.🤖 Generated with Claude Code