Skip to content

feat(OAuth2): implemet support for OAuth2 (1)#209

Open
fioan89 wants to merge 66 commits intomainfrom
impl-support-for-oauth
Open

feat(OAuth2): implemet support for OAuth2 (1)#209
fioan89 wants to merge 66 commits intomainfrom
impl-support-for-oauth

Conversation

@fioan89
Copy link
Copy Markdown
Collaborator

@fioan89 fioan89 commented Oct 13, 2025

Recent versions of Coder act as an OAuth 2.1 authorization server for first- and third‑party applications.
This PR aims at providing support for authenticating via OAuth with Coder Toolbox and still retain backward compatibility for authentication via API tokens or via certificates.

This PR is a WIP:

  • Infrastructure code to discover if a Coder deployment supports OAuth2
  • Infrastructure code to discover the Coder OAuth2 endpoints
  • Infrastructure code to dynamically register Coder Toolbox as a client app
  • Toolbox API is configured so that authorization page can be launched
  • Token refresh is configured as well
  • The infrastructure for exchanging the authorization codes for tokens and handle of auth redirect URI
  • Add support for PKCE verification
  • Limit the permission scope to read workspaces, read templates, template versions, buildinfo, read/write workspace, agents + apps, read your own user details, authentication, read access to SSH keys
  • Retrieve the logged account
  • Support for logout to revoke access/all tokens for the client app
  • Support for storing the oauth token and refresh token, re-use at the next start.
  • Update/Redesign the Login wizard screen to skip over the token step if OAuth is available
  • Redesigned the Settings page and regrouped all settings under a couple of logical topics
  • Integrate with Coder CLI, authenticate the CLI with OAuth2 as well.
  • Update documentation

fioan89 added 11 commits October 9, 2025 22:52
Toolbox API comes with a basic oauth2 client. This commit
sets-up details about two important oauth flows:

- authorization flow, in which the user is sent to web page
  where an authorization code is generated which is exchanged
  for an access token.
- details about token refresh endpoint where users can obtain
  a new access token and a new refresh token.

A couple of important aspects:
- the client app id is resolved in upstream
- as well as the actual endpoints for authorization and token refresh
- S256 is the only code challenge supported
…ation url

OAuth endpoint `.well-known/oauth-authorization-server` provides metadata about
the endpoint for dynamic client registration and supported response types.
This commit adds support for deserializing these values.
OAuth allows programatic client registration for apps like Coder Toolbox
via the DCR endpoint which requires a name for the client app, the requested
scopes, redirect URI, etc... DCR replies back with a similar structure but
in addition it returs two very important properties: client_id - a unique
client identifier string and also a client_secret - a secret string value
used by clients to authenticate to the token endpoint.
Code Toolbox plugin should protect against authorization code interception
attacks by making use of the PKCE security extension which involves
a cryptographically random string (128 characters) known as code verifier
and a code challenge - derived from code verifier using the S256 challenge method.
The OAuth2-compatible authentication manager provided by Toolbox
- authentication and token endpoints are now passed via the login configuration object
- similar for client_id and client_secret
- PCKE is now enabled
…injection

- remove ServiceLocator dependency from CoderToolboxContext
- move OAuth manager creation to CoderToolboxExtension for cleaner separation
- Refactor CoderOAuthManager to use configuration-based approach instead of constructor injection

The idea behind these changes is that createRefreshConfig API does not receive a configuration
object that can provide the client id and secret and even the refresh url. So initially
we worked around the issue by passing the necessary data via the constructor. However this approach
means a couple of things:

- the actual auth manager can be created only at a very late stage, when a URL is provided by users
- can't easily pass arround the auth manager without coupling the components
- have to recreate a new auth manager instance if the user logs out and logs in to a different URL
- service locator needs to be passed around because this is the actual factory of oauth managers in Toolbox

Instead, we went with a differet approach, COderOAuthManager will derive and store the refresh configs once
the authorization config is received. If the user logs out and logs in to a different URL the refresh data is
also guaranteed to be updated. And on top of that - this approach allows us to get rid of all of the issues
mentioned above.
Toolbox can handle automatically the exchange of an authorization code with a token
by handling the custom URI for oauth. This commit calls the necessary API
in the Coder Toolbox URI handling.
@fioan89 fioan89 requested review from f0ssel and jcjiang October 14, 2025 21:23
@fioan89 fioan89 changed the title impl: support for OAuth2 impl: support for OAuth2 [WIP] Oct 14, 2025
POST /api/v2/oauth2-provider/apps is actually for manual admin
registration for admin created apps. Programmatic Dynamic Client
Registration is done via `POST /oauth2/register`.

At the same time I included `registration_access_token` and `registration_client_uri`
to use it later in order to refresh the client secret without re-registering the client app.
A bunch of code thrown around to launch the OAuth flow.
Still needs a couple of things:
- persist the client id and registration uri and token
- re-use client id instead of re-register every time
- properly handle scenarios where OAuth is not available
- the OAuth right now can be enabled if we log out and then
hit next in the deployment screen
A new config `preferAuthViaApiToken` allows users to continue to use
API tokens for authentication when OAuth2 is available on the Coder deployment.
@f0ssel f0ssel removed their request for review January 20, 2026 15:01
Account implementation with logic to resolve the account once the token
is retrieved. Marshalling logic for the account is also added.

There is a limitation in the Toolbox API where createRefreshConfig
is not receiving the auth params. We worked around by capturing and
storing these params in the createAuthConfig but this is unreliable.
Instead we use the account to pass the missing info around.
OAuth2 should be launched if user prefers is over any other method of auth
and if only the server supports it.
Fallback on client_secret_basic or None depending on what the Coder
server supports.
…n endpoint

Based on the auth method type we need to send client id and client secret as a basic
auth header or part of the body as an encoded url form
We encountered a couple of issues with the Toolbox API which is inflexible:
- we don't have complete control over which parameters are sent as query&body
- we don't have fully basic + headers + body logging for debugging purposes
- doesn't integrate that well with our existing http client used for polling
- spent more than a couple of hours trying to understand why Coder rejects the
  authorization call with:
 ```
  {"error":"invalid_request","error_description":"The request is missing required parameters or is otherwise malformed"} from Coder server.
 ```
 Instead we will slowly discard the existing logic and rely on enhancements to our existing http client.
 Basically, the login screen will try to first determine if mTLS auth is configured and use that, otherwise
 it will check if the user wants to use OAuth over API token, if available. When the flag is
 true then the login screen will query the Coder server to see if OAuth2 is supported.
 If that is true then browser is launched pointing to the authentication URL. If not we will default to
 the API token authentication.
Comment thread src/main/kotlin/com/coder/toolbox/store/CoderSecretsStore.kt
Comment thread src/main/kotlin/com/coder/toolbox/CoderRemoteProvider.kt Outdated
Comment thread src/main/kotlin/com/coder/toolbox/sdk/CoderRestClient.kt Outdated
Comment thread src/main/kotlin/com/coder/toolbox/CoderRemoteProvider.kt Outdated
Comment thread src/main/kotlin/com/coder/toolbox/views/ConnectStep.kt Outdated
Comment thread src/main/kotlin/com/coder/toolbox/sdk/CoderRestClient.kt
fioan89 added 5 commits March 9, 2026 23:15
The OAuth2 server implementation needs to provide an authorization code
that can be exchanged for an access token. But in order to make sure the
authorization code is for the "our" login request, the client provides a
state value when launching the authorization URL which the OAuth2 server has
to send back when with the auth code. This fix makes sure the authorization
code is actually sent, and that the state value is the same as in our initial
request.
This fix reports an error to the user when token exchange
request is failing, or returning an empty body or a body that does not contain the token.
The logic for exchanging auth code to tokens, refreshing tokens was used in
multiple places without any code reuse strategy. Extracted an OAuth service that
handles the basic operations.
The metadata endpoint provide an absolute URL for the client registration endpoint
which we should use instead of hardcoding the path relative to the base url.
@fioan89 fioan89 requested a review from matifali March 16, 2026 20:49
Comment thread src/main/kotlin/com/coder/toolbox/views/DeploymentUrlStep.kt Outdated
fioan89 added 8 commits March 18, 2026 23:21
https://datatracker.ietf.org/doc/html/rfc7591 normalizes the client
registration error responses and forces providers to always include
json with an error code and an error message. This patch captures the error
response and builds a pretty message and displays it to the user.
RFC 6749 §4.1.2.1 + RFC 7636 §4.4.1 specify that the error
code and optional error_description can be returned as a query params
int the callback URI.

Similarly, RFC 6749 §5.2 — the exchange of authorization codes to tokens
can return a json body containing an error code and an error message that
was never handled in our code.
This upgrade will need TBX 3.4 or higher to be installed.
The upgrade is needed to benefit from the fixes related to displaying UI
pages in the URI handler.

In addition I reworked the main build.gradle and extracted everything into a
small custom plugin.
Due to the dependency on the new API.
OAuth callbacks are encoded, especially error details
need to be decoded before surfacing them to the user.
We ended up with error messages like `An error was encountered: <error-code>: <some error description`. ":" is a bit repetitive.
Copy link
Copy Markdown
Member

@matifali matifali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-engineering approval. I am fine shipping this, given its opt-in.
Just ensure we provide a good experience in case the setting is enabled, but the deployment does not have CODER_EXPERIMENTS=aouth2 enabled.

@fioan89
Copy link
Copy Markdown
Collaborator Author

fioan89 commented Apr 6, 2026

Non-engineering approval. I am fine shipping this, given its opt-in.Just ensure we provide a good experience in case the setting is enabled, but the deployment does not have CODER_EXPERIMENTS=aouth2 enabled.

Yes, I can confirm that OAuth happens only when the user explicitly enables the OAuth authentication AND the backed exposes the necessary endpoints.

fioan89 added a commit to coder/coder that referenced this pull request Apr 6, 2026
Go's html/template has a built-in security filter (urlFilter) that only allows http,
https, and mailto URL schemes. Any other scheme gets replaced with #ZgotmplZ.

The OAuth2 app's callback URL uses custom URI scheme which the filter considers unsafe.
For example the Coder JetBrains plugin exposes a callback URI with the scheme jetbrains:// - which
was effectively changed by the template engine into  #ZgotmplZ. Of course this is not an
actual callback. When users clicked the cancel button nothing happened.

The fix was simple - we now wrap the apps registered callback URI into htmltemplate.URL.

In addition, while testing this PR with coder/coder-jetbrains-toolbox#209
I discovered that we are also not compliant with https://www.rfc-editor.org/rfc/rfc6749#section-4.1.2.1
which requires the server to attach the local state if it was provided by the client in the original
request. Also it is optional but generally a good practice to include `error_description` in the error
responses. In fact we follow this pattern for the other types of error responses. So this is not a one off.
Copy link
Copy Markdown
Member

@code-asher code-asher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran out of time but will finish tomorrow!

Comment thread src/main/kotlin/com/coder/toolbox/oauth/OAuth2Service.kt Outdated
throw Exception(errorMessage)
}

private fun createAuthorizationService(): CoderAuthorizationApi {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nit, but when I initially saw the oauth2 auth service create yet another auth service it seemed weird to me, but this is really just an http/api client right? Not really doing any service-like things, I think?

In my mind it would involve state management to be a service, which has implications for how it should behave in the code (which is why I thought maybe it was a problem to recreate it above without updating the one on the class).

I guess in that sense OAuth2Service is not necessarily a service either, just a wrapper around the API calls. Actually, could these all be methods directly on the coder rest client? Feels to me like it could be part of the sdk, they are just more API endpoints after all.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... this is an interesting point and I'd like to discuss/philosophize a bit over it.
First - CoderAuthorizationApi is just a Retrofit interface — it's an HTTP client definition, OAuth2Service is stateless — it just wraps those API calls with error handling.

Could these live on CoderRestClient? In principle yes — they're just more HTTP calls to the Coder deployment. But there's a practical issue : OAuth2Service is used before CoderRestClient exists. The discovery/registration/exchange calls are pre-authentication — they use a bare HTTP client (no auth interceptors, no token). CoderRestClient is constructed with a token or OAuth context already in hand, and its HTTP client is configured with auth interceptors. Mixing unauthenticated OAuth endpoint calls into that client would mean either:

  1. Building a second internal HTTP client without auth interceptors
  2. Making the auth interceptors conditional per-request

Both add complexity for little gain. So I think the current separation makes sense architecturally

Now regarding the naming - services usually orchestrate business logic. I guess I confused request construction with logic. I'll try to come up with a better name.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah true, good points.

Comment on lines +391 to +392
val newAuthResponse = OAuth2Service(context).refreshToken(oauthContext!!)
this.oauthContext.tokenResponse = newAuthResponse
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nbd at all but we use oauthContext without a this and then this.oauthContext, is there a reason for that?

I know I always say this haha but !! feels like a trap waiting to spring in the future, maybe we could pass in the context to refreshToken() or something?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arghhhh....I always tell myself - let's quickly throw two ! together and I'll rewrite the code later, for now let's make it work. As you have noticed...that later never comes :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relatable lol

block()
try {
val response = block()
if (response.code() == HttpURLConnection.HTTP_UNAUTHORIZED && oauthContext.hasRefreshToken()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If oauthContext is nullable would this not require a ?? Should we do an oauthContext.let or something? Would also let us get rid of that !! if we pass that around.

But I assume it must not require it since it is building, just not sure how haha

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works because hasRefreshToken is an extension function on a nullable context:

fun CoderOAuthSessionContext?.hasRefreshToken(): Boolean = this?.tokenResponse?.refreshToken != null

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooooooo did not know that was possible. Neat.

Comment thread src/main/kotlin/com/coder/toolbox/sdk/CoderRestClient.kt
Comment thread src/main/kotlin/com/coder/toolbox/sdk/CoderRestClient.kt Outdated
refreshToken()
true
} catch (e: Exception) {
context.logger.error(e, "Failed to refresh access token")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the refresh fails, do we need to look at the response and possibly discard the token? If it is some kind of permanent auth failure. Otherwise I imagine we would keep trying to refresh with the same token.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll log an issue. There are transient reasons (network failure??) for which it is worth retrying at the next API call. But for other errors like invalid_grant or unauthorized_client it is probably pointless to retry again. And it is not easy to tackle this problem - do we disrupt the user and remove his workspaces, stop the ongoing ssh connections and popup the login screen, or do we stop the world but still keep the workspaces visible? I think it is worth pondering over and maybe involve @matifali in the discussion, but in a separate ticket.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

@code-asher code-asher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried it out and oauth worked fabulously! Just FYI I only reviewed the oauth stuff and skipped buildSrc.

Had some observations (some of these might be pre-existing issues):

  1. If I deselect the oauth option in the settings it seems to immediately go back to using the API key. But the reverse seems not to be true, if I select oauth while already logged in then it keeps using my API key (until I restart Toolbox).

Is that what we want? I think I expected my session to keep using whatever it used to log in until I explicitly logged out. I also kind of expected my API key to be deleted once I switched to oauth (or rather I expected it would be overridden with the oauth access token), but it seems to have been preserved, so even if I explicitly log out and then disable oauth, it still authenticates with the API key which was surprising to me.

  1. This one is kinda wacky and probably not going to happen in practice so might not be worth addressing, but if I try to log in, then disable oauth, then click "allow", the error message does not really make sense, says "oauth or api token is required" but I feel like the error should really be something like "unable to log in with oauth because it is disabled". Or, we should remember what we used to start the login and carry that through the whole process.

  2. When I launched Toolbox I got "Error encountered while setting up Coder" and "authorization failed" which looks a bit scary but according to the logs my API key was invalid (makes sense, I have not launched Toolbox in a long while). We should add the actual error message to that dialog.

  3. If I click out of Toolbox while the cli is downloading (for example if I close the browser window opened by the oauth process) then Toolbox closes itself (idk why they made it like this) and if I re-open it I am back on the URL screen. It looks like in the background the setup is still ongoing though. If I try to log in again, they seem to clash and I do get the workspaces list but I also get the security dialog (maybe because it is trying to use a file that got overwritten by the other). Same thing happens if I click "back" and log in again. Not sure what happens if I do that and try to log into a different deployment while the other is ongoing.

Comment on lines +100 to +101
refreshOAuthToken()
oauthSession = CoderSetupWizardContext.oauthSession!!.copy()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we call refreshToken on the rest client instead of duplicating it here? Would have to create the rest client first of course. Mostly it feels kinda scattered to me with how we have two copies of oauth session context and we have to kinda glue them together.

Also maybe we could return the session to avoid !!.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do it quickly - but there are a number of issues:

  • cli also needs to be initialized
  • I need to find a good way to do the refresh token in the http client before any other API call is made.
    Is it alright if I log an issue and treat this separetly?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I deselect the oauth option in the settings it seems to immediately go back to using the API key. But the reverse seems not to be true

This sounds really scary. If you are logged via oauth, and go to the Settings page and deselect the "Prefer OAuth.." the API token should be used only at the next restart or if you log out and log in again. The client should not terminate automatically, and go through the login screen again. Is this what you are experiencing (I can't reproduce it)

And yeah... it was intentional to keep the API token stored.

@code-asher do you think I can treat all of these into a separate issue/PR? it is a lot of stuff that I need to test and think about it. I will need Atif's help to decide the behavior for some of these scenarios (like removing the API token)

Copy link
Copy Markdown
Member

@code-asher code-asher Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think I can treat all of these into a separate issue/PR

Yes of course!

If you are logged via oauth, and go to the Settings page and deselect the "Prefer OAuth.." the API token should be used only at the next restart or if you log out and log in again. The client should not terminate automatically, and go through the login screen again. Is this what you are experiencing

What happened is that I deselected "prefer oauth", then I revoked permissions for the oauth app in my Coder dashboard. But it kept connecting to workspaces just fine.

To me, this felt unexpected, I thought the current session would keep using oauth (and so connecting would result in an auth failure), and for a second it felt like somehow it had bypassed auth lol until I realized it was probably just using my old API key I had configured before switching to oauth.

I think it is because we check the "prefer oauth" setting each time rather than having that info embedded as part of the session itself. IMO it should only affect the decision we make when first logging in.

fioan89 added 4 commits April 8, 2026 22:42
Currently, named OAuth2Service which oversells itself because it
only orchestrates http request construction, there is no business
logic orchestration or state management.
We used to share oauth context as a global val that could
be mutated once a token had to refreshed. Instead, we changed
the code pass a modified copy of this oauth context with the
refreshed token.
fioan89 added a commit to coder/coder that referenced this pull request Apr 10, 2026
Go's html/template has a built-in security filter (urlFilter) that only
allows http, https, and mailto URL schemes. Any other scheme gets
replaced with #ZgotmplZ.

The OAuth2 app's callback URL uses custom URI scheme which the filter
considers unsafe. For example the Coder JetBrains plugin exposes a
callback URI with the scheme jetbrains:// - which was effectively
changed by the template engine into #ZgotmplZ. Of course this is not an
actual callback. When users clicked the cancel button nothing happened.

The fix was simple - we now wrap the apps registered callback URI into
htmltemplate.URL. Usually this needs some validation otherwise the
linter will complain about it. The callback URI used by the Cancel logic
is actually validated by our backend when the client app
programmatically registered via the dynamic OAuth2 registration
endpoints, so we refactored the validation around that code and re-used
some of it in the Cancel handling to make sure we don't allow URIs like
`javascript` and `data`, even though in theory these URIs were already
validated.

In addition, while testing this PR with
coder/coder-jetbrains-toolbox#209 I discovered
that we are also not compliant with
https://www.rfc-editor.org/rfc/rfc6749#section-4.1.2.1 which requires
the server to attach the local state if it was provided by the client in
the original request. Also it is optional but generally a good practice
to include `error_description` in the error responses. In fact we follow
this pattern for the other types of error responses. So this is not a
one off.

- resolves #20323
<img width="1485" height="771" alt="Cancel_page_with_invalid_uri"
src="https://github.com/user-attachments/assets/5539d234-9ce3-4dda-b421-d023fc9aa99e"
/>
<img width="486" height="746" alt="Coder Toolbox handling the Cancel
button"
src="https://github.com/user-attachments/assets/acab71a6-d29c-4fa9-80ba-3c0095bbdc8f"
/>

<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->
Copy link
Copy Markdown
Member

@code-asher code-asher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting down an approval, we can follow up on anything remaining separately!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants