Skip to content

infra: split shared Canton OAuth user into per-service users (api-server, relayer, indexer) #243

@salindne

Description

@salindne

Background

The api-server, relayer, and indexer all authenticate to Canton with the same OAuth credentials (CANTON_AUTH_CLIENT_ID / CANTON_AUTH_CLIENT_SECRET). The resulting JWT sub claim becomes their Canton user_id, so Canton sees a single user with the union of all rights any service needs. That same user appears to be shared with ChainSafe's autopilot app, since both operate the same daml-autopilot::1220... party.

A quick framing note up front: the daml-autopilot::1220... thing is a Canton party (on-chain identity with signing keys). The Canton user is a separate concept — a ledger user-management record, mapped from the JWT sub claim. Today we share both. This issue is about splitting the user (auth identity), not the party (on-chain identity). Services would still act as the same parties they do today.

Same CANTON_AUTH_CLIENT_ID is wired into:

  • pkg/config/defaults/config.api-server.{docker,local-devnet,mainnet}.yaml
  • pkg/config/defaults/config.relayer.{docker,local-devnet,mainnet}.yaml
  • pkg/config/defaults/config.indexer.{docker,local-devnet}.yaml

Same client_id → same JWT sub → same Canton user_id.

Rights the shared user currently has

Based on what's granted in code (infra: please run ListUserRights against prod and devnet ledgers to confirm the live state):

Both rights are wildcards. The user can act as any party and read as any party on the participant.

Note on per-user CanActAs grants: an earlier model granted CanActAs <userParty> per registered user via Client.GrantActAsParty (pkg/cantonsdk/identity/client.go:368). Those callers were removed in commit 766f169 when external parties + interactive submission replaced the older internal-party custodial model. The method still exists on the interface but has no production callers — CanExecuteAsAnyParty covers what those grants used to do.

Why this is a problem

  1. Privilege: the indexer never submits commands but currently has CanExecuteAsAnyParty. The relayer only ever acts as a single party (the bridge operator) but currently has wildcard any-party rights. Neither needs what it has.
  2. Blast radius: if CANTON_AUTH_CLIENT_SECRET leaks, all four services (api-server, relayer, indexer, autopilot) are compromised at once. One revocation, four outages.
  3. Auditability: Canton ledger logs user_id per command. Today every command from the middleware and autopilot reads as the same user_id — there is no way to tell from the ledger whether a transfer was submitted by the api-server's custodial flow, the relayer, or autopilot.
  4. Independent rotation / SLOs: rotating the shared secret today requires coordinating with the autopilot team. Per-user rate limits and metrics in Canton apply to the union of all four services.

Proposed split

Four dedicated OAuth clients → four dedicated Canton users, each with the minimum-necessary rights:

OAuth client Granted Canton rights Rationale
canton-mw-api-server CanExecuteAsAnyParty, CanReadAsAnyParty Submits commands acting as any external user party (interactive submission for transfers). Needs the wildcard execute claim.
canton-mw-relayer CanActAs <relayer_party> only Relayer only ever submits as the bridge operator party (cfg.OperatorParty). Single-party scope. No any-party rights needed.
canton-mw-indexer CanReadAsAnyParty only Indexer is read-only — it never submits commands, only subscribes via GetUpdates. PR #234 needs FiltersForAnyParty which requires CanReadAsAnyParty.
canton-autopilot unchanged Existing user, stays as-is. Just no longer shared with middleware.

Per-service evidence

api-server — submits commands acting as user parties (external parties via interactive submission):

User parties are allocated dynamically per registration; they're not knowable at provisioning time, so the wildcard execute right is the operational mechanism today.

relayer — only ever submits as cfg.OperatorParty:

indexer — read-only stream, no submissions:

Asks to infra

  1. Baseline: run ListUserRights against the current shared user on devnet + mainnet and share output, so we can confirm the assumption above matches reality.
  2. Provision three new Auth0 OAuth clients (devnet + mainnet pairs each):
    • canton-mw-api-server
    • canton-mw-relayer
    • canton-mw-indexer
  3. Register a Canton user for each (user_id = whatever sub claim the OAuth client emits).
  4. Grant rights per the table above — using the same UserManagementService.GrantUserRights pattern that grant-any-party-rights.go uses today. We're happy to extend that script with a --rights flag to support tighter per-service grants if useful.
  5. Provision secrets into the deployment env (per-service env vars, e.g. CANTON_AUTH_CLIENT_ID_API, CANTON_AUTH_CLIENT_ID_RELAYER, CANTON_AUTH_CLIENT_ID_INDEXER).
  6. Confirm the autopilot user is unchanged and untouched.

Follow-up middleware work (out of scope for infra)

These are tracked separately and depend on infra confirming feasibility:

  • Rename CANTON_AUTH_CLIENT_ID env vars to per-service variants in middleware deploy configs.
  • Add fail-fast startup checks that each service's user actually has the rights it needs. Especially relevant for the indexer — once PR feat(indexer): support external token indexing via FiltersForAnyParty #234 lands, a misconfigured user silently misses events instead of failing loudly. Such a check would catch the failure at boot.
  • Extend grant-any-party-rights.go with a --rights flag for per-service grants, instead of always granting both CanExecuteAsAnyParty + CanReadAsAnyParty.
  • Remove the now-unused Client.GrantActAsParty method (and its IdentityAdmin interface declaration) since callers were dropped in 766f169.

Trigger

This came up while reviewing PR #234 (feat(indexer): support external token indexing via FiltersForAnyParty), which depends on CanReadAsAnyParty being available to the indexer's user. Today that works "for free" because the api-server already has the right via the shared user — but that coincidence is exactly the problem this issue is trying to fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions