Skip to content

Restore OAuth against the reworked Cloud scope model#53

Merged
ChiragAgg5k merged 1 commit into
mainfrom
fix/oauth-cloud-scope-model
Jul 3, 2026
Merged

Restore OAuth against the reworked Cloud scope model#53
ChiragAgg5k merged 1 commit into
mainfrom
fix/oauth-cloud-scope-model

Conversation

@ChiragAgg5k

Copy link
Copy Markdown
Member

Summary

The Cloud console OAuth server reworked its scope model (RAR v2 — appwrite-labs/cloud#4546 and follow-ups): the old scope names (account.admin, teams.read, projects.read, …) no longer exist. The server now publishes ~120 fine-grained scopes (project:*, organization:*) plus umbrella scopes (all, project:all, organization:all), with concrete project/organization selection happening on the consent screen via RFC 9396 authorization_details.

That broke the hosted MCP's OAuth flow in two independent ways, because the MCP mirrors the authorization server's entire scopes_supported into its RFC 9728 protected-resource metadata and MCP clients request every scope advertised there:

  1. Stale discovery cache — the cache lived for the process lifetime, so production kept advertising the removed scope names. Clients requesting them get invalid_scope ("Scope 'account.admin' is not supported") from the authorize endpoint.
  2. Scope parameter overflow — even after a restart, mirroring the new catalog produces a 2,680-character scope string, over the authorize endpoint's 2,048-character cap on the scope parameter.

The rest of the contract is intact: discovery, JWKS (RS256), and aud binding via resource indicators all still work, and the issuer moving to the regional host (fra.cloud.appwrite.io) is already handled by the issuer-from-discovery logic.

Auth fixes

  • Curated scope advertising: the MCP now advertises openid profile email all project:all organization:all, intersected with the AS's live scopes_supported so a renamed/removed scope is never advertised. Falls back to mirroring the full discovered list when none of the preferred scopes exist (self-hosted projects with custom scope catalogs). Overridable via MCP_OAUTH_SCOPES (space-separated).
  • Discovery cache TTL: 5-minute TTL with stale-fallback when a refresh fails — authorization-server changes now propagate without a redeploy, and an AS blip doesn't take the MCP down.

Housekeeping (same branch)

  • constants.py: module-level constants consolidated into a single file, grouped by consumer. Dedupes DEFAULT_ENDPOINT (was defined in both auth.py and server.py); generic docs_search names gained a DOCS_ prefix; private-by-convention names dropped the leading underscore now that they're shared. Mutable/derived module state (_UPLOAD_TRANSPORT, SERVICE_CLASSES) stays where it lives.
  • Pyright in CI: pyright (basic mode over src/, configured in pyproject.toml) added to the dev group and the lint job, plus the AGENTS.md pre-PR checklist. Fixed the 13 pre-existing errors it found: Optional narrowing in docs_search._rank, ASGIApp typing for RequireBearer, AnyUrl construction for resource URIs, dict | None return on ToolManager.get_tool, a missing inline-content None guard in uploads, doc_param.description or "", the enum schema annotation, and the anyio.to_thread submodule import (previously relied on another library importing it first).

Verification

  • All 94 unit tests pass; ruff, black, and pyright clean.
  • Booted the HTTP transport locally against production Cloud: serves the curated 6-scope metadata with the fra.cloud.appwrite.io authorization server; 401 challenge intact.
  • End-to-end against production: dynamically registered a test client and hit /authorize with the curated scope set + resource=https://mcp.appwrite.io/mcp — it 302s to the consent screen instead of erroring.

Note

Production needs a deploy of this change — a bare restart would still hit failure mode 2.

The Cloud console OAuth server reworked its scope model (RAR v2): the old
scope names (account.admin, teams.read, ...) were replaced by OIDC scopes
plus umbrella all/project:all/organization:all scopes, with concrete
project/org selection handled on the consent screen via RFC 9396
authorization_details. That broke the hosted MCP twice over: the
process-lifetime discovery cache kept advertising the removed scope names
(clients got invalid_scope), and mirroring the new 120-scope catalog would
exceed the authorize endpoint's scope parameter length cap.

Auth fixes:
- Advertise a curated scope set (openid profile email all project:all
  organization:all), intersected with the authorization server's live
  scopes_supported; fall back to mirroring discovery for projects with
  custom catalogs. Overridable via MCP_OAUTH_SCOPES.
- Give the discovery cache a 5-minute TTL with stale-fallback on refresh
  failure, so authorization-server changes propagate without a redeploy.

Housekeeping in the same change:
- Consolidate module-level constants into constants.py (dedupes
  DEFAULT_ENDPOINT, prefixes the generic docs_search names with DOCS_).
- Add a pyright (basic) check to the lint job and fix the type errors it
  found: Optional annotations, ASGIApp typing, AnyUrl resource URIs, a
  missing inline-content guard, and the anyio.to_thread submodule import.
@greptile-apps

greptile-apps Bot commented Jul 3, 2026

Copy link
Copy Markdown

Greptile Summary

This PR restores the hosted MCP's OAuth flow against Appwrite Cloud's reworked RAR v2 scope model by curating the advertised scope set and adding a TTL-based discovery cache with stale fallback. It also consolidates module-level constants into constants.py and fixes 13 pre-existing Pyright errors.

  • Auth fixes: _advertised_scopes intersects a short preferred list (openid profile email all project:all organization:all) with the AS's live scopes_supported, eliminating both the stale-cache and scope-parameter-overflow failures; override via MCP_OAUTH_SCOPES; 5-minute discovery TTL with stale-serve on failed refresh.
  • Housekeeping: constants.py deduplifies constants previously scattered across server.py, operator.py, context.py, docs_search.py, and telemetry.py; Pyright added to the lint CI job and pyproject.toml; type fixes in service.py, tool_manager.py, server.py, and docs_search.py.

Confidence Score: 4/5

Safe to merge; all changes are additive and the core auth contract (issuer binding, audience check, JWKS RS256 verification) is unchanged. The two items worth watching are both edge-case behaviors under specific failure conditions.

The curated-scope and TTL-cache logic is the heart of this PR and is well-covered by the new test suite. Two non-blocking concerns exist: during an AS outage, stale metadata is served correctly but every request independently retries the failed fetch (10 s timeout each), because the stale entry's timestamp is never refreshed; and the MCP_OAUTH_SCOPES env override falls back to the full discovered catalog when none of its scopes intersect with the AS — counter to the operator's intent.

auth.py — the stale-fallback path in both authorization_server_metadata() and authorization_server_metadata_sync() and the _advertised_scopes fallback logic when the env override produces an empty intersection.

Important Files Changed

Filename Overview
src/mcp_server_appwrite/auth.py Core OAuth/discovery rework: adds TTL-based cache, stale fallback, and curated scope list. Two P2 concerns: repeated 10-second HTTP retries on every request during an AS outage (stale timestamp never refreshed), and the MCP_OAUTH_SCOPES env override silently falls back to the full catalog when none of its scopes intersect with the AS catalog.
src/mcp_server_appwrite/constants.py New module consolidating all package-level constants. Moves constants from server.py, operator.py, context.py, docs_search.py, and telemetry.py into one place, grouped by consumer. No functional changes; clean refactor.
src/mcp_server_appwrite/http_app.py Switches CORS_HEADERS and SERVER_VERSION imports to constants.py; adds ASGIApp type annotation to RequireBearer; no logic changes.
src/mcp_server_appwrite/operator.py Constants moved to constants.py; resource URIs now wrapped in AnyUrl() for Pydantic v2 type correctness. Clean, low-risk changes.
src/mcp_server_appwrite/server.py Constants moved to constants.py; optional narrowing fixed for definition access; AnyUrl wrapping for BlobResourceContents URI; inline-content None guard added before decoding. All pre-existing Pyright errors corrected.
tests/unit/test_auth.py Tests updated to use _store_discovery helper; six new tests covering TTL expiry, stale fallback, curated scope filtering, env override, and multi-AS discovery. Good coverage of the new cache and scope logic.
pyproject.toml Adds pyright>=1.1.390 to dev dependencies and [tool.pyright] config block (basic mode, Python 3.12, venv-resolved). Clean addition.

Reviews (1): Last reviewed commit: "Restore OAuth against the reworked Cloud..." | Re-trigger Greptile

Comment on lines 115 to 157
async def authorization_server_metadata() -> dict:
project_id = configured_project_id()
cached = _discovery_cache.get(project_id)
cached = _cached_discovery(project_id)
if cached is not None:
return cached

url = discovery_url()
async with httpx.AsyncClient(timeout=10.0, follow_redirects=True) as client:
resp = await client.get(url)
resp.raise_for_status()
metadata = _validate_discovery(resp.json(), url)

_discovery_cache[project_id] = metadata
try:
async with httpx.AsyncClient(timeout=10.0, follow_redirects=True) as client:
resp = await client.get(url)
resp.raise_for_status()
metadata = _validate_discovery(resp.json(), url)
except Exception as exc:
stale = _cached_discovery(project_id, allow_stale=True)
if stale is not None:
_log(f"Discovery refresh failed ({exc}); serving stale metadata.")
return stale
raise

_store_discovery(project_id, metadata)
return metadata


def authorization_server_metadata_sync() -> dict:
project_id = configured_project_id()
cached = _discovery_cache.get(project_id)
cached = _cached_discovery(project_id)
if cached is not None:
return cached

url = discovery_url()
resp = httpx.get(url, timeout=10.0, follow_redirects=True)
resp.raise_for_status()
metadata = _validate_discovery(resp.json(), url)
_discovery_cache[project_id] = metadata
try:
resp = httpx.get(url, timeout=10.0, follow_redirects=True)
resp.raise_for_status()
metadata = _validate_discovery(resp.json(), url)
except Exception as exc:
stale = _cached_discovery(project_id, allow_stale=True)
if stale is not None:
_log(f"Discovery refresh failed ({exc}); serving stale metadata.")
return stale
raise

_store_discovery(project_id, metadata)
return metadata

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Continuous retry overhead when AS is down

Both authorization_server_metadata() and authorization_server_metadata_sync() return stale data on a failed refresh, but they don't update the cached timestamp. This means every subsequent request — including every _verify_sync call, which runs in a thread-pool thread — will attempt a fresh 10-second HTTP fetch before falling back to the stale copy. Under any meaningful load during an AS outage, concurrent threads each block for the full timeout=10.0, which can exhaust the threadpool and dramatically degrade response times despite the MCP continuing to serve correct results. A simple fix is to bump the cached entry's timestamp when a stale hit is served so the entry is treated as fresh for the next TTL window.

Comment on lines +160 to +178
def _advertised_scopes(metadata: dict) -> list[str]:
"""The scope set to advertise: the preferred scopes intersected with the
authorization server's live ``scopes_supported`` (so a renamed/removed scope
is never advertised). Falls back to mirroring the full discovery list when
none of the preferred scopes exist — e.g. a self-hosted project with a
custom, compact scope catalog."""
discovered = metadata.get("scopes_supported")
if not isinstance(discovered, list):
raise ValueError(
f"authorization server discovery missing scopes_supported: {discovery_url()}"
)
return scopes
scopes = [scope for scope in preferred_scopes() if scope in discovered]
if scopes:
return scopes
_log(
"None of the preferred scopes are in the authorization server's "
"scopes_supported; advertising the full discovered list."
)
return discovered

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 MCP_OAUTH_SCOPES override silently falls back to the full catalog

When MCP_OAUTH_SCOPES is set but none of the specified scopes exist in the AS's scopes_supported, _advertised_scopes falls back to mirroring the full discovered list. For Cloud, that's 120+ scopes — the overflow condition the env var is specifically meant to prevent. An operator who sets this variable to work around a non-standard AS catalog would find it has no effect, and the same scope-parameter overflow will occur. The fallback log message fires, but the behavior is counter-intuitive and the current tests only cover the case where the override scopes exist in the catalog.

@ChiragAgg5k ChiragAgg5k merged commit ba0b18e into main Jul 3, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant