From e2450a497a6e23150e3ddfecc497fe108d57ba04 Mon Sep 17 00:00:00 2001 From: Vasyl Vdovychenko Date: Tue, 16 Jun 2026 16:28:22 -0400 Subject: [PATCH] =?UTF-8?q?feat(ai):=20DeviceFlowTokenProvider=20=E2=80=94?= =?UTF-8?q?=20completes=20MCP=20device=20flow=20(AI-050b)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The CLI side of AI-050a: the MCP server now obtains a per-user JWT via the device grant instead of a static shared token. - IMcpTokenProvider goes async: Task GetTokenAsync with Authorized | Pending(verificationUri, userCode) | Failed. AuthorizedRequest awaits it; Pending/Failed → McpUnauthorizedException → the catalog renders an actionable IsError ('open {uri} and enter code {user_code}, then retry'). - DeviceFlowTokenProvider: local JWT-exp check → body-based refresh via /auth/refresh-mobile (rotates the refresh token) → single-flight device flow. First user-scoped call with no creds kicks off /auth/device/code in the BACKGROUND (stderr prompt, poll loop at interval until approved/ expired/denied) and returns Pending IMMEDIATELY — never blocks the tool call. Self-heals (finally → ClearInFlight) so a later call re-initiates. - TokenCache ~/.textstack/mcp-token.json (XDG-aware, env override), 0600 set BEFORE the secret is written; group/world-readable files refused on read. - Mode: TEXTSTACK_MCP_TOKEN set → StaticEnvTokenProvider (CI escape hatch); else DeviceFlowTokenProvider. stdout stays JSON-RPC only (prompt → stderr). QA fixes: single-flight claimed on the device-code REQUEST (concurrent first-callers issue exactly ONE /auth/device/code and share the same Pending code — was orphaning server codes); failed code POST is retryable; genuine caller cancellation propagates without killing the background poll. 29 MCP device-flow/cache tests (non-blocking timing, single-flight concurrency, refresh+rotation, exp-decode edge cases, 0600, recovery). 561 unit green; StudyBuddy set-equality green. Unlocks AI-048b (write tool) on a per-user consented token. Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 11 + .../Auth/DeviceFlowTokenProvider.cs | 479 +++++++++++++ .../Auth/IMcpTokenProvider.cs | 43 +- .../Auth/StaticEnvTokenProvider.cs | 14 +- .../Ai/TextStack.Ai.Mcp/Auth/TokenCache.cs | 137 ++++ .../Http/TextStackApiClient.cs | 70 +- .../Ai/TextStack.Ai.Mcp/McpBridgeOptions.cs | 19 +- backend/src/Ai/TextStack.Ai.Mcp/Program.cs | 39 +- .../TextStack.Ai.Mcp/TextStack.Ai.Mcp.csproj | 6 + .../TextStack.Ai.Mcp/Tools/McpToolCatalog.cs | 13 +- .../TextStack.UnitTests/McpDeviceFlowTests.cs | 678 ++++++++++++++++++ .../McpTokenProviderTests.cs | 142 ++++ 12 files changed, 1607 insertions(+), 44 deletions(-) create mode 100644 backend/src/Ai/TextStack.Ai.Mcp/Auth/DeviceFlowTokenProvider.cs create mode 100644 backend/src/Ai/TextStack.Ai.Mcp/Auth/TokenCache.cs create mode 100644 tests/TextStack.UnitTests/McpDeviceFlowTests.cs create mode 100644 tests/TextStack.UnitTests/McpTokenProviderTests.cs diff --git a/CHANGELOG.md b/CHANGELOG.md index 86e0573d..15505e23 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,17 @@ ## [Unreleased] +### Phase 8 — MCP device-flow token provider (AI-050b) (2026-06-16) + +CLI-side completion of the device flow built in AI-050a: the headless MCP bridge now obtains a per-user TextStack JWT on its own — no `TEXTSTACK_MCP_TOKEN` needed. All in `backend/src/Ai/TextStack.Ai.Mcp/` + its unit tests. + +- **`IMcpTokenProvider` is now async + three-valued.** Replaced the sync `string? GetToken()` with `Task GetTokenAsync(CancellationToken)`, where `TokenResult` is a closed hierarchy `Authorized(accessToken) | Pending(verificationUri, userCode) | Failed(message)`. `TextStackApiClient.AuthorizedRequest` became async: `Authorized` → attach Bearer; `Pending`/`Failed` → throw `McpUnauthorizedException` (now carries the verification URL + user code, or the failure message). The catalog's `InvokeAsync` catch renders an **actionable** `IsError` — Pending: `"authentication required — open {uri} and enter code {user_code} to connect TextStack, then retry."`; Failed: the reason. `StaticEnvTokenProvider` now returns `Authorized(token)` or `Failed("no TEXTSTACK_MCP_TOKEN configured")`. +- **`DeviceFlowTokenProvider` (non-blocking).** Singleton with its own `HttpClient` (base `TEXTSTACK_API_URL`, Host header pinned, 15s timeout). `GetTokenAsync`: (1) cached access token still valid → `Authorized` (expiry decided by a **local JWT `exp` decode** — base64url-decode the payload, read the claim, NO signature validation; unparseable = expired); (2) else cached refresh token → **body-based** `POST /auth/refresh-mobile` `{ refreshToken }` → cache + `Authorized`, falling through on 401/failure; (3) else start a device flow (single-flight) — `POST /auth/device/code`, log the verification URL + code to **stderr**, spawn a BACKGROUND poll of `POST /auth/device/token` honoring `authorization_pending` (keep polling), `slow_down` (back off), `expired_token`/`access_denied` (clear in-flight so a later call re-initiates), and success (cache tokens), then **return `Pending` IMMEDIATELY** — the tool call never blocks on the browser approval. Concurrent calls share one flow (lock-guarded in-flight + cache state); transport blips during polling are swallowed until the device code's own `expires_in` deadline. +- **QA fix (P2) — single-flight on the `/auth/device/code` POST itself.** The in-flight slot was claimed AFTER the device-code POST, so N concurrent first-callers (cold cache) each fired their OWN `POST /auth/device/code`, orphaning server-side device codes (state leak + wasted round-trips). The provider now holds a `Task? _deviceCodeRequest` claimed **under the lock BEFORE** the network call (lock never spans the `await`): the winner owns the shared request; concurrent callers join it and all return the **same** `Pending` (same `verification_uri` + `user_code`) — the real code, not a half-state. The shared POST runs on `CancellationToken.None` so one caller's cancellation can't tear it down (a cancelled caller stops `await`-ing via `WaitAsync(ct)` and propagates). A FAILED device-code POST clears the slot (guarded by reference-equality so a newer request isn't clobbered) so a later cold call **retries** instead of wedging on a faulted task; terminal/expiry/success funnel through `ClearInFlight`, which now clears `_deviceCodeRequest` too for a clean re-initiate. +- **`TokenCache` (0600).** Persists `{ accessToken, refreshToken, accessExpiresAt }` at `$XDG_CONFIG_HOME/textstack/mcp-token.json` (→ `~/.textstack/mcp-token.json`, override `TEXTSTACK_MCP_TOKEN_CACHE`), creating the dir as needed. Writes with `0600` on Unix (created owner-only BEFORE the secret is written); on read, a **group/world-readable** file is refused and ignored (treated as no cache → re-auth) so a leaked secret is never trusted. Windows relies on the user-profile ACL. +- **Bridge wiring (one branch).** `TEXTSTACK_MCP_TOKEN` set → `StaticEnvTokenProvider` (CI / escape hatch); else → `DeviceFlowTokenProvider` (the default) with a named auth `HttpClient` + the `TokenCache`. stdout stays JSON-RPC only — the device-flow instructions go to stderr via the SDK's stderr-routed logger. +- **Tests**: 25 unit tests (fake `HttpMessageHandler`, no network, temp-dir cache) — first call returns `Pending` without blocking; background poll (pending ×2 → 200) caches → later call `Authorized`; expired-access + valid-refresh → refresh request shape → `Authorized`; refresh 401 → fresh device flow; local JWT exp decode (future/past/garbage/no-exp); cache round-trip + `0600` perms + world-readable-ignored + path resolution; async `StaticEnvTokenProvider`; catalog rendering of `Authorized`/`Pending`/`Failed` for a user-scoped tool. **Single-flight (P2)**: 20 concurrent first-callers issue **exactly ONE** `/auth/device/code` and all get the **same** `Pending` code (`DeviceCodeCount == 1`); a caller arriving **while the POST is mid-flight** (gated handler) joins the one request and gets the real code; a **failed** device-code POST clears the slot so the next call retries (issues a fresh code). Existing AI-047/048a read-tool tests migrated to the async provider. No `ITool` added (StudyBuddy set-equality stays green). + ### Phase 8 — Device Authorization Grant backend (AI-050a) (2026-06-16) Backend for the OAuth 2.0 **Device Authorization Grant (RFC 8628)** so the headless MCP CLI (AI-050b) can obtain a per-user TextStack JWT without a browser redirect. This is the BACKEND slice; the consent page lives in `apps/web` (built concurrently by the frontend agent). diff --git a/backend/src/Ai/TextStack.Ai.Mcp/Auth/DeviceFlowTokenProvider.cs b/backend/src/Ai/TextStack.Ai.Mcp/Auth/DeviceFlowTokenProvider.cs new file mode 100644 index 00000000..9e1816cb --- /dev/null +++ b/backend/src/Ai/TextStack.Ai.Mcp/Auth/DeviceFlowTokenProvider.cs @@ -0,0 +1,479 @@ +using System.Net; +using System.Net.Http.Json; +using System.Text.Json; +using System.Text.Json.Serialization; +using Microsoft.Extensions.Logging; + +namespace TextStack.Ai.Mcp.Auth; + +/// +/// The real (default) for the headless MCP CLI: +/// implements the OAuth 2.0 Device Authorization Grant (RFC 8628) client built in +/// AI-050a's endpoints. +/// +/// Non-blocking by design — a tool call must never hang on the user approving in a +/// browser. resolves in priority order: +/// 1. cached access token still valid (local JWT exp check) → Authorized; +/// 2. cached refresh token → body-based refresh (/auth/refresh-mobile) +/// → cache + Authorized, falling through on 401/failure; +/// 3. otherwise start a device flow if none is running (single-flight), print +/// the verification URL + code to STDERR, kick off a BACKGROUND poll, and +/// IMMEDIATELY return Pending; subsequent calls return Pending until the +/// background poll caches tokens, after which they return Authorized. +/// +/// Thread-safety: a single lock guards the cached tokens and the in-flight flow +/// so concurrent tool calls never start two flows or tear the cache. The poll +/// runs on a tracked background task and never writes to stdout. +/// +public sealed class DeviceFlowTokenProvider : IMcpTokenProvider +{ + // Treat the access token as expired this far ahead of its real exp so an + // in-flight request can't race the clock and 401 mid-call. + private static readonly TimeSpan ExpirySkew = TimeSpan.FromSeconds(30); + + private static readonly JsonSerializerOptions JsonOptions = new() + { + PropertyNameCaseInsensitive = true, + }; + + private readonly HttpClient _http; + private readonly TokenCache _cache; + private readonly ILogger _logger; + private readonly TimeProvider _time; + // Test seam: floor for the poll interval so tests don't sleep real seconds. + private readonly TimeSpan _minPollInterval; + + private readonly object _gate = new(); + private CachedTokens? _tokens; + private InFlightFlow? _inFlight; + // Single-flight slot for the /auth/device/code POST itself: concurrent first-callers + // share this ONE request instead of each firing their own (which would orphan + // server-side device codes). Set under _gate before awaiting the network call. + private Task? _deviceCodeRequest; + + public DeviceFlowTokenProvider( + HttpClient http, + TokenCache cache, + ILogger logger, + TimeProvider? time = null, + TimeSpan? minPollInterval = null) + { + _http = http; + _cache = cache; + _logger = logger; + _time = time ?? TimeProvider.System; + _minPollInterval = minPollInterval ?? TimeSpan.FromSeconds(1); + _tokens = cache.Read(); + } + + public async Task GetTokenAsync(CancellationToken ct) + { + // 1. Cached, unexpired access token → Authorized. + CachedTokens? cached; + InFlightFlow? pendingFlow; + lock (_gate) + { + cached = _tokens; + pendingFlow = _inFlight; + } + + if (cached is not null && !IsExpired(cached.AccessToken)) + return new TokenResult.Authorized(cached.AccessToken); + + // 2. Cached refresh token → try a body-based refresh. + if (cached is not null && !string.IsNullOrEmpty(cached.RefreshToken)) + { + var refreshed = await TryRefreshAsync(cached.RefreshToken, ct); + if (refreshed is not null) + { + StoreTokens(refreshed); + return new TokenResult.Authorized(refreshed.AccessToken); + } + // Refresh failed (401 / transport) → fall through to a device flow. + } + + // 3. A flow is already in flight (its user_code is known) → relay Pending. + if (pendingFlow is not null) + return new TokenResult.Pending(pendingFlow.VerificationUri, pendingFlow.UserCode); + + // 3b. Start (or join) the single-flight device-code request and return Pending. + return await StartOrJoinDeviceFlowAsync(ct); + } + + // ── device flow start (non-blocking, single-flight on the /auth/device/code POST) ─ + + private async Task StartOrJoinDeviceFlowAsync(CancellationToken ct) + { + // Claim the single-flight slot UNDER the lock, BEFORE any network I/O, so N + // concurrent first-callers share ONE /auth/device/code POST. The winner owns + // the request task; losers await the same task and return the same Pending. + Task request; + bool isWinner; + lock (_gate) + { + // Another caller already obtained a code while we waited on the lock. + if (_inFlight is not null) + return new TokenResult.Pending(_inFlight.VerificationUri, _inFlight.UserCode); + + if (_deviceCodeRequest is null) + { + // We are the winner: create the shared request task under the lock. + _deviceCodeRequest = RequestDeviceCodeAsync(); + isWinner = true; + } + else + { + isWinner = false; + } + request = _deviceCodeRequest; + } + + DeviceCodeResponse code; + try + { + // Awaited outside the lock. ct is the WINNER's caller token; losers pass + // their own ct below. A cancelled loser stops waiting without killing the + // shared request (it runs on CancellationToken.None inside RequestDeviceCodeAsync). + code = await request.WaitAsync(ct); + } + catch (OperationCanceledException) when (ct.IsCancellationRequested) + { + // The caller cancelled while waiting — propagate. The shared request lives + // on for other callers; on its own failure it self-clears the slot. + throw; + } + catch (Exception) + { + // The shared device-code POST failed (network/parse). Clear the slot so a + // later cold call can retry rather than wedging on a dead task. + ClearDeviceCodeRequest(request); + return new TokenResult.Failed("could not reach TextStack to start device authorization"); + } + + // Promote the shared request into an in-flight flow + background poll EXACTLY + // once. The lock makes this idempotent across all concurrent callers. + InFlightFlow flow; + lock (_gate) + { + if (_inFlight is not null) + return new TokenResult.Pending(_inFlight.VerificationUri, _inFlight.UserCode); + + flow = new InFlightFlow(code.VerificationUri ?? "", code.UserCode!); + _inFlight = flow; + + // STDERR only — stdout is reserved for JSON-RPC framing. Logged once, + // under the lock, by whichever caller wins the promotion. + _logger.LogInformation( + "To connect TextStack, open {VerificationUri} and enter code {UserCode}", + flow.VerificationUri, flow.UserCode); + + // Background poll — fire-and-track, never awaited by the tool call. + flow.PollTask = Task.Run(() => PollLoopAsync(code), CancellationToken.None); + } + + _ = isWinner; // winner vs. loser only differs in who created the task above. + return new TokenResult.Pending(flow.VerificationUri, flow.UserCode); + } + + // Performs the actual /auth/device/code POST on CancellationToken.None so a single + // caller's cancellation can't tear down the request shared by other concurrent + // callers. Throws on failure (caller clears the slot to make it retryable). + private async Task RequestDeviceCodeAsync() + { + using var request = new HttpRequestMessage(HttpMethod.Post, "/auth/device/code"); + using var response = await _http.SendAsync(request, CancellationToken.None); + if (!response.IsSuccessStatusCode) + throw new InvalidOperationException("could not start TextStack device authorization"); + + var parsed = await response.Content.ReadFromJsonAsync(JsonOptions); + if (parsed is null || string.IsNullOrEmpty(parsed.DeviceCode) || string.IsNullOrEmpty(parsed.UserCode)) + throw new InvalidOperationException("could not start TextStack device authorization"); + + return parsed; + } + + // ── background polling loop ─────────────────────────────────────────────────── + + private async Task PollLoopAsync(DeviceCodeResponse code) + { + var interval = ClampInterval(code.Interval); + var lifetime = code.ExpiresIn > 0 ? TimeSpan.FromSeconds(code.ExpiresIn) : TimeSpan.FromMinutes(10); + var deadline = _time.GetUtcNow() + lifetime; + + try + { + while (_time.GetUtcNow() < deadline) + { + await Task.Delay(interval, _time, CancellationToken.None); + + PollOutcome outcome; + try + { + outcome = await PollOnceAsync(code.DeviceCode!); + } + catch (Exception) + { + // Transport blip — keep polling until the device code expires. + continue; + } + + switch (outcome.Kind) + { + case PollKind.Approved: + StoreTokens(outcome.Tokens!); + ClearInFlight(); + return; + case PollKind.Pending: + continue; // keep waiting for the user to approve + case PollKind.SlowDown: + interval += TimeSpan.FromSeconds(5); + continue; + case PollKind.Terminal: + ClearInFlight(); // expired_token / access_denied → allow a fresh flow later + return; + } + } + } + finally + { + // Lifetime exhausted without approval → clear so a later call re-initiates. + ClearInFlight(); + } + } + + private async Task PollOnceAsync(string deviceCode) + { + using var request = new HttpRequestMessage(HttpMethod.Post, "/auth/device/token") + { + Content = JsonContent.Create(new DeviceTokenRequestBody( + "urn:ietf:params:oauth:grant-type:device_code", deviceCode)), + }; + using var response = await _http.SendAsync(request, CancellationToken.None); + + if (response.StatusCode == HttpStatusCode.OK) + { + var body = await response.Content.ReadFromJsonAsync(JsonOptions); + if (body is null || string.IsNullOrEmpty(body.AccessToken)) + return PollOutcome.Terminal; + + var refresh = body.RefreshToken ?? ""; + var expiresAt = ReadExp(body.AccessToken) ?? _time.GetUtcNow().AddMinutes(15); + return PollOutcome.Approved(new CachedTokens(body.AccessToken, refresh, expiresAt)); + } + + if (response.StatusCode == HttpStatusCode.BadRequest) + { + var err = await response.Content.ReadFromJsonAsync(JsonOptions); + return err?.Error switch + { + "authorization_pending" => PollOutcome.Pending, + "slow_down" => PollOutcome.SlowDown, + _ => PollOutcome.Terminal, // expired_token / access_denied / anything else + }; + } + + // Any other status: treat as a transient blip — keep polling. + return PollOutcome.Pending; + } + + // ── body-based refresh (/auth/refresh-mobile) ───────────────────────────────── + + private async Task TryRefreshAsync(string refreshToken, CancellationToken ct) + { + try + { + using var request = new HttpRequestMessage(HttpMethod.Post, "/auth/refresh-mobile") + { + Content = JsonContent.Create(new RefreshRequestBody(refreshToken)), + }; + using var response = await _http.SendAsync(request, ct); + if (!response.IsSuccessStatusCode) + return null; + + var body = await response.Content.ReadFromJsonAsync(JsonOptions, ct); + if (body is null || string.IsNullOrEmpty(body.AccessToken)) + return null; + + var newRefresh = string.IsNullOrEmpty(body.RefreshToken) ? refreshToken : body.RefreshToken; + var expiresAt = ReadExp(body.AccessToken) ?? _time.GetUtcNow().AddMinutes(15); + return new CachedTokens(body.AccessToken, newRefresh, expiresAt); + } + catch (OperationCanceledException) when (ct.IsCancellationRequested) + { + throw; + } + catch (Exception) + { + return null; + } + } + + /// + /// Test seam: awaits the current in-flight device-flow poll (if any) so tests + /// can deterministically wait for the background loop to finish instead of + /// racing on wall-clock sleeps. No-op when no flow is running. + /// + internal async Task WaitForBackgroundPollAsync() + { + Task? poll; + lock (_gate) + { + poll = _inFlight?.PollTask; + } + if (poll is not null) + await poll; + } + + // ── state helpers ───────────────────────────────────────────────────────────── + + private void StoreTokens(CachedTokens tokens) + { + lock (_gate) + { + _tokens = tokens; + } + _cache.Write(tokens); + } + + private void ClearInFlight() + { + lock (_gate) + { + _inFlight = null; + // Clear the device-code slot too so a later cold call re-initiates cleanly + // (terminal/expiry/success all funnel through here). + _deviceCodeRequest = null; + } + } + + // Clears the device-code single-flight slot iff it still points at the failed + // request, so a subsequent GetTokenAsync retries with a fresh POST instead of + // re-awaiting a faulted task. Guarded so we never clobber a newer request. + private void ClearDeviceCodeRequest(Task failed) + { + lock (_gate) + { + if (ReferenceEquals(_deviceCodeRequest, failed)) + _deviceCodeRequest = null; + } + } + + private bool IsExpired(string accessToken) + { + var exp = ReadExp(accessToken); + // Unparseable / no exp → treat as expired (force refresh / re-auth). + if (exp is null) + return true; + return _time.GetUtcNow() >= exp.Value - ExpirySkew; + } + + private TimeSpan ClampInterval(int seconds) + { + var requested = seconds > 0 ? TimeSpan.FromSeconds(seconds) : TimeSpan.FromSeconds(5); + return requested < _minPollInterval ? _minPollInterval : requested; + } + + // ── local JWT exp decode (NO signature validation) ──────────────────────────── + + /// + /// Reads the exp claim (Unix seconds) from a JWT's payload segment + /// WITHOUT validating the signature — we only need the expiry to decide whether + /// to refresh. Returns null for any malformed / missing-exp token (caller + /// treats null as expired). + /// + internal static DateTimeOffset? ReadExp(string jwt) + { + try + { + var parts = jwt.Split('.'); + if (parts.Length < 2) + return null; + + var payload = Base64UrlDecode(parts[1]); + using var doc = JsonDocument.Parse(payload); + if (!doc.RootElement.TryGetProperty("exp", out var expElement)) + return null; + + // exp may be a number or (rarely) a numeric string. + long exp = expElement.ValueKind switch + { + JsonValueKind.Number when expElement.TryGetInt64(out var n) => n, + JsonValueKind.String when long.TryParse(expElement.GetString(), out var s) => s, + _ => -1, + }; + if (exp < 0) + return null; + + return DateTimeOffset.FromUnixTimeSeconds(exp); + } + catch + { + return null; + } + } + + private static byte[] Base64UrlDecode(string segment) + { + var s = segment.Replace('-', '+').Replace('_', '/'); + switch (s.Length % 4) + { + case 2: s += "=="; break; + case 3: s += "="; break; + } + return Convert.FromBase64String(s); + } + + // ── in-flight flow state ────────────────────────────────────────────────────── + + private sealed class InFlightFlow(string verificationUri, string userCode) + { + public string VerificationUri { get; } = verificationUri; + public string UserCode { get; } = userCode; + public Task? PollTask { get; set; } + } + + // ── poll outcome ────────────────────────────────────────────────────────────── + + private enum PollKind { Approved, Pending, SlowDown, Terminal } + + private readonly struct PollOutcome + { + public PollKind Kind { get; private init; } + public CachedTokens? Tokens { get; private init; } + + public static PollOutcome Approved(CachedTokens tokens) => + new() { Kind = PollKind.Approved, Tokens = tokens }; + public static PollOutcome Pending => new() { Kind = PollKind.Pending }; + public static PollOutcome SlowDown => new() { Kind = PollKind.SlowDown }; + public static PollOutcome Terminal => new() { Kind = PollKind.Terminal }; + } + + // ── wire DTOs (device endpoints use snake_case; refresh uses camelCase) ──────── + + private sealed record DeviceCodeResponse( + [property: JsonPropertyName("device_code")] string? DeviceCode, + [property: JsonPropertyName("user_code")] string? UserCode, + [property: JsonPropertyName("verification_uri")] string? VerificationUri, + [property: JsonPropertyName("verification_uri_complete")] string? VerificationUriComplete, + [property: JsonPropertyName("expires_in")] int ExpiresIn, + [property: JsonPropertyName("interval")] int Interval); + + private sealed record DeviceTokenRequestBody( + [property: JsonPropertyName("grant_type")] string GrantType, + [property: JsonPropertyName("device_code")] string DeviceCode); + + private sealed record DeviceTokenResponse( + [property: JsonPropertyName("access_token")] string? AccessToken, + [property: JsonPropertyName("refresh_token")] string? RefreshToken, + [property: JsonPropertyName("token_type")] string? TokenType); + + private sealed record DeviceErrorResponse( + [property: JsonPropertyName("error")] string? Error); + + private sealed record RefreshRequestBody( + [property: JsonPropertyName("refreshToken")] string RefreshToken); + + private sealed record MobileAuthBody( + [property: JsonPropertyName("accessToken")] string? AccessToken, + [property: JsonPropertyName("refreshToken")] string? RefreshToken); +} diff --git a/backend/src/Ai/TextStack.Ai.Mcp/Auth/IMcpTokenProvider.cs b/backend/src/Ai/TextStack.Ai.Mcp/Auth/IMcpTokenProvider.cs index 384820fb..2611ec42 100644 --- a/backend/src/Ai/TextStack.Ai.Mcp/Auth/IMcpTokenProvider.cs +++ b/backend/src/Ai/TextStack.Ai.Mcp/Auth/IMcpTokenProvider.cs @@ -4,14 +4,45 @@ namespace TextStack.Ai.Mcp.Auth; /// Supplies the Bearer token the bridge attaches to user-scoped tool calls /// (list_my_highlights, list_my_vocabulary, ask_book). /// -/// AI-048a ships (reads the reserved -/// TEXTSTACK_MCP_TOKEN env var). AI-050 swaps the single DI registration -/// for a device-flow provider — no tool / client / catalog signature changes. +/// AI-048a shipped a sync string? GetToken() backed by +/// (the reserved TEXTSTACK_MCP_TOKEN +/// env var). AI-050b makes it ASYNC and three-valued so the device-flow provider +/// () can refresh / start a flow on demand +/// without blocking the tool call: +/// • → attach Bearer, proceed. +/// • → a device flow is in progress; the +/// handler renders an actionable "open {uri}, enter {code}" IsError. +/// • → no token and no flow possible; the +/// handler renders the message as an IsError. /// -/// A null return means "no token available": the handler returns a clean -/// IsError ("authentication required") and never issues the HTTP call. +/// The provider NEVER throws for the expected no-token states (it returns +/// Pending/Failed); maps those to a clean +/// so the catalog fails-clean and the +/// HTTP call is never issued. /// public interface IMcpTokenProvider { - string? GetToken(); + Task GetTokenAsync(CancellationToken ct); +} + +/// +/// Three-valued result of a token request. A closed hierarchy: every consumer +/// switches on exactly these three cases. +/// +public abstract record TokenResult +{ + private TokenResult() { } + + /// A usable access token — attach as Bearer. + public sealed record Authorized(string AccessToken) : TokenResult; + + /// + /// A device flow is in progress (not yet approved). The user must visit + /// and enter ; + /// a later call returns once approved. + /// + public sealed record Pending(string VerificationUri, string UserCode) : TokenResult; + + /// No token is available and none can be obtained (carries why). + public sealed record Failed(string Message) : TokenResult; } diff --git a/backend/src/Ai/TextStack.Ai.Mcp/Auth/StaticEnvTokenProvider.cs b/backend/src/Ai/TextStack.Ai.Mcp/Auth/StaticEnvTokenProvider.cs index db4a7451..8fdf25c8 100644 --- a/backend/src/Ai/TextStack.Ai.Mcp/Auth/StaticEnvTokenProvider.cs +++ b/backend/src/Ai/TextStack.Ai.Mcp/Auth/StaticEnvTokenProvider.cs @@ -1,11 +1,12 @@ namespace TextStack.Ai.Mcp.Auth; /// -/// Interim backed by the static -/// TEXTSTACK_MCP_TOKEN env var (surfaced via ). +/// backed by the static TEXTSTACK_MCP_TOKEN +/// env var (surfaced via ). /// -/// This is the AI-050 swap point: replacing the DI registration of this type with -/// a device-flow provider is a one-line change in Program.cs. +/// This is the CI / escape-hatch provider: when the env var is set the bridge +/// uses it instead of the device flow (one branch in Program.cs). A blank +/// token is treated as absent and yields . /// public sealed class StaticEnvTokenProvider : IMcpTokenProvider { @@ -17,5 +18,8 @@ public StaticEnvTokenProvider(McpBridgeOptions options) _token = string.IsNullOrWhiteSpace(options.McpToken) ? null : options.McpToken; } - public string? GetToken() => _token; + public Task GetTokenAsync(CancellationToken ct) => + Task.FromResult(_token is null + ? new TokenResult.Failed("no TEXTSTACK_MCP_TOKEN configured") + : new TokenResult.Authorized(_token)); } diff --git a/backend/src/Ai/TextStack.Ai.Mcp/Auth/TokenCache.cs b/backend/src/Ai/TextStack.Ai.Mcp/Auth/TokenCache.cs new file mode 100644 index 00000000..837dc736 --- /dev/null +++ b/backend/src/Ai/TextStack.Ai.Mcp/Auth/TokenCache.cs @@ -0,0 +1,137 @@ +using System.Text.Json; +using System.Text.Json.Serialization; + +namespace TextStack.Ai.Mcp.Auth; + +/// +/// The persisted device-flow credentials (access + refresh token + the access +/// token's expiry). Written by after a +/// successful device authorization or refresh; read on startup to skip re-auth. +/// +public sealed record CachedTokens( + [property: JsonPropertyName("accessToken")] string AccessToken, + [property: JsonPropertyName("refreshToken")] string RefreshToken, + [property: JsonPropertyName("accessExpiresAt")] DateTimeOffset AccessExpiresAt); + +/// +/// On-disk, single-user cache for the device-flow tokens. Default location: +/// $XDG_CONFIG_HOME/textstack/mcp-token.json (falling back to +/// ~/.textstack/mcp-token.json); overridable via +/// TEXTSTACK_MCP_TOKEN_CACHE (an explicit file path). +/// +/// SECURITY: tokens are secrets. On Unix the file is written with mode +/// 0600 (owner read/write only). On READ, if the file is group- or +/// world-readable on Unix it is treated as compromised and IGNORED (returns +/// null) rather than trusted — the caller simply re-runs the device flow. On +/// Windows we rely on the user-profile ACL (no explicit mode is set). +/// +public sealed class TokenCache +{ + private const UnixFileMode OwnerOnly = UnixFileMode.UserRead | UnixFileMode.UserWrite; + private const UnixFileMode GroupOrWorld = + UnixFileMode.GroupRead | UnixFileMode.GroupWrite | UnixFileMode.GroupExecute | + UnixFileMode.OtherRead | UnixFileMode.OtherWrite | UnixFileMode.OtherExecute; + + private static readonly JsonSerializerOptions JsonOptions = new() + { + PropertyNameCaseInsensitive = true, + WriteIndented = false, + }; + + private readonly string _path; + + public TokenCache(string path) => _path = path; + + /// The absolute file path this cache reads/writes. + public string Path => _path; + + /// + /// Resolves the cache file path: explicit override first, then + /// $XDG_CONFIG_HOME/textstack/, then ~/.textstack/. + /// + public static string ResolvePath(string? overridePath) + { + if (!string.IsNullOrWhiteSpace(overridePath)) + return overridePath; + + var xdg = Environment.GetEnvironmentVariable("XDG_CONFIG_HOME"); + if (!string.IsNullOrWhiteSpace(xdg)) + return System.IO.Path.Combine(xdg, "textstack", "mcp-token.json"); + + var home = Environment.GetFolderPath(Environment.SpecialFolder.UserProfile); + return System.IO.Path.Combine(home, ".textstack", "mcp-token.json"); + } + + /// + /// Reads the cached tokens, or null when absent / unreadable / unsafe-perms / + /// malformed. Never throws — a bad cache is just a cache miss (re-auth). + /// + public CachedTokens? Read() + { + try + { + if (!File.Exists(_path)) + return null; + + // Refuse a group/world-readable secret on Unix — treat as no cache so + // the caller re-runs the device flow rather than trusting a leaked file. + if (!OperatingSystem.IsWindows()) + { + var mode = File.GetUnixFileMode(_path); + if ((mode & GroupOrWorld) != 0) + return null; + } + + var json = File.ReadAllText(_path); + if (string.IsNullOrWhiteSpace(json)) + return null; + + return JsonSerializer.Deserialize(json, JsonOptions); + } + catch + { + // Unreadable / malformed → cache miss. + return null; + } + } + + /// + /// Writes the cached tokens with owner-only perms (0600) on Unix. Creates the + /// parent directory if missing. Best-effort: a write failure (e.g. read-only + /// dir) is swallowed — the in-memory tokens still work for this session. + /// + public void Write(CachedTokens tokens) + { + try + { + var dir = System.IO.Path.GetDirectoryName(_path); + if (!string.IsNullOrEmpty(dir)) + Directory.CreateDirectory(dir); + + var json = JsonSerializer.Serialize(tokens, JsonOptions); + + if (OperatingSystem.IsWindows()) + { + File.WriteAllText(_path, json); + } + else + { + // Create with 0600 BEFORE writing the secret so it is never briefly + // world-readable. If the file pre-exists with looser perms, tighten it. + using (var fs = new FileStream( + _path, FileMode.Create, FileAccess.Write, FileShare.None, + bufferSize: 4096, options: FileOptions.None)) + { + fs.Dispose(); + } + File.SetUnixFileMode(_path, OwnerOnly); + File.WriteAllText(_path, json); + File.SetUnixFileMode(_path, OwnerOnly); + } + } + catch + { + // Best-effort persistence — non-fatal. + } + } +} diff --git a/backend/src/Ai/TextStack.Ai.Mcp/Http/TextStackApiClient.cs b/backend/src/Ai/TextStack.Ai.Mcp/Http/TextStackApiClient.cs index d567cb00..b214df2f 100644 --- a/backend/src/Ai/TextStack.Ai.Mcp/Http/TextStackApiClient.cs +++ b/backend/src/Ai/TextStack.Ai.Mcp/Http/TextStackApiClient.cs @@ -112,7 +112,7 @@ public async Task> SearchBooksAsync( /// public async Task> GetHighlightsAsync(Guid editionId, CancellationToken ct) { - using var request = AuthorizedRequest(HttpMethod.Get, $"/me/highlights/{editionId}"); + using var request = await AuthorizedRequestAsync(HttpMethod.Get, $"/me/highlights/{editionId}", ct); using var response = await _http.SendAsync(request, HttpCompletionOption.ResponseHeadersRead, ct); if (response.StatusCode is HttpStatusCode.Unauthorized) @@ -144,7 +144,7 @@ public async Task GetVocabularyAsync( if (offset is { } o) query.Add($"offset={o}"); var url = "/me/vocabulary/words" + (query.Count > 0 ? "?" + string.Join("&", query) : ""); - using var request = AuthorizedRequest(HttpMethod.Get, url); + using var request = await AuthorizedRequestAsync(HttpMethod.Get, url, ct); using var response = await _http.SendAsync(request, HttpCompletionOption.ResponseHeadersRead, ct); if (response.StatusCode is HttpStatusCode.Unauthorized) @@ -168,7 +168,7 @@ public async Task GetVocabularyAsync( /// public async Task AskAsync(Guid editionId, string question, int? k, CancellationToken ct) { - using var request = AuthorizedRequest(HttpMethod.Post, $"/books/{editionId}/ask"); + using var request = await AuthorizedRequestAsync(HttpMethod.Post, $"/books/{editionId}/ask", ct); request.Content = JsonContent.Create(new AskRequestJson(question, k), options: JsonOptions); using var response = await _http.SendAsync(request, HttpCompletionOption.ResponseHeadersRead, ct); @@ -192,18 +192,29 @@ private HttpRequestMessage PublicRequest(HttpMethod method, string url) return request; } - // User-scoped route: Host header + Bearer. Throws McpUnauthorizedException up - // front when no token is available, so the call never leaves the process. - private HttpRequestMessage AuthorizedRequest(HttpMethod method, string url) + // User-scoped route: Host header + Bearer. Asks the token provider; a + // non-Authorized result throws McpUnauthorizedException (carrying the + // verification URL/code for Pending, or the message for Failed) up front so + // the HTTP call never leaves the process. The catalog maps it to an + // actionable IsError. + private async Task AuthorizedRequestAsync(HttpMethod method, string url, CancellationToken ct) { - var token = _tokenProvider.GetToken(); - if (string.IsNullOrEmpty(token)) - throw new McpUnauthorizedException(); - - var request = new HttpRequestMessage(method, url); - request.Headers.Host = _siteHost; - request.Headers.Authorization = new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", token); - return request; + var token = await _tokenProvider.GetTokenAsync(ct); + switch (token) + { + case TokenResult.Authorized a: + var request = new HttpRequestMessage(method, url); + request.Headers.Host = _siteHost; + request.Headers.Authorization = + new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", a.AccessToken); + return request; + case TokenResult.Pending p: + throw new McpUnauthorizedException(verificationUri: p.VerificationUri, userCode: p.UserCode); + case TokenResult.Failed f: + throw new McpUnauthorizedException(message: f.Message); + default: + throw new McpUnauthorizedException(); + } } private static readonly PaginatedResult EmptySearch = new(0, []); @@ -212,13 +223,36 @@ private HttpRequestMessage AuthorizedRequest(HttpMethod method, string url) /// /// Typed "no/invalid auth" signal for user-scoped tools. Raised when the token -/// provider yields nothing, or the API answers 401. The handler maps it to a -/// clean IsError ("authentication required") — it is NOT a transport fault, -/// so the shared wrapper rethrows it for the handler to translate. +/// provider can't supply a usable token, or the API answers 401. The catalog maps +/// it to a clean, actionable IsError — it is NOT a transport fault, so the +/// shared wrapper rethrows it for the catalog to translate. +/// +/// When a device flow is in progress the provider yields +/// , surfaced here via +/// + so the catalog can tell +/// the user where to authorize. Otherwise carries the reason +/// (e.g. "no TEXTSTACK_MCP_TOKEN configured") or the default (401 from the API). /// public sealed class McpUnauthorizedException : Exception { - public McpUnauthorizedException() : base("Unauthorized: no valid TextStack token.") { } + /// Device-flow verification URL (set only for the Pending case). + public string? VerificationUri { get; } + + /// Device-flow user code (set only for the Pending case). + public string? UserCode { get; } + + public McpUnauthorizedException() + : base("Unauthorized: no valid TextStack token.") { } + + public McpUnauthorizedException(string message) + : base(message) { } + + public McpUnauthorizedException(string verificationUri, string userCode) + : base("Unauthorized: device authorization pending.") + { + VerificationUri = verificationUri; + UserCode = userCode; + } } // ── Local DTOs mirroring the API's JSON (deliberately not a Contracts reference: diff --git a/backend/src/Ai/TextStack.Ai.Mcp/McpBridgeOptions.cs b/backend/src/Ai/TextStack.Ai.Mcp/McpBridgeOptions.cs index 52cdf4b0..4f18d1e9 100644 --- a/backend/src/Ai/TextStack.Ai.Mcp/McpBridgeOptions.cs +++ b/backend/src/Ai/TextStack.Ai.Mcp/McpBridgeOptions.cs @@ -16,13 +16,21 @@ public sealed class McpBridgeOptions public required string SiteHost { get; init; } /// - /// Bearer token for the user-scoped tools (AI-048a serves it via - /// ). AI-050 replaces the provider - /// with a device-flow source; this env var stays the interim contract. - /// Env: TEXTSTACK_MCP_TOKEN. + /// Bearer token for the user-scoped tools. When set, the bridge uses + /// (CI / escape hatch) instead of the + /// device flow. When unset, the default is + /// . Env: TEXTSTACK_MCP_TOKEN. /// public string? McpToken { get; init; } + /// + /// Optional explicit path for the device-flow token cache file. When unset, + /// uses + /// $XDG_CONFIG_HOME/textstack/mcp-token.json (or + /// ~/.textstack/mcp-token.json). Env: TEXTSTACK_MCP_TOKEN_CACHE. + /// + public string? TokenCachePath { get; init; } + public static McpBridgeOptions FromEnvironment() { var apiUrl = Environment.GetEnvironmentVariable("TEXTSTACK_API_URL"); @@ -37,8 +45,9 @@ public static McpBridgeOptions FromEnvironment() { ApiBaseUrl = apiUrl.TrimEnd('/'), SiteHost = siteHost, - // Reserved for AI-050; read now so the env var contract is stable. + // When set → static-token mode (CI / escape hatch); else device flow. McpToken = Environment.GetEnvironmentVariable("TEXTSTACK_MCP_TOKEN"), + TokenCachePath = Environment.GetEnvironmentVariable("TEXTSTACK_MCP_TOKEN_CACHE"), }; } } diff --git a/backend/src/Ai/TextStack.Ai.Mcp/Program.cs b/backend/src/Ai/TextStack.Ai.Mcp/Program.cs index dddfd84d..89165fd0 100644 --- a/backend/src/Ai/TextStack.Ai.Mcp/Program.cs +++ b/backend/src/Ai/TextStack.Ai.Mcp/Program.cs @@ -32,12 +32,39 @@ var bridgeOptions = McpBridgeOptions.FromEnvironment(); builder.Services.AddSingleton(bridgeOptions); -// Bearer token for the user-scoped tools. AI-050 SWAP POINT: replace this single -// registration with a device-flow provider — no tool/client/catalog signature -// churn. StaticEnvTokenProvider reads the reserved TEXTSTACK_MCP_TOKEN env var; -// a null token makes user-scoped tools fail-clean (auth required), never calling -// the API, while ALL tools stay listed for stable discovery. -builder.Services.AddSingleton(); +// Bearer token for the user-scoped tools. ONE branch: +// • TEXTSTACK_MCP_TOKEN set → StaticEnvTokenProvider (CI / escape hatch). +// • else → DeviceFlowTokenProvider (the real, default flow): cached token → +// refresh → device authorization, non-blocking (returns Pending immediately, +// polls in the background). Either way ALL tools stay listed for stable +// discovery; user-scoped calls fail-clean (auth required) until a token exists. +if (!string.IsNullOrWhiteSpace(bridgeOptions.McpToken)) +{ + builder.Services.AddSingleton(); +} +else +{ + // The device-flow token cache (0600 on Unix). Path: env override → + // $XDG_CONFIG_HOME/textstack → ~/.textstack. + builder.Services.AddSingleton(new TokenCache(TokenCache.ResolvePath(bridgeOptions.TokenCachePath))); + + // Named HTTP client to the auth endpoints, Host header pinned so + // SiteContextMiddleware resolves the site. Same 15s timeout convention. + const string deviceFlowClient = "device-flow"; + builder.Services.AddHttpClient(deviceFlowClient, http => + { + http.BaseAddress = new Uri(bridgeOptions.ApiBaseUrl, UriKind.Absolute); + http.DefaultRequestHeaders.Host = bridgeOptions.SiteHost; + http.Timeout = TimeSpan.FromSeconds(McpTimeoutSeconds()); + }); + + // Singleton — the provider holds the cache + single-flight device-flow state + // across tool calls. Pull its HttpClient from the factory's named config. + builder.Services.AddSingleton(sp => new DeviceFlowTokenProvider( + sp.GetRequiredService().CreateClient(deviceFlowClient), + sp.GetRequiredService(), + sp.GetRequiredService>())); +} // Typed HTTP client over the public API. Host header is set per-request inside // TextStackApiClient so SiteContextMiddleware resolves the site for /search. diff --git a/backend/src/Ai/TextStack.Ai.Mcp/TextStack.Ai.Mcp.csproj b/backend/src/Ai/TextStack.Ai.Mcp/TextStack.Ai.Mcp.csproj index 4ced911e..273e71ab 100644 --- a/backend/src/Ai/TextStack.Ai.Mcp/TextStack.Ai.Mcp.csproj +++ b/backend/src/Ai/TextStack.Ai.Mcp/TextStack.Ai.Mcp.csproj @@ -5,6 +5,12 @@ TextStack.Ai.Mcp + + + + +