diff --git a/modules/ai-gateway/pages/configure-provider.adoc b/modules/ai-gateway/pages/configure-provider.adoc index da13be6..e59bd23 100644 --- a/modules/ai-gateway/pages/configure-provider.adoc +++ b/modules/ai-gateway/pages/configure-provider.adoc @@ -174,6 +174,8 @@ Models you select on this form become the catalog the provider exposes. Leave th For *OpenAI*, *Anthropic*, *Google AI*, and *AWS Bedrock*, the form shows a picker backed by the provider's catalog. Pick from the list, or type a model identifier the catalog doesn't show. For *OpenAI-compatible*, the form takes a freeform list: type the exact identifiers your upstream serves. +The catalog of available models in the picker is maintained by Redpanda. When an upstream provider publishes a new model, it usually appears in the picker within a day or two; admins don't have to wait for a Redpanda release. New models aren't enabled automatically: an admin still selects the model in the catalog to make it callable through this provider. + For Bedrock, the picker exposes inference profiles, not raw foundation-model IDs. See <>. [NOTE] @@ -185,6 +187,21 @@ After you create the provider, the detail page renders each model as a row with The detail page also carries a *Last 7 days* KPI strip (*TOTAL SPEND*, *REQUESTS*, *TOKENS*) with sparklines and _vs previous period_ deltas. *View all* on each card opens the *Cost & Usage* tab with this provider pre-filtered so you can drill into spend, request, or token trends. +[[transcript-logging]] +== Configure transcript logging + +By default, AI Gateway records the full request and response payload (including prompt content, completion content, and tool-call arguments and results) for every call this provider proxies, writing each call into xref:observability:transcripts.adoc[the Transcripts view] alongside token counts and latency. This powers turn-by-turn investigation and per-conversation drill-down in Governance. + +Some workloads need to suppress that payload capture: regulated PII, customer secrets, or any traffic where the message body itself must not be retained. For those, configure a dedicated "sensitive" provider with transcript logging disabled. + +The toggle is on the provider's create and edit form. It is per-provider, not per-request: applications cannot opt in or out at call time. To split sensitive from non-sensitive traffic, create one provider with transcript logging on and another with it off, and route each application to whichever proxy URL matches its data class. + +Disabling transcript logging does not suppress cost and usage telemetry. Token counts, latency, and provider/model attribution are still recorded, so the *Cost & Usage* tab and the xref:governance:dashboard/overview.adoc[Governance dashboard] continue to report spend for traffic on the provider; only the message bodies are withheld from the Transcripts view. + +NOTE: Changing the toggle takes effect for new requests. Transcripts already captured under the previous setting are not retroactively redacted; delete or rotate the provider if you need to purge historical content. + +// TODO: Verify the exact UI field label ("Transcript logging" / "Capture transcripts" / similar) and default value against adp-production. Confirm with eng whether disabling capture also suppresses the `gen_ai.prompt.*` and `gen_ai.completion.*` attributes on OTel spans, or only the long-form content fields. + == Save and verify . Click *Create provider*. The button activates after *Name* and *Type* are both set. The *Summary* panel checks them off as you fill them in. diff --git a/modules/ai-gateway/pages/connect-agent.adoc b/modules/ai-gateway/pages/connect-agent.adoc index 7b2ced6..cb80aac 100644 --- a/modules/ai-gateway/pages/connect-agent.adoc +++ b/modules/ai-gateway/pages/connect-agent.adoc @@ -123,6 +123,7 @@ The `rpk ai` command honors the following environment variables: |Map to `--rpai-profile`, `--rpai-config`, `--rpai-verbose`, `--format`. Long flag names are renamed under `rpk ai` to avoid collision with `rpk`'s globals; short flags (`-p`, `-c`, `-v`, `-o`) are unchanged. |=== +[[authenticate-with-oidc-client-credentials]] == Authenticate with OIDC client credentials (CI and programmatic) For application code, CI runners, server-side processes, and headless agents, use the OIDC `client_credentials` grant directly. This is the canonical authentication path for SDK-style usage; `rpk ai` is for command-line workflows, not for embedding in application code. Values are surfaced on the provider's *Connection* card; defaults at the time of writing are below. diff --git a/modules/governance/pages/budgets.adoc b/modules/governance/pages/budgets.adoc index 6047849..297b50a 100644 --- a/modules/governance/pages/budgets.adoc +++ b/modules/governance/pages/budgets.adoc @@ -74,6 +74,121 @@ For more expressive queries, `SpendingFilter` also accepts an AIP-160 `filter` e // TODO: confirm `user_id` and `organization_id` are populated automatically from request context (OIDC claims) or require setup. Open Q A2 in the companion plan. +[[query-spend-programmatically]] +== Query spend programmatically + +`SpendingService.GetSpendingBreakdown` is the canonical RPC for pulling spend out of ADP. Use it for chargeback reporting, scheduled emails, internal cost dashboards, or any workflow the built-in UI doesn't cover. Every spend number the dashboard shows comes from this RPC, so query results match the UI to the cent. + +=== Authenticate + +`SpendingService` uses the same OIDC client-credentials grant as the rest of AI Gateway. Mint a service-account access token using the flow in xref:ai-gateway:connect-agent.adoc#authenticate-with-oidc-client-credentials[Authenticate with OIDC client credentials], then pass the token in the `Authorization: Bearer ` header on every call. The service account needs `dataplane_adp_spending_get` on the resource you're querying. See xref:governance:permissions-reference.adoc#spending-permissions[Spending permissions]. + +// TODO: Confirm the canonical endpoint shape against `apps/aigw` on cloudv2. Likely a Connect-Go / gRPC reflection surface at `/redpanda.aigateway.spending.v1.SpendingService/GetSpendingBreakdown`. Replace the placeholder URL below with the verified shape, and add a note on whether HTTP/JSON transcoding is exposed or whether clients must speak gRPC. + +=== Request shape + +`GetSpendingBreakdown` takes a `SpendingFilter` plus a `group_by` dimension. The filter accepts: + +[cols="1,3"] +|=== +|Field |Meaning + +|`time_range.start_time`, `time_range.end_time` +|RFC 3339 timestamps bracketing the window. Required. + +|`provider_name` +|Restrict to one LLM provider (matches the *Name* field on the provider's detail page). + +|`model_id` +|Restrict to one model identifier (`claude-sonnet-4-6`, `gpt-5.2`, and so on). + +|`user_id` +|Restrict to one identified user. Anonymous traffic is excluded. + +|`organization_id` +|Restrict to one organization. Multi-tenant deployments only. + +|`filter` +|AIP-160 expression that combines and negates dimensions in a single string (for example, `provider_name="anthropic" AND model_id!="claude-sonnet-4-6"`). Composes with the structured fields above; populate one or both. +|=== + +The `group_by` value chooses the breakdown dimension: `PROVIDER`, `MODEL`, `USER`, `ORGANIZATION`, or `PROVIDER_TYPE`. + +=== cURL example + +Pull per-user spend for the last 7 days against an Anthropic provider: + +[source,bash] +---- +ACCESS_TOKEN="" # from the client_credentials flow +DATAPLANE_BASE="https://aigw..clusters.rdpa.co" + +curl -s --request POST \ + --url "${DATAPLANE_BASE}/redpanda.aigateway.spending.v1.SpendingService/GetSpendingBreakdown" \ + --header "Authorization: Bearer ${ACCESS_TOKEN}" \ + --header 'Content-Type: application/json' \ + --data '{ + "filter": { + "time_range": { + "start_time": "2026-05-17T00:00:00Z", + "end_time": "2026-05-24T00:00:00Z" + }, + "provider_name": "prod-anthropic" + }, + "group_by": "USER" + }' | jq +---- + +The response carries one row per user in the window, each with `input_tokens`, `output_tokens`, `cached_tokens`, `total_tokens` (server-derived), `total_cost_microcents`, and `request_count`. Divide `total_cost_microcents` by 10,000 to convert to dollars. + +=== Python example + +Generated client code lives in the proto bundle; if your project doesn't already import it from cloudv2, drive `SpendingService` over plain HTTPS: + +[source,python] +---- +import os, requests +from datetime import datetime, timedelta, timezone + +token = os.environ["ACCESS_TOKEN"] # from the client_credentials flow +base = os.environ["DATAPLANE_BASE"] # https://aigw..clusters.rdpa.co +end = datetime.now(timezone.utc) +start = end - timedelta(days=7) + +body = { + "filter": { + "time_range": { + "start_time": start.isoformat().replace("+00:00", "Z"), + "end_time": end.isoformat().replace("+00:00", "Z"), + }, + "filter": 'provider_name="prod-anthropic"', + }, + "group_by": "USER", +} + +r = requests.post( + f"{base}/redpanda.aigateway.spending.v1.SpendingService/GetSpendingBreakdown", + headers={"Authorization": f"Bearer {token}", "Content-Type": "application/json"}, + json=body, +) +r.raise_for_status() +for row in r.json().get("rows", []): + dollars = row["total_cost_microcents"] / 10_000 + print(f"{row['user_id']}: ${dollars:,.2f} ({row['request_count']} requests)") +---- + +// TODO: Replace the URL path with the verified Connect-Go / gRPC route once eng confirms. The proto-generated client (Connect-Go or grpc-python) is the long-term recommendation; the cURL/`requests` examples above are for quick scripting. + +=== Related methods + +`SpendingService` exposes three more methods that follow the same `SpendingFilter` shape: + +* `GetSpendingTimeSeries`: Bucketed spend over the time range, for chart-style consumers. +* `GetSpendingSummary`: Total spend, tokens, and requests for the range, with no breakdown. +* `ListSpendingEvents`: Paged per-call detail. Use this only for narrow time ranges; the event volume is high. + +// TODO: Confirm the method names above against the current `SpendingService` proto and add the full request/response shape for each one (or split into a dedicated spending-api.adoc if this section outgrows budgets.adoc). + == Guardrail evaluator cost Some guardrail evaluators call an LLM to do their work. A toxicity classifier, for example, runs the request or response through a separate model and accrues per-call cost in the process. PII detection over regex doesn't, but anything LLM-based does. diff --git a/modules/governance/pages/dashboard/overview.adoc b/modules/governance/pages/dashboard/overview.adoc index 3f94ac8..656e9c0 100644 --- a/modules/governance/pages/dashboard/overview.adoc +++ b/modules/governance/pages/dashboard/overview.adoc @@ -8,6 +8,7 @@ // Source: `cloudv2` `apps/adp-ui/src/routes/governance/index.tsx`, governance components, and `apps/adp-ui/docs/design/0003-governance-v0.md` on `origin/main`, verified 2026-05-10. // TODO: Capture screenshots and exact empty-state copy after an authenticated walkthrough of the protected ADP UI. +// TODO: Package 2 briefing (2026-05-22) lists "grouping by user / agent" and "drill-down into agent and user" as still-WIP for Jun 15 ship. The surfaces below already describe a *user* filter on the chart, a per-user *Top users* panel with heatmap, an *Agents* table, and `SpendingFilter.user_id` on the breakdown API. Before next release: reconcile with eng (Johannes / governance team) which of these are live in Package 2 today vs. coming with the Jun 15 cut. If any are still WIP, mark them as "coming in Package 2" or remove until they ship. Do not document unshipped surfaces. Specific items to confirm: (1) per-user filter on the dashboard chart, (2) Top users panel + heatmap, (3) Agents-table drill-down to per-agent spend, (4) per-agent grouping on `GetSpendingBreakdown`, (5) per-user filter on the AI Gateway *Cost & Usage* tab. The Governance dashboard brings AI Gateway usage and agent inventory into one view. Use it to compare spend, request volume, and token volume over a selected time range, then narrow the chart by provider, model, cost type, token type, or user. diff --git a/modules/integrations/pages/claude-code.adoc b/modules/integrations/pages/claude-code.adoc index 7db94c8..f2d8855 100644 --- a/modules/integrations/pages/claude-code.adoc +++ b/modules/integrations/pages/claude-code.adoc @@ -1,4 +1,166 @@ -= Claude Code Integration -:description: Integrate Redpanda ADP with Claude Code. += Use Claude Code with ADP +:description: Point Claude Code at an ADP-managed Anthropic provider so your team's LLM calls flow through AI Gateway with centralized credentials, usage tracking, and per-provider transcript logging. Optionally attach MCP servers hosted in ADP. +:page-topic-type: how-to +:personas: app_developer, platform_admin +:learning-objective-1: Configure Claude Code to call an Anthropic provider hosted in AI Gateway instead of the public Anthropic API +:learning-objective-2: Attach ADP-hosted MCP servers to Claude Code so its tools resolve against your managed tool catalog +:learning-objective-3: Verify the connection and read usage in the *Cost & Usage* tab -// TODO: Add content +Claude Code is Anthropic's command-line coding agent. When you point it at an AI Gateway proxy URL instead of the public Anthropic API, your team's LLM calls flow through ADP: API keys stay in the dataplane secret store, usage rolls up in the *Cost & Usage* tab, and the calls land in xref:observability:transcripts.adoc[the Transcripts view] for investigation. + +After completing this guide, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +== When to use ADP with Claude Code + +Use this integration when you want to: + +* Pull Anthropic API keys out of every developer's shell and manage them centrally. +* Track Claude Code spend per provider, model, and user without parsing each developer's individual Anthropic invoice. +* Apply per-provider transcript logging. For example, route a regulated team to a "no-logging" provider while keeping the default provider's full conversation history available for review. +* Forward each developer's own Anthropic subscription token through ADP (Anthropic *Auth passthrough*), so the existing Max- or Team-plan entitlement still applies but the call is observed by ADP. + +This guide does not cover building agents that *call* Claude Code. For that, see xref:agents:integration-overview.adoc[Integration patterns overview]. + +== Prerequisites + +* An Anthropic LLM provider configured in AI Gateway. If you haven't created one, follow xref:ai-gateway:configure-provider.adoc[Configure an LLM provider] and pick *Anthropic* as the type. Enable at least one Claude model (for example, `claude-sonnet-4-6` or `claude-opus-4-7`) in the model picker. +* Claude Code installed on the developer's workstation. See https://docs.anthropic.com/claude-code[Anthropic's Claude Code documentation]. +* A Redpanda Cloud service account with permission to invoke the provider (`dataplane_adp_llmprovider_invoke`). See xref:governance:permissions-reference.adoc#llm-provider-permissions[LLM provider permissions]. Both shared-developer-tooling and per-developer setups use the same OIDC client-credentials grant; the differences are operational. + +== Get the proxy URL + +. Sign in to ADP and open *LLM Providers*. +. Click into your Anthropic provider. +. On the *Connection* card, copy the Proxy URL. It looks like: ++ +[source,text] +---- +https://aigw..clusters.rdpa.co/llm/v1/providers/ +---- ++ +The *Connect your app* section of the detail page also exposes a ready-to-paste Claude Code snippet pre-filled with this URL. Use it instead of hand-editing if you want to skip the next section. + +== Configure Claude Code + +Claude Code reads the Anthropic base URL and an authentication token from environment variables. Set both to point at the ADP proxy URL. + +[tabs] +====== +OIDC service account (default):: ++ +-- +Use the OIDC `client_credentials` grant to mint an access token, then hand the token to Claude Code through `ANTHROPIC_AUTH_TOKEN`. This is the same flow xref:ai-gateway:connect-agent.adoc[Connect your agent] documents for SDK clients; the only thing different here is how Claude Code reads the token. + +. Mint an access token. The full cURL, Python, and Node.js examples live in xref:ai-gateway:connect-agent.adoc#authenticate-with-oidc-client-credentials[Authenticate with OIDC client credentials]. The short version: ++ +[source,bash] +---- +ANTHROPIC_AUTH_TOKEN=$(curl -s --request POST \ + --url 'https://auth.prd.cloud.redpanda.com/oauth/token' \ + --header 'content-type: application/x-www-form-urlencoded' \ + --data grant_type=client_credentials \ + --data client_id= \ + --data client_secret= \ + --data audience=cloudv2-production.redpanda.cloud | jq -r .access_token) +export ANTHROPIC_AUTH_TOKEN +---- + +. Export the proxy URL: ++ +[source,bash] +---- +export ANTHROPIC_BASE_URL="https://aigw..clusters.rdpa.co/llm/v1/providers/" +---- + +. Launch Claude Code as you normally would. Calls flow through ADP. + +The token has a short TTL: re-mint when it expires. AI Gateway does not refresh OIDC tokens for you. For day-to-day work, wrap the mint-and-export in a shell function or sub-shell so you don't have to remember the steps. +-- + +Anthropic Auth passthrough:: ++ +-- +Use when developers should authenticate to Anthropic with their own subscription (Max plan, Team plan, enterprise) and ADP should only observe the call. + +. Confirm the provider has *Auth passthrough* enabled. The *Connection* card on the provider detail page shows the current setting. If it is off, an admin needs to flip it. See xref:ai-gateway:configure-provider.adoc#anthropic-authorization-passthrough[Anthropic: Authorization passthrough]. + +. Set the base URL but pass the developer's own Anthropic key as the token: ++ +[source,bash] +---- +export ANTHROPIC_BASE_URL="https://aigw..clusters.rdpa.co/llm/v1/providers/" +export ANTHROPIC_AUTH_TOKEN="" +---- + +AI Gateway forwards the `Authorization` header to Anthropic unchanged. Usage is still recorded against the provider in the *Cost & Usage* tab; the upstream subscription bears the cost. +-- +====== + +[NOTE] +==== +The model identifier Claude Code sends (for example, `claude-sonnet-4-6`) must be one your Anthropic provider exposes. If you see a "model not found" error, open the provider detail page, confirm the model is ticked in the catalog, and pass the exact identifier shown there. +==== + +== Attach ADP-hosted MCP servers (optional) + +Claude Code can call MCP servers for tool access. To use the MCP servers you already host in ADP (managed catalog types, self-managed proxied servers, or both), register each one with Claude Code's MCP configuration: + +[source,bash] +---- +claude mcp add https://aigw..clusters.rdpa.co/mcp/v1/servers/ +---- + +For OAuth-protected MCP servers (most managed types), Claude Code prompts the developer to complete the consent flow on first use. ADP stores the resulting token in the per-user xref:mcp:user-delegated-oauth.adoc[token vault], so subsequent invocations reuse it. + +To front many MCP servers behind a single Claude Code endpoint, use xref:ai-gateway:aggregation.adoc[MCP aggregation] and point Claude Code at the aggregated URL. + +// TODO: Confirm the exact MCP URL shape (`/mcp/v1/servers/` vs. another path) against adp-production once the MCP servers detail page surfaces the canonical URL. + +== Verify the connection + +. Run a short prompt: ++ +[source,bash] +---- +claude "say hello" +---- + +. Open *LLM Providers > > Cost & Usage* in ADP. Within a few seconds the request appears in the *Requests over time* chart. +. Open xref:observability:transcripts.adoc[Transcripts] to read the full turn (if transcript logging is enabled on this provider). + +== Troubleshooting + +[cols="1,2"] +|=== +|Symptom |What to check + +|`401 Unauthorized` +|Token is missing, malformed, or expired. Re-mint the OIDC access token (it has a short TTL) and re-export `ANTHROPIC_AUTH_TOKEN`. Confirm the audience is `cloudv2-production.redpanda.cloud` and that Claude Code is sending the token as `Authorization: Bearer `. For Auth passthrough, confirm the upstream Anthropic key is valid. + +|`403 Forbidden` +|The service account or user lacks `dataplane_adp_llmprovider_invoke` on the provider. See xref:governance:permissions-reference.adoc#llm-provider-permissions[LLM provider permissions] or have an admin assign the `LLMProviderInvoker` built-in role. + +|`404 Not Found` +|`ANTHROPIC_BASE_URL` doesn't match the provider's Proxy URL. Copy it again from the *Connection* card on the detail page; the path segment after `/providers/` must be exactly the provider's `Name`. + +|"Model not found" +|The model identifier Claude Code is sending is not enabled on the provider. Open the provider detail page, confirm the model row appears, and pass that exact identifier (Claude Code's `--model` flag or `ANTHROPIC_MODEL` env var). + +|Spend isn't appearing in *Cost & Usage* +|Allow a few seconds for the cost-reporting pipeline to catch up. If the chart still shows zero after a minute, verify the request actually reached the provider (the *Requests over time* chart populates first) and that you're looking at the right date range. + +|MCP tool calls return `OAuthConnectionRequired` +|The developer hasn't yet completed the consent flow for that MCP server. See xref:mcp:user-delegated-oauth.adoc[User-delegated OAuth]; Claude Code surfaces the `authorize_url` in the error so the developer can finish the handshake. +|=== + +== Related topics + +* xref:ai-gateway:configure-provider.adoc[Configure an LLM provider] +* xref:ai-gateway:connect-agent.adoc[Connect your agent] for the SDK and OIDC reference this page builds on +* xref:ai-gateway:configure-provider.adoc#anthropic-authorization-passthrough[Anthropic: Authorization passthrough] for the enterprise-subscription pattern +* xref:integrations:remote-mcp-clients.adoc[Connect remote MCP clients] for chat-app integrations (Claude Desktop, ChatGPT, Gemini, Cursor) +* xref:ai-gateway:aggregation.adoc[MCP aggregation] for fronting many MCP servers behind one URL diff --git a/modules/integrations/pages/remote-mcp-clients.adoc b/modules/integrations/pages/remote-mcp-clients.adoc index c770fca..1ef1a2e 100644 --- a/modules/integrations/pages/remote-mcp-clients.adoc +++ b/modules/integrations/pages/remote-mcp-clients.adoc @@ -131,10 +131,11 @@ The flow mirrors Claude Desktop. The exact menu paths and field labels differ by * *ChatGPT desktop*: Recent builds support remote MCP custom connectors. Confirm the latest menu path; OpenAI iterates on this surface. * *Gemini apps*: Recent builds support remote MCP custom connectors. * *Cursor*: Supports remote MCP servers in recent builds. +* *Microsoft Copilot Studio*: Recent builds support remote MCP custom connectors registered against an external OAuth 2.0 authorization server. Register Copilot Studio's published redirect URIs on the AI Gateway OAuth Client before connecting. The required inputs are the same as Claude Desktop: connector name, MCP URL, Client ID, Client Secret. The chat client's redirect URIs must be registered on the AI Gateway OAuth Client. -// TODO: confirm and document the ChatGPT, Gemini, and Cursor menu paths once each integration ships. +// TODO: confirm and document the ChatGPT, Gemini, Cursor, and Microsoft Copilot Studio menu paths once each integration is exercised end-to-end against adp-production. For Copilot Studio specifically: capture the Power Platform / agent designer route to "Add custom connector", the redirect URIs Microsoft publishes for the MCP connector flow, and any tenant-admin consent requirements. Briefing 2026-05-22 names Copilot Studio as a supported client but does not confirm a passing end-to-end test; verify before promising the integration in customer-facing copy. == The two-step OAuth flow @@ -244,6 +245,8 @@ Common symptoms and fixes: == Limitations +* *No Dynamic Client Registration (RFC 7591)*: AI Gateway does not support OAuth 2.0 Dynamic Client Registration today. Every external MCP client must be registered manually through the *OAuth Clients* page or the `rpk ai oauth-client` CLI before its first connection attempt. MCP clients that *only* support DCR (some experimental connector builds) cannot connect to AI Gateway until a corresponding OAuth Client is registered by an admin. See <> for the manual flow. + This page does not cover: * *Custom desktop or mobile UIs*: Build against the AI Gateway MCP endpoints directly using your platform's HTTP client; you don't need an OAuth Client unless you want the same external-app flow. diff --git a/modules/mcp/pages/create-server.adoc b/modules/mcp/pages/create-server.adoc index 1f49162..b2bdedc 100644 --- a/modules/mcp/pages/create-server.adoc +++ b/modules/mcp/pages/create-server.adoc @@ -35,8 +35,8 @@ After completing this guide, you will be able to: The marketplace picker lists every managed type as a card and includes a *Remote (Proxied)* option for self-managed servers. -* *Managed*: pick a card. Redpanda hosts the server in-process. The configuration form is rendered from the type's protobuf schema; field labels and help text come straight from the proto. -* *Self-managed*: pick *Remote (Proxied)*. You provide a URL and a transport, and Redpanda proxies requests to your server. +* *Managed*: Pick a card. Redpanda hosts the server in-process. The configuration form is rendered from the type's protobuf schema; field labels and help text come straight from the proto. +* *Self-managed*: Pick *Remote (Proxied)*. You provide a URL and a transport, and Redpanda proxies requests to your server. // TODO: screenshot of the marketplace picker, with both a managed card and the Remote (Proxied) option visible. @@ -139,10 +139,13 @@ NOTE: Defer advanced code-mode patterns (sandboxing limits, runtime selection, d . Click *Create*. The server appears in the list with a *Type* badge: *Managed* or *Self-managed*. . Open the detail page. The *Overview* tab shows the *API URL*: this is the MCP URL agents connect to. Copy it for use later. +. Open the *Connection* tab to see ready-to-paste connection snippets for common MCP clients (Claude Code, Claude Desktop, ChatGPT, Cursor) pre-filled with this server's API URL. . Open the *Inspector* tab. Redpanda performs a live `tools/list` against the server and lists every tool it discovered. See xref:mcp:test-tools.adoc[Test a server's tools] for how to call them. A populated tools list confirms that the connection works and credentials resolve correctly. If the list is empty or the tab shows an error, see <>. +// TODO: Verify the exact tab names on the MCP server detail page against adp-production (briefing 2026-05-22 implies three tabs: Overview / Connection / Inspector for managed servers; remote/proxied servers may surface an additional connection-config tab). Confirm with eng whether the *Connection* tab is the canonical home for client snippets, or whether they live under *Connect your app* like on the LLM provider detail page. + == Create from the CLI Use xref:reference:rpk/rpk-ai/rpk-ai.adoc[`rpk ai`] for a non-UI path through the same create flow, useful for scripting and CI. diff --git a/package-2-doc-verification-plan.md b/package-2-doc-verification-plan.md new file mode 100644 index 0000000..70b86d7 --- /dev/null +++ b/package-2-doc-verification-plan.md @@ -0,0 +1,327 @@ +# ADP Package 2 β€” Doc Verification & Update Plan + +Source briefing: https://docs.google.com/document/d/1r5anOCrF3sqptzRX9T_4Wc5ryCMwU8sHi4tnGF0Rr1Y/edit +Scope of this pass: six features the briefing marks 🟒 Shipped or in-production: LLM Proxy, Managed MCP, Microsoft Teams integration, Cost Tracking, MCP OAuth Clients, and OAuth Provider / Token Vault. (LLM Proxy, Managed MCP, Cost Tracking, MCP OAuth Clients, and Claude Code updates shipped 2026-05-24 on branch `adp-package-2-briefing-doc-catchup`.) +Doc tree reviewed: `modules/ai-gateway`, `modules/mcp`, `modules/agents`, `modules/integrations`, `modules/governance`, `modules/reference`, `modules/ROOT/nav.adoc`. + +--- + +## TL;DR + +| Briefing feature | % per briefing | Docs state | Action needed | +|---|---|---|---| +| LLM Proxy | 95%+ | Mostly covered; two specific gaps | ~~Add transcript-logging section; finish Claude Code integration page; clarify Redpanda-managed model catalog~~ βœ… Done 2026-05-24 | +| Managed MCP | 95%+ | Well covered | ~~Light cleanup: add "Connection tab" naming, fill missing catalog deep-dives, optional codegen note~~ βœ… Connection-tab + deep-dive backlog flagged 2026-05-24 | +| Microsoft Teams integration | 95%+ | **Not documented at all** | New page + nav entry + update of `integration-overview.adoc`. Paused pending eng verification: see memory `project-adp-teams-integration-doc-gap` | +| Cost Tracking | ~90% | Well covered; two small gaps | ~~Distinguish v1-shipped vs WIP-in-Package-2; surface the API surface more clearly~~ βœ… Spending API how-to added + dashboard shipped-vs-WIP flagged 2026-05-24 | +| MCP OAuth Clients | ~70% | Well covered; two specific gaps | ~~Add Microsoft Copilot Studio to client list; surface "no DCR in Package 2" limitation~~ βœ… Done 2026-05-24 | +| OAuth Provider / Token Vault | 95%+ | **Most thoroughly covered feature.** One minor gap | Optional: new `my-connections.adoc` page so users have a discoverable home for connection management | + +LLM Proxy, Managed MCP, Cost Tracking, MCP OAuth Clients, and Claude Code integration have all been updated. OAuth Provider / Token Vault is the best-documented Package 2 area and does not need significant changes in this pass. Microsoft Teams is the standing gap: a 95%+-shipped feature with zero user-facing documentation. + +--- + +## 1. LLM Proxy + +### What the briefing claims + +1. Configure LLM providers and their API keys; centralized observability of inputs/outputs, token usage, and tool invocations. +2. Five provider types: OpenAI, Google AI Studio, Anthropic (API key **and** Enterprise/Personal passthrough), AWS Bedrock, OpenAI-compatible (vLLM, DeepSeek, self-hosted). +3. Connect tab on the provider detail page showing how to connect via, for example, Claude Code. +4. Built-in connectivity check. +5. Models list: Redpanda maintains the catalog of **available** models, IT Admin ticks checkboxes to enable. +6. Transcript logging is configured per LLM provider, so an admin can create a "sensitive LLM provider" where no data is logged. + +### What the docs cover today + +- `modules/ai-gateway/pages/overview.adoc` β€” provider types, capabilities, when-to-use. +- `modules/ai-gateway/pages/configure-provider.adoc` β€” full form reference for each of the five types, including Anthropic Auth passthrough, Bedrock inference profiles, model picker, Test Connection. +- `modules/ai-gateway/pages/connect-agent.adoc` β€” proxy URL anatomy, OIDC client-credentials grant, native SDK examples, and a passing mention that the provider detail page has Claude Code / Codex / Gemini client guides. +- `modules/ai-gateway/pages/bedrock-setup.adoc` β€” IAM walkthrough. + +### Gaps to fix + +**Gap 1 β€” Transcript logging per provider is undocumented.** The briefing highlights this as a featured capability (screenshot of "Transcript logging is configured on LLM provider level… IT Admin is in charge"), including the "sensitive provider with no logging" use case. `configure-provider.adoc` has no section for it. A grep for "transcript" across `modules/ai-gateway` returns zero hits. + +> *Plan:* Add a "Transcript logging" subsection to `configure-provider.adoc`, ideally between "Select models" and "Save and verify". Cover: where the toggle lives on the create/edit form, the default state, the data captured when enabled (inputs, outputs, tool calls, token counts) and the data still captured when disabled (request metadata, token totals β€” confirm with eng), and the canonical "sensitive provider" pattern (one provider with logging off for regulated workloads, alongside the default provider). Cross-link to `observability:transcripts.adoc` and `observability:concepts.adoc`. + +**Gap 2 β€” The "Connect tab" / Claude Code page is a stub.** `modules/integrations/pages/claude-code.adoc` contains only `// TODO: Add content`. The detail-page snippets the briefing showcases (Claude Code, Codex, Gemini desktop) are only referenced in a NOTE in `connect-agent.adoc`. + +> *Plan:* Fill `modules/integrations/pages/claude-code.adoc`. The two existing partials (`modules/integrations/partials/integrations/claude-code-admin.adoc` and `…-user.adoc`) carry the substance β€” but they describe the **old** AI Gateway v1 surface (separate "AI Gateway > Models" page, `…/ai-gateway/v1` endpoint shape, model IDs like `anthropic/claude-opus-4.6-5`) and need a rewrite against the current per-provider model selection and the `aigw..clusters.rdpa.co/llm/v1/providers//…` URL shape. Same retrofit applies to `cursor.adoc`, `continue.adoc`, `cline.adoc`, `copilot.adoc` β€” flag for a follow-up sweep, out of scope for this pass. + +**Gap 3 β€” "Available models are Redpanda-maintained" is implicit, not stated.** `configure-provider.adoc#select-models` says "the form shows a picker backed by the provider's catalog" but doesn't say that Redpanda curates that catalog and adds new models within a day or two of their upstream release. That's a featured product claim in the briefing. + +> *Plan:* Add a one-paragraph clarification near the model picker description: "The available-models list is maintained by Redpanda; new models published by the upstream provider typically appear in the picker within a day or two. Models you tick become the catalog this provider exposes; everything else stays gated." Light edit. + +### Items the briefing implies but does not require a doc change for + +- "OpenAI compatibility is a fallback that allows anything (DeepSeek, self-hosted vLLM)" β€” already covered in `configure-provider.adoc` and `overview.adoc`. +- Anthropic Enterprise / Max passthrough β€” already covered as "Auth passthrough" in `configure-provider.adoc` and `connect-agent.adoc`. +- Built-in connectivity check β€” already documented as "Test Connection" in the "Save and verify" section. + +--- + +## 2. Managed MCP + +### What the briefing claims + +1. Redpanda-hosted, high-quality, response-optimized MCP servers with form-based config. +2. Per-user OBO (on-behalf-of) authentication so each end-user only sees what their own upstream account allows. +3. AI-driven codegen harness (vibe-coded servers via `protoc-gen-go-mcp`, Claude Skills for authoring/optimization/review). +4. 14 MCP servers shipped by Alex Gallego; full catalog with both managed and proxied servers shown together. +5. Marketplace-style Create page; autoform-rendered configuration (e.g., BambooHR subdomain field). +6. Connection tab on detail page (external-client connection URL). +7. Inspector tab for testing. + +### What the docs cover today + +- `modules/mcp/pages/overview.adoc` β€” managed vs. self-managed, 36 default types referenced. +- `modules/mcp/pages/managed/managed-catalog.adoc` β€” every managed type listed (categorized: AI, AWS, Communication, Database, Google, Streaming, Utility), with deep-dive links where they exist. +- `modules/mcp/pages/create-server.adoc` β€” marketplace picker, autoform-from-protobuf, five auth modes (None / Static key / Token passthrough / Service-account OAuth / User-delegated OAuth), code mode, CLI flow. +- `modules/mcp/pages/test-tools.adoc` β€” Inspector tab with Tools/Resources/Prompts/Session panels. +- `modules/mcp/pages/user-delegated-oauth.adoc` β€” full OBO flow with token vault, scope upgrade, refresh, contrast with service-account auth. +- `modules/mcp/pages/oauth-providers.adoc`, `…/github-oauth-tutorial.adoc` β€” OAuth provider configuration. +- Deep-dives: BambooHR, Ironclad, Jira, Kafka, Metabase, NetSuite, OpenAPI, Ramp, Slack, SQL, Workday, Zendesk. + +### Gaps to fix + +**Gap 1 β€” Detail-page "Connection" tab is not described by name for MCP.** `configure-provider.adoc` explicitly walks through the Connection card on the LLM-provider detail page. The MCP equivalent β€” Connection tab showing the MCP URL plus client-connection snippets β€” isn't called out as a discrete UI surface in `create-server.adoc` (it just says "the *Overview* tab shows the *API URL*"). + +> *Plan:* In `create-server.adoc` "Save and verify", split the post-create UI into the actual tabs visible on the detail page: *Overview*, *Connection* (URL + connection-snippet generator), *Inspector*, *Connect* (or whatever the live label is). Or, if there is no separate Connection tab and the URL really does sit on Overview, leave as-is but verify against `adp-production` before signing off β€” the briefing's screenshot caption "Connection tab. Shows how to connect" implies a real tab. **Open question to confirm with eng.** + +**Gap 2 β€” Catalog deep-dive coverage is incomplete.** 12 of the 36 types have deep-dive pages. Untouched: Discord, GitHub (Read), Elasticsearch, MongoDB, Qdrant, Redis, GCP Pub/Sub, Google Calendar, Google Drive, NATS, Azure AD, BILL, DocuSign, Greenhouse, Morningstar Portfolio Analytics, Morningstar Securities, Okta, Salesforce, Text Chunker, AWS Bedrock-as-MCP, Cohere, OpenAI-as-MCP, AWS SNS, AWS SQS. The briefing called out 14 ships by Alex Gallego plus Marc Millstone's response-optimization work β€” the team can self-prioritize which ones get deep-dives next. + +> *Plan:* Out of scope for "Package 2 verify" specifically. Surface as a separate tracking issue: which managed types need deep-dive pages before GA, and which can ride on the catalog row alone. + +**Gap 3 β€” Optional: the AI-driven codegen story.** The briefing makes a point that managed MCP servers are vibe-coded via Claude Skills, and the harness uses `protoc-gen-go-mcp`. This is product-marketing material but might warrant a one-paragraph callout under "How managed servers stay current" β€” useful for prospects evaluating quality/cadence. Likely not user-doc territory, more a blog/case-study angle. **Defer unless product asks.** + +### Items the briefing implies but does not require a doc change for + +- OBO semantics / per-user upstream identity β€” already heavily documented in `user-delegated-oauth.adoc` and `create-server.adoc`. +- Inspector tab β€” already documented in `test-tools.adoc`. +- Marketplace picker + autoform β€” already documented in `create-server.adoc`. +- Managed + remote shown together β€” already in `overview.adoc`. + +--- + +## 3. Microsoft Teams Integration β€” **MAJOR GAP** + +### What the briefing claims + +1. Implemented as an ask from GF; hacked at a Dresden workwith, then product-ized in 10 days. +2. A Managed Agent can be configured to be exposed via Microsoft Teams (screenshot of the toggle/config in the agent detail page). +3. End-users interact with the agent inside MS Teams (screenshot of the Teams chat). +4. Currently wired to **Agents v1**; will be ported to v2 (Agent Registry) when that lands. + +### What the docs cover today + +Nothing. A `\bTeams\b` grep across `modules/` returns only generic usages ("teams adopting LLMs"). There is no Teams integration page, no nav entry, and no mention in `integration-overview.adoc`. The three integration scenarios listed there are: agent invokes MCP, pipeline invokes agent, external system invokes agent β€” nothing for "chat platform invokes agent" / "agent surfaced in a chat platform". + +### Gaps to fix + +**Gap 1 β€” No page, no nav entry, no concept.** + +> *Plan:* Add a new page. Two viable homes: +> +> - **Option A (recommended):** `modules/agents/pages/integrations/teams.adoc`. Pairs with the existing `integration-overview.adoc`, treats Teams as an agent frontend pattern. Add an "Agent surfaced in a chat platform" row to the scenarios table in `integration-overview.adoc` and link out. +> - **Option B:** `modules/integrations/pages/teams.adoc`. Lives alongside Claude Code, Cursor, etc. Risk: those integrations connect *clients* to ADP's LLM proxy, not chat platforms to *agents*. Conflating the two would confuse readers. +> +> Recommend Option A. +> +> Page should cover: prerequisites (Teams app permissions / bot registration in Azure AD or Entra; ADP-side managed-agent already created); configure-the-agent flow (the toggle the briefing screenshots show β€” exact label TBD with eng); end-user flow (where the bot shows up in Teams, how a user starts a conversation, auth boundary β€” whose identity Teams forwards, how that maps to ADP's OBO model); known limits (currently Agents v1 only; will be ported to Agents v2 when the registry ships β€” flag this **explicitly** so customers don't build long-term integrations on the v1 path); troubleshooting. +> +> Add nav entry to `modules/ROOT/nav.adoc` under the Agents section, after `integration-overview.adoc`. + +**Gap 2 β€” `integration-overview.adoc` scenario table is incomplete.** + +> *Plan:* Add a fourth scenario row: "User chats with agent from a chat platform" β†’ "Chat-platform-initiated, interactive, end-user-facing" β†’ link to the new Teams page. Phrase it generically so Slack-as-frontend (if/when it ships) can slot in later. + +**Gap 3 β€” Agents v1 vs. v2 caveat.** The briefing is explicit that Teams is wired to v1 today and will be ported to v2. Docs need to surface this honestly so customers building on Teams know what's portable. + +> *Plan:* In the new Teams page, include a "Roadmap and current limitations" admonition noting v1-only today and the eventual port. Cross-link to the Agent Registry RFC or relevant agents v2 doc once that exists. + +### Items to confirm with eng before writing + +- Exact label of the "expose via Teams" toggle/section in the agent detail page. +- Whether Teams users authenticate via the same Token Vault / OAuth Provider mechanism as MCP servers do, or via a Teams-specific path. +- Whether multiple agents can be exposed in the same Teams tenant simultaneously, and how the user picks among them. +- Tenant/admin install steps β€” who installs the Teams app, what permissions are required, is it App-Source-listed or sideloaded. + +--- + +## Suggested PR breakdown + +To keep reviews focused, split the work into four PRs: + +1. **LLM Proxy: transcript logging + Redpanda-maintained models clarification.** Touches `configure-provider.adoc` only. Smallest blast radius. Ships first. +2. **Claude Code integration page rewrite.** Touches `modules/integrations/pages/claude-code.adoc` and its two partials. Flag the parallel pages (Cursor / Continue / Cline / Copilot) for a follow-up β€” same stale-architecture pattern in all of them. +3. **Microsoft Teams integration page (new).** Touches `modules/agents/pages/integrations/teams.adoc` (new), `modules/agents/pages/integration-overview.adoc`, `modules/ROOT/nav.adoc`. Blocked on eng confirmation of toggle label, auth model, and v1/v2 messaging. +4. **MCP Connection-tab naming check + catalog deep-dive backlog.** Touches `create-server.adoc` if eng confirms the tab exists; otherwise just opens a tracking issue for missing deep-dives. + +## Open questions for eng before writing + +- LLM Proxy: where in the create-LLM-provider form does the transcript-logging toggle live, and what exactly does it gate (capture of `gen_ai.prompt.*` / `gen_ai.completion.*` content fields only, or also `gen_ai.usage.*` token counts)? +- MCP: is there a real "Connection" tab on the MCP-server detail page, or does the API URL just sit on the Overview tab? +- Teams: needs the answers in the "Items to confirm with eng" subsection above. +- Engineering source of truth: `projects/cloudv2` (per project instructions) β€” verify the toggle/tab labels against the actual UI before committing copy. + +--- + +## 4. Cost Tracking + +### What the briefing claims + +1. AI Gateway tracks every token used and writes it to Redpanda (Kafka topic-backed cost pipeline). +2. ADP provides an **API** and **UI** to query for usage. +3. Some "minor parts" still to do β€” explicitly: grouping by user / agent and drill-down into agent and user. Eng expects this to land before Jun 15. + +### What the docs cover today + +- `modules/governance/pages/budgets.adoc` β€” what ADP records automatically (input/output/cached tokens, cost in microcents, request count, provider/model/user/org context), how spend events flow through Kafka, per-request pricing variations (Anthropic fast mode, Gemini context-tier pricing), per-model rate-card overrides for chargeback. +- `modules/governance/pages/dashboard/overview.adoc` β€” KPI tabs (Total Spend / Requests / Tokens), date-range presets, chart filters (provider, model, cost type, token type, user), Agents table, Top users panel with heatmap. +- `modules/ai-gateway/pages/configure-provider.adoc` β€” *Cost & Usage* tab on the provider detail page: time-series charts, group-by (provider / model / token type), filter by provider/model/cost type/token type/user, sharable custom-range URLs. +- `modules/governance/pages/guardrails/cost-tracking.adoc` β€” per-evaluator cost shape (PII / Toxicity / Custom webhook), where evaluator cost surfaces, capping knobs. + +This is the most thoroughly documented Package 2 area. Coverage is in good shape. + +### Gaps to fix + +**Gap 1 β€” The "what's shipped vs. what's WIP" line is blurry.** The briefing flags grouping by user/agent and per-agent / per-user drill-down as still-to-land work. The docs already describe a *Top users* panel and a *user* filter on the chart, and `budgets.adoc` references `GetSpendingBreakdown` with `user_id`. Either (a) those features are now live and the briefing is stale, (b) the docs got ahead of eng and document something that's not yet shipped, or (c) the docs describe a partial implementation. This needs reconciliation before customers find a feature gap. + +> *Plan:* Send eng a short list of features the docs claim and ask which are live in Package 2 today vs. coming Jun 15: per-user filter on the dashboard chart, the *Top users* panel with heatmap, `SpendingFilter.user_id` on `GetSpendingBreakdown`, the *Agents* table, an Agent drill-down view from the dashboard, and the *user* filter on the AI Gateway *Cost & Usage* tab. Add a "Currently available" admonition in `dashboard/overview.adoc` covering what's live in Package 2 vs. what's coming. Avoid documenting unshipped surfaces β€” if the user-drill-down or agent-drill-down view isn't live, the *Top users* panel description should be marked as "coming in Package 2" or deferred entirely. + +**Gap 2 β€” The API surface is under-promoted.** The briefing says ADP exposes an API to query usage. `budgets.adoc` mentions `SpendingService.GetSpendingBreakdown` and AIP-160 `filter` expressions in passing, but there's no how-to with a worked example (auth, endpoint URL, request body, response shape). The `modules/reference/pages/api.adoc` page is a `// TODO: Add content` stub. Customers comparing ADP to LangSmith / Helicone / etc. will look for "can I pull spend programmatically" β€” the answer is yes, but the docs don't show it. + +> *Plan:* Add a new "Query spend programmatically" section in `budgets.adoc` (or a dedicated `modules/governance/pages/spending-api.adoc` if it grows past a section) with: the gRPC + Connect/REST shape of `GetSpendingBreakdown`, the canonical `SpendingFilter` body, a worked cURL example pulling per-user-per-day spend for the last 7 days, a worked Python example using the generated client, and the AIP-160 `filter` syntax. Cross-link from `dashboard/overview.adoc` "Next steps" and from the *Cost & Usage* tab section of `configure-provider.adoc`. **Out of scope:** the full `api.adoc` reference page rewrite β€” that's a separate project. + +### Items the briefing implies but does not require a doc change for + +- Every-token capture via Kafka topic β€” covered in `budgets.adoc`. +- Microcents unit and `total_cost_microcents` conversion β€” covered. +- Per-request pricing variations (Anthropic fast mode, Gemini context-tier) β€” covered. +- Per-model rate-card overrides for chargeback β€” covered. +- Anonymous-vs-identified user behavior in the Top users panel β€” covered. + +### Open questions for eng + +- Which of the following are live in Package 2 today, and which are WIP? Per-user filter on the dashboard chart; *Top users* panel with heatmap; agent drill-down from the *Agents* table; `SpendingFilter.user_id` in `GetSpendingBreakdown`; per-agent spend grouping. +- Is `user_id` populated automatically from OIDC claims today, or does the calling app need to inject it? (Also flagged in `budgets.adoc` TODO at line 75.) +- For the API how-to: confirm the canonical endpoint (gRPC reflection? Connect-Go? HTTP/JSON gateway?) and the authentication model (same OIDC service-account flow as `connect-agent.adoc`?). +- Is `organization_id` multi-tenant-aware in the public ADP API, or internal-only? (Same as `budgets.adoc` TODO at line 97.) + +--- + +## 5. MCP OAuth Clients + +### What the briefing claims + +1. Lets external MCP clients connect *to* ADP-hosted MCP servers via incoming OAuth. +2. ADP runs an embedded Identity Provider so MCP clients can connect with no extra hacks like headers or local rpk proxy. +3. Works with **any compliant MCP client** β€” claude.ai, Claude Desktop, Claude Code, **Microsoft Copilot Studio**, etc. +4. Generic, based on the OAuth2 spec. +5. **Dynamic Client Registration (DCR) is explicitly out of scope for Package 2.** + +### What the docs cover today + +- `modules/integrations/pages/remote-mcp-clients.adoc` β€” comprehensive page covering: + - When to use vs. when not to use (line 17–28). + - Three-resource architecture (MCP server / OAuth Provider / OAuth Client) with a GitHub worked example (line 30–54). + - Register Client form (Name, Grant Types, Redirect URIs) and Client ID + Client Secret retrieval (line 65–93). + - Claude.ai / Claude Desktop wire-up walkthrough (line 95–125). + - Brief mentions of ChatGPT, Gemini, Cursor (line 127–137). + - Two-step OAuth flow with consent screen content (line 139–177). + - Manage and rotate, including `rpk ai oauth-client revoke-tokens` (line 179–216). + - Troubleshooting and Limitations (line 218–251). +- Embedded IdP reference: "Auth0 today, Zitadel later" at line 149. +- `modules/integrations/pages/claude-code.adoc` (newly written 2026-05-24) β€” covers Claude Code specifically, including MCP server attachment. + +This is the second-most-thoroughly-documented Package 2 area. + +### Gaps to fix + +**Gap 1 β€” Microsoft Copilot Studio is missing from the named-clients list.** The briefing's OAuth Clients screenshot caption explicitly names "Claude.ai, Microsoft Copilot Studio, …" as examples. Grep across `modules/` returns zero hits for "Copilot Studio" or "copilot studio". This matters because Copilot Studio is the Microsoft-side equivalent of Claude's custom connector and is what unlocks Microsoft 365 / Teams users β€” a meaningful adjacency given the Teams integration GF asked for. + +> *Plan:* Add a "Microsoft Copilot Studio" bullet to the "Wire up other chat clients" section in `remote-mcp-clients.adoc` (currently a three-bullet list of ChatGPT desktop / Gemini apps / Cursor at line 127–137). Cover: where Copilot Studio's custom-connector setup lives (Power Platform admin center? Copilot Studio agent designer?), the redirect URIs Microsoft requires (these need eng / a manual test to capture), and the same fields-required pattern (connector name, MCP URL, Client ID, Client Secret). Flag with a "// TODO" until a real walk-through is captured. **Open question:** does Copilot Studio actually pass MCP custom-connector tests against ADP today, or is it on the roadmap? The briefing's caption is illustrative β€” Michele should verify with eng that the integration is actually exercised before promising it in docs. + +**Gap 2 β€” DCR-out-of-scope is not surfaced.** The briefing makes a point of calling out: "Dynamic Client Registration (DCR) is out of scope for Package 2." This is a meaningful limitation: some MCP clients (notably newer Anthropic builds and some experimental Copilot connectors) expect DCR (RFC 7591) for auto-registration. Without DCR, an admin must manually register the OAuth Client first β€” exactly what `remote-mcp-clients.adoc` walks through. Today the docs implicitly require manual registration but never explain *why*, so a customer reading the page might wonder why their MCP client's "register with this MCP server" auto-flow isn't supported. + +> *Plan:* Add a one-bullet entry to the "Limitations" section at the end of `remote-mcp-clients.adoc` (currently line 245–251) explicitly stating that AI Gateway does not support OAuth 2.0 Dynamic Client Registration (RFC 7591) today, and that all OAuth Clients must be registered manually through the UI or `rpk ai oauth-client` CLI. One sentence. Cross-link to the manual registration section. Optionally add a sentence in the architecture overview at the top stating "Each external MCP client must be registered up-front" so readers form the right model immediately. + +### Items the briefing implies but does not require a doc change for + +- Embedded IdP (Auth0 β†’ Zitadel migration) β€” covered at line 149. +- OAuth2-spec generic / works with any compliant client β€” covered throughout, including the explicit Limitations subsection telling readers to confirm the latest menu paths for ChatGPT, Gemini, Cursor. +- No extra hacks like headers or local rpk proxy β€” implicit; the doc only shows the clean flow. +- `claude.ai` and Claude Desktop wire-up β€” covered thoroughly. +- Claude Code β€” covered in the new `claude-code.adoc` page (2026-05-24). +- Revocation flows (UI button + `rpk ai oauth-client revoke-tokens`) β€” covered, including the access-vs-refresh-token distinction. + +### Open questions for eng + +- Microsoft Copilot Studio: has anyone successfully wired ADP as a Copilot Studio connector end-to-end? If yes, capture the menu path and required redirect URIs. If no, leave it off the named-clients list. +- DCR: any plan for when DCR support lands (Package 3? Beyond?), so the Limitations bullet can reference a successor doc? +- ChatGPT / Gemini / Cursor: the doc has `// TODO: confirm and document the ChatGPT, Gemini, and Cursor menu paths once each integration ships.` at line 137. Are any of these now in a state where a real walk-through can replace the placeholder? + +--- + +## Updated suggested PR breakdown + +Five PRs total, in dependency order: + +1. βœ… **LLM Proxy: transcript logging + Redpanda-maintained models clarification.** Shipped 2026-05-24. Touches `configure-provider.adoc`. +2. βœ… **Claude Code integration page.** Shipped 2026-05-24. Touches `modules/integrations/pages/claude-code.adoc` and adds an anchor to `connect-agent.adoc`. +3. βœ… **MCP Connection-tab naming flag.** Shipped 2026-05-24. Touches `modules/mcp/pages/create-server.adoc` with a TODO. +4. **Cost Tracking: shipped-vs-WIP reconciliation + spending API how-to.** Blocked on eng confirming which dashboard surfaces are live in Package 2. Touches `dashboard/overview.adoc` and adds a section to `budgets.adoc` (or a new `governance/pages/spending-api.adoc`). +5. **MCP OAuth Clients: Microsoft Copilot Studio + DCR limitation.** Touches `modules/integrations/pages/remote-mcp-clients.adoc`. The Copilot Studio addition is blocked on eng confirmation that the integration actually works; the DCR limitation can ship immediately. +6. **Microsoft Teams integration page (new).** Paused: see memory `project-adp-teams-integration-doc-gap`. + +--- + +## 6. OAuth Provider / Token Vault + +### What the briefing claims + +1. OAuth Providers register external systems (Slack, Jira, GitHub, Salesforce, Workday, and similar) that AI Gateway can authenticate against. +2. OBO (on-behalf-of) semantics: each user of an MCP server can only do what they can do in the upstream system anyway, with their own permission level. No shared credentials. +3. Token Vault captures user tokens during OAuth login in the browser and refreshes them automatically up to the upstream's max expiry. +4. Two creation paths: fully customizable custom providers and pre-built "easy" template providers per popular upstream. +5. GitHub template as the canonical example, pre-filled with most fields. +6. MCP server configuration attaches one configured OAuth Provider; the auth lives on the provider once and is shared across many MCP servers (IT-configures-once, builders-attach-many). +7. Claude.ai-driven UX: expired connection presents a login URL, user authorizes, MCP becomes invokable seamlessly. + +Briefing acknowledges one product-side polish item not yet shipped: "we will bake a custom experience per template oauth provider type in the future." Custom-flow UX is functional but rough today. + +### What the docs cover today + +This is the most thoroughly documented Package 2 feature. Coverage spans five pages plus the glossary: + +- `modules/mcp/pages/oauth-providers.adoc`: ~285 lines. Prerequisites, the OAuth-Providers-vs-OAuth-Clients disambiguation, full permission set (`_create`, `_get`, `_update`, `_delete`, `_attach`), Browse / Register-in-UI / Register-from-CLI flows, the Category template picker with 10 categories, Browser Consent vs JWT Bearer grant types, client-secret-basic vs client-secret-post vs none auth methods, edit and rotate, delete with consequences, troubleshooting. +- `modules/mcp/pages/user-delegated-oauth.adoc`: ~80 lines. The OBO concept, the user consent flow with `authorize_url`, token vault storage under user identity, transparent refresh, `OAuthTokenExpired` failure mode, scope-upgrade flow, service-account-OAuth contrast. +- `modules/mcp/pages/github-oauth-tutorial.adoc`: ~360+ lines. End-to-end tutorial: create the upstream GitHub OAuth app, register the provider in ADP, test through *My Connections*, create the user-delegated GitHub MCP server, troubleshoot scope mismatches and token expiry. +- `modules/mcp/pages/create-server.adoc`: section *Configure authentication* documents the five auth modes, with User-delegated OAuth pulling from a registered OAuth Provider. +- `modules/ROOT/pages/index.adoc`: surfaces the "on-behalf-of authorization model" at the top of the product overview. +- `modules/reference/pages/glossary.adoc`: registers `OAuth provider`, `OAuth client`, `OAuth connection`, and `token vault` as terms with `glossterm:...` macros so they hyperlink consistently across the site. + +### Gaps to fix + +**Gap 1: `My Connections` is referenced but never has its own page.** Six pages mention *My Connections* as a sidebar entry where users see and revoke their OAuth connections (`user-delegated-oauth.adoc`, `github-oauth-tutorial.adoc`, `overview.adoc`, `create-server.adoc`, `test-tools.adoc`, `ai-gateway/overview.adoc`). A user trying to "find their connections to revoke one" reaches the page itself through trial, then has to piece together the behavior from references on three or four other pages. + +> *Plan (optional):* Add `modules/mcp/pages/my-connections.adoc` describing: where the page lives in the sidebar, what each row represents (provider + scopes + last-used + status), how to revoke a single connection, how the consent flow re-runs after revocation, and the contrast with admin-side OAuth provider management. Cross-link from the six pages that already reference it. Low priority because the references work; nice to have for discoverability. + +### Items the briefing implies but do not need a doc change + +- OAuth Providers for popular upstreams: covered by the Category picker in `oauth-providers.adoc` (Identity, Source control, Productivity, Storage, Communication, CRM, Data warehouse, Monitoring, Security, Business categories, with named example providers in each). +- OBO semantics: surfaced in `oauth-providers.adoc` opening prose, dedicated treatment in `user-delegated-oauth.adoc`, and a top-level product-overview callout in `ROOT/pages/index.adoc`. +- Token Vault as a named concept: glossary term plus consistent inline use across pages. +- Custom provider creation: full form reference in `oauth-providers.adoc`. +- GitHub template: full tutorial. +- IT-configures-once, builders-attach-many: covered by the separate `_create` / `_update` / `_delete` vs `_attach` permission split and the NOTE at `oauth-providers.adoc` line 59. The opening prose also says "any MCP server that talks to that upstream can attach to it instead of carrying its own credentials." +- Token refresh up to max expiry: covered in `user-delegated-oauth.adoc` (transparent refresh + re-consent on refresh failure). +- Claude.ai expired-connection re-login flow: covered in `remote-mcp-clients.adoc` two-step OAuth flow section. + +### Open questions for engineering + +None blocking. Two soft questions: + +- Is there a planned `My Connections` redesign that would change what the page shows (for example, surfacing token TTL, last upstream call timestamp, or per-MCP-server attribution)? If yes, defer the dedicated page until the redesign lands. +- The briefing's "we will bake a custom experience per template oauth provider type" implies the Category picker today pre-fills endpoints but does not deeply customize the form fields. Confirm whether the current per-template differences are limited to endpoint pre-fill, or whether scope vocabularies and additional fields already vary per template; the docs should reflect what actually differs.