Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,17 @@

## [Unreleased]

### Phase 8 — MCP remote HTTP (streamable) transport (AI-049) (2026-06-16)

A REMOTE HTTP transport for the MCP server so clients connect over HTTP behind the existing nginx + Cloudflare tunnel (no new cloud), as a new localhost-only Docker service. **stdio behavior is byte-identical** — the http transport is additive. Critical design point: remote HTTP is **multi-user** — each connection authenticates via its OWN `Authorization: Bearer` header (the AI-050 device-flow JWT pasted into the client config), NOT a server-side token cache.

- **Dual-mode host** (`Program.cs` + new `McpHosts`). Transport selected by env `MCP_TRANSPORT` (`stdio` default | `http`) or the `--http` CLI flag (`McpBridgeOptions.Transport`). **stdio** (default, UNCHANGED): `Host.CreateApplicationBuilder`, logs→stderr (the JSON-RPC-on-stdout invariant preserved), `.WithStdioServerTransport()`, singleton DI, device-flow/static token — one process identity. **http** (AI-049): `WebApplication`, normal logging (no JSON-RPC on stdout here), `AddHttpContextAccessor()`, `.WithHttpTransport(o => o.Stateless = true)`, `app.MapMcp("/mcp")` + `GET /health` (Docker probe), `ASPNETCORE_URLS=http://+:8090`.
- **Per-connection identity** (`Auth/HttpContextTokenProvider.cs`, http only). `IMcpTokenProvider` that reads the bearer off `IHttpContextAccessor.HttpContext.Request.Headers.Authorization` (case-insensitive `Bearer ` strip): non-empty → `Authorized(jwt)`; missing/empty/`"Bearer "`-only/malformed → `Failed("authentication required — set Authorization: Bearer <token> in your MCP client")`. NEVER does device flow, NEVER touches `TokenCache`.
- **DI lifetime — the real impact.** http mode registers `IMcpTokenProvider` → `HttpContextTokenProvider` **scoped** and `McpToolCatalog` **scoped** (the `tools/call` handler resolves from request-scoped `request.Services`; the typed `AddHttpClient<TextStackApiClient>` is request-scoped already) so the per-request bearer can't leak across connections. stdio mode keeps the SINGLETON registrations exactly as before. The http host additionally turns DI **scope validation ON** (`UseDefaultServiceProvider` → `ValidateScopes = true`, `ValidateOnBuild = true`) as defense-in-depth for the public multi-user endpoint: a future regression that registers an identity service as a singleton capturing a scoped dep fails at build instead of silently leaking one user's bearer (stdio keeps the defaults — singleton-by-design). The lifetime-agnostic `ListTools`/`CallTool` handler delegates + shared client config are extracted to `McpBridgeCore` so both hosts register identical handlers. `TextStackApiClient`/`McpToolCatalog` are UNCHANGED (already call `GetTokenAsync()`, map `Failed` → clean auth-required `IsError`, public tools use `PublicRequest`).
- **Package.** `ModelContextProtocol.AspNetCore` 1.4.0 (matches the pinned `ModelContextProtocol` 1.4.0) added to `Directory.Packages.props`; the MCP csproj keeps `Microsoft.NET.Sdk` (NOT `.Web`) + an explicit `<FrameworkReference Include="Microsoft.AspNetCore.App" />`. Real API used: `builder.Services.AddMcpServer(...).WithHttpTransport(o => o.Stateless = true)` + `app.MapMcp("/mcp")`.
- **Deploy.** `backend/Docker/Mcp.Dockerfile` (alpine sdk build → aspnet runtime, restores/publishes ONLY the thin MCP csproj, non-root `app`, `ASPNETCORE_URLS=http://+:8090`, EXPOSE 8090). `docker-compose.yml` `mcp-server` behind a **`mcp` profile** (so the CI docker/e2e jobs' bare `docker compose up` does NOT start it): `MCP_TRANSPORT=http`, `TEXTSTACK_API_URL=http://api:8080` (INTERNAL docker network, no Cloudflare round-trip), `TEXTSTACK_SITE_HOST=textstack.app`, `ports: 127.0.0.1:8090:8090`, `/health` healthcheck, `depends_on: api healthy`, `restart: always`, 256M cap. `infra/nginx/textstack.conf`: `upstream textstack_mcp` (keepalive 32), `mcp_limit` zone (10r/s burst 20), `location /mcp` with SSE/streaming settings (`proxy_http_version 1.1`, `Connection ""`, relays `Authorization`, `proxy_buffering off`, `proxy_cache off`, 3600s read/send timeouts) — applied manually on the server at deploy.
- **Tests** (`tests/TextStack.UnitTests/McpHttpTransportTests.cs`, CI-safe, no network/live server): `HttpContextTokenProvider` header→`TokenResult` (bearer→`Authorized`, case-insensitive scheme, absent/empty/`"Bearer "`-only/wrong-scheme/malformed→`Failed`, no-context→`Failed`); composition (catalog over the provider, no bearer → clean auth-required `IsError`, ZERO sends); multi-user guarantee (different bearers → different tokens; same provider re-reads the live request); dual-mode startup (`ResolveTransport` parses env + `--http`; `BuildStdio` constructs; `BuildHttp` starts a `WebApplication` with `/health`→200). No `ITool` added (StudyBuddy set-equality stays green).

### Phase 8 — `save_highlight` MCP write tool (AI-048b) (2026-06-16)

The first WRITE tool on the MCP↔HTTP bridge, completing the 7-tool surface (the 6 reads + `save_highlight`). Now safe to ship because AI-050b gives a per-user consented token, so the write runs on the user's OWN identity, not a shared static secret. All in `backend/src/Ai/TextStack.Ai.Mcp/` + unit tests — **no backend change**.
Expand Down
16 changes: 13 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -456,13 +456,13 @@ That single command builds the AAB and pushes it to Internal Testing. Service ac

```
Internet → Cloudflare (DNS+SSL) → Cloudflare Tunnel → nginx (port 80)
├─ textstack.app → SSG static files + /api/ proxy to :8080
├─ textstack.app → SSG static files + /api/ proxy to :8080 + /mcp proxy to :8090
└─ textstack.dev → admin panel (:81)
```

Docker services: `db` (postgres:16), `migrator`, `api`, `worker`, `admin`, `ssg-worker`, `aspire-dashboard` (profile-gated), `ollama`. All localhost-only, no public ports except 80 via tunnel.
Docker services: `db` (postgres:16), `migrator`, `api`, `worker`, `admin`, `ssg-worker`, `aspire-dashboard` (profile-gated), `ollama`, `mcp-server` (profile-gated, `--profile mcp`). All localhost-only, no public ports except 80 via tunnel.

**Nginx bot detection**: Regex map identifies crawlers (Google, Bing, Yandex, social bots) → routes to prerendered SSG HTML. Rate limiting zones: API (10r/s), uploads (1r/s), translation (5r/m).
**Nginx bot detection**: Regex map identifies crawlers (Google, Bing, Yandex, social bots) → routes to prerendered SSG HTML. Rate limiting zones: API (10r/s), uploads (1r/s), translation (5r/m), MCP (10r/s).

**Systemd services**: `seo-publish-poller` (auto-publish with SEO generation).

Expand All @@ -474,6 +474,16 @@ Supported formats: EPUB, PDF, FB2. Processing order: Spelling → Hyphenation

FB2 (`Fb2TextExtractor`): XML-based FictionBook 2.0. Cover from binary elements, metadata extraction, chapter flattening, namespace detection for non-compliant files.

## MCP Server (`backend/src/Ai/TextStack.Ai.Mcp/`)

Thin, stateless MCP↔HTTP bridge (Phase 8) — every tool call becomes an HTTP request to the public TextStack API (no DB/EF/OpenAI). 7 tools: `search_books`, `get_book`, `get_chapter` (public) + `list_my_highlights`, `list_my_vocabulary`, `ask_book`, `save_highlight` (Bearer).

**Dual transport** (env `MCP_TRANSPORT`: `stdio` default | `http`; `--http` flag also selects http). Shared wiring (tool catalog handlers, typed `TextStackApiClient`) in `McpBridgeCore`; the two host builders in `McpHosts`.
- **stdio** (local, single identity): `Host.CreateApplicationBuilder`, **logs→stderr** (stdout is JSON-RPC only — never `Console.Write*`), singleton DI, token from `TEXTSTACK_MCP_TOKEN` (static) or the device flow (`DeviceFlowTokenProvider`, AI-050). Byte-identical to the pre-049 server.
- **http** (AI-049, remote, **multi-user**): `WebApplication`, `.WithHttpTransport(o => o.Stateless = true)`, `app.MapMcp("/mcp")` + `GET /health`. Each connection authenticates with its OWN `Authorization: Bearer <token>` — the AI-050 device-flow JWT pasted into the client config — read per-request by `HttpContextTokenProvider` (SCOPED; `McpToolCatalog` + provider scoped so no identity leaks across connections). NEVER touches the device-flow cache. Package: `ModelContextProtocol.AspNetCore` 1.4.0 (matches the pinned `ModelContextProtocol`).

**Deploy** (http mode): Docker `mcp-server` (`backend/Docker/Mcp.Dockerfile`, profile `mcp`) binds `http://+:8090`, mapped `127.0.0.1:8090`; talks to the API over the **internal** docker network (`TEXTSTACK_API_URL=http://api:8080`). nginx `location /mcp` (upstream `textstack_mcp`, zone `mcp_limit`) proxies with SSE settings (`proxy_buffering off`, `Connection ""`, relays `Authorization`, 3600s timeouts). Behind Cloudflare tunnel — no new cloud. Bring up: `docker compose --profile mcp up -d mcp-server`. nginx `/mcp` block is applied manually on the server at deploy.

## Telemetry

OpenTelemetry → Aspire Dashboard (`localhost:18888`). OTLP: `:18889`. Services: `textstack-api`, `textstack-worker`.
Expand Down
6 changes: 6 additions & 0 deletions Directory.Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,12 @@
TextStack.Ai.Mcp — the SDK ships the stdio transport + low-level
ListTools/CallTool handler API used by the runtime tool catalog. -->
<PackageVersion Include="ModelContextProtocol" Version="1.4.0" />
<!-- HTTP (streamable) transport for the MCP server (AI-049, Phase 8). Separate
package layered on ModelContextProtocol; pinned to the SAME 1.4.0 as the
core SDK. Brings WithHttpTransport(o => o.Stateless = true) + MapMcp("/mcp")
on a WebApplication. Pulls in the ASP.NET Core framework reference for the
TextStack.Ai.Mcp http transport branch. -->
<PackageVersion Include="ModelContextProtocol.AspNetCore" Version="1.4.0" />
<!-- Tool-args validation (AI-030): real JSON Schema draft 2020-12 evaluation before dispatch -->
<PackageVersion Include="JsonSchema.Net" Version="7.3.4" />
<!-- AI eval framework (MEAI.Evaluation). Stable 10.6.0 — IChatClient seam +
Expand Down
27 changes: 27 additions & 0 deletions backend/Docker/Mcp.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# TextStack MCP server — remote HTTP (streamable) transport (AI-049, Phase 8).
# Mirrors Api.Dockerfile (alpine sdk build → alpine aspnet runtime). The project
# is a thin, stateless MCP↔HTTP bridge: it references ONLY the MCP SDK packages
# (no Application / Infrastructure / Domain), so the restore layer copies just its
# csproj + the central package/build props.
FROM mcr.microsoft.com/dotnet/sdk:10.0-alpine AS build
WORKDIR /src

COPY Directory.Build.props Directory.Packages.props ./
COPY backend/src/Ai/TextStack.Ai.Mcp/TextStack.Ai.Mcp.csproj backend/src/Ai/TextStack.Ai.Mcp/
RUN dotnet restore backend/src/Ai/TextStack.Ai.Mcp/TextStack.Ai.Mcp.csproj

COPY backend/src/Ai/TextStack.Ai.Mcp/ backend/src/Ai/TextStack.Ai.Mcp/
RUN dotnet publish backend/src/Ai/TextStack.Ai.Mcp/TextStack.Ai.Mcp.csproj -c Release -o /app/publish

FROM mcr.microsoft.com/dotnet/aspnet:10.0-alpine AS runtime
RUN deluser app 2>/dev/null; delgroup app 2>/dev/null; \
addgroup -g 1000 app && adduser -D -u 1000 -G app app
WORKDIR /app
COPY --from=build /app/publish .
USER app
# Remote, multi-user transport: each connection carries its own Bearer; the
# container binds all interfaces on 8090 (compose maps it to 127.0.0.1 only,
# nginx fronts /mcp). MCP_TRANSPORT=http is supplied by compose.
ENV ASPNETCORE_URLS=http://+:8090
EXPOSE 8090
ENTRYPOINT ["dotnet", "TextStack.Ai.Mcp.dll"]
49 changes: 49 additions & 0 deletions backend/src/Ai/TextStack.Ai.Mcp/Auth/HttpContextTokenProvider.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
using Microsoft.AspNetCore.Http;

namespace TextStack.Ai.Mcp.Auth;

/// <summary>
/// HTTP-mode <see cref="IMcpTokenProvider"/> (AI-049). The remote transport is
/// MULTI-USER: each connection authenticates with its OWN
/// <c>Authorization: Bearer &lt;token&gt;</c> header, NOT a server-side device-flow
/// cache. This provider reads that header off the CURRENT request and never does
/// device flow / never touches <see cref="TokenCache"/>.
///
/// Registered SCOPED (so the per-request bearer can't leak across connections);
/// resolves the live request via <see cref="IHttpContextAccessor"/>.
/// • non-empty bearer → <see cref="TokenResult.Authorized"/> (the raw JWT).
/// • missing / empty / "Bearer "-only / malformed → <see cref="TokenResult.Failed"/>
/// with an actionable "set Authorization: Bearer &lt;token&gt;" message that the
/// catalog renders as a clean auth-required IsError (the HTTP call is never made).
///
/// NEVER yields <see cref="TokenResult.Pending"/> — there is no device flow in http
/// mode; the user pastes the AI-050 device-flow JWT into their MCP client config.
/// </summary>
public sealed class HttpContextTokenProvider : IMcpTokenProvider
{
private const string BearerScheme = "Bearer ";

private static readonly TokenResult.Failed NoBearer = new(
"authentication required — set Authorization: Bearer <token> in your MCP client");

private readonly IHttpContextAccessor _accessor;

public HttpContextTokenProvider(IHttpContextAccessor accessor) => _accessor = accessor;

public Task<TokenResult> GetTokenAsync(CancellationToken ct)
{
// No live request (defensive) → fail-clean, no throw.
var header = _accessor.HttpContext?.Request.Headers.Authorization.ToString();
if (string.IsNullOrWhiteSpace(header))
return Task.FromResult<TokenResult>(NoBearer);

// Strip the scheme case-insensitively. A header that is NOT a Bearer scheme,
// or "Bearer " with nothing after it, is treated as missing.
if (!header.StartsWith(BearerScheme, StringComparison.OrdinalIgnoreCase))
return Task.FromResult<TokenResult>(NoBearer);

var token = header[BearerScheme.Length..].Trim();
return Task.FromResult<TokenResult>(
token.Length == 0 ? NoBearer : new TokenResult.Authorized(token));
}
}
96 changes: 96 additions & 0 deletions backend/src/Ai/TextStack.Ai.Mcp/McpBridgeCore.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
using System.Reflection;
using System.Text.Json;
using Microsoft.Extensions.DependencyInjection;
using ModelContextProtocol.Protocol;
using ModelContextProtocol.Server;
using TextStack.Ai.Mcp.Http;
using TextStack.Ai.Mcp.Tools;

namespace TextStack.Ai.Mcp;

/// <summary>
/// Transport-agnostic wiring shared by BOTH hosts (stdio + http, AI-049): the
/// typed <see cref="TextStackApiClient"/> HTTP config and the
/// <c>tools/list</c> / <c>tools/call</c> handler delegates.
///
/// The handlers resolve <see cref="McpToolCatalog"/> from <c>request.Services</c>
/// (which is request-scoped under HTTP and the root provider under stdio), so they
/// are lifetime-agnostic and IDENTICAL across hosts. Only the DI lifetime of the
/// catalog / token provider differs per transport, registered by the callers.
/// </summary>
internal static class McpBridgeCore
{
/// <summary>
/// Registers the typed <see cref="TextStackApiClient"/> over the public API.
/// The Host header is set per-request inside the client so
/// <c>SiteContextMiddleware</c> resolves the site. Bound timeout so a stuck
/// upstream can't hang a tool call for the default 100s.
/// </summary>
public static void AddApiClient(IServiceCollection services, McpBridgeOptions options) =>
services.AddHttpClient<TextStackApiClient>(http =>
{
http.BaseAddress = new Uri(options.ApiBaseUrl, UriKind.Absolute);
http.Timeout = TimeSpan.FromSeconds(McpTimeoutSeconds());
});

/// <summary>
/// The shared MCP server handlers. Both hosts register the SAME pair; identity
/// (and lifetime) is supplied by whatever <see cref="McpToolCatalog"/> the
/// request scope resolves.
/// </summary>
public static McpServerHandlers BuildHandlers() => new()
{
// tools/list — projected from the runtime catalog.
ListToolsHandler = (request, _) =>
{
var catalog = request.Services!.GetRequiredService<McpToolCatalog>();
return ValueTask.FromResult(new ListToolsResult { Tools = catalog.ListTools() });
},
// tools/call — dispatch by name; args dictionary → a single JSON object for
// the catalog handler (which validates against the input schema).
CallToolHandler = async (request, ct) =>
{
var catalog = request.Services!.GetRequiredService<McpToolCatalog>();
var name = request.Params!.Name;
var arguments = ToArgumentsObject(request.Params.Arguments);
return await catalog.CallAsync(name, arguments, ct);
},
};

/// <summary>Server identity advertised in both transports.</summary>
public static Implementation ServerInfo() => new()
{
Name = "textstack",
Version = Assembly.GetExecutingAssembly().GetName().Version?.ToString() ?? "1.0.0",
};

/// <summary>Tools-only capability set (no prompts/resources).</summary>
public static ServerCapabilities Capabilities() => new() { Tools = new ToolsCapability() };

// Rebuilds the MCP-supplied args dictionary into a single JSON object element so
// the catalog handler can validate/read it as one schema-shaped value.
private static JsonElement? ToArgumentsObject(IDictionary<string, JsonElement>? arguments)
{
if (arguments is null)
return null;

var node = new System.Text.Json.Nodes.JsonObject();
foreach (var (key, value) in arguments)
node[key] = System.Text.Json.Nodes.JsonNode.Parse(value.GetRawText());

return JsonSerializer.SerializeToElement(node);
}

// HttpClient timeout in seconds. Env: TEXTSTACK_MCP_TIMEOUT_SECONDS (default 15);
// a non-positive / unparseable value falls back to the 15s default. internal so
// the stdio device-flow client (McpHosts) shares the SAME env-overridable value
// as the typed TextStackApiClient — no drift.
internal static double McpTimeoutSeconds()
{
var raw = Environment.GetEnvironmentVariable("TEXTSTACK_MCP_TIMEOUT_SECONDS");
if (double.TryParse(raw, out var seconds) && seconds > 0)
return seconds;

return 15;
}
}
Loading
Loading