The conformance test runner is a framework for testing both MCP client and server implementations against the MCP specification. It provides two testing modes:
- Client Testing - Executes client implementations in controlled scenarios and validates their behavior
- Server Testing - Tests server implementations by acting as an MCP client and validating responses
src/runner/
├── index.ts # Exports all public functions
├── client.ts # Client testing implementation
├── server.ts # Server testing implementation
└── utils.ts # Shared utilities (formatting, file I/O, ANSI colors)
Client Testing:
- Runner - Orchestrates test execution
- Test Server - MCP server (and optionally auth server for auth scenarios)
- Client Process - The MCP client implementation under test
- Scenario - A specific test case with expected behaviors
- Checks - Validation functions that produce ConformanceCheck results
Server Testing:
- Runner - Orchestrates test execution
- Test Client - MCP SDK client that connects to the server under test
- Server Process - The MCP server implementation under test (external)
- Scenario - A specific test case with expected server behaviors
- Checks - Validation functions that produce ConformanceCheck results
Client Testing:
1. Runner starts test server(s) on available port(s)
2. Runner spawns client process with server URL as final argument
3. Server captures MCP interactions
4. Runner captures client stdout/stderr
5. Scenario-specific checks are executed
6. Results are written to results/<scenario>-<timestamp>/
Server Testing:
1. User starts their server implementation
2. Runner connects to server URL as an MCP client
3. Runner sends requests and captures responses
4. Scenario-specific checks validate server behavior
5. Results are written to results/server-<scenario>-<timestamp>/
The conformance suite provides a unified CLI with two main commands:
# Run a single client scenario
npm run start -- client --command "tsx examples/clients/typescript/test1.ts" --scenario initialize
# With verbose output
npm run start -- client --command "tsx examples/clients/typescript/test1.ts" --scenario initialize --verbose
# With custom timeout
npm run start -- client --command "tsx examples/clients/typescript/test1.ts" --scenario initialize --timeout 60000Arguments:
--command- The command to run the client (can include existing flags)--scenario- The scenario to test (e.g., "initialize", "tools-call")--timeout- Timeout in milliseconds (default: 30000)--verbose- Show verbose output (JSON format)
The runner will append the server URL as the final argument to the command.
# Run a single server scenario
npm run start -- server --url http://localhost:3000/mcp --scenario server-initialize
# Run all server scenarios (default when no --scenario specified)
npm run start -- server --url http://localhost:3000/mcpArguments:
--url- URL of the server to test--scenario <scenario>- Scenario to test (optional, defaults to all scenarios if not specified)
# List all scenarios
npm run start -- list
# List only client scenarios
npm run start -- list --client
# List only server scenarios
npm run start -- list --serverAll CLI arguments are validated using Zod schemas (src/schemas.ts) before being passed to runner functions:
Client Validation:
- Command is non-empty
- Scenario exists in available scenarios
- Timeout is a positive integer
Server Validation:
- URL is a valid HTTP/HTTPS URL
- All specified scenarios exist (if provided)
- Defaults to all scenarios when no
--scenariois specified
A scenario represents a specific test case that validates one or more aspects of MCP behavior. Each scenario:
- Configures the test environment with expected behavior
- May run multiple conformance checks
- Validates both request/response patterns and protocol compliance
initialize- Tests client initialization handshaketools-call- Tests tool invocationauth/basic-dcr- Tests OAuth Dynamic Client Registration flowauth/basic-metadata-var1- Tests OAuth with authorization metadata variation 1
server-initialize- Tests server initialization and capabilitiestools-list- Tests tool listing endpointtools-call-simple-text- Tests tool invocation with text responseresources-list- Tests resource listingprompts-list- Tests prompt listinglogging-set-level- Tests logging level configuration
Results are written to: results/<scenario>-<timestamp>/ or results/server-<scenario>-<timestamp>/
Client Testing Files:
checks.json- Array of ConformanceCheck objects with validation resultsstdout.txt- Complete stdout from the client processstderr.txt- Complete stderr from the client process
Server Testing Files:
checks.json- Array of ConformanceCheck objects with validation results
[
{
"id": "mcp-client-initialization",
"name": "MCPClientInitialization",
"description": "Validates that MCP client properly initializes with server",
"status": "SUCCESS",
"timestamp": "2024-10-29T14:30:00.000Z",
"specReferences": [
{
"id": "MCP-Lifecycle",
"url": "https://modelcontextprotocol.io/specification/2025-06-18/basic/lifecycle"
}
],
"details": { ... }
}
]The runner can also be used programmatically:
import { runConformanceTest, printClientResults, runServerConformanceTest, printServerResults } from './runner/index.js';
// Test a client
const clientResult = await runConformanceTest('tsx my-client.ts', 'initialize', 30000);
const { failed } = printClientResults(clientResult.checks);
// Test a server
const serverResult = await runServerConformanceTest('http://localhost:3000/mcp', 'server-initialize');
const { failed } = printServerResults(serverResult.checks);The utils.ts module provides shared functionality:
- File Operations:
ensureResultsDir(),createResultDir() - Formatting:
formatPrettyChecks()for colored console output - Styling: ANSI color constants and helpers
- Test Suites - Group multiple scenarios for convenience
- Configurable Timeouts - Per-scenario timeout configuration
- Parallel Execution - Run multiple scenarios concurrently
- Custom Servers - Allow custom MCP server implementations for advanced scenarios
- Report Formats - HTML, Markdown, or other human-readable formats
- Watch Mode - Automatically re-run tests on file changes
- Coverage Tracking - Track which parts of the spec are tested