This document provides a comprehensive overview of the code-executor-mcp project for Gemini, including its purpose, architecture, and development conventions.
code-executor-mcp is a sophisticated, security-focused proxy server built with TypeScript and Node.js. It operates within the Model-driven Code Protocol (MCP) ecosystem.
Its primary purpose is to solve the "context exhaustion" problem that occurs when AI models are given access to a large number of tools. Instead of exposing dozens of tools (consuming vast amounts of tokens), this server exposes only two primary tools: executeTypescript and executePython.
The AI model can then request the execution of code, and within that secure, sandboxed environment, the code can dynamically discover and call any number of other MCP tools (like filesystem, git, web browsers, etc.). This "progressive disclosure" mechanism reduces initial token load by up to 98%, enabling complex, multi-tool workflows that would otherwise be impossible.
- Language: TypeScript (strict mode)
- Platform: Node.js (v22.0.0+)
- Module System: ES Modules (
"type": "module") - Sandboxing:
- Testing: Vitest for unit and integration testing.
- Linting: ESLint with TypeScript-specific rules.
- Schema Validation: AJV and Zod for robust validation of tool inputs.
The core of the project is the CodeExecutorServer class (src/index.ts), which sets up an MCP server that communicates over stdin/stdout.
- Server Initialization: The server starts, loads configuration from
.mcp.jsonfiles, and checks for dependencies like the Deno runtime. - Tool Registration: It registers the
executeTypescriptandexecutePythontools. The Python tool includes a crucial security gate (PYTHON_SANDBOX_READY) to prevent use of the older, insecure implementation. - Request Handling: When the server receives a request to execute code:
a. Rate Limiting: The request is checked against a rate limiter.
b. Validation: The input is validated against a Zod schema.
c. Security Checks: The code and its requested permissions are passed through a
SecurityValidator, which checks for dangerous patterns, validates tool allowlists, and ensures path traversal protection. d. Connection Pooling: The request is handed to aConnectionPoolto manage concurrency. e. Sandboxed Execution: The code is executed in the appropriate sandbox (Deno or Pyodide). The sandbox environment has helper functions likecallMCPToolanddiscoverMCPToolsinjected into its scope. f. Tool Orchestration: From within the sandbox,callMCPToolcalls are routed through theMCPClientPool, which manages connections to all other configured MCP servers. g. Auditing: An audit log is written upon completion. - Graceful Shutdown: The server listens for
SIGINT/SIGTERMsignals to shut down gracefully, allowing in-flight requests to complete.
The project uses npm for dependency management and scripts.
-
Install Dependencies:
npm install
-
Build (Compile TypeScript):
npm run build
(Source in
src/is compiled todist/) -
Run Tests:
npm test -
Run Tests in Watch Mode:
npm run test:watch
-
Run Linting:
npm run lint
-
Run Type Checking:
npm run typecheck
-
Run the Server (for development): This command builds the project first, then starts the server.
npm run server
- Code Style: The project follows standard TypeScript best practices, enforced by ESLint and Prettier. The configuration can be found in
eslint.config.mjs. - Testing:
- Tests are co-located in the
tests/directory and use the.test.tsextension. - The project uses
vitest. - Tests are comprehensive, covering unit, integration, and edge cases. Mocking is used extensively (
vi.fn()) to isolate components. - Test names are descriptive (e.g.,
should_completeWithin500ms_when_discoverMCPToolsCalled). - Many tests are linked directly to User Stories (e.g., "US6") or bug reports in comments, providing excellent context.
- Tests are co-located in the
- Commits & PRs: While not explicitly defined in the browsed files, the high quality of the code and tests suggests a convention of well-tested, focused PRs.
- Error Handling: The code makes extensive use of
try...catchblocks and formats errors consistently usingformatErrorResponse. It distinguishes between different error types (VALIDATION,EXECUTION). - Security: Security is a primary concern. This is evident from:
- The secure-by-default design (e.g., the
PYTHON_SANDBOX_READYgate). - Multiple layers of validation (Zod, AJV, custom security validator).
- Explicit sandboxing with Deno and Pyodide.
- Detailed audit logging.
- Graceful handling of failures.
- The secure-by-default design (e.g., the