Skip to content

fix(mcp): sanitize tool input schemas into Moonshot Flavored JSON Schema#808

Open
grandmaster451 wants to merge 2 commits into
MoonshotAI:mainfrom
grandmaster451:fix/mcp-schema-sanitize-792
Open

fix(mcp): sanitize tool input schemas into Moonshot Flavored JSON Schema#808
grandmaster451 wants to merge 2 commits into
MoonshotAI:mainfrom
grandmaster451:fix/mcp-schema-sanitize-792

Conversation

@grandmaster451

Copy link
Copy Markdown

Problem

MCP servers advertise tool input schemas as standard JSON Schema, which permits properties that omit the type keyword and freely uses combinators (anyOf, oneOf, allOf) and $ref indirection. Moonshot's API validator is stricter — every property must carry an explicit type, and unresolved $ref pointers are rejected.

Without sanitization the API returns HTTP 400:

tools.function.parameters is not a valid moonshot flavored json schema
details: <At path 'properties.pageSetup.properties.size.anyOf.items': items must be an object>

This is a regression from the Python-based kimi-cli, whose kosong abstraction layer contained explicit schema interceptors (fix(kosong/kimi): fill in missing JSON Schema type for MCP tool parameters).

Closes #792.

Root Cause

assertMcpInputSchema() in packages/agent-core/src/mcp/types.ts validates only that the schema is a JSON object — it passes the raw schema through to the wire without any normalization. The connectAndDiscoverTools() method in connection-manager.ts calls it on every MCP tool discovered:

parameters: assertMcpInputSchema(mcpTool.name, mcpTool.inputSchema),

Solution

Add a sanitization layer (packages/agent-core/src/mcp/schema-sanitize.ts) that ports the original kosong interceptor from Python (kosong/utils/jsonschema.py). The sanitizeMcpSchema() function:

  1. Resolves local $ref pointers (#/$defs/...) inline, then strips the $defs/definitions buckets.
  2. Fills in missing type on every property schema:
    • Inferred from enum/const values (e.g. [true, false]"boolean")
    • Inferred from structural keywords (properties"object", items"array", pattern"string", minimum"number")
    • Defaults to "string" when no hints are present
  3. Leaves combinator branches alone (anyOf/oneOf/allOf/not/if/then/else/$ref) since they legitimately describe shape without type.

The function is wired into connectAndDiscoverTools() so every MCP tool schema is sanitized before reaching the API.

Testing

  • 23 new tests in test/mcp/schema-sanitize.test.ts covering:
  • All 170 existing MCP tests still pass (28 connection-manager + 142 others).
  • oxlint --type-aware: 0 warnings, 0 errors.
  • tsc --noEmit: clean.

Files Changed

File Change
packages/agent-core/src/mcp/schema-sanitize.ts New — sanitization module (port of kosong jsonschema.py)
packages/agent-core/src/mcp/connection-manager.ts Wire sanitizeMcpSchema() into connectAndDiscoverTools()
packages/agent-core/src/mcp/index.ts Export new module
packages/agent-core/test/mcp/schema-sanitize.test.ts New — 23 tests
.changeset/mfjs-schema-sanitize.md Changeset (patch)

MCP servers advertise tool input schemas as standard JSON Schema, which
permits properties that omit the `type` keyword and freely uses
combinators (`anyOf`, `oneOf`, `allOf`) and `` indirection.
Moonshot's API validator is stricter — every property must carry an
explicit `type`, and unresolved `` pointers are rejected.

Without sanitization the API returns HTTP 400:
  tools.function.parameters is not a valid moonshot flavored json schema

This adds a sanitization layer (ported from the Python kimi-cli's
kosong/utils/jsonschema.py) that:
1. Resolves local `` pointers and strips definition buckets
2. Fills in missing `type` on every property schema — inferred from
   enum/const values, structural keywords, or defaulting to "string"

Combinator branches (anyOf/oneOf/allOf/not/if/then/else/$ref) are left
alone since they legitimately describe shape without type.

Closes MoonshotAI#792
@changeset-bot

changeset-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: fc57b77

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@moonshot-ai/kimi-code Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dff6b44971

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

if (typeof record['$ref'] === 'string') {
const ref = record['$ref'];
if (ref.startsWith('#')) {
const target = traverse(resolvePointer(ref));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard recursive refs before dereferencing

When an MCP tool schema contains a recursive local reference, such as a Node definition whose child items point back to #/$defs/Node, this calls traverse() on the referenced target without any visited set, so discovery recurses until a RangeError and connectAndDiscoverTools() marks the whole server failed. This regresses the preexisting packages/kosong/src/providers/kimi-schema.ts behavior, which preserves cyclic refs and keeps the needed definition bucket instead of crashing startup.

Useful? React with 👍 / 👎.

throw new Error('Local $ref must resolve to a JSON object');
}
const { $ref: _, ...rest } = record;
return { ...(target as JsonRecord), ...rest };

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Traverse $ref siblings before dropping definitions

If a $ref node has sibling schema fields that contain their own local refs, for example a local properties override with { address: { $ref: '#/$defs/Address' } }, rest is merged back without being traversed and the top-level $defs are deleted afterward. That leaves a dangling $ref in the sanitized schema, so the later provider normalizer cannot resolve it and the API can still reject the MCP tool schema; the existing packages/kosong dereferencer recursively resolves sibling values before merging them.

Useful? React with 👍 / 👎.

function derefJsonSchema(schema: JsonRecord): JsonRecord {
const root = structuredClone(schema);

function resolvePointer(pointer: string): Json {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Decode JSON Pointer tokens before lookup

Valid local $ref pointers use JSON Pointer escaping, so a $defs key like a/b is referenced as #/$defs/a~1b; splitting the path without unescaping looks for the literal a~1b, throws during MCP discovery, and disables the server even though the schema is valid. The provider-side dereferencer already handles ~1 and ~0, so this earlier sanitizer should do the same or reuse that implementation.

Useful? React with 👍 / 👎.

…SON pointer

- Guard recursive local $refs with a visited set to avoid RangeError;
  preserve the $ref and keep $defs when a cycle is detected.
- Traverse sibling schema fields of a $ref so their own local refs are
  resolved instead of being left dangling after $defs deletion.
- Decode JSON Pointer escape sequences (~1 -> /, ~0 -> ~) when resolving
  local pointer paths.

Addresses Codex review feedback.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP tool schema validation fails with HTTP 400 ("moonshot flavored json schema") due to missing schema sanitization in new CLI

1 participant