feat(core): implement skills-over-MCP SEP extension by olaservo · Pull Request #1 · olaservo/gemini-cli

olaservo · 2026-04-21T12:40:58Z

Draft PR against the fork's main — not proposed to upstream yet.

What this adds

Host support for the Skills-over-MCP SEP (io.modelcontextprotocol/skills). MCP servers can publish skills at skill://index.json (or any <scheme>:// the server chooses); the host discovers them, surfaces them through the existing SkillManager, and lazy-loads the SKILL.md body on activation via resources/read.

Filesystem and MCP-served skills share one namespace, one /skills list view, and one activate-skill tool. A <location> starting with skill:// is the only difference in the model's context.

Where the logic lives

packages/core/src/skills/mcpSkillDiscovery.ts — new. Reads skill://index.json, validates entries, applies name/description sanitization, returns SkillDefinition[] with source: 'mcp' and empty body. declaresSkillsExtension() checks both extensions (SEP-2133) and experimental capability slots.
packages/core/src/skills/skillLoader.ts — SkillDefinition gains optional source?: 'mcp' and mcp?: { serverName, skillUri }. Filesystem skills leave them unset.
packages/core/src/skills/skillManager.ts — new setSkillsForSource(tag, skills) channel for dynamic providers. Two backing stores: localSkills (filesystem) and sourcedSkills: Map<tag, skills[]> (keyed by mcp:<serverName>). Filesystem wins on name collision; first MCP source wins among MCPs; each shadowed skill emits one warning.
packages/core/src/tools/mcp-client.ts — discoverSkills() slots into discoverInto() after tool/prompt/resource discovery. Gates: admin-enabled, folder-trust, per-server skillsEnabled, resources-or-extension capability. Every short-circuit calls setSkillsForSource(tag, []) so config toggles take effect mid-session. disconnect() clears the slot too.
packages/core/src/tools/activate-skill.ts — MCP branch fetches the body via resources/read, enforces a 256 KiB cap, rejects frontmatter name drift, XML-escapes every server-derived string, and wraps the body in <instructions trust="untrusted">. Does not call WorkspaceContext.addDirectory() (the filesystem-branch affordance) — granting filesystem read scope based on server-controlled content would widen the agent's blast radius on every activation.
packages/core/src/tools/mcp-client-manager.ts — findResourceByUri falls back to raw-URI match across all servers. Lets the model emit read_mcp_resource("skill://foo/references/GUIDE.md") with no server-name argument; the host resolves it from the registry.
packages/cli/src/ui/… — /skills list tags MCP skills [mcp:<serverName>]; /mcp list <server> grows a Skills section.
packages/cli/src/config/settingsSchema.ts — per-server skillsEnabled, includeSkills, excludeSkills. Merged in mcp-client-manager.ts:mergeMcpConfigs with case-insensitive intersection (include) and union (exclude).

Key design choices

Discovery is ungated by capability. The TS MCP SDK's ServerCapabilitiesSchema strips unknown extensions keys during parse, so gating on capabilities.extensions["io.modelcontextprotocol/skills"] would silently hide every conforming server using the SEP-2133 slot. declaresSkillsExtension() is logged for observability; skill://index.json is probed on every MCP server regardless.
Stock read_mcp_resource tool, unchanged. Its description was left at its pre-SEP wording. Testing against the unchanged description stresses the SEP's design ("the tool description shouldn't need to know about skills") rather than host prompt engineering.
Scheme-agnostic everywhere. The SEP SHOULDs skill:// but permits any URI scheme. Accepted in discovery, registry, activation, and sibling enumeration.
Server name resolved host-side. ReadMcpResourceTool takes { uri } only; McpClientManager.findResourceByUri resolves the server from the ResourceRegistry fallback.

Trust boundaries

Server-supplied content (names, descriptions, SKILL.md body, sibling URIs) is treated as untrusted model input — XML-escaped on the way to the model, size-capped, control-bytes stripped, and wrapped in <instructions trust="untrusted">. Two non-obvious calls worth flagging:

WorkspaceContext.addDirectory(...) is skipped on the MCP branch, per the SEP's MUST against local-execution mechanisms from MCP skills without per-skill user opt-in.
Empty includeSkills is deny-all (matches the existing includeTools contract); cross-config intersection in mergeMcpConfigs is case-insensitive so two configs with intersecting allowlists don't produce [] that accidentally reads as "no allowlist configured."

Testing

Unit + subprocess integration: 128 tests pass across the touched and new files, including a subprocess-based end-to-end that spins up the in-repo template server and drives the full SEP pipeline.
Live HTTP smoke (no LLM): scripts/skills-e2e-smoke.js against olaservo/github-mcp-server @ 890ea55 — PASS.
Live agent activation (gemini-2.5-flash, non-interactive -p): 2/2 first-try successes. Server launched with --read-only, which filters pull_request_review_write and add_comment_to_pending_review out of tools/list entirely — yet the model named both tools (with the correct method: "submit_pending" argument value) in both runs. Those names only appear in the SKILL.md body, making the fingerprint unambiguous proof of skill activation.

Out of scope (follow-ups)

type: "mcp-resource-template" index entries + completion-API discovery UI
resources/subscribe for live index invalidation
Dedicated UI surface for per-skill enable/disable beyond settings.skills.disabled[] + per-server includeSkills/excludeSkills
Eager-to-filesystem materialization

Ports clients/fast-agent/scripts/skills_e2e_agent/agent.py to TypeScript so the Gemini CLI fork can reproduce Scenario #1 (PR-review via the bundled pull-requests skill) end-to-end. Spawns bundle/gemini.js in non-interactive mode with --output-format stream-json, writes an isolated .gemini/settings.json pointing at http://localhost:8082/mcp, then evaluates the same five pass/fail criteria as fast-agent: skill-read-before-write, create-pending, add-comment(s), submit-pending-with-verdict, no-single-shot-bypass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Host support for io.modelcontextprotocol/skills: MCP servers can publish skills at skill://index.json (or any <scheme>:// the server chooses), which gemini-cli discovers, surfaces through SkillManager, and lazy-loads on activation via resources/read. Design highlights - SkillDefinition gains optional source + mcp fields; MCP skills have an empty body until first activation. - SkillManager tracks filesystem (localSkills) and sourced (sourcedSkills) skills separately, so MCP skills stay visible in the merged view throughout async filesystem rescans. setSkillsForSource(tag, entries) gives dynamic providers an atomic register/replace/clear path keyed by a source tag; filesystem skills win on name collision (warning emitted), MCP-vs-MCP collisions are deduped first-source-wins. - McpClient.discoverSkills runs after tool/prompt/resource discovery, gated on isTrustedFolder() AND (capabilities.resources || the skills extension being declared). Tools-only servers no longer receive a stray readResource('skill://index.json') probe. - ActivateSkillTool branches on source === 'mcp': fetches SKILL.md via resources/read, validates frontmatter.name against the index-advertised name (rejects on drift), caches the body, renders <source>/<location> provenance, and enumerates sibling skill-root URIs from the ResourceRegistry. Workspace context is NOT expanded for MCP skills. - Cascade resources/list_changed to skill re-discovery so servers that publish skills post-connect get re-indexed. - read_mcp_resource resolves the serving MCP server from the URI via the registry fallback, so the model only needs the bare URI. Trust boundaries Server-supplied content is treated as untrusted model input, never first-party host text. Specifically: - Skill names must match /^[A-Za-z0-9][A-Za-z0-9._-]{0,63}$/ so server-supplied names cannot smuggle XML / z.enum / UI metacharacters into activated_skill XML, tool enum values, or /mcp output. - Descriptions capped at 500 chars, ASCII control bytes stripped; descriptions flow into model prompts verbatim. - Activated body is XML-escaped and wrapped as <instructions source="mcp:..." trust="untrusted"> so the model cannot be tricked by a </instructions> in the body to break out of the untrusted fence and impersonate host content. Same escape applied to serverName, skillUri, and sibling entries in the llmContent; markdown-escape applied to the activation-confirmation UI prompt so spoofed trust cues can't reach the user's decision point. - 256 KiB cap on fetched SKILL.md body before caching or inlining — the body lands in the model's context, so an unbounded server response is a memory and context-budget DoS vector. - Empty includeSkills == deny-all (matches the includeTools contract); cross-config intersection in mergeMcpConfigs is case-insensitive so two configs with intersecting allowlists don't produce [] that accidentally reads as "no allowlist configured." - discoverSkills short-circuits clear the source slot via setSkillsForSource(tag, []) so mid-session admin toggles, folder trust changes, and skillsEnabled=false flips don't leave already- registered skills in place until disconnect. Config / UI - MCPServerConfig gains skillsEnabled / includeSkills / excludeSkills (appended positionally to preserve source compat). includeSkills / excludeSkills matching is case-insensitive on both sides so 'Pull-Requests' in config matches a skill named 'pull-requests'. - /skills list tags MCP skills as [mcp:<serverName>]. - /mcp list gains a Skills section. Tests - mcpSkillDiscovery, activate-skill, skillManager, mcp-client, read-mcp-resource, and mcp-client-manager coverage (~1,077 test lines), including MCP-vs-MCP name collision dedup, setDisabledSkills applying to MCP-sourced skills, case-insensitive include/exclude filtering, admin gate, per-server skillsEnabled=false, capability gate, and disconnect cleanup. - Verified end-to-end against olaservo/github-mcp-server @ add-agent-skills (pull-requests skill's SKILL.md containing submit_pending + add_comment_to_pending_review). Reproduction harness lives at skills-over-mcp-ig/clients/gemini-cli-harness/. SDK compatibility note discoverSkills today accepts servers that declare the extension via capabilities.experimental, because the TS SDK strips unknown capabilities keys. When SEP-2133 lands in the SDK, declaresSkillsExtension() already checks both the extensions and experimental slots, so no host-side change will be needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…structions The <instructions> tag was carrying source="mcp:<serverName>" alongside trust="untrusted", but the same provenance is already in the sibling <source>MCP server: ...</source> element. Keeping one in-band marker (trust="untrusted") and letting <source> answer where-from.

olaservo force-pushed the experimental/skills-over-mcp branch from f7ba18c to 31b3ac6 Compare April 22, 2026 16:00

olaservo force-pushed the experimental/skills-over-mcp branch from 31b3ac6 to 38bdc0a Compare April 22, 2026 16:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): implement skills-over-MCP SEP extension#1

feat(core): implement skills-over-MCP SEP extension#1
olaservo wants to merge 2 commits intomainfrom
experimental/skills-over-mcp

olaservo commented Apr 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

olaservo commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this adds

Where the logic lives

Key design choices

Trust boundaries

Testing

Out of scope (follow-ups)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

olaservo commented Apr 21, 2026 •

edited

Loading