Skip to content

SEP for MCP-Native Skills#86

Open
helloeve wants to merge 2 commits intomodelcontextprotocol:mainfrom
helloeve:skill-as-primitive
Open

SEP for MCP-Native Skills#86
helloeve wants to merge 2 commits intomodelcontextprotocol:mainfrom
helloeve:skill-as-primitive

Conversation

@helloeve
Copy link
Copy Markdown

This PR introduces MCP-Native Skills SEP, offering Skill as a first-class protocol primitive alongside Tool, Resource, and Prompt.

Agent Skills are a widely adopted pattern for addressing tool/context bloat through progressive disclosure. However, current implementations rely heavily on the filesystem and shell execution, raising portability and structural security concerns. Additionally, using resources/ to serve markdown-based skill files conflates the data plane with the control plane by forcing clients to parse text to discover required protocol primitives.

This SEP addresses these issues by bringing skills directly into the MCP protocol, preserving the gating and bundling benefits of progressive disclosure while keeping structural semantics properly typed and secure.

Key Additions

  • skills/list: Returns a cursor-paginated array of lightweight Skill metadata.
  • skills/activate: The explicit activation request that returns the full workflow bundle, including prompt-style instructions and expanded definitions for scoped primitives (tools, prompts, resources, and nested skills). Scoped primitives do not appear in top-level lists.
  • notifications/skills/list_changed: Standard push notification for capabilities updates.

Motivation & Rationale:

  • Context Bloat: Groups situational tools, prompts, and resources into a gated bundle. Clients avoid injecting dozens of tool schemas upfront and instead query for them contextually.
  • Control Plane vs Data Plane: Avoids overloading resources/ with conventions that require clients to extract capabilities and lifecycle hooks from parsed markdown.
  • Security & Portability: Eliminates the resources/read → decode → disk → shell-exec chain. Scoped tools are executed through standard tools/call invocations inside the server's trust boundary, relying on standard MCP authorization UI loop without requiring host-side filesystem execution.
  • Better than Tool Search: While Tool Search is pull-based and scales linearly in overhead, Skills are push-based—surfacing unexpected but highly relevant tools directly into the session. Skills and search compose well together.
    Backward Compatibility:

The skills capability is net-new and fully backward compatible. Clients that don't understand it can safely ignore it and continue using top-level capabilities.

* Every entry MUST be a complete primitive object with the fields a
* client needs to invoke it (e.g., `inputSchema` for Tools). Scoped
* primitives MUST NOT appear in top-level `tools/list`,
* `prompts/list`, or `resources/list`; they exist only within the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they exist only within the activated skill's scope.

What does this mean? What is the scope of a skill?

Comment on lines +321 to +323
primitive: they deliver content. Layering skills on resources forces the client
to parse that content to recover protocol-level semantics: which tools the skill
gates, which prompts it composes, when to activate, how to tie instructions back
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resources proposal doesn't gate any tool/prompt/resource access at all. It's just file like any other agent skill.

to parse that content to recover protocol-level semantics: which tools the skill
gates, which prompts it composes, when to activate, how to tie instructions back
to callable primitives. Structure that should live in typed protocol fields ends
up hidden inside markdown every client has to run a parser against. This is the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All skills-supporting clients already have this parser. Building on top of the existing agent skills integrations that clients have was the primary motivation of the resources approach: you don't need to do anything new, module read_resource vs read_file, it all works the same and uses the same machinery.

Comment on lines +464 to +468
against dictating *how* clients solve problems, but rejecting a skills
primitive on the grounds that "tool search exists" *is* a dictation: it forces
every client onto tool search as the only available path to progressive
discovery. Offering skills alongside the existing surface is what actually
leaves the choice to clients.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth spending a bit of time here since I think it's effectively the crux of the difference between proposals.

First thing to call out: MCP does not force tool search as the only available path. There are many other paths and there are better ways than tool search. It simply says that it is the client's problem to manage its own context: MCP provides the list of tools, but the clients must decide how and when to surface them to the model.

To emphasize: this is not about progressive discovery vs tool search, it is about progressive discovery vs all other possible options the client may choose. It is highly unlikely that tool search or simple group-based progressive discovery is the optimal approach for information management.

My main objection to this proposal is that it is hard-coding progressive disclosure as the way to solve tool bloat by putting it directly into the protocol -- and not only that but it is also mandating precisely how progressive disclosure should work.

Suppose I'm using a server that has decided to use this proposal. It has 100 tools, but only 10 are returned in tools/list. The rest are hidden in skills. This is problematic for a number of reasons:

  1. If I (client end user) want to build my own local skill that makes use of those hidden tools then I can't: the protocol has decided for me that skills/activate is the only way you are allowed to discover tools.
  2. If I come up with some much more clever/efficient way of managing tools in context, I'm still forced to use skills because the protocol is hiding them from me.

--

Looking at this proposal, I see two things:

  1. Using a first-class primitive instead of resources for enumerating and fetching skills.
  2. Introducing primitive groups attached to skills.

(1) tbh is a reasonable thing to consider with trade-offs. I still lean towards resources because skills are just files and resources are meant to represent files. If we don't use resources then we lose a lot of nice resource functionality: lastModifiedTime, TTL (soon), icons, subscriptions, templates, etc. -- we can re-introduce those for skills, but that's the duplication that we want to avoid. If clients implement a resource caching mechanism then skills benefit for free.

I do agree that a new skills primitive would not be hard to implement, but it does require new spec. The resources approach requires no new spec since it is just convention + SDK sugar.

(2) is something that CMs already discussed and unanimously rejected. Maybe there is an argument to be made that primitive groups tied to skills is more appealing than primitive groups as a standalone concept, but I see that as less appealing if anything (conflates two separate things).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First thing to call out: MCP does not force tool search as the only available path. There are many other paths and there are better ways than tool search. It simply says that it is the client's problem to manage its own context: MCP provides the list of tools, but the clients must decide how and when to surface them to the model.

To emphasize: this is not about progressive discovery vs tool search, it is about progressive discovery vs all other possible options the client may choose. It is highly unlikely that tool search or simple group-based progressive discovery is the optimal approach for information management.

Ultimately servers and clients are limited by the protocol in how they can communicate. If the protocol refuses to introduce any other mechanisms to communicate context on how to progressive discovery (e.g. session state, grouping, etc), we're fundamentally limiting it to the options that work without that shared context (which is currently tool search).

I think it's worth observing that both Skills and CLIs have a progressive discovery mechanism that is based around provider-provided context, but MCP servers are unable to provide something similar with the protocol in it's current state.

My main objection to this proposal is that it is hard-coding progressive disclosure as the way to solve tool bloat by putting it directly into the protocol -- and not only that but it is also mandating precisely how progressive disclosure should work.

I don't think it's mandate it's the way, I think it's enabling a way that servers can communicate to clients how to discover the tools. Servers could just not add skills if they didn't want to. Clients could still use tool search if they want to. If some other new progress discovery mechanism becomes ubiquitous, we should probably support that too.

Copy link
Copy Markdown

@chughtapan chughtapan Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me of the exact discussion that arose for grouping #2084 (and CM feedback was documented here). The fundamental tradeoff seems to be:

  1. There is no way to implement progressive disclosure, except ToolSearch, without some standardization: clients cannot manage context without servers providing them with "some" information. And the protocol dictates what information is shared.

  2. The protocol does not want to mandate one specific scheme, especially in a fast moving field like ours. The specific concern was that selecting one at this point makes it hard to reverse that decision and change it later.

I think the only compromise is resources. @cliffhall and I had sketched out how clients could implement hierarchical organization using resources end-to-end, which can mostly be ported here.

Comment on lines +436 to +441
- **Namespace overloading.** A server's `resources/list` becomes a mix of
content files and workflow bundles. Existing host features built around
resources (@-mentions, attachments, pinned context, resource subscriptions)
now have to reason about whether each resource is "a file the user might
attach" or "a skill envelope the agent might activate." A dedicated primitive
keeps these namespaces separate.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to treat them differently. They are just files! If I @ in claude code, it also shows me SKILL.md files. That's a good thing, because reading a file and activating a skill are two separate things.

Treating it as a file:

Does `@code-review/SKILL.md` look suspicious

Treating it as a skill:

/code-review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

5 participants