Skip to content

feat(skills): MCP-served Agent Skills per the skills-over-MCP SEP (WG PoC)#1

Draft
olaservo wants to merge 2 commits intomainfrom
mcp-skills-sep
Draft

feat(skills): MCP-served Agent Skills per the skills-over-MCP SEP (WG PoC)#1
olaservo wants to merge 2 commits intomainfrom
mcp-skills-sep

Conversation

@olaservo
Copy link
Copy Markdown
Owner

@olaservo olaservo commented Apr 21, 2026

Draft — Not for merge into this fork's main. Exists as a reviewable diff for the Skills-over-MCP SEP WG.

Host-side implementation of the pre-submission Skills-over-MCP SEP (io.modelcontextprotocol/skills). Skills served by connected MCP servers under skill:// — or any URI scheme the server chooses, per the SEP's SHOULD-not-MUST stance — are discovered at extension-connect time, surfaced into the per-turn system prompt alongside filesystem skills, and loaded on demand through Goose's existing load_skill tool.

Reference server: olaservo/github-mcp-server#add-agent-skills.

What the model sees

Goose already emits a "You have these skills..." bullet list from SkillsClient's static InitializeResult.instructions. This PR appends a second, parallel section per-turn via a new McpClientTrait::get_dynamic_instructions hook:

You also have these skills from connected MCP servers. Load them via load_skill by name; if a collision is shown in <server>__<name> form, use that exact form:
• pull-requests (github) - Submit a multi-comment GitHub pull request review using the pending-review workflow.

On FS-vs-MCP collision, FS wins and the MCP entry re-renders as <server>__<name>. MCP-vs-MCP collisions also render prefixed.

No new model-facing tool. The existing load_skill(name) handler handles both FS and MCP skills:

  • load_skill("pull-requests") — cache lookup, dispatches to ExtensionManager::read_resource.
  • load_skill("github__pull-requests") — disambiguation on collision; the model never supplies a server name.
  • load_skill("<skill>/references/GUIDE.md") — supporting-file load, scheme-agnostic composition against the cached base_uri.
  • URI input (e.g. skill://foo/SKILL.md) is rejected with a redirect to the pre-existing read_resource tool.

Discovery is not gated on the server advertising the capability — resources/read skill://index.json is attempted unconditionally; a not-found / transport error yields an empty cache entry and a debug log. The client also advertises io.modelcontextprotocol/skills: {} in its own ClientCapabilities.extensions.

The developer extension's file tools now reject <scheme>:// input with a redirect to read_resource — prevents a generic file reader from Path()-resolving skill://.../SKILL.md into a bogus filesystem path. The guard is scheme-agnostic and shares the RFC-3986 detector with load_skill's URI rejection.

Skill content from MCP servers is treated as untrusted model input per SEP §Security. Frontmatter parsing extracts only name and description — no hook / pre-post / shell-execution fields are honored.

Code tour

The discovery module is crates/goose/src/agents/platform_extensions/mcp_skills.rs — parses each server's skill://index.json into cached McpSkillEntry { server, name, description, base_uri, uri }, 5s fetch timeout, never propagates errors. The cache lives on Extension.mcp_skills in extension_manager.rs so it's scoped to the owning server's lifecycle; aggregated_mcp_skills() is the per-turn accessor. SkillsClient grows a Weak<ExtensionManager> and a new McpClientTrait::get_dynamic_instructions(session_id) hook (default-None) that renders the MCP section per reply with a 10s-TTL FS-names cache for collision detection. developer's file tools and load_skill's URI-rejection share an RFC-3986 detector in platform_extensions::mod.rs so the two guards can't drift apart. The client advertises io.modelcontextprotocol/skills: {} in ClientCapabilities.extensions via GooseClient::get_info. Server-side: SlashCommand.origin + SlashCommandsQuery.session_id on /config/slash_commands surface MCP skills to the desktop; SkillsView.tsx adds the MCP · <server> pill. The alternative ui/goose2/ surface is not updated.

The extensionmanager::read_resource tool is deliberately unchanged — the SEP's illustrative read_resource(server, uri) signature already matched.

Bugs surfaced by e2e validation

Empty-session-id permanent client lock. The first draft of ExtensionManager::add_extension / add_client called fetch_server_skills(..., session_id: "", ...) unconditionally. McpClient asserts a single session id across its lifetime — so the empty-string "first session" permanently locked the client, and every subsequent real tool call panicked. Fix: thread the existing session_id: Option<&str> into the fetch call; platform extensions that don't hit a real transport pass None (empty cache, repopulated on the next real-session call).

Platform extensions aren't auto-enabled in ad-hoc Agent builds. The desktop wires up the skills platform extension via config migrations. Ad-hoc Agent::with_config(...) builds — including e2e / scenario runners — do not auto-enable it. Without an explicit add_extension(ExtensionConfig::Platform { name: "skills", ... }), the cache populates but load_skill is never exposed; activation silently fails with the model calling whatever tool is available instead. Fix (in the harness, not production): add the platform extension explicitly after agent construction.

Running it locally

GITHUB_TOKEN=$(gh auth token) \
  cargo test -p goose --test mcp_skills_e2e \
    --no-default-features \
    --features "code-mode,aws-providers,telemetry,otel,rustls-tls" \
    mcp_skills_discovery -- --ignored --nocapture

LLM-in-the-loop e2e lives out-of-tree at skills-over-mcp-ig/clients/goose-harness/ — a sibling cargo crate that path-deps ../goose/crates/goose. Keeps this PR's diff narrow. Harness stands up a real Agent with an Anthropic provider and drives pr-review-input-validation.yaml.

Conformance vs SEP host requirements

Requirement Where
Read InitializeResult.capabilities.extensions and recognize io.modelcontextprotocol/skills server_declares_skills_capability in mcp_skills.rs
Enumerate skill:// resources from skill://index.json fetch_server_skills
Accept non-skill:// URI schemes on indexed entries base_uri derived by stripping /SKILL.md; scheme never inspected
Tolerate index-absent / malformed-index servers errors swallowed with debug!/warn!; registration still succeeds
resources/read both index and SKILL.md yes
Surface skill name + description in model context SkillsClient::get_dynamic_instructions
Provide a host tool to load skill content existing load_skill, unified FS + MCP
Resolve relative refs against the skill root load_skill("<skill>/<rel>") composes against base_uri
MUST NOT honor in-skill local-execution mechanisms without explicit opt-in frontmatter parser extracts only name + description
Indicate originating server when presenting a skill server name in prompt catalog; MCP · <server> pill in SkillsView
Guard generic file-reading tools against MCP URIs developer file tools reject <scheme>://
Client advertises skills-extension capability (optional) GooseClient::get_info inserts io.modelcontextprotocol/skills: {}

Out of scope (follow-ups)

  • type: "mcp-resource-template" entries + MCP completion-API integration for parameterized skills.
  • Per-server enable_skills toggle on ExtensionConfig::StreamableHttp / Stdio.
  • notifications/resources/list_changed → cache invalidation.
  • Mirroring the MCP-origin badge into ui/goose2/.

🤖 Generated with Claude Code

olaservo added a commit that referenced this pull request Apr 22, 2026
… reuses running server

New `examples/skills_e2e_scenario.rs` drives Scenario #1 end-to-end against the
shared YAML contract at `experiments/scenarios/pr-review-input-validation.yaml`.
Takes `--scenario <path>` (positional also accepted); connects to the MCP server
already running at the YAML's `mcp_server.endpoint` rather than spawning one.
Evaluates the same five criteria fast-agent's runner does (skill-read-before-write,
create-pending, add-comment(s), submit-pending-with-verdict, no-single-shot-bypass)
with Goose's criterion 1 keyed on `load_skill(name="pull-requests")`. Writes
`<scenario-dir>/../results/<iso>-<id>-goose-<model>.json`.

`tests/mcp_skills_e2e.rs` trimmed back to just `mcp_skills_discovery_and_load_against_real_server` —
the mechanical, sub-second discovery + load round-trip. LLM-in-the-loop activation
now lives in the example binary so the scenario path can be a real CLI arg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
olaservo and others added 2 commits April 22, 2026 21:06
… SEP

Implement the reference behavior defined in the skills-over-MCP SEP: a
Goose session can now discover and load Agent Skills hosted as MCP
resources under a server's `skill://index.json`, in addition to the
existing filesystem-backed skills.

Core pieces:
- `platform_extensions::mcp_skills` parses a server's skill index,
  caches `McpSkillEntry` records per-extension at connect time, and
  resolves supporting files by composing relative refs against each
  entry's `base_uri` with `..` / leading `/` rejection.
- `ExtensionManager` threads `session_id: Option<&str>` through
  `add_client` and `get_extensions_info` so MCP-skill discovery is
  scoped to the current session and skipped when no session exists.
  Cache population is factored into `populate_mcp_skills_cache` with
  the single-session-lock rationale living in one place.
- `platform_extensions::skills::load_skill` resolves literal names
  first, then composes supporting files; multi-text `read_resource`
  responses log a warn! with server/uri/text_count rather than
  silently dropping entries.
- `platform_extensions::looks_like_uri` centralizes RFC-3986 scheme
  detection; `developer::edit::reject_uri_path` reuses it so the
  filesystem edit/write/read tools refuse `skill://...` (and other URI
  schemes) instead of misinterpreting them as paths.
- Client advertises `io.modelcontextprotocol/skills` capability so
  servers can gate skill-related resources on the handshake.
- Desktop UI surfaces MCP-sourced skills with an "MCP - <origin>"
  badge via a new `origin` field on `SlashCommand`.

Known limitation: the per-extension MCP-skills cache is populated at
connect time and not invalidated on
`notifications/resources/list_changed` — see TODO at
`extension_manager.rs`. Left as follow-up; not expanding scope here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `crates/goose/tests/mcp_skills_e2e.rs` exercising MCP-served skill
discovery and loading against a real `github-mcp-server` instance.
Gates on `GITHUB_TOKEN` and the server binary being on PATH, picks an
ephemeral port, and cleans up the subprocess via Drop on a
`ServerHandle` guard.

The mechanical round-trip test
(`mcp_skills_discovery_and_load_against_real_server`) is sub-second
and runs without a model in the loop — suitable for upstream CI once
a `GITHUB_TOKEN` secret is wired up. LLM-in-the-loop scenario runs
live out-of-tree at `skills-over-mcp-ig/clients/goose-harness/`,
mirroring the `clients/gemini-cli-harness/` pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant