The MCP aggregator returns every tool from every connected MCP server, dot-namespaced as server.tool_name. That full list flows into each provider's _prepare_tools() and is sent on every request. For a multi-server setup this can consume tens of thousands of tokens in tool definitions before the model and degrade tool-selection accuracy past 30–50 tools.
Anthropic shipped the tool search tool with defer_loading: true on individual tools https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search. OpenAI's Responses API has similar patterns, and fast-agent should grow a single, abstraction for this and bridge to each provider's mechanism where one exists.
The MCP aggregator returns every tool from every connected MCP server, dot-namespaced as server.tool_name. That full list flows into each provider's _prepare_tools() and is sent on every request. For a multi-server setup this can consume tens of thousands of tokens in tool definitions before the model and degrade tool-selection accuracy past 30–50 tools.
Anthropic shipped the tool search tool with defer_loading: true on individual tools https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search. OpenAI's Responses API has similar patterns, and fast-agent should grow a single, abstraction for this and bridge to each provider's mechanism where one exists.