Add RFC 0004: MCP Registry#12
Conversation
f96cc0e to
564fe4d
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds RFC 0004 proposing an MLflow-native “MCP Registry” design aligned with the upstream Model Context Protocol registry spec, including a proposed data model, API surface, and a reference database migration.
Changes:
- Adds an RFC document describing entities, lifecycle, schema, store interfaces, REST API, SDK, CLI, UI, and trace linking.
- Adds a reference Alembic migration that creates the proposed MCP registry tables and index.
Reviewed changes
Copilot reviewed 2 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
rfcs/0004-mcp-registry/0004-mcp-registry.md |
Full design RFC for the MCP Registry feature (schema, APIs, SDK/CLI, UI, trace linking, adoption plan). |
rfcs/0004-mcp-registry/alembic-migration.py |
Reference Alembic migration implementing the proposed schema (tables for servers, versions, tags, aliases). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Register an MCP server from a server.json payload. | ||
| # name and version are extracted from server_json. | ||
| # The parent MCPServer is auto-created if it doesn't exist. | ||
| version = mlflow.genai.register_mcp_server( |
There was a problem hiding this comment.
It is a bit unclear to me when users should create a new version. Usually the actual server implementationis managed on github, and if it is distributed as a public package, it has a package version. Also if the server is hosted somewhere, the runtime has deployment version. Does the version defined in MLflow strictly ties to any of these, or adding a new semantic versioning on top of them?
There was a problem hiding this comment.
The stored version is the canonical server_json["version"] provided by the publisher, not an additional MLflow specific version. Deployment or runtime revisions belong in mutable runtime_metadata. Users register a new version when the canonical server definition changes, and update runtime_metadata when only the deployment of that server definition changes.`
|
|
||
| ### Out of scope | ||
|
|
||
| - **Runtime hosting or deployment** — The registry stores metadata, not runtimes. Deployment is handled by external operators |
There was a problem hiding this comment.
While the separation of registry and runtime makes sense from system perspective, the actual use case always requires some sort of runtime, which can influence the design of the registry data model.
- Local (e.g. install local mcp server to coding agent)
- MLflow MCP Gateway
- Other hosted services
Did you assume one or multiple of these runtime options as a downstream use case?
There was a problem hiding this comment.
The design intentionally assumes multiple downstream runtime models and aims to be runtime-agnostic. In other words, the same registry entry should be usable by local agent installs, an MLflow MCP Gateway, or other hosted runtimes. Optional deployment-association fields such as runtime_metadata and is_deployed support deployment discovery within an organization, while remaining intentionally unopinionated about runtime-specific details.
|
|
||
| MCP server usage is linked to traces following the same pattern as prompt registry linking. When a registered MCP server is resolved within an active trace, the registry records the association so that traces carry a record of which MCP servers were involved. | ||
|
|
||
| **Tag and attribute**: Traces carry an `mlflow.linkedMcpServers` tag containing a JSON array of `{name, version}` entries — the same format as `mlflow.linkedPrompts`: |
There was a problem hiding this comment.
Is there any potential benefit of supporting bi-directional linking (i.e. search traces that uses a particular mcp server version)? If yes, we can use EntityAssociationTable rather than trace tags.
There was a problem hiding this comment.
Yes, I think that's a good idea!
| - Trace which MCP servers are used by which agents or workflows | ||
| - Provide downstream systems (catalogs, gateways, agent frameworks) with a governed source of truth | ||
|
|
||
| ### Use cases |
There was a problem hiding this comment.
How end-users should connect to the server? One option is not go through the registry at all and simply let them specify to the downstream runtime URL. However, this option does not resolve usage control and tracing problem. To achieve that, end users should access MCP servers using MLflow's name and version. However, it is more like discovery problem (like DNS) or a gateway layer, so the current pure registry scope might not be sufficient for addressing some of these problems.
There was a problem hiding this comment.
Good point. This RFC is intended to cover the registry/discovery layer, not a gateway/proxy layer, so it does not by itself solve end-user connectivity or enforce usage control for arbitrary direct access.
The RFC's expectation is that MLflow-aware clients/runtimes resolve an MCP server via MLflow by name + version (or alias), then use deployment-specific information from runtime_metadata and server_json to connect. Trace linking follows the same assumption: it works when resolution happens through MLflow or when the runtime/server explicitly links {name, version} back to MLflow. A consumer can obviously bypass MLflow though.
The RFC should make that boundary clearer and better connect it to the future gateway integration story. I also think we should document a small set of recommended runtime_metadata conventions, such as endpoint_url, without making the field prescriptive. If/when the MLflow AI Gateway expands to be an MCP Gateway, it could reuse these runtime_metadata fields to steer users to the gateway to link trace data.
|
|
||
| ### The problem | ||
|
|
||
| As MCP adoption grows, organizations accumulate MCP server definitions across teams and environments. Today, MLflow has no way to govern them. There is no single place to: |
There was a problem hiding this comment.
Can we list unique values of using MLflow registry over the Github's public registry? It appears to be basically a private registry (analogy of private pypi/npm index), with a few extra features such as lifecycle labeling and usage tracking, is my understanding correct?
There was a problem hiding this comment.
The MLflow MCP Registry is not trying to replace the upstream/public registry, but to provide an MLflow-native governed registry for enterprise use cases.
Relative to a public GitHub-hosted registry, the main added value here is:
- workspace-scoped governance and permissions within MLflow
- lifecycle state management for versions (
active/deprecated/deleted) - stable aliases for resolution (for example,
production) - association of governed MCP definitions with deployment-specific metadata via
runtime_metadata - MLflow-native trace linking / observability when clients or runtimes resolve MCP servers through MLflow-aware flows
- future integration with other MLflow components such as the AI Gateway
So I would describe it more as a governed enterprise registry that reuses the upstream MCP metadata model, while adding MLflow-specific management and integration capabilities.
I'll update the RFC with this distinction.
There was a problem hiding this comment.
I agree that the workspace and permission management are missing in the official mcp registry (btw the official registry supports lifecycle management and versioning).
For alias, can we describe the use cases where alias is useful, since usually only the latest version can be available, and it's not clear how users use different MCP server versions based on the stage (dev, stg, prd).
For tracing, users should be able to track the MCP server info on traces as long as tracing is enabled for the client. What challenge do users have when tracing their agent's mcp usage without using MLflow MCP registry?
There was a problem hiding this comment.
I addressed this in the last commit I pushed. Please take a look!
There was a problem hiding this comment.
This is intended to complement, not replace, the public GitHub-hosted MCP registry.
I think this can be a great point of confusion, since on the surface there's some overlapping elements and terminologies. A brief note about how these would complement each other would be beneficial.
As a user I will want to know how I should use these two in my workflow.
|
@B-Step62 I addressed your comments in the latest commit, please take a look! |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 4 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 3 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| mlflow mcp-servers register --server-json-file server.json | ||
|
|
||
| # List active servers | ||
| mlflow mcp-servers search --filter "status = 'active'" |
There was a problem hiding this comment.
When/how is the server status modified? Also does it return servers, or versions?
There was a problem hiding this comment.
Status lives on versions, not servers (there's no status column on mcp_servers). Server status is derived from the latest server version at query time. Filtering search_mcp_servers by status should be feasible using a subquery + JOIN pattern similar to how MLflow model registry does it for tag filters on search_model_versions (tags live in SqlModelVersionTag, resolved via subquery join).
Maybe I can update the doc to make it clearer, and to highlight difference between these two:
mlflow mcp-servers search --filter "status = 'active'" # get server (uses subquery)
mlflow mcp-server-versions search --filter "status = 'active'" # get server version (simpler query internally)
|
|
||
| - Record which MCP servers exist and what state they are in | ||
| - Version MCP server definitions as they evolve | ||
| - Control which MCP servers are eligible for consumption by AI engineers |
There was a problem hiding this comment.
nit: users should be able to configure which mcp server are available for which end user/agent
There was a problem hiding this comment.
Is this clearer:
- Control which MCP servers are available to specific users, teams, or agents
Let me know if I've missed what you're getting at :)
| The store interface is implemented as a mixin class (`MCPServerRegistryMixin`) that the model registry's `AbstractStore` inherits from. This follows the same pattern used by `GatewayStoreMixin` on the tracking store — MCP server registry code lives in its own files while composing into the existing store hierarchy via multiple inheritance. | ||
|
|
||
| ``` | ||
| mlflow/store/model_registry/mcp_server_registry/ |
There was a problem hiding this comment.
Why do we want to define this mixin as part of model_registry?
There was a problem hiding this comment.
The idea was to follow the same pattern as GatewayStoreMixin on the tracking store, but this is closer in function to model registry, so it could share the model registry AbstractStore for simplicity. That said, a standalone store (e.g. mlflow/store/mcp_registry/) would also work, but it would require more work. (Or we could use tracking store instead of model registry store)
There was a problem hiding this comment.
Since mcp gateway is closer to llm gateway and we have ai gateway resources in the tracking store, I think we should use tracking store
There was a problem hiding this comment.
I updated the RFC to place the MCP registry mixin under the tracking store, following the same composition pattern as the gateway code rather than the model registry store.
|
|
||
| # Summary | ||
|
|
||
| Add an MCP Registry to MLflow — a governed, versioned registry for [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) server definitions. The registry stores metadata-first records aligned with the [upstream MCP registry specification](https://registry.modelcontextprotocol.io/docs), providing stable identity, versioning, status lifecycle, workspace-scoped governance, and MLflow-native integrations for MCP server assets. This is intended to complement, not replace, the public GitHub-hosted MCP registry, adding workspace governance, stable aliases, deployment association, and traceability for MLflow-aware runtimes. |
There was a problem hiding this comment.
First of all, can we clarify the difference between MCP Registry and the MCP gateway (e.g., portkey), and why we want to introduce the MCP registry? We will certainly add MCP gateway as that's a clear tablestake as an ai gateway product, and I wonder if gateway and registry are two distinct features, or can be unified. If these are separate concepts, I need to think carefully about the potential confusions that users may have when seeing MCP gateway and registry in the same platform.
There was a problem hiding this comment.
Good point. I updated the RFC to make the registry vs. gateway split more explicit.
Short version: this RFC is for the registry/control-plane piece, not the MCP gateway/runtime piece. The registry is the governed source of truth for canonical MCP server definitions, version history, aliases, and approved deployment locations. That gives us enterprise governance and historical records that a gateway alone would not provide.
A future MCP gateway in MLflow AI Gateway would then consume those governed records for live MCP traffic, while the registry continues to provide the canonical identity/history behind what the gateway is serving. That also improves traceability, since gateway-produced traces can be linked back to the governed MCP server/version records rather than only to a live endpoint.
I added more detail in Summary, Registry vs. Gateway, and MCPHostedBinding.
| ### Out of scope | ||
|
|
||
| - **Runtime execution and orchestration** — The registry may store deployment metadata, but it does not provision, host, scale, or manage MCP runtimes | ||
| - **End-user connectivity, proxying, or usage control enforcement** — Consumers can still connect directly to an MCP endpoint unless a future MLflow gateway or proxy mediates access |
There was a problem hiding this comment.
Note - If we will provide MCP gateway, this is an essential feature
There was a problem hiding this comment.
Agreed. The RFC now makes that boundary clearer: it does not design the MCP gateway itself, but it does shape the registry so it can integrate cleanly with a future MCP gateway in MLflow AI Gateway.
The main value the registry adds beyond the gateway is governed historical state: canonical definitions, version/alias history, and approved deployment locations. That gives us enterprise governance and a better source of truth for linking gateway output traces back to the MCP server/version that was actually in use. See Summary, Registry vs. Gateway, and MCPHostedBinding sections for more details.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| The MCP Registry and an MCP Gateway are related but distinct capabilities: | ||
|
|
||
| - **MCP Registry**: the control-plane system of record for governed MCP assets. It stores canonical `server_json`, version history, aliases, tags, lifecycle state, and hosted binding records that describe approved live connection paths. | ||
| - **MCP Gateway**: a runtime data-plane service that receives live client traffic and mediates connectivity, authentication, routing, policy enforcement, and request-time observability. |
There was a problem hiding this comment.
MCP gateway still needs to store a list of MCP server configurations, so there are some overlaps. I think the relationship is probably MCP gateway = subset of MCP registry capabilities + runtime proxy, authentication, observability. Can we clarify this overlap and how we plan to unify them?
There was a problem hiding this comment.
Could you please check if the last commit addresses this the way you expected?
| - **MCP Registry**: the control-plane system of record for governed MCP assets. It stores canonical `server_json`, version history, aliases, tags, lifecycle state, and hosted binding records that describe approved live connection paths. | ||
| - **MCP Gateway**: a runtime data-plane service that receives live client traffic and mediates connectivity, authentication, routing, policy enforcement, and request-time observability. | ||
|
|
||
| This RFC proposes the former, not the latter. It does **not** design an MLflow MCP Gateway, extend the existing AI Gateway into an MCP Gateway, or define MCP proxy semantics. Instead, it ensures the registry can support that future integration cleanly by storing governed MCP server identities, aliases, and hosted bindings that a future gateway could resolve to concrete `{name, version}` records for connectivity and trace association. |
There was a problem hiding this comment.
nit: Can we rephrase a bit so that it's clear that MCP gateway will be built on top of the registry layer described in this RFC?
There was a problem hiding this comment.
Addressed in the last commit.
…oyed This decouples runtime information from the registry and then allows for expanding the AI Gateway to be an MCP gateway. Signed-off-by: mprahl <mprahl@users.noreply.github.com>
…nale, widen access control language - Remove run_id from entity, schema, store, and API models (provenance covered by source field; tags can handle run linkage if needed) - Add version history rationale bullets under immutability contract (trace provenance, rollback, deprecation signaling, audit) - Clarify that search_mcp_servers status filter is derived from latest version - Broaden motivation bullet to "available to specific users, teams, or agents" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Jon Burdo <jon@jonburdo.com>
This adds more details about how the future MCP gateway would integrate, the benefit of aliases, and the benefit of trace linking. Signed-off-by: mprahl <mprahl@users.noreply.github.com>
Signed-off-by: Jon Burdo <jon@jonburdo.com>
- refactor the Phase 1 connectivity model from hosted bindings to `MCPAccessBinding` for approved direct-access endpoints - clarify the distinction between governed server/version metadata and direct-access bindings that make an endpoint available for use - make the future gateway path more explicit with an illustrative `MCPGatewayBinding` example and a small forward-looking diagram - tighten discovery wording around registry listings, access-binding listings, and direct-access filtering - clarify upstream version-update behavior and why MLflow treats canonical `server_json` changes as new versions - expand the Python SDK section to follow model-registry-style create, get, search, update, and delete APIs, while keeping `register_*` helpers as convenience entry points - improve UI and trace-linking explanations for direct access, gateway evolution, and auditability Signed-off-by: mprahl <mprahl@users.noreply.github.com> Co-authored-by: Jon Burdo <jon@jonburdo.com> Co-authored-by: Dan Kuc <dkuc@redhat.com>
New versions now default to draft instead of active, requiring an explicit publish action (draft → active) before downstream consumers can discover them. The Phase 2 upstream compatibility router filters out draft versions since the upstream MCP registry spec only defines active/deprecated/deleted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Jon Burdo <jon@jonburdo.com>
This now reserves the `latest` alias to be the latest version. latest_version can be optionally set and is useful because there is not an MLflow controlled increasing version number like in Model Registry and server.json does not enforce semver. Signed-off-by: mprahl <mprahl@users.noreply.github.com>
Add MCPTransportType enum (streamable-http, sse, stdio) and transport_type field to access bindings so clients know which MCP transport protocol to use when connecting to an approved direct endpoint. This addresses review feedback about missing upstream fields needed for client connectivity. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Jon Burdo <jon@jonburdo.com>
|
|
||
| 1. An admin registers an MCP server definition in MLflow from a canonical `server.json` payload (stored by MLflow as `server_json`), such as a public NPM-backed server or an internally hosted remote server. | ||
| 2. A new MCP server entry appears in the registry listing with its version history and metadata, even before it has any approved access path. | ||
| 3. The admin optionally sets aliases such as `dev`, `staging`, and `production`, and records approved direct-access bindings for any remote endpoints that users are allowed to call directly. |
There was a problem hiding this comment.
So when admin wants to add a new version for production, do they need to both update alias and create a new binding?
There was a problem hiding this comment.
If they move the alias to the new MCPServer entity, then the binding stays the same. If the binding is tied to a version, then the binding must change to point to the new version.
|
|
||
| #### MCPAccessBinding | ||
|
|
||
| A direct access binding is the separate record that says a governed MCP server version or alias can be reached through an approved non-gateway endpoint. An `MCPServerVersion` is the governed metadata record for a server definition; by itself, it does not mean there is an approved direct endpoint available in the workspace. `MCPAccessBinding` is what makes that governed server show up in direct-access discovery. The registry is intentionally runtime-agnostic: MLflow-aware clients or runtimes may resolve a registered MCP server through MLflow and then either use the canonical `server_json` payload directly (for example, for local `packages[]` consumption) or follow an approved direct access binding. When the canonical server definition changes, publishers register a new version. When only direct connectivity changes, operators create or update access bindings without creating a new version. |
There was a problem hiding this comment.
For me to understand, what's the use case of setting a different mcp server endpoint from the one in server_json? When the server host is updated, shouldn't users just create a new version with the new host url?
There was a problem hiding this comment.
If this is just meant for the "approval status" we can have another column in the mcp server version entity for it.
There was a problem hiding this comment.
@TomeHirata the remotes in server_json may not be present (e.g. requires on-premise deployment) or the access binding may be pointing to a gateway endpoint.
The goal is that server_json can be taken directly from the publisher and is treated as immutable. The access bindings are how we communicate the catalog of what is deployed/available for use.
An alternative could be to require the admin to manually modify the server_json and allow modifying only the remotes field in server_json, however, it's limiting and would not provide a good unifying experience for direct access + MCP gateway.
There was a problem hiding this comment.
Understood the intention, thanks for the clarification. Regarding the data model, is it better to be a separate entity? I wonder if it fits better if we have a separate mutable column in ServerVersion. Also let's not mix this with gateway endpoint since the binding/deployment url is the location the actual MCP server is hosted and MCP gateway will invoke the url internally anyway.
There was a problem hiding this comment.
I think we want a separate entity because:
- independent lifecycle - endpoints get added/moved/removed without touching the governed version record
- alias-targeted bindings - a binding can target an alias like
productionrather than a concrete version, which doesn't fit as a column on a version row - gateway extensibility - keeping bindings decoupled lets a future MCP Gateway reuse the same concept with additional gateway-specific fields
The separation also enables two complementary UI experiences: access bindings as a catalog of what's currently available/deployed in the organization, and the registry (MCPServer/MCPServerVersion) as the governed historical record.
Agreed on the gateway point - the binding URL is where the actual server is hosted, not a gateway endpoint.
- Add MCPTool frozen dataclass and MCPToolPayload Pydantic model with all 8 fields from the MCP Tool spec (name, title, description, inputSchema, outputSchema, annotations, icons, execution) - Add tools field to MCPServerVersion (source of truth); binding responses project tools read-only from the resolved version - Remove tools from MCPAccessBinding create/update APIs (read-only projection only) - Add icon field to MCPServer (URL or data URI) for UI display - Add inputSchema examples to _meta projection - Add tools display and filtering paragraph to UI section - Delete speculative MCPObservedTool section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Jon Burdo <jon@jonburdo.com>
| 1. An admin registers an MCP server definition in MLflow from a canonical `server.json` payload (stored by MLflow as `server_json`), such as a public NPM-backed server or an internally hosted remote server. | ||
| 2. A new MCP server entry appears in the registry listing with its version history and metadata, even before it has any approved access path. | ||
| 3. The admin optionally sets aliases such as `dev`, `staging`, and `production`, and records approved direct-access bindings for any remote endpoints that users are allowed to call directly. | ||
| 4. The workspace can also show a separate access-binding listing that surfaces the approved direct endpoints currently available in that workspace. |
There was a problem hiding this comment.
Can "access-binding" be renamed to "deployment"? I feel like it would match more nicely with the MCP Gateway path.
There was a problem hiding this comment.
These are really pointers to already-running endpoints. In a previous comment, @HumairAK actually flagged the same concern from the other direction (suggesting has_active_binding over is_deployed). Even in consideration of the eventual unified registry/gateway experience, I think "binding" keeps the distinction clear - a record of where a governed server can be reached, not a record of something MLflow necessarily deployed.
There was a problem hiding this comment.
I am not very strongly opinionated here, but I do think "deployment" in the context of mcp servers can give the impression we are offering server deployment capabilities. When the server is an already deployed remote service that we are creating a route to (via gateway proxy or otherwise). In my opinion you create or connect endpoints to live services, and you deploy services themselves.
I suppose you could you could say you are "deploying" an endpoint that points to a server "deployment" that is also "deployed", but it feels a bit confusing. However, if this terminology is already being used elsewhere in MLflow gateway, then maybe it's better to just be consistent and stick to deployment.
| 2. A new MCP server entry appears in the registry listing with its version history and metadata, even before it has any approved access path. | ||
| 3. The admin optionally sets aliases such as `dev`, `staging`, and `production`, and records approved direct-access bindings for any remote endpoints that users are allowed to call directly. | ||
| 4. The workspace can also show a separate access-binding listing that surfaces the approved direct endpoints currently available in that workspace. | ||
| 5. The admin grants appropriate permissions through MLflow's permission model. |
There was a problem hiding this comment.
Can we elaborate this step? Is the permission here only for discovery not access control of the server itself, because registry is not a part of the access flow from end users?
There was a problem hiding this comment.
Tbh I'm not sure if there is a much benefit in setting permissions for discovery.
- For remote server, admins will put the access control on the endpoints. This is governed by MLflow if they use MLflow MCP Gateway, otherwise configured at their hosting service.
- The local package governance seems to be easily bypassed by end users. While admin can control which version/server can be seen on MLflow registry, users can still use any servers locally or tweak canonical
server_json. To strictly control which version and server is used by the end users, which I guess what many organizations want, they need to gate at installation not discovery, for instance, hosting internal npm proxy with subset of mcp packages.
In either case, setting up a separate permission for discovering the server in MLflow seem to be duplicate effort.
I might be wrong, but it feels like the more realistic operation would be all servers are discoverable, but to use the server, developers need permissions granted by admins. Are requirements you heard from enterprise customers different?
There was a problem hiding this comment.
The registry permissions are really for management operations (register, update, deprecate, delete), not runtime access control. Agreed that all servers should probably be discoverable within a workspace - workspace scoping handles visibility, per-resource permissions handle who can modify state. I think this follows the existing MLflow patterns in Model Registry, Prompt Registry
There was a problem hiding this comment.
We can also just remove that line if it's confusing.
There was a problem hiding this comment.
I've removed the line from both stories. Our permission model here just follows existing mlflow patterns
| last_updated_timestamp: int | None = None | ||
| ``` | ||
|
|
||
| ```python |
There was a problem hiding this comment.
Let's remove this due to https://github.com/mlflow/rfcs/pull/12/changes#r3192622469 and update "Future gateway relationship" section
There was a problem hiding this comment.
Agreed, I've removed the subclass example. A future gateway entity would follow the same parent/target resolution pattern, but whether it shares implementation with access bindings (subclass, mixin, etc.) is an implementation detail we can decide later.
I also updated the "Future gateway relationship" section accordingly
|
|
||
|
|
||
| @dataclass | ||
| class MCPAccessBinding: |
There was a problem hiding this comment.
Given the e2e user journey, shouldn't we allow users to attach an alias to MCPAccessBinding rather than attaching it to MCPServerVersion and linking it to MCPAccessBinding? I think there's a scenario where the same MCP server definition is used across STG and PRD, but endpoints differ between environments. The current data model doesn't tell us which endpoint is for which environment since the alias is attached to MCPServerVersion rather than MCPAccessBinding, while the endpoint url is a field of MCPAccessBinding.
There was a problem hiding this comment.
I think the current model already handles this scenario. A binding targets an alias rather than being labeled by one - so a binding with alias=production and endpoint_url=prd.example.com tells you both the environment and the endpoint. When you promote by moving the alias, the binding automatically resolves to the new version without needing to update the binding itself.
For example, say you have versions v1, v2, v3:
- alias
production-> v2, aliasstaging-> v3 - binding A:
alias=production,endpoint_url=prd.example.com - binding B:
alias=staging,endpoint_url=stg.example.com
Promoting v3 to production just means moving the production alias from v2 to v3. Binding A automatically resolves to v3 - no binding update needed. Pinning bindings to concrete versions would lose that indirection and require updating binding records on every promotion.
There was a problem hiding this comment.
That promotion workflow is possible, but which API should I use to know the endpoint_url for "production" in the scenario? Since an alias is linked to MCPServerVersion, it's not straightforward to determine the binding from the alias. But in the actual user journey, I'd like to retrieve the set of MCPServerVersion + endpoint id from an alias easily.
There was a problem hiding this comment.
The binding targets an alias directly via server_alias, so to find the version + endpoints for "production":
version = mlflow.genai.get_mcp_server_version_by_alias(
name="io.github.anthropic/brave-search",
alias="production",
)
bindings = mlflow.genai.search_mcp_access_bindings(
server_name="io.github.anthropic/brave-search",
filter_string="server_alias = 'production'"
)Note this is a search rather than a get because alias -> bindings is 1:many. For example, an org might deploy the same server behind two endpoints targeting production - https://mcp-primary.example.com/brave-search and https://mcp-failover.example.com/brave-search - for redundancy. A single get_mcp_server() call also returns all bindings with their alias targets as summaries.
We looked into a name_or_uri approach following the prompt registry pattern (load_prompt("prompts:/my_prompt@production")), but MCP server names use reverse-DNS format with slashes (e.g., io.github.anthropic/brave-search) which conflicts with URI path separators, so the pattern doesn't translate cleanly.
Maybe as a follow-up enhancement we could consider adding server_alias and server_version as direct query parameters / SDK kwargs on the search endpoint to avoid constructing filter_string expressions for these common lookups:
bindings = mlflow.genai.search_mcp_access_bindings(
server_name="io.github.anthropic/brave-search",
server_alias="production",
)GET /ajax-api/3.0/mlflow/mcp-servers/io.github.anthropic%2Fbrave-search/bindings?server_alias=production
There was a problem hiding this comment.
Got it, so are we intentionally allowing to have multiple endpoints per alias? I'm honestly not sure if having multiple endpoints is realistic since we can use the same domain with load balancing. But yeah, if it's intentional, the current data model makes sense then. Let's update search_mcp_access_bindings so that it's easier to find endpoints for a specific alias/environment.
There was a problem hiding this comment.
Yeah, it's intentional - the model doesn't require multiple bindings per alias, but it doesn't prevent it either.
Scenarios where multiple bindings per alias or version are useful
Internal vs external access: An org might expose the same governed server at https://mcp.internal.example.com/brave-search (internal network, no auth) and https://mcp.example.com/brave-search (external, with mTLS). Both target production but serve different network contexts.
Different deployment modes: Consider something like mcp-atlassian, which supports a READ_ONLY_MODE env var. An org could deploy two instances of the same governed server version - one read-only (READ_ONLY_MODE=true) at https://mcp-readonly.example.com/atlassian and one read-write (READ_ONLY_MODE=false) at https://mcp-readwrite.example.com/atlassian - both targeting the same alias but offering different operational guarantees.
Version migration with stable + versioned endpoints: Even with a single binding per alias, you can end up with multiple bindings resolving to the same version. For example:
binding1->alias: production-> resolves tov3(follows the alias pointer)binding2->version: v3(pinned directly)binding3->version: v2(legacy clients still on v2)
Each might have its own endpoint: prod.example.com, mcp-v3.example.com, mcp-v2.example.com. When the admin moves production from v2 to v3, binding1 automatically follows while binding3 stays on v2 for legacy clients. This is a natural outcome of the alias + direct-version model.
Agreed on improving search_mcp_access_bindings by adding server_alias and server_version as direct query parameters. However, I will do this as a follow-up enhancement if that's okay, since this RFC has been approved now!
| binding_id: int # stable MLflow-managed binding identifier | ||
| name: str # parent MCPServer name | ||
| endpoint_url: str # required approved direct endpoint | ||
| transport_type: MCPTransportType = MCPTransportType.STREAMABLE_HTTP |
There was a problem hiding this comment.
How do we plan to support the stdio connection?
There was a problem hiding this comment.
Access bindings are specifically for remote endpoints, so stdio doesn't apply here. I've renamed the enum to MCPRemoteTransportType and dropped STDIO from it to make that scoping clearer.
I originally included it since I figured an MCPTransportType would be a general purpose thing and should include all possible types.
There was a problem hiding this comment.
Not limited to the bindings, but do we expect users to register stdio MCP servers in this MCP registry? Or is this limited to HTTP connections only? If that's the latter, we should clarify in the RFC.
There was a problem hiding this comment.
Yeah, stdio servers are supported in the registry. They just wouldn't have an MCPAccessBinding. The registry stores the full server_json payload including packages[].transport.type = "stdio".
There is still value in having stdio servers in the registry for governance, versioning, and discovery - and for linking traces to them.
TomeHirata
left a comment
There was a problem hiding this comment.
LGTM once https://github.com/mlflow/rfcs/pull/12/changes#r3199173218 is addressed/answered!
…decoupling - Remove unnecessary "admin grants permissions" step from both user journeys (workspace scoping handles discovery, per-resource permissions handle management elevation) - Rename MCPTransportType to MCPRemoteTransportType and drop STDIO (access bindings are for remote endpoints; stdio servers are consumed directly from server_json packages) - Remove MCPGatewayBinding subclass example (implementation relationship with access bindings is an implementation detail for later) - Update future gateway references to use generic "gateway-managed entity" language and remove MCPGatewayBinding from mermaid diagram Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Jon Burdo <jon@jonburdo.com>
mprahl
left a comment
There was a problem hiding this comment.
The latest changes looks good! Perhaps I shouldn't have approved because I'm a coauthor lol. Could @B-Step62 and @TomeHirata take one more look?
…_alias
MCPAccessBinding is a cross-reference entity with its own binding_id,
not a true child like MCPServerVersion. Prefixing the reference fields
with server_ makes the FK relationship explicit and distinguishes them
from the binding's own identity.
Renamed across entity, DB schema, store interface, API models, and SDK.
REST paths keep {name} for URL nesting consistency.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Jon Burdo <jon@jonburdo.com>
Adds a design for an MCP registry in MLflow, aligned with the upstream MCP registry specification.
mlflow issue: mlflow/mlflow#22625
authored by: @jonburdo @dkuc @mprahl