Summary
The local-MCP tools in airbyte/mcp/local.py (validate_connector_config, list_source_streams, read_source_stream_records, get_stream_previews) currently accept only a connector_name: str argument and resolve everything else (image, version, manifest path) via the registry. PyAirbyte's get_source(...) already supports a docker_image: bool | str | None argument that lets callers point at a specific docker image (and version to pin the tag), but the MCP layer doesn't expose any of that. As a result, MCP callers can't pin to a specific connector version, can't point at a custom-built image (e.g. airbyte/source-mssql:dev after docker build), and silently always pull whatever the registry resolves at the moment of the call.
Proposal
Let MCP callers identify a connector by docker image identifier in addition to the registry name. Two compatible shapes (either or both, following the patterns already in get_source(...)):
- Overload
connector_name to accept a docker image identifier when it looks like one — i.e. when it contains a / or matches <repo>/<name>(:<tag>)?. So all of these would be valid inputs:
source-mssql (current behavior — registry lookup, latest version)
airbyte/source-mssql (docker image, latest tag)
airbyte/source-mssql:4.4.2 (docker image, pinned tag)
airbyte/source-mssql:dev (locally built dev image)
- Add an explicit
docker_image: str | None = None arg to each MCP tool, mirroring get_source(docker_image=...). Pass-through to _get_mcp_source and from there into get_source. When docker_image is set, override override_execution_mode to "docker" (since the caller has explicitly chosen the docker path).
Either shape works. (1) is the lowest-friction change for callers since it's just a string. (2) is a cleaner separation if we'd rather keep connector_name strictly as a registry key. Doing both is also fine and consistent with get_source(name=..., docker_image=...).
A version: str | None = None arg following the same pattern would also be useful — get_source(...) already supports it, and MCP callers currently have no way to pin to a specific registry version when staying on the connector_name-only path.
Use case
CONTRIBUTING.md-blessed connector repro harnesses (in airbytehq/airbyte). Specifically, airbytehq/airbyte#77775 is adding a CDC repro harness for source-mssql (Java/Kotlin) — the worked examples reproduce two real oncall issues against a pinned airbyte/source-mssql:4.4.2 image. Today the doc does this by hand-rolling docker run airbyte/source-mssql:4.4.2 read --catalog ..., which means contributors have to deal with picocli arg-splitting, catalog-shaping, and message parsing. The natural fix is to delegate that whole pipeline to coral-mcp / PyAirbyte's read_source_stream_records, but the wrapper doesn't currently let me say "use this image" — only "use whatever the registry resolves for source-mssql right now," which makes the harness non-deterministic and breaks the worked examples whenever a new GA ships.
The companion playbook update — airbytehq/ai-skills#302 — is rewriting the !slack_connector_issue_repro playbook to recommend coral-mcp (i.e. these MCP tools) as the primary connector-side runner for source/destination repros, regardless of language. That recommendation only really works once MCP callers can pin a docker image / version.
Repro of the friction today
# All three of these fail in the same shape — the wrapper hands
# `connector_name` straight through to get_source(name, docker_image=True),
# which then tries to resolve the source manifest path for the declarative-
# connector code path before the docker-image override takes effect.
mcp_tool.call("validate_connector_config", connector_name="source-mssql", config={...})
# → [false,"Failed to get connector 'source-mssql': [Errno 13] Permission denied: '/source-mssql'"]
mcp_tool.call("validate_connector_config", connector_name="airbyte/source-mssql", config={...})
# → [false,"Failed to get connector 'airbyte/source-mssql': [Errno 2] No such file or directory: '/airbyte/source-mssql'"]
mcp_tool.call("validate_connector_config", connector_name="airbyte/source-mssql:4.4.2", config={...})
# → [false,"Failed to get connector 'airbyte/source-mssql:4.4.2': [Errno 2] No such file or directory: '/airbyte/source-mssql:4.4.2'"]
Even with override_execution_mode="docker" set explicitly the first error still happens, since _get_mcp_source runs get_source(connector_name, docker_image=True, install_if_missing=False, source_manifest=manifest_path or None) — docker_image=True says "use the default image for this connector name", which only works if name resolution succeeds first. There's no way today to say "use this image, skip name resolution entirely".
Where the change lands
airbyte/mcp/local.py — _get_mcp_source(...) gains a docker_image: str | None = None param (and/or detects-and-forwards connector_name when it looks like an image), passes it through to get_source(docker_image=...) instead of docker_image=True.
airbyte/mcp/local.py — each public MCP tool (validate_connector_config, list_source_streams, read_source_stream_records, get_stream_previews) gains the same docker_image (and version) param.
- The argument descriptions in the MCP schema document the supported forms (registry name vs. docker image identifier).
Happy to take this on if it's small enough to land before the dependent PRs.
Devin session
Summary
The local-MCP tools in
airbyte/mcp/local.py(validate_connector_config,list_source_streams,read_source_stream_records,get_stream_previews) currently accept only aconnector_name: strargument and resolve everything else (image, version, manifest path) via the registry. PyAirbyte'sget_source(...)already supports adocker_image: bool | str | Noneargument that lets callers point at a specific docker image (andversionto pin the tag), but the MCP layer doesn't expose any of that. As a result, MCP callers can't pin to a specific connector version, can't point at a custom-built image (e.g.airbyte/source-mssql:devafterdocker build), and silently always pull whatever the registry resolves at the moment of the call.Proposal
Let MCP callers identify a connector by docker image identifier in addition to the registry name. Two compatible shapes (either or both, following the patterns already in
get_source(...)):connector_nameto accept a docker image identifier when it looks like one — i.e. when it contains a/or matches<repo>/<name>(:<tag>)?. So all of these would be valid inputs:source-mssql(current behavior — registry lookup, latest version)airbyte/source-mssql(docker image, latest tag)airbyte/source-mssql:4.4.2(docker image, pinned tag)airbyte/source-mssql:dev(locally builtdevimage)docker_image: str | None = Nonearg to each MCP tool, mirroringget_source(docker_image=...). Pass-through to_get_mcp_sourceand from there intoget_source. Whendocker_imageis set, overrideoverride_execution_modeto"docker"(since the caller has explicitly chosen the docker path).Either shape works. (1) is the lowest-friction change for callers since it's just a string. (2) is a cleaner separation if we'd rather keep
connector_namestrictly as a registry key. Doing both is also fine and consistent withget_source(name=..., docker_image=...).A
version: str | None = Nonearg following the same pattern would also be useful —get_source(...)already supports it, and MCP callers currently have no way to pin to a specific registry version when staying on theconnector_name-only path.Use case
CONTRIBUTING.md-blessed connector repro harnesses (in airbytehq/airbyte). Specifically, airbytehq/airbyte#77775 is adding a CDC repro harness for
source-mssql(Java/Kotlin) — the worked examples reproduce two real oncall issues against a pinnedairbyte/source-mssql:4.4.2image. Today the doc does this by hand-rollingdocker run airbyte/source-mssql:4.4.2 read --catalog ..., which means contributors have to deal with picocli arg-splitting, catalog-shaping, and message parsing. The natural fix is to delegate that whole pipeline to coral-mcp / PyAirbyte'sread_source_stream_records, but the wrapper doesn't currently let me say "use this image" — only "use whatever the registry resolves forsource-mssqlright now," which makes the harness non-deterministic and breaks the worked examples whenever a new GA ships.The companion playbook update — airbytehq/ai-skills#302 — is rewriting the
!slack_connector_issue_reproplaybook to recommend coral-mcp (i.e. these MCP tools) as the primary connector-side runner for source/destination repros, regardless of language. That recommendation only really works once MCP callers can pin a docker image / version.Repro of the friction today
Even with
override_execution_mode="docker"set explicitly the first error still happens, since_get_mcp_sourcerunsget_source(connector_name, docker_image=True, install_if_missing=False, source_manifest=manifest_path or None)—docker_image=Truesays "use the default image for this connector name", which only works if name resolution succeeds first. There's no way today to say "use this image, skip name resolution entirely".Where the change lands
airbyte/mcp/local.py—_get_mcp_source(...)gains adocker_image: str | None = Noneparam (and/or detects-and-forwardsconnector_namewhen it looks like an image), passes it through toget_source(docker_image=...)instead ofdocker_image=True.airbyte/mcp/local.py— each public MCP tool (validate_connector_config,list_source_streams,read_source_stream_records,get_stream_previews) gains the samedocker_image(andversion) param.Happy to take this on if it's small enough to land before the dependent PRs.
Devin session