Skip to content

feat(mcp): add tools to run, monitor, and alert on ingestion pipelines#27743

Open
puri-adityakumar wants to merge 2 commits intoopen-metadata:mainfrom
puri-adityakumar:feat/mcp-ops-tools
Open

feat(mcp): add tools to run, monitor, and alert on ingestion pipelines#27743
puri-adityakumar wants to merge 2 commits intoopen-metadata:mainfrom
puri-adityakumar:feat/mcp-ops-tools

Conversation

@puri-adityakumar
Copy link
Copy Markdown

Describe your changes

Part of #26609 (New MCP Tools epic).

Adds three new MCP tools to openmetadata-mcp so AI assistants (e.g. Claude Desktop) can drive day-2 ops on OpenMetadata via the existing MCP server:

  • run_ingestion — triggers an ingestion pipeline by FQN. Wraps IngestionPipelineResource.triggerIngestion.
  • get_ingestion_status — read-only; returns recent PipelineStatus rows via IngestionPipelineRepository.listPipelineStatus.
  • create_alert — creates an EventSubscription for pipelineFailed on ingestionPipeline with a webhook destination. Mirrors the existing GlossaryTool pattern (mapper + repo.createOrUpdate).

Each tool implements the existing McpTool interface and is dispatched from DefaultToolContext.callTool(). JSON Schemas for inputs are added to openmetadata-mcp/src/main/resources/json/data/mcp/tools.json.

No new REST endpoints, no auth/authz changes. Authorization flows through the existing OperationContext + Authorizer.authorize() (EDIT_ALL / VIEW_ALL / CREATE). Tools are deterministic REST wrappers — no LLM calls server-side; the LLM lives in the MCP client.

Files added/changed

  • openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/RunIngestionTool.java (new)
  • openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/GetIngestionStatusTool.java (new)
  • openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java (new)
  • openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/DefaultToolContext.java — 3 new switch arms in callTool()
  • openmetadata-mcp/src/main/resources/json/data/mcp/tools.json — 3 new tool entries
  • openmetadata-mcp/src/test/java/org/openmetadata/mcp/tools/ — 3 new test classes

Known issues / follow-ups

  • RunIngestionTool uses reflection to read the private pipelineServiceClient field on IngestionPipelineRepository (no public getter today). Flagged in javadoc. A clean follow-up is to add a public runIngestion(...) on the repository and drop the reflection.
  • CreateAlertTool initially passed secretKey="" for the webhook destination; switched to null after observing Fernet-encryption behavior on empty strings. Unit-tested both paths.

v1 scope for create_alert: resourceType=ingestionPipeline, eventType=pipelineFailed, webhook destination only. Slack / email / additional event types are intentionally out of scope (separate PRs under #26609).

Type of change

  • New feature (non-breaking change which adds functionality)

How was this tested?

  • Unit tests: mvn -pl openmetadata-mcp test -Dtest="RunIngestionToolTest,GetIngestionStatusToolTest,CreateAlertToolTest"10 / 10 pass (2 + 3 + 5)
  • mvn -pl openmetadata-mcp spotless:apply — clean
  • Existing MCP test suite still green after the DefaultToolContext.java edit
  • Manually verified the tools register on a custom OM 1.12-SNAPSHOT build that includes this branch — all three appear in the MCP tools/list response and are dispatchable from a Claude Desktop client

Checklist

  • I have read the CONTRIBUTING document
  • My PR title follows the conventional-commit pattern
  • I have commented on my code, particularly in hard-to-understand areas (reflection in RunIngestionTool is documented in javadoc)
  • For JSON Schema changes: tools.json is the MCP tool registry, not a stored entity schema — no migration needed
  • The issue (New MCP Tools #26609) properly describes why the new feature is needed, what's the goal, and how we are building it
  • I have added tests around the new logic
  • I have updated the documentation. (Follow-up: docs PR once tool surface stabilizes)

Closes part of #26609.

Aditya Puri added 2 commits April 26, 2026 13:12
…tools

Adds three new MCP tools to the OM MCP server, closing part of open-metadata#26609:

  run_ingestion         triggers an ingestion pipeline by FQN
  get_ingestion_status  returns the most recent N pipeline runs
  create_alert          registers an EventSubscription that posts to a
                        webhook on ingestionPipeline failure

Each is implemented as a McpTool, registered in DefaultToolContext.callTool
and declared in tools.json so Claude Desktop can discover them. No new
REST endpoints; no auth/authz changes; no LLM calls inside tools.

run_ingestion currently reads the @Setter-only pipelineServiceClient
field on IngestionPipelineRepository via reflection -- a small isolated
workaround pending a follow-up PR that exposes runIngestion() publicly.

create_alert is opinionated for v1: resourceType=ingestionPipeline only,
eventType=pipelineFailed only, destination=webhook only. Extending to
multi-event/multi-destination is a follow-up.

Tests: 10 JUnit tests across the three tools.
EventSubscriptionMapper.createToEntity passes every destination through
Fernet.encryptWebhookSecretKey; an empty string was being encrypted in
place of being left absent, which silently broke any future webhook
signature verification at delivery time. Pass null instead so the
encryption step skips the field.

Tests: 10/10 still pass.
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Comment on lines +77 to +79
String resourceFqn = requireString(params, "resourceFqn");
if (resourceFqn == null) {
return errorMap("resourceFqn is required");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 Bug: resourceFqn is validated but never used to scope the alert

The resourceFqn parameter is validated (line 77-79) and echoed back in the response (line 117), but it is never passed to buildRequest() or otherwise used to configure the EventSubscription filter. The resulting alert fires for all ingestionPipeline failures rather than just the specific pipeline the user requested. This silently violates the API contract described in the JSON schema ("FQN of the specific resource instance to watch") and will flood the webhook with unrelated failure notifications.

The buildRequest method needs to accept resourceFqn and configure an appropriate filter rule (e.g., via FilteringRules or equivalent) on the CreateEventSubscription to restrict the alert to the named pipeline.

Suggested fix:

Pass `resourceFqn` into `buildRequest` and configure a
filtering rule that restricts the subscription to the
specific pipeline entity, e.g.:

  FilteringRules rules = new FilteringRules();
  rules.setResources(List.of(resourceFqn));
  r.setInput(rules);

(Exact API depends on EventSubscription schema.)

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Comment on lines +57 to +71
public Map<String, Object> execute(
Authorizer authorizer,
Limits limits,
CatalogSecurityContext securityContext,
Map<String, Object> params)
throws IOException {
final String fqn = requireString(params, PARAM_FQN);
if (fqn == null) {
return errorMap(PARAM_FQN + " is required");
}

authorizer.authorize(
securityContext, new OperationContext(RESOURCE, EDIT_ALL), new ResourceContext<>(RESOURCE));

IngestionPipeline pipeline =
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: RunIngestionTool accepts Limits but never enforces them

The execute(authorizer, limits, securityContext, params) overload in RunIngestionTool receives a Limits parameter (to be consistent with the DefaultToolContext.callTool dispatch) but never calls limits.enforceLimits(...). Compare with CreateAlertTool (line 102) which explicitly enforces limits before proceeding. If the deployment has rate/quota limits configured for ingestion triggers, they will be silently bypassed via this MCP tool.

Suggested fix:

Add limits enforcement before the authorize call:

  OperationContext opCtx = new OperationContext(RESOURCE, EDIT_ALL);
  limits.enforceLimits(securityContext, resourceContext, opCtx);
  authorizer.authorize(securityContext, opCtx, resourceContext);

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Comment on lines +68 to +69
authorizer.authorize(
securityContext, new OperationContext(RESOURCE, EDIT_ALL), new ResourceContext<>(RESOURCE));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Security: Authorization checks resource type, not specific entity instance

RunIngestionTool (line 68-69) authorizes with new ResourceContext<>(RESOURCE), which checks EDIT_ALL against the resource type (ingestionPipeline) rather than the specific pipeline entity being triggered. The actual IngestionPipelineResource uses getResourceContextById(pipeline.getId()) for entity-level authorization. This means a user who is allowed to edit some pipelines can trigger any pipeline through this MCP tool, bypassing entity-level RBAC policies.

The same pattern applies to GetIngestionStatusTool (line 54-55).

Suggested fix:

Use entity-level ResourceContext after resolving the pipeline:

  ResourceContext<?> resourceContext =
      new ResourceContext<>(RESOURCE, pipeline.getId(), null);
  authorizer.authorize(
      securityContext, operationContext, resourceContext);

Note: move the authorize call after the entity lookup.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Comment on lines +83 to +85
if (!SUPPORTED_EVENT_TYPE.equals(eventType)) {
return errorMap("v1 supports eventType=" + SUPPORTED_EVENT_TYPE + " only");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: CreateAlertTool does not set eventType filter on subscription

Similar to the resourceFqn issue, the eventType parameter ("pipelineFailed") is validated (line 83-85) and echoed in the response (line 118), but buildRequest() never configures a filter for the specific event type. The alert will trigger on all events for the ingestionPipeline resource (creation, update, deletion, etc.), not just pipelineFailed as the user requested. This makes the tool's described behavior ("alert when pipeline fails") incorrect.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Comment on lines +116 to +128
private static PipelineServiceClientInterface readPipelineServiceClient(
IngestionPipelineRepository repo) {
try {
Field field = IngestionPipelineRepository.class.getDeclaredField("pipelineServiceClient");
field.setAccessible(true);
return (PipelineServiceClientInterface) field.get(repo);
} catch (ReflectiveOperationException exc) {
LOG.warn(
"Could not access IngestionPipelineRepository.pipelineServiceClient: {}",
exc.getMessage());
return null;
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Reflection to access private field is fragile

RunIngestionTool.readPipelineServiceClient uses reflection to access IngestionPipelineRepository.pipelineServiceClient (line 116-128). If the field is renamed or its access is changed, this breaks silently at runtime (returning null and reporting a misleading "not configured" error). The PR description acknowledges this as a known issue. Consider adding a public accessor to the repository in this PR rather than deferring it, since the workaround introduces a runtime fragility with a misleading error message.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 26, 2026

Code Review 🚫 Blocked 0 resolved / 5 findings

Adds ingestion and alert tools, but fails to apply resource scoping, ingestion limits, entity-level authorization, and event filtering, resulting in security and functional gaps.

🚨 Bug: resourceFqn is validated but never used to scope the alert

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:77-79 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:97 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:117 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:125-139 📄 openmetadata-mcp/src/main/resources/json/data/mcp/tools.json:647-649

The resourceFqn parameter is validated (line 77-79) and echoed back in the response (line 117), but it is never passed to buildRequest() or otherwise used to configure the EventSubscription filter. The resulting alert fires for all ingestionPipeline failures rather than just the specific pipeline the user requested. This silently violates the API contract described in the JSON schema ("FQN of the specific resource instance to watch") and will flood the webhook with unrelated failure notifications.

The buildRequest method needs to accept resourceFqn and configure an appropriate filter rule (e.g., via FilteringRules or equivalent) on the CreateEventSubscription to restrict the alert to the named pipeline.

Suggested fix
Pass `resourceFqn` into `buildRequest` and configure a
filtering rule that restricts the subscription to the
specific pipeline entity, e.g.:

  FilteringRules rules = new FilteringRules();
  rules.setResources(List.of(resourceFqn));
  r.setInput(rules);

(Exact API depends on EventSubscription schema.)
⚠️ Bug: RunIngestionTool accepts Limits but never enforces them

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/RunIngestionTool.java:57-71 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/DefaultToolContext.java:87

The execute(authorizer, limits, securityContext, params) overload in RunIngestionTool receives a Limits parameter (to be consistent with the DefaultToolContext.callTool dispatch) but never calls limits.enforceLimits(...). Compare with CreateAlertTool (line 102) which explicitly enforces limits before proceeding. If the deployment has rate/quota limits configured for ingestion triggers, they will be silently bypassed via this MCP tool.

Suggested fix
Add limits enforcement before the authorize call:

  OperationContext opCtx = new OperationContext(RESOURCE, EDIT_ALL);
  limits.enforceLimits(securityContext, resourceContext, opCtx);
  authorizer.authorize(securityContext, opCtx, resourceContext);
⚠️ Security: Authorization checks resource type, not specific entity instance

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/RunIngestionTool.java:68-69 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/GetIngestionStatusTool.java:54-55

RunIngestionTool (line 68-69) authorizes with new ResourceContext<>(RESOURCE), which checks EDIT_ALL against the resource type (ingestionPipeline) rather than the specific pipeline entity being triggered. The actual IngestionPipelineResource uses getResourceContextById(pipeline.getId()) for entity-level authorization. This means a user who is allowed to edit some pipelines can trigger any pipeline through this MCP tool, bypassing entity-level RBAC policies.

The same pattern applies to GetIngestionStatusTool (line 54-55).

Suggested fix
Use entity-level ResourceContext after resolving the pipeline:

  ResourceContext<?> resourceContext =
      new ResourceContext<>(RESOURCE, pipeline.getId(), null);
  authorizer.authorize(
      securityContext, operationContext, resourceContext);

Note: move the authorize call after the entity lookup.
⚠️ Bug: CreateAlertTool does not set eventType filter on subscription

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:83-85 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:118 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:125-139

Similar to the resourceFqn issue, the eventType parameter ("pipelineFailed") is validated (line 83-85) and echoed in the response (line 118), but buildRequest() never configures a filter for the specific event type. The alert will trigger on all events for the ingestionPipeline resource (creation, update, deletion, etc.), not just pipelineFailed as the user requested. This makes the tool's described behavior ("alert when pipeline fails") incorrect.

💡 Quality: Reflection to access private field is fragile

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/RunIngestionTool.java:116-128

RunIngestionTool.readPipelineServiceClient uses reflection to access IngestionPipelineRepository.pipelineServiceClient (line 116-128). If the field is renamed or its access is changed, this breaks silently at runtime (returning null and reporting a misleading "not configured" error). The PR description acknowledges this as a known issue. Consider adding a public accessor to the repository in this PR rather than deferring it, since the workaround introduces a runtime fragility with a misleading error message.

🤖 Prompt for agents
Code Review: Adds ingestion and alert tools, but fails to apply resource scoping, ingestion limits, entity-level authorization, and event filtering, resulting in security and functional gaps.

1. 🚨 Bug: `resourceFqn` is validated but never used to scope the alert
   Files: openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:77-79, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:97, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:117, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:125-139, openmetadata-mcp/src/main/resources/json/data/mcp/tools.json:647-649

   The `resourceFqn` parameter is validated (line 77-79) and echoed back in the response (line 117), but it is never passed to `buildRequest()` or otherwise used to configure the `EventSubscription` filter. The resulting alert fires for **all** `ingestionPipeline` failures rather than just the specific pipeline the user requested. This silently violates the API contract described in the JSON schema ("FQN of the specific resource instance to watch") and will flood the webhook with unrelated failure notifications.
   
   The `buildRequest` method needs to accept `resourceFqn` and configure an appropriate filter rule (e.g., via `FilteringRules` or equivalent) on the `CreateEventSubscription` to restrict the alert to the named pipeline.

   Suggested fix:
   Pass `resourceFqn` into `buildRequest` and configure a
   filtering rule that restricts the subscription to the
   specific pipeline entity, e.g.:
   
     FilteringRules rules = new FilteringRules();
     rules.setResources(List.of(resourceFqn));
     r.setInput(rules);
   
   (Exact API depends on EventSubscription schema.)

2. ⚠️ Bug: `RunIngestionTool` accepts `Limits` but never enforces them
   Files: openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/RunIngestionTool.java:57-71, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/DefaultToolContext.java:87

   The `execute(authorizer, limits, securityContext, params)` overload in `RunIngestionTool` receives a `Limits` parameter (to be consistent with the `DefaultToolContext.callTool` dispatch) but never calls `limits.enforceLimits(...)`. Compare with `CreateAlertTool` (line 102) which explicitly enforces limits before proceeding. If the deployment has rate/quota limits configured for ingestion triggers, they will be silently bypassed via this MCP tool.

   Suggested fix:
   Add limits enforcement before the authorize call:
   
     OperationContext opCtx = new OperationContext(RESOURCE, EDIT_ALL);
     limits.enforceLimits(securityContext, resourceContext, opCtx);
     authorizer.authorize(securityContext, opCtx, resourceContext);

3. ⚠️ Security: Authorization checks resource type, not specific entity instance
   Files: openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/RunIngestionTool.java:68-69, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/GetIngestionStatusTool.java:54-55

   `RunIngestionTool` (line 68-69) authorizes with `new ResourceContext<>(RESOURCE)`, which checks EDIT_ALL against the *resource type* (`ingestionPipeline`) rather than the specific pipeline entity being triggered. The actual `IngestionPipelineResource` uses `getResourceContextById(pipeline.getId())` for entity-level authorization. This means a user who is allowed to edit *some* pipelines can trigger *any* pipeline through this MCP tool, bypassing entity-level RBAC policies.
   
   The same pattern applies to `GetIngestionStatusTool` (line 54-55).

   Suggested fix:
   Use entity-level ResourceContext after resolving the pipeline:
   
     ResourceContext<?> resourceContext =
         new ResourceContext<>(RESOURCE, pipeline.getId(), null);
     authorizer.authorize(
         securityContext, operationContext, resourceContext);
   
   Note: move the authorize call after the entity lookup.

4. ⚠️ Bug: `CreateAlertTool` does not set `eventType` filter on subscription
   Files: openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:83-85, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:118, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/CreateAlertTool.java:125-139

   Similar to the `resourceFqn` issue, the `eventType` parameter ("pipelineFailed") is validated (line 83-85) and echoed in the response (line 118), but `buildRequest()` never configures a filter for the specific event type. The alert will trigger on *all* events for the `ingestionPipeline` resource (creation, update, deletion, etc.), not just `pipelineFailed` as the user requested. This makes the tool's described behavior ("alert when pipeline fails") incorrect.

5. 💡 Quality: Reflection to access private field is fragile
   Files: openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/RunIngestionTool.java:116-128

   `RunIngestionTool.readPipelineServiceClient` uses reflection to access `IngestionPipelineRepository.pipelineServiceClient` (line 116-128). If the field is renamed or its access is changed, this breaks silently at runtime (returning `null` and reporting a misleading "not configured" error). The PR description acknowledges this as a known issue. Consider adding a public accessor to the repository in this PR rather than deferring it, since the workaround introduces a runtime fragility with a misleading error message.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@puri-adityakumar puri-adityakumar changed the title feat(mcp): add run_ingestion, get_ingestion_status, create_alert tools feat(mcp): add tools to run, monitor, and alert on ingestion pipelines Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant