Skip to content

feat(mcp): add workflow automation MCP tools for ingestion pipeline management#27741

Open
Keerthivasan-Venkitajalam wants to merge 2 commits intoopen-metadata:mainfrom
Keerthivasan-Venkitajalam:feat/mcp-workflow-automation-tools
Open

feat(mcp): add workflow automation MCP tools for ingestion pipeline management#27741
Keerthivasan-Venkitajalam wants to merge 2 commits intoopen-metadata:mainfrom
Keerthivasan-Venkitajalam:feat/mcp-workflow-automation-tools

Conversation

@Keerthivasan-Venkitajalam
Copy link
Copy Markdown

@Keerthivasan-Venkitajalam Keerthivasan-Venkitajalam commented Apr 26, 2026

Summary

Closes part of #26609 — implements the Workflow Automation category of new MCP tools.

Adds three new tools to the OpenMetadata MCP server so agents can manage ingestion pipelines conversationally, without switching to the UI or calling REST APIs directly.

New Tools

Tool Description
list_ingestion_pipelines List pipelines, optionally filtered by service or pipeline type
get_pipeline_status Get latest run status + recent run history for a pipeline
trigger_ingestion_pipeline Trigger an immediate run of a deployed pipeline

Architecture

sequenceDiagram
    participant Agent as AI Agent / MCP Client
    participant MCP as McpServer
    participant DTC as DefaultToolContext
    participant Repo as IngestionPipelineRepository
    participant PSC as PipelineServiceClient

    Agent->>MCP: call tool (e.g. trigger_ingestion_pipeline)
    MCP->>DTC: callTool(toolName, params)

    alt list_ingestion_pipelines
        DTC->>Repo: listAfter(filter, limit)
        Repo-->>DTC: ResultList<IngestionPipeline>
    else get_pipeline_status
        DTC->>Repo: getLatestPipelineStatus(pipeline)
        DTC->>Repo: listPipelineStatus(fqn, limit)
        Repo-->>DTC: PipelineStatus + history
    else trigger_ingestion_pipeline
        DTC->>Repo: getEntityByName(fqn)
        DTC->>PSC: runPipeline(pipeline, service)
        PSC-->>DTC: PipelineServiceClientResponse
    end

    DTC-->>MCP: Map<String, Object>
    MCP-->>Agent: CallToolResult (JSON)
Loading

Files Changed

  • McpApplicationContext.java (new) — lightweight static holder for OpenMetadataApplicationConfig, populated at MCP server init so tools can access app config without reflection
  • McpServer.java — registers config into McpApplicationContext on startup
  • ListIngestionPipelinesTool.java (new) — lists ingestion pipelines with service/type filters
  • GetPipelineStatusTool.java (new) — fetches latest status + recent run history
  • TriggerIngestionPipelineTool.java (new) — triggers a pipeline run via PipelineServiceClientFactory
  • DefaultToolContext.java — wires the 3 new tools into the switch router
  • tools.json — adds JSON schema definitions for all 3 tools (LLM-facing descriptions + input schemas)

Example Agent Interactions

"List all ingestion pipelines for the bigquery_prod service"
→ list_ingestion_pipelines { service: "bigquery_prod" }

"What's the status of the last run for bigquery_prod.metadata_ingestion?"
→ get_pipeline_status { fqn: "bigquery_prod.metadata_ingestion" }

"Trigger the metadata ingestion for bigquery_prod right now"
→ trigger_ingestion_pipeline { fqn: "bigquery_prod.metadata_ingestion" }

Checklist

  • Follows existing tool patterns (McpTool interface, Entity.getEntityRepository, auth via authorizer.authorize)
  • LLM-optimized tool descriptions (verbose fields stripped, context-aware parameter descriptions)
  • Authorization enforced (VIEW_BASIC for list/status, EDIT_ALL for trigger)
  • tools.json schema updated with input validation and helpful descriptions
  • Graceful error when pipeline service client is not configured

Summary by Gitar

  • Refactored logic:
    • Added CommonUtils.parseLimit to centralize limit parameter parsing and validation.
    • Streamlined ListIngestionPipelinesTool and GetPipelineStatusTool to utilize the new helper.
  • Enhanced safety:
    • Added a deployment guard in TriggerIngestionPipelineTool to prevent triggering non-deployed pipelines.

This will update automatically on new commits.

…anagement

Addresses the Workflow Automation category from issue open-metadata#26609 (New MCP Tools).

Adds three new MCP tools:
- list_ingestion_pipelines: list pipelines by service/type with pagination
- get_pipeline_status: fetch latest run status and recent run history
- trigger_ingestion_pipeline: trigger an immediate pipeline run on demand

Also adds McpApplicationContext to store OpenMetadataApplicationConfig at
server init time, making it accessible from MCP tools without reflection.
Copilot AI review requested due to automatic review settings April 26, 2026 16:38
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

case "create_metric":
result = new CreateMetricTool().execute(authorizer, limits, securityContext, params);
break;
case "list_ingestion_pipelines":
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Raw string literals used as tool name constants in switch

Per project standards ('No Magic: Never use raw strings for constants or logic branches. Use Enums/Constants.'), the new tool names "list_ingestion_pipelines", "get_pipeline_status", and "trigger_ingestion_pipeline" are used as bare string literals in the switch statement. These should be declared as named constants (matching the pattern that should be used for the existing cases as well, but that's out of scope).

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Comment on lines +22 to +36
public Map<String, Object> execute(
Authorizer authorizer, CatalogSecurityContext securityContext, Map<String, Object> params)
throws IOException {
String fqn = (String) params.get("fqn");
if (fqn == null || fqn.isBlank()) {
throw new IllegalArgumentException("Parameter 'fqn' is required");
}

int limit = 5;
if (params.containsKey("limit")) {
Object limitObj = params.get("limit");
if (limitObj instanceof Number number) {
limit = number.intValue();
} else if (limitObj instanceof String s) {
try {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Methods exceed 15-line guideline

The execute() methods in GetPipelineStatusTool (45 lines), ListIngestionPipelinesTool (56 lines), and TriggerIngestionPipelineTool (40 lines) exceed the 15-line method limit from the project coding standards. Consider extracting parameter parsing, authorization, and response building into separate private methods.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

…guard, apply google-java-format

- Extract duplicate limit-parsing logic to CommonUtils.parseLimit()
- Add deployed check in TriggerIngestionPipelineTool before attempting runPipeline
- Reformat all new files with google-java-format (GOOGLE style) to pass spotless check
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 26, 2026

Code Review 👍 Approved with suggestions 3 resolved / 5 findings

Implements new MCP tools for ingestion pipeline management while resolving input parsing errors and missing deployment checks. Refactor the execute methods to meet length guidelines and replace raw string literals with defined constants to improve maintainability.

💡 Quality: Raw string literals used as tool name constants in switch

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/DefaultToolContext.java:86 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/DefaultToolContext.java:89 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/DefaultToolContext.java:92

Per project standards ('No Magic: Never use raw strings for constants or logic branches. Use Enums/Constants.'), the new tool names "list_ingestion_pipelines", "get_pipeline_status", and "trigger_ingestion_pipeline" are used as bare string literals in the switch statement. These should be declared as named constants (matching the pattern that should be used for the existing cases as well, but that's out of scope).

💡 Quality: Methods exceed 15-line guideline

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/GetPipelineStatusTool.java:22-36 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/ListIngestionPipelinesTool.java:34-48 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/TriggerIngestionPipelineTool.java:25-39

The execute() methods in GetPipelineStatusTool (45 lines), ListIngestionPipelinesTool (56 lines), and TriggerIngestionPipelineTool (40 lines) exceed the 15-line method limit from the project coding standards. Consider extracting parameter parsing, authorization, and response building into separate private methods.

✅ 3 resolved
Edge Case: NumberFormatException silently swallowed, invalid input ignored

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/GetPipelineStatusTool.java:38-39 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/ListIngestionPipelinesTool.java:49-50
In both GetPipelineStatusTool (line 38) and ListIngestionPipelinesTool (line 49), when the limit parameter is a non-numeric string, the NumberFormatException is silently caught and the limit is reset to the default. This means a caller passing "limit": "abc" gets no indication their input was invalid. Per project standards, empty catch blocks should be avoided and exceptions should be logged with context.

Quality: Duplicate limit-parsing logic should be extracted to a helper

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/GetPipelineStatusTool.java:30-42 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/ListIngestionPipelinesTool.java:41-53
GetPipelineStatusTool (lines 30-42) and ListIngestionPipelinesTool (lines 41-53) contain identical boilerplate for parsing an optional integer parameter from a Map<String, Object>. This violates DRY and will drift over time. Consider extracting a shared utility method, e.g. McpToolUtils.getIntParam(params, "limit", defaultValue).

Bug: TriggerIngestionPipelineTool doesn't check if pipeline is deployed

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/TriggerIngestionPipelineTool.java:50-63
The trigger_ingestion_pipeline tool description states it 'Requires the pipeline to be deployed', but TriggerIngestionPipelineTool.execute() never checks pipeline.getDeployed() before calling pipelineServiceClient.runPipeline(). Triggering a non-deployed pipeline will likely produce a confusing error from the pipeline service backend instead of a clear user-facing message.

🤖 Prompt for agents
Code Review: Implements new MCP tools for ingestion pipeline management while resolving input parsing errors and missing deployment checks. Refactor the `execute` methods to meet length guidelines and replace raw string literals with defined constants to improve maintainability.

1. 💡 Quality: Raw string literals used as tool name constants in switch
   Files: openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/DefaultToolContext.java:86, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/DefaultToolContext.java:89, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/DefaultToolContext.java:92

   Per project standards ('No Magic: Never use raw strings for constants or logic branches. Use Enums/Constants.'), the new tool names `"list_ingestion_pipelines"`, `"get_pipeline_status"`, and `"trigger_ingestion_pipeline"` are used as bare string literals in the switch statement. These should be declared as named constants (matching the pattern that should be used for the existing cases as well, but that's out of scope).

2. 💡 Quality: Methods exceed 15-line guideline
   Files: openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/GetPipelineStatusTool.java:22-36, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/ListIngestionPipelinesTool.java:34-48, openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/TriggerIngestionPipelineTool.java:25-39

   The `execute()` methods in `GetPipelineStatusTool` (45 lines), `ListIngestionPipelinesTool` (56 lines), and `TriggerIngestionPipelineTool` (40 lines) exceed the 15-line method limit from the project coding standards. Consider extracting parameter parsing, authorization, and response building into separate private methods.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants