agents

AI agent tooling for data engineering workflows. Includes an MCP server for Airflow, a CLI tool (af) for interacting with Airflow from your terminal, and skills that extend AI coding agents with specialized capabilities for working with Airflow and data warehouses. Works with Claude Code, Cursor, and other agentic coding tools.

Built by Astronomer. Apache 2.0 licensed and compatible with open-source Apache Airflow.

Installation

Quick Start

npx skills add astronomer/agents --skill '*'

This installs all Astronomer skills into your project via skills.sh. You'll be prompted to select which agents to install to. To also select skills individually, omit the --skill flag.

Claude Code users: We recommend using the plugin instead (see Claude Code section below) for better integration with MCP servers and hooks.

Compatibility

Skills: Works with 25+ AI coding agents including Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, Cline, and more.

MCP Server: Works with any MCP-compatible client including Claude Desktop, VS Code, and others.

Claude Code

# Add the marketplace and install the plugin
claude plugin marketplace add astronomer/agents
claude plugin install data@astronomer

The plugin includes the Airflow MCP server that runs via uvx from PyPI. Data warehouse queries are handled by the analyzing-data skill using a background Jupyter kernel.

Cursor

Cursor supports both MCP servers and skills.

MCP Server - Click to install:

Skills - Install to your project:

npx skills add astronomer/agents --skill '*' -a cursor

This installs skills to .cursor/skills/ in your project.

Manual MCP configuration

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "airflow": {
      "command": "uvx",
      "args": ["astro-airflow-mcp", "--transport", "stdio"]
    }
  }
}

Enable hooks (skill suggestions, session management)

Create .cursor/hooks.json in your project:

{
  "version": 1,
  "hooks": {
    "beforeSubmitPrompt": [
      {
        "command": "$CURSOR_PROJECT_DIR/.cursor/skills/airflow/hooks/airflow-skill-suggester.sh",
        "timeout": 5
      }
    ],
    "stop": [
      {
        "command": "uv run $CURSOR_PROJECT_DIR/.cursor/skills/analyzing-data/scripts/cli.py stop",
        "timeout": 10
      }
    ]
  }
}

What these hooks do:

beforeSubmitPrompt: Suggests data skills when you mention Airflow keywords
stop: Cleans up kernel when session ends

Other MCP Clients

For any MCP-compatible client (Claude Desktop, VS Code, etc.):

# Airflow MCP
uvx astro-airflow-mcp --transport stdio

# With remote Airflow
AIRFLOW_API_URL=https://your-airflow.example.com \
AIRFLOW_USERNAME=admin \
AIRFLOW_PASSWORD=admin \
uvx astro-airflow-mcp --transport stdio

Features

The data plugin bundles an MCP server and skills into a single installable package.

MCP Server

Server	Description
Airflow	Full Airflow REST API integration via astro-airflow-mcp: DAG management, triggering, task logs, system health

Skills

Data Discovery & Analysis

Skill	Description
init	Initialize schema discovery - generates `.astro/warehouse.md` for instant lookups
analyzing-data	SQL-based analysis to answer business questions (uses background Jupyter kernel)
checking-freshness	Check how current your data is
profiling-tables	Comprehensive table profiling and quality assessment

Data Lineage

Skill	Description
tracing-downstream-lineage	Analyze what breaks if you change something
tracing-upstream-lineage	Trace where data comes from
annotating-task-lineage	Add manual lineage to tasks using inlets/outlets
creating-openlineage-extractors	Build custom OpenLineage extractors for operators

DAG Development

Skill	Description
airflow	Main entrypoint - routes to specialized Airflow skills
setting-up-astro-project	Initialize and configure new Astro/Airflow projects
managing-astro-local-env	Manage local Airflow environment (start, stop, logs, troubleshoot)
authoring-dags	Create and validate Airflow DAGs with best practices
testing-dags	Test and debug Airflow DAGs locally
debugging-dags	Deep failure diagnosis and root cause analysis

Migration

Skill	Description
migrating-airflow-2-to-3	Migrate DAGs from Airflow 2.x to 3.x

User Journeys

Data Analysis Flow

flowchart LR
    init["/data:init"] --> analyzing["/data:analyzing-data"]
    analyzing --> profiling["/data:profiling-tables"]
    analyzing --> freshness["/data:checking-freshness"]

Initialize (/data:init) - One-time setup to generate warehouse.md with schema metadata
Analyze (/data:analyzing-data) - Answer business questions with SQL
Profile (/data:profiling-tables) - Deep dive into specific tables for statistics and quality
Check freshness (/data:checking-freshness) - Verify data is up to date before using

DAG Development Flow

flowchart LR
    setup["/data:setting-up-astro-project"] --> authoring["/data:authoring-dags"]
    setup --> env["/data:managing-astro-local-env"]
    authoring --> testing["/data:testing-dags"]
    testing --> debugging["/data:debugging-dags"]

Setup (/data:setting-up-astro-project) - Initialize project structure and dependencies
Environment (/data:managing-astro-local-env) - Start/stop local Airflow for development
Author (/data:authoring-dags) - Write DAG code following best practices
Test (/data:testing-dags) - Run DAGs and fix issues iteratively
Debug (/data:debugging-dags) - Deep investigation for complex failures

Airflow CLI (`af`)

The af command-line tool lets you interact with Airflow directly from your terminal. Install it with:

uvx --from astro-airflow-mcp af --help

For frequent use, add an alias to your shell config (~/.bashrc or ~/.zshrc):

alias af='uvx --from astro-airflow-mcp af'

Then use it for quick operations like af health, af dags list, or af runs trigger <dag_id>.

See the full CLI documentation for all commands and instance management.

Configuration

Warehouse Connections

Configure data warehouse connections at ~/.astro/agents/warehouse.yml:

my_warehouse:
  type: snowflake
  account: ${SNOWFLAKE_ACCOUNT}
  user: ${SNOWFLAKE_USER}
  auth_type: private_key
  private_key_path: ~/.ssh/snowflake_key.p8
  private_key_passphrase: ${SNOWFLAKE_PRIVATE_KEY_PASSPHRASE}
  warehouse: COMPUTE_WH
  role: ANALYST
  databases:
    - ANALYTICS
    - RAW

Store credentials in ~/.astro/agents/.env:

SNOWFLAKE_ACCOUNT=xyz12345
SNOWFLAKE_USER=myuser
SNOWFLAKE_PRIVATE_KEY_PASSPHRASE=your-passphrase-here  # Only required if using an encrypted private key

Supported databases:

Type	Package	Description
`snowflake`	Built-in	Snowflake Data Cloud
`postgres`	Built-in	PostgreSQL
`bigquery`	Built-in	Google BigQuery
`sqlalchemy`	Any SQLAlchemy driver	Auto-detects packages for 25+ databases (see below)

Auto-detected SQLAlchemy databases

The connector automatically installs the correct driver packages for:

Database	Dialect URL
PostgreSQL	`postgresql://` or `postgres://`
MySQL	`mysql://` or `mysql+pymysql://`
MariaDB	`mariadb://`
SQLite	`sqlite:///`
SQL Server	`mssql+pyodbc://`
Oracle	`oracle://`
Redshift	`redshift://`
Snowflake	`snowflake://`
BigQuery	`bigquery://`
DuckDB	`duckdb:///`
Trino	`trino://`
ClickHouse	`clickhouse://`
CockroachDB	`cockroachdb://`
Databricks	`databricks://`
Amazon Athena	`awsathena://`
Cloud Spanner	`spanner://`
Teradata	`teradata://`
Vertica	`vertica://`
SAP HANA	`hana://`
IBM Db2	`db2://`

For unlisted databases, install the driver manually and use standard SQLAlchemy URLs.

Example configurations

# PostgreSQL
my_postgres:
  type: postgres
  host: localhost
  port: 5432
  user: analyst
  password: ${POSTGRES_PASSWORD}
  database: analytics

# BigQuery
my_bigquery:
  type: bigquery
  project: my-gcp-project
  credentials_path: ~/.config/gcloud/service_account.json

# SQLAlchemy (any supported database)
my_duckdb:
  type: sqlalchemy
  url: duckdb:///path/to/analytics.duckdb
  databases: [main]

# Redshift (via SQLAlchemy)
my_redshift:
  type: sqlalchemy
  url: redshift+redshift_connector://${REDSHIFT_USER}:${REDSHIFT_PASSWORD}@${REDSHIFT_HOST}:5439/${REDSHIFT_DATABASE}
  databases: [my_database]

Airflow

The Airflow MCP auto-discovers your project when you run Claude Code from an Airflow project directory (contains airflow.cfg or dags/ folder).

For remote instances, set environment variables:

Variable	Description
`AIRFLOW_API_URL`	Airflow webserver URL
`AIRFLOW_USERNAME`	Username
`AIRFLOW_PASSWORD`	Password
`AIRFLOW_AUTH_TOKEN`	Bearer token (alternative to username/password)

Usage

Skills are invoked automatically based on what you ask. You can also invoke them directly with /data:<skill-name>.

Getting Started

Initialize your warehouse (recommended first step):
```
/data:init
```
This generates .astro/warehouse.md with schema metadata for faster queries.
Ask questions naturally:
- "What tables contain customer data?"
- "Show me revenue trends by product"
- "Create a DAG that loads data from S3 to Snowflake daily"
- "Why did my etl_pipeline DAG fail yesterday?"

Development

See CLAUDE.md for plugin development guidelines.

Local Development Setup

# Clone the repo
git clone https://github.com/astronomer/agents.git
cd agents

# Test with local plugin
claude --plugin-dir .

# Or install from local marketplace
claude plugin marketplace add .
claude plugin install data@astronomer

Adding Skills

Create a new skill in skills/<name>/SKILL.md with YAML frontmatter:

---
name: my-skill
description: When to invoke this skill
---

# Skill instructions here...

After adding skills, reinstall the plugin:

claude plugin uninstall data@astronomer && claude plugin install data@astronomer

Troubleshooting

Common Issues

Issue	Solution
Skills not appearing	Reinstall plugin: `claude plugin uninstall data@astronomer && claude plugin install data@astronomer`
Warehouse connection errors	Check credentials in `~/.astro/agents/.env` and connection config in `warehouse.yml`
Airflow not detected	Ensure you're running from a directory with `airflow.cfg` or a `dags/` folder

Contributing

Contributions welcome! Please read our Code of Conduct and Contributing Guide before getting started.

Roadmap

Skills we're likely to build:

DAG Operations

CI/CD pipelines for DAG deployment
Performance optimization and tuning
Monitoring and alerting setup
Data quality and validation workflows

Astronomer Open Source

Cosmos - Run dbt projects as Airflow DAGs
DAG Factory - Generate DAGs from YAML
Other open source projects we maintain

Conference Learnings

Reviewing talks from Airflow Summit, Coalesce, Data Council, and other conferences to extract reusable skills and patterns

Broader Data Practitioner Skills

Churn prediction, data modeling, ML training, and other workflows that span DE/DS/analytics roles

Don't see a skill you want? Open an issue or submit a PR!

License

Apache 2.0

Made with ❤️ by Astronomer

Name		Name	Last commit message	Last commit date
Latest commit History 183 Commits
.claude-plugin		.claude-plugin
.github/workflows		.github/workflows
astro-airflow-mcp		astro-airflow-mcp
skills		skills
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

License

astronomer/agents

Folders and files

Latest commit

History

Repository files navigation

agents

Table of Contents

Installation

Quick Start

Compatibility

Claude Code

Cursor

Other MCP Clients

Features

MCP Server

Skills

Data Discovery & Analysis

Data Lineage

DAG Development

Migration

User Journeys

Data Analysis Flow

DAG Development Flow

Airflow CLI (af)

Configuration

Warehouse Connections

Airflow

Usage

Getting Started

Development

Local Development Setup

Adding Skills

Troubleshooting

Common Issues

Contributing

Roadmap

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 10

Uh oh!

Languages

Airflow CLI (`af`)

Packages