AI agent tooling for data engineering workflows. Includes an MCP server for Airflow, a CLI tool (af) for interacting with Airflow from your terminal, and skills that extend AI coding agents with specialized capabilities for working with Airflow and data warehouses. Works with Claude Code, Cursor, and other agentic coding tools.
Built by Astronomer. Apache 2.0 licensed and compatible with open-source Apache Airflow.
npx skills add astronomer/agents --skill '*'This installs all Astronomer skills into your project via skills.sh. You'll be prompted to select which agents to install to. To also select skills individually, omit the --skill flag.
Claude Code users: We recommend using the plugin instead (see Claude Code section below) for better integration with MCP servers and hooks.
Skills: Works with 25+ AI coding agents including Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, Cline, and more.
MCP Server: Works with any MCP-compatible client including Claude Desktop, VS Code, and others.
# Add the marketplace and install the plugin
claude plugin marketplace add astronomer/agents
claude plugin install data@astronomerThe plugin includes the Airflow MCP server that runs via uvx from PyPI. Data warehouse queries are handled by the analyzing-data skill using a background Jupyter kernel.
Cursor supports both MCP servers and skills.
MCP Server - Click to install:
Skills - Install to your project:
npx skills add astronomer/agents --skill '*' -a cursorThis installs skills to .cursor/skills/ in your project.
Manual MCP configuration
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"airflow": {
"command": "uvx",
"args": ["astro-airflow-mcp", "--transport", "stdio"]
}
}
}Enable hooks (skill suggestions, session management)
Create .cursor/hooks.json in your project:
{
"version": 1,
"hooks": {
"beforeSubmitPrompt": [
{
"command": "$CURSOR_PROJECT_DIR/.cursor/skills/airflow/hooks/airflow-skill-suggester.sh",
"timeout": 5
}
],
"stop": [
{
"command": "uv run $CURSOR_PROJECT_DIR/.cursor/skills/analyzing-data/scripts/cli.py stop",
"timeout": 10
}
]
}
}What these hooks do:
beforeSubmitPrompt: Suggests data skills when you mention Airflow keywordsstop: Cleans up kernel when session ends
For any MCP-compatible client (Claude Desktop, VS Code, etc.):
# Airflow MCP
uvx astro-airflow-mcp --transport stdio
# With remote Airflow
AIRFLOW_API_URL=https://your-airflow.example.com \
AIRFLOW_USERNAME=admin \
AIRFLOW_PASSWORD=admin \
uvx astro-airflow-mcp --transport stdioThe data plugin bundles an MCP server and skills into a single installable package.
| Server | Description |
|---|---|
| Airflow | Full Airflow REST API integration via astro-airflow-mcp: DAG management, triggering, task logs, system health |
| Skill | Description |
|---|---|
| init | Initialize schema discovery - generates .astro/warehouse.md for instant lookups |
| analyzing-data | SQL-based analysis to answer business questions (uses background Jupyter kernel) |
| checking-freshness | Check how current your data is |
| profiling-tables | Comprehensive table profiling and quality assessment |
| Skill | Description |
|---|---|
| tracing-downstream-lineage | Analyze what breaks if you change something |
| tracing-upstream-lineage | Trace where data comes from |
| annotating-task-lineage | Add manual lineage to tasks using inlets/outlets |
| creating-openlineage-extractors | Build custom OpenLineage extractors for operators |
| Skill | Description |
|---|---|
| airflow | Main entrypoint - routes to specialized Airflow skills |
| setting-up-astro-project | Initialize and configure new Astro/Airflow projects |
| managing-astro-local-env | Manage local Airflow environment (start, stop, logs, troubleshoot) |
| authoring-dags | Create and validate Airflow DAGs with best practices |
| testing-dags | Test and debug Airflow DAGs locally |
| debugging-dags | Deep failure diagnosis and root cause analysis |
| Skill | Description |
|---|---|
| migrating-airflow-2-to-3 | Migrate DAGs from Airflow 2.x to 3.x |
flowchart LR
init["/data:init"] --> analyzing["/data:analyzing-data"]
analyzing --> profiling["/data:profiling-tables"]
analyzing --> freshness["/data:checking-freshness"]
- Initialize (
/data:init) - One-time setup to generatewarehouse.mdwith schema metadata - Analyze (
/data:analyzing-data) - Answer business questions with SQL - Profile (
/data:profiling-tables) - Deep dive into specific tables for statistics and quality - Check freshness (
/data:checking-freshness) - Verify data is up to date before using
flowchart LR
setup["/data:setting-up-astro-project"] --> authoring["/data:authoring-dags"]
setup --> env["/data:managing-astro-local-env"]
authoring --> testing["/data:testing-dags"]
testing --> debugging["/data:debugging-dags"]
- Setup (
/data:setting-up-astro-project) - Initialize project structure and dependencies - Environment (
/data:managing-astro-local-env) - Start/stop local Airflow for development - Author (
/data:authoring-dags) - Write DAG code following best practices - Test (
/data:testing-dags) - Run DAGs and fix issues iteratively - Debug (
/data:debugging-dags) - Deep investigation for complex failures
The af command-line tool lets you interact with Airflow directly from your terminal. Install it with:
uvx --from astro-airflow-mcp af --helpFor frequent use, add an alias to your shell config (~/.bashrc or ~/.zshrc):
alias af='uvx --from astro-airflow-mcp af'Then use it for quick operations like af health, af dags list, or af runs trigger <dag_id>.
See the full CLI documentation for all commands and instance management.
Configure data warehouse connections at ~/.astro/agents/warehouse.yml:
my_warehouse:
type: snowflake
account: ${SNOWFLAKE_ACCOUNT}
user: ${SNOWFLAKE_USER}
auth_type: private_key
private_key_path: ~/.ssh/snowflake_key.p8
private_key_passphrase: ${SNOWFLAKE_PRIVATE_KEY_PASSPHRASE}
warehouse: COMPUTE_WH
role: ANALYST
databases:
- ANALYTICS
- RAWStore credentials in ~/.astro/agents/.env:
SNOWFLAKE_ACCOUNT=xyz12345
SNOWFLAKE_USER=myuser
SNOWFLAKE_PRIVATE_KEY_PASSPHRASE=your-passphrase-here # Only required if using an encrypted private keySupported databases:
| Type | Package | Description |
|---|---|---|
snowflake |
Built-in | Snowflake Data Cloud |
postgres |
Built-in | PostgreSQL |
bigquery |
Built-in | Google BigQuery |
sqlalchemy |
Any SQLAlchemy driver | Auto-detects packages for 25+ databases (see below) |
Auto-detected SQLAlchemy databases
The connector automatically installs the correct driver packages for:
| Database | Dialect URL |
|---|---|
| PostgreSQL | postgresql:// or postgres:// |
| MySQL | mysql:// or mysql+pymysql:// |
| MariaDB | mariadb:// |
| SQLite | sqlite:/// |
| SQL Server | mssql+pyodbc:// |
| Oracle | oracle:// |
| Redshift | redshift:// |
| Snowflake | snowflake:// |
| BigQuery | bigquery:// |
| DuckDB | duckdb:/// |
| Trino | trino:// |
| ClickHouse | clickhouse:// |
| CockroachDB | cockroachdb:// |
| Databricks | databricks:// |
| Amazon Athena | awsathena:// |
| Cloud Spanner | spanner:// |
| Teradata | teradata:// |
| Vertica | vertica:// |
| SAP HANA | hana:// |
| IBM Db2 | db2:// |
For unlisted databases, install the driver manually and use standard SQLAlchemy URLs.
Example configurations
# PostgreSQL
my_postgres:
type: postgres
host: localhost
port: 5432
user: analyst
password: ${POSTGRES_PASSWORD}
database: analytics
# BigQuery
my_bigquery:
type: bigquery
project: my-gcp-project
credentials_path: ~/.config/gcloud/service_account.json
# SQLAlchemy (any supported database)
my_duckdb:
type: sqlalchemy
url: duckdb:///path/to/analytics.duckdb
databases: [main]
# Redshift (via SQLAlchemy)
my_redshift:
type: sqlalchemy
url: redshift+redshift_connector://${REDSHIFT_USER}:${REDSHIFT_PASSWORD}@${REDSHIFT_HOST}:5439/${REDSHIFT_DATABASE}
databases: [my_database]The Airflow MCP auto-discovers your project when you run Claude Code from an Airflow project directory (contains airflow.cfg or dags/ folder).
For remote instances, set environment variables:
| Variable | Description |
|---|---|
AIRFLOW_API_URL |
Airflow webserver URL |
AIRFLOW_USERNAME |
Username |
AIRFLOW_PASSWORD |
Password |
AIRFLOW_AUTH_TOKEN |
Bearer token (alternative to username/password) |
Skills are invoked automatically based on what you ask. You can also invoke them directly with /data:<skill-name>.
-
Initialize your warehouse (recommended first step):
/data:initThis generates
.astro/warehouse.mdwith schema metadata for faster queries. -
Ask questions naturally:
- "What tables contain customer data?"
- "Show me revenue trends by product"
- "Create a DAG that loads data from S3 to Snowflake daily"
- "Why did my etl_pipeline DAG fail yesterday?"
See CLAUDE.md for plugin development guidelines.
# Clone the repo
git clone https://github.com/astronomer/agents.git
cd agents
# Test with local plugin
claude --plugin-dir .
# Or install from local marketplace
claude plugin marketplace add .
claude plugin install data@astronomerCreate a new skill in skills/<name>/SKILL.md with YAML frontmatter:
---
name: my-skill
description: When to invoke this skill
---
# Skill instructions here...After adding skills, reinstall the plugin:
claude plugin uninstall data@astronomer && claude plugin install data@astronomer| Issue | Solution |
|---|---|
| Skills not appearing | Reinstall plugin: claude plugin uninstall data@astronomer && claude plugin install data@astronomer |
| Warehouse connection errors | Check credentials in ~/.astro/agents/.env and connection config in warehouse.yml |
| Airflow not detected | Ensure you're running from a directory with airflow.cfg or a dags/ folder |
Contributions welcome! Please read our Code of Conduct and Contributing Guide before getting started.
Skills we're likely to build:
DAG Operations
- CI/CD pipelines for DAG deployment
- Performance optimization and tuning
- Monitoring and alerting setup
- Data quality and validation workflows
Astronomer Open Source
- Cosmos - Run dbt projects as Airflow DAGs
- DAG Factory - Generate DAGs from YAML
- Other open source projects we maintain
Conference Learnings
- Reviewing talks from Airflow Summit, Coalesce, Data Council, and other conferences to extract reusable skills and patterns
Broader Data Practitioner Skills
- Churn prediction, data modeling, ML training, and other workflows that span DE/DS/analytics roles
Don't see a skill you want? Open an issue or submit a PR!
Apache 2.0
Made with ❤️ by Astronomer