Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,4 @@ dolphinscheduler-master/logs
dolphinscheduler-api/logs
__pycache__
ds_schema_check_test
docs/superpowers/
161 changes: 161 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# CLAUDE.md — Apache DolphinScheduler

Apache DolphinScheduler is a distributed, visual DAG workflow-scheduling platform. This is the monorepo: backend servers (master / worker / api / alert), a Vue 3 frontend, plugin families for tasks / datasources / storage / alerting / scheduling, and the release tooling.

**This file is an index.** Each module has its own `CLAUDE.md` with the details — do not duplicate module contents here.

---

## Tech stack (project-wide)

- **Java 1.8** (do not assume 11+ APIs; `dolphinscheduler-api-test` is the only Java 11 island).
- **Spring Boot 2.6.1** across servers, **Jetty** (Tomcat is excluded transitively).
- **MyBatis-Plus** for ORM; **HikariCP** for the metadata DB pool, **Druid** inside user-facing datasource plugins.
- **Quartz** for cron scheduling (via `scheduler-plugin`).
- **Netty / gRPC** for inter-server RPC (see `extract-base`).
- **Vue 3 + Vite + TypeScript + Naive UI** for the frontend.
- **Maven** multi-module reactor (26 modules in root `pom.xml` + 2 test modules).
- **Zookeeper 3.8** by default for the registry (Etcd and JDBC also supported).

## Runnable services

A production deployment runs **four independent services** (plus an external registry and metadata DB). A fifth entry point — `StandaloneServer` — embeds all four in one JVM for development.

| Service | Module | Main class | Default ports |
|---------|--------|------------|---------------|
| **API** | [`dolphinscheduler-api`](dolphinscheduler-api/CLAUDE.md) | `org.apache.dolphinscheduler.api.ApiApplicationServer` | `12345` (HTTP / UI + REST) |
| **Master** | [`dolphinscheduler-master`](dolphinscheduler-master/CLAUDE.md) | `org.apache.dolphinscheduler.server.master.MasterServer` | `5679` (RPC) |
| **Worker** | [`dolphinscheduler-worker`](dolphinscheduler-worker/CLAUDE.md) | `org.apache.dolphinscheduler.server.worker.WorkerServer` | `1235` (RPC) |
| **Alert** | [`dolphinscheduler-alert`](dolphinscheduler-alert/CLAUDE.md) (→ `-alert-server`) | `org.apache.dolphinscheduler.alert.AlertServer` | `50053` (HTTP), `50052` (RPC) |
| Standalone (dev only) | [`dolphinscheduler-standalone-server`](dolphinscheduler-standalone-server/CLAUDE.md) | `org.apache.dolphinscheduler.StandaloneServer` | `12345` + `50052` (API + alert; master/worker use in-JVM calls) |

Every service is a `@SpringBootApplication` on Jetty and implements `IStoppable`. Scale Master / Worker / Alert horizontally; coordination happens via the registry (Zookeeper by default). API is stateless and also scales horizontally behind a load balancer.

Ports are overridable via `server.port` / service-specific keys in each service's `application.yaml`.

## Build & run

```bash
# Full build (release profile; produces dist tarball)
./mvnw clean install -Prelease

# Zookeeper 3.4 legacy
./mvnw clean install -Prelease -Dzk-3.4

# Skip UI build (faster iteration on backend only)
./mvnw -pl '!dolphinscheduler-ui' clean install

# Build one module (+ its required siblings)
./mvnw -pl dolphinscheduler-master -am clean install

# Format (Spotless is configured)
./mvnw spotless:apply

# Standalone server (after building)
cd dolphinscheduler-standalone-server/target && ./bin/start.sh
```

Binary artifact: `dolphinscheduler-dist/target/apache-dolphinscheduler-*-bin.tar.gz`.

## Test

```bash
# Unit tests for one module
./mvnw -pl dolphinscheduler-master test

# API integration tests (separate reactor, requires Docker)
mvn -pl dolphinscheduler-api-test/dolphinscheduler-api-test-case test

# E2E browser tests (Selenium + Docker)
mvn -pl dolphinscheduler-e2e/dolphinscheduler-e2e-case test

# Apple Silicon: add -Dm1_chip=true to the Docker-driven suites
```

---

## Module index

Click into a module's `CLAUDE.md` for details. Each description is one line here on purpose.

### Core execution

- [`dolphinscheduler-master`](dolphinscheduler-master/CLAUDE.md) — workflow orchestration engine; consumes `Command`s, runs the DAG state machine, dispatches to workers.
- [`dolphinscheduler-worker`](dolphinscheduler-worker/CLAUDE.md) — runs physical tasks dispatched from master; hosts task plugins.
- [`dolphinscheduler-task-executor`](dolphinscheduler-task-executor/CLAUDE.md) — reusable task-lifecycle framework embedded by the worker.
- [`dolphinscheduler-alert`](dolphinscheduler-alert/CLAUDE.md) — alert server + channel plugins (email, Feishu, DingTalk, …).

### API layer

- [`dolphinscheduler-api`](dolphinscheduler-api/CLAUDE.md) — REST API server (entry point for UI, Python SDK, external clients).
- [`dolphinscheduler-api-test`](dolphinscheduler-api-test/CLAUDE.md) — integration tests against the REST API (Docker Compose + Testcontainers).
- [`dolphinscheduler-authentication`](dolphinscheduler-authentication/CLAUDE.md) — Actuator-endpoint auth + AWS credential helpers (NOT the main login path).

### Shared libraries

- [`dolphinscheduler-common`](dolphinscheduler-common/CLAUDE.md) — foundation utilities (everything depends on this).
- [`dolphinscheduler-dao`](dolphinscheduler-dao/CLAUDE.md) — MyBatis DAO layer + SQL migration scripts.
- [`dolphinscheduler-service`](dolphinscheduler-service/CLAUDE.md) — business logic between DAO and the servers.
- [`dolphinscheduler-spi`](dolphinscheduler-spi/CLAUDE.md) — Service-Provider Interface root (every plugin depends on this).
- [`dolphinscheduler-extract`](dolphinscheduler-extract/CLAUDE.md) — RPC interface contracts between servers.
- [`dolphinscheduler-eventbus`](dolphinscheduler-eventbus/CLAUDE.md) — in-process event-bus abstractions.
- [`dolphinscheduler-registry`](dolphinscheduler-registry/CLAUDE.md) — pluggable registry (Zookeeper / Etcd / JDBC).
- [`dolphinscheduler-meter`](dolphinscheduler-meter/CLAUDE.md) — metrics (Prometheus) + server load-protection primitives.

### Plugin families

- [`dolphinscheduler-task-plugin`](dolphinscheduler-task-plugin/CLAUDE.md) — task-type plugins (shell, SQL, Spark, Flink, K8s, EMR, …). 33 concrete plugins.
- [`dolphinscheduler-datasource-plugin`](dolphinscheduler-datasource-plugin/CLAUDE.md) — user-facing datasource plugins (MySQL, Hive, Trino, Snowflake, …). 28 concrete plugins.
- [`dolphinscheduler-storage-plugin`](dolphinscheduler-storage-plugin/CLAUDE.md) — resource storage (S3, HDFS, OSS, GCS, ABS, OBS, COS).
- [`dolphinscheduler-scheduler-plugin`](dolphinscheduler-scheduler-plugin/CLAUDE.md) — cron scheduler (Quartz today).
- [`dolphinscheduler-dao-plugin`](dolphinscheduler-dao-plugin/CLAUDE.md) — metadata-DB dialect support (MySQL / PostgreSQL / H2).

### Build, ops, tools

- [`dolphinscheduler-bom`](dolphinscheduler-bom/CLAUDE.md) — Maven BOM; central dependency version pinning.
- [`dolphinscheduler-dist`](dolphinscheduler-dist/CLAUDE.md) — assembles the release tarball + Docker images.
- [`dolphinscheduler-standalone-server`](dolphinscheduler-standalone-server/CLAUDE.md) — all-in-one JVM with H2 (dev / smoke tests).
- [`dolphinscheduler-tools`](dolphinscheduler-tools/CLAUDE.md) — CLIs for schema upgrade + resource / lineage migration.
- [`dolphinscheduler-microbench`](dolphinscheduler-microbench/CLAUDE.md) — JMH micro-benchmarks.
- [`dolphinscheduler-yarn-aop`](dolphinscheduler-yarn-aop/CLAUDE.md) — AspectJ weaver capturing YARN ApplicationIds.

### Frontend & E2E

- [`dolphinscheduler-ui`](dolphinscheduler-ui/CLAUDE.md) — Vue 3 frontend.
- [`dolphinscheduler-e2e`](dolphinscheduler-e2e/CLAUDE.md) — Selenium browser tests.

---

## Architecture overview (one paragraph)

A **user** hits the UI, which calls the API server. The API server writes to the **metadata DB** and, for runtime operations (start / kill / pause workflow), talks to the **master** over RPC. The master consumes `t_ds_command` rows, runs the workflow state machine, and dispatches tasks to **workers**. Workers execute task plugins (shell, SQL, Spark, …) and stream lifecycle events back to master. Failures and SLA breaches flow to the **alert server**, which fans out through alert plugins. **Registry** (Zookeeper / Etcd / JDBC) provides service discovery, leader election, and distributed locks. **Storage plugins** back the resource center and distributed-task artifacts. **Quartz** (via scheduler plugin) fires scheduled workflows, which become new `Command` rows.

## Where things live (quick lookup)

| Looking for… | Start here |
|--------------|------------|
| A REST endpoint | `dolphinscheduler-api/src/main/java/.../api/controller/` |
| Workflow execution logic | `dolphinscheduler-master/src/main/java/.../server/master/engine/` |
| Task execution logic | `dolphinscheduler-worker` + the specific `task-plugin/<type>` |
| How "X" is stored | `dolphinscheduler-dao/src/main/java/.../dao/entity/` |
| SQL schema / upgrade | `dolphinscheduler-dao/src/main/resources/sql/` |
| RPC contract between servers | `dolphinscheduler-extract/dolphinscheduler-extract-<role>` |
| UI page source | `dolphinscheduler-ui/src/views/<feature>/` |
| API call in the UI | `dolphinscheduler-ui/src/service/modules/<resource>.ts` |
| Version of a dependency | `dolphinscheduler-bom/pom.xml` |

## Project-wide conventions

- **Formatting**: `./mvnw spotless:apply`. CI will fail PRs that aren't formatted. Java imports are ordered; license headers are enforced.
- **Commit style**: `[Type-ISSUE_ID] [Scope] Subject`, e.g. `[Fix-18168] [Worker] ...`. Scopes match module names.
- **Branching**: `dev` is the main integration branch (not `main`/`master`).
- **PRs must link a GitHub issue** and keep their scope tight — one module / one concern.
- **Do not break wire / DB compatibility** silently. Changes to `extract-*` RPC interfaces, `dao` entities, enum values, and `spi.DbType` ripple to deployed clusters mid-upgrade.
- **Only one registry / storage / DB dialect is active at runtime**. Code paths that check "which one" belong inside the plugin SPI, not sprinkled through services.

## External references

- Release docs (version-specific): https://dolphinscheduler.apache.org/en-us/docs
- GitHub issues: https://github.com/apache/dolphinscheduler/issues
- Python SDK: https://dolphinscheduler.apache.org/python/main/index.html
- Contribution guide: [`docs/docs/en/contribute/join/contribute.md`](docs/docs/en/contribute/join/contribute.md)
44 changes: 44 additions & 0 deletions dolphinscheduler-alert/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# CLAUDE.md — dolphinscheduler-alert

Alert / notification subsystem. Master and worker publish alert events (task failure, timeout, SLA breach, …); this subsystem evaluates them and dispatches via configured channel plugins (email, webhook, Feishu, DingTalk, PagerDuty, …).

**This directory is a Maven parent POM.**

## Sub-modules

- **`dolphinscheduler-alert-server`** — the runnable alert server. `@SpringBootApplication` `AlertServer` + HA coordinator (`AlertHAServer`) + RPC server (`AlertRpcServer`) + event loop.
- **`dolphinscheduler-alert-plugins`** — parent of the concrete channel plugins:
- `dolphinscheduler-alert-email`, `-http`, `-feishu`, `-dingtalk`, `-wechat`, `-webexteams`, `-pagerduty`, `-telegram`, `-slack`, `-script`, etc. (see directory listing for the current set). Each plugin implements the `AlertPlugin` SPI and is loaded dynamically at runtime.

## How alerting works (end-to-end)

1. Master/worker calls into `dolphinscheduler-extract-alert` RPC with an `AlertRequest`.
2. `AlertRpcServer` receives it, persists/enqueues onto `AlertEventPendingQueue`.
3. `AlertEventFetcher` pulls pending events in `AlertEventLoop`.
4. `AlertSender` looks up the alert group's configured plugins, invokes each.
5. Delivery result is recorded to the DB for audit.

## Extension points

- **`AlertPlugin` SPI** — new channel? Implement the interface in a new sub-module of `dolphinscheduler-alert-plugins` and it will be discovered by `AlertPluginManager`.
- Alert rule logic is currently fixed (type-based); extending rules means editing `AlertEventLoop` / `AlertSender`.

## Gotchas

- **Alert server is separate from master/worker** by design. Running it as an embedded part of master is not supported — it has its own HA, its own port, its own `application.yaml`.
- **`AlertEventPendingQueue` is DB-backed** (not just in-memory) so restarting the alert server does not lose queued alerts.
- **HA pattern mirrors master/worker**: only the leader pulls from the queue; others stand by.
- **Plugin config is per-alert-group**, stored in the DB. Adding a plugin implementation does not enable it for anyone until an admin creates an alert group referencing it.
- **Channel-specific rate limits / retries** live *inside* each plugin, not in the server. Don't add a global retry loop in `AlertSender`.

## Tests

- `alert-server/src/test/java` — unit tests for config, RPC, event queue, sender.
- Each plugin has its own `src/test/java` with mocked channel calls.

## Related modules

- `dolphinscheduler-extract-alert` — RPC contracts this server implements.
- `dolphinscheduler-dao` — persists alert events + audit.
- `dolphinscheduler-registry-all`, `dolphinscheduler-meter`, `dolphinscheduler-spi`.
- Callers: `dolphinscheduler-master`, `dolphinscheduler-worker`.
54 changes: 54 additions & 0 deletions dolphinscheduler-api-test/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# CLAUDE.md — dolphinscheduler-api-test

Integration test harness for the REST API. **Not** bundled into the release — this module exists to run curl-style black-box tests against a full DolphinScheduler stack booted via Docker Compose + Testcontainers.

**This is a Maven parent POM.** This module is **not** declared in the root `pom.xml` `<modules>` — run it explicitly (e.g. `mvn -pl dolphinscheduler-api-test test -am`) or via CI workflows.

## Sub-modules

- **`dolphinscheduler-api-test-core`** — the reusable framework:
- `@DolphinScheduler` annotation (marks a test class, accepts `composeFiles`).
- `DolphinSchedulerExtension` — JUnit 5 extension that starts the Compose stack, injects `RequestClient` and page objects.
- `RequestClient` — thin HTTP client with session handling.
- Page-object base classes (`LoginPage`, `WorkflowDefinitionPage`, `ProjectPage`, `TenantPage`, `UserPage`, `WorkerGroupPage`).
- **`dolphinscheduler-api-test-case`** — the actual test classes. `WorkflowDefinitionAPITest`, `TenantAPITest`, `SchedulerAPITest`, etc. Each class uses `@DolphinScheduler(composeFiles = { "docker/basic/docker-compose.yaml" })` to declare its environment.

## Stack bootstrap

Compose files live under `src/test/resources/docker/<scenario>/docker-compose.yaml`:

- `basic/` — standard API + Postgres.
- `tenant/` — scenario with multi-tenant configuration.
- `oidc-login/` — API with Keycloak for OIDC flow (`realm-export.json` alongside).

The JUnit extension uses Testcontainers Compose to wait for port readiness before returning.

## Running

```
# Full suite
mvn -pl dolphinscheduler-api-test/dolphinscheduler-api-test-case test

# Single class (Mac M1)
mvn -pl dolphinscheduler-api-test/dolphinscheduler-api-test-case test \
-Dtest=WorkflowDefinitionAPITest -Dm1_chip=true
```

Flags:

- `-Dm1_chip=true` — forces `arm64/v8` platform for Docker on Apple Silicon.
- `-Dlocal=true` — skips Testcontainers and points at a locally-running DolphinScheduler instead.

## Gotchas

- **Java 11 required** to compile this module (rest of the repo targets 1.8).
- **`@Order` is mandatory** on test methods within a class — tests are state-dependent and must run in declared order.
- **`@DisableIfTestFails` (from junit-pioneer)** cascades a failure to dependent tests in the same class; don't silently disable without understanding the chain.
- **`sessionId` is threaded through page objects**: `LoginPage.login(...)` returns a session ID that subsequent page objects must receive. Don't instantiate page objects without first logging in.
- **This module is excluded from the root reactor build** — CI drives it via a dedicated workflow. Check `.github/workflows/` before assuming a PR runs these.

## Related modules

- `dolphinscheduler-api` — the system under test.
- `dolphinscheduler-dao` — depended on to reuse entity DTOs.
- `dolphinscheduler-e2e` — complementary browser-level tests (selenium against UI).
61 changes: 61 additions & 0 deletions dolphinscheduler-api/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# CLAUDE.md — dolphinscheduler-api

REST API server. Entry point for the UI and external clients (curl, Python SDK). Uses Spring Boot with **Jetty** (not Tomcat) and springdoc-openapi for Swagger UI.

## Main package

`org.apache.dolphinscheduler.api`

## Entry point

`ApiApplicationServer` — `@SpringBootApplication`. On startup it:
1. Loads `DataSourcePluginManager` and `TaskPluginManager` (plugin discovery).
2. Binds to the port in `server.port` (default 12345).
3. Starts the Py4J gateway used by the Python SDK.

## Key sub-packages

- `api.controller` — 30+ `@RestController` classes, one per domain (workflows, tasks, users, projects, tenants, resources, data sources, alerts, monitoring, …). All URLs rooted at `/dolphinscheduler/*`.
- `api.service` / `api.service.impl` — business-logic layer wrapping `dolphinscheduler-service` and adding API-level concerns (auth checks, DTO mapping).
- `api.security` — pluggable authenticators: `PASSWORD` (default), `LDAP`, `OIDC`, `CASDOOR`, `SSO`. Selected by `security.authentication.type`.
- `api.interceptor` — `LoginHandlerInterceptor` (session cookie check), `RateLimitInterceptor`, `LocaleChangeInterceptor`.
- `api.configuration` — Spring config beans: Swagger, OAuth2, task-type catalog.
- `api.dto` — request/response DTOs.
- `api.exceptions` — `ApiExceptionHandler` (`@RestControllerAdvice`) maps exceptions → structured JSON responses.

## Extension points

- `Authenticator` — implement a new one to support additional login backends (added cases go into `AuthenticationType` enum + registered in `SecurityConfig`).
- Controllers auto-pick up new task types via `TaskTypeConfiguration` reading `task-type-config.yaml` / `dynamic-task-type-config.yaml`.

## Configuration

`src/main/resources`:

- `application.yaml` — server port, datasource, registry, security mode, CORS, OpenAPI.
- `task-type-config.yaml`, `dynamic-task-type-config.yaml` — the catalog of task types exposed to the UI.
- `i18n/messages_*.properties` — English + Simplified Chinese server-side messages.
- `swagger.properties` — springdoc config (UI lives at `/dolphinscheduler/swagger-ui/index.html`).
- `logback-spring.xml` — logging.

## Gotchas

- **Jetty, not Tomcat**. `spring-boot-starter-tomcat` is excluded transitively via `dolphinscheduler-meter`. Avoid accidentally pulling Tomcat in.
- **Session-based auth**: `LoginHandlerInterceptor` reads the `sessionId` cookie. There is no JWT by default. OIDC/CASDOOR paths still set a session.
- **OIDC requires `casdoor-spring-boot-starter`**; `OAuth2Configuration` is `@ConditionalOnProperty(security.authentication.type = OIDC|CASDOOR)`.
- **Python SDK integration** uses Py4J gateway, not REST. If a Python SDK change misbehaves, check `api-server` logs for Py4J init messages.
- **Controllers mix `@PostMapping` with form params and JSON bodies inconsistently** — this is legacy. Follow whatever shape the adjacent endpoint uses rather than converting to JSON across the board.
- **Swagger annotations are required** on new endpoints (`@Operation`, `@Parameter`). Missing ones break auto-generated docs the UI team consumes.

## Tests

- Unit tests in `src/test/java`.
- **Integration tests live in `dolphinscheduler-api-test`** (separate module, Docker Compose + Testcontainers).

## Related modules

- `dolphinscheduler-service`, `dolphinscheduler-dao` — primary deps.
- `dolphinscheduler-extract-master` / `-worker` / `-alert` — RPC into servers.
- `dolphinscheduler-authentication` (actuator sub-module) — secures `/actuator/**`.
- `dolphinscheduler-ui` — primary consumer.
- `dolphinscheduler-api-test` — integration harness.
Loading
Loading