Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions .claude/skills/adr/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
name: adr
description: Generate a new Architecture Decision Record (ADR) for documenting architecture decisions.
---

# ADR Generation

Create a new Architecture Decision Record.

## Instructions

### Step 1: Check for Existing ADRs

Search `docs/adr/` to check if a similar decision already exists or find related ADRs to reference.

### Step 2: Gather Information

If the user provided a topic as argument (e.g., `/adr push based architecture`), use that as the title and skip the title question.

Otherwise, gather information **one question at a time**. Ask a single question, then STOP and wait for the user's response before asking the next question:

1. **First**: Ask for the **Title** - a short, descriptive title for the decision
2. **Second**: Ask for the **Context** - what problem requires a decision?
3. **Third**: Ask for the **Options** - what options are being considered? (2-3 minimum)
4. **Fourth** (optional): Ask if there is a **Recommendation** - is there already a preferred option?

Do NOT ask all questions at once. Ask one question, wait for the answer, then ask the next. Do NOT use the AskUserQuestion tool - just ask in plain text and wait for free text input from the user.

### Step 3: Create the ADR File

1. Read the template from `docs/adr/template.md`
2. Create file: `docs/adr/YYYY.MM.DD-kebab-case-title.md` (use today's date)
3. Fill in the template:
- Status: `proposed`
- Deciders: `checkup team` (or ask user if different)
- Proposal date: `DD/MM/YYYY` format
- Leave decision date empty

### Naming Convention

- Date: `YYYY.MM.DD` (e.g., `2025.10.14`)
- Title: kebab-case, lowercase (e.g., `push-based-architecture`)
- Example: `2025.10.14-push-based-architecture.md`
81 changes: 81 additions & 0 deletions docs/adr/2026.03.25-push-based-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Push-based Architecture

* Status: `proposed`
* Deciders: `checkup team`
* Proposal date: 25/03/2026
* Decision date:

## Context and problem statement

Checkup currently operates in a pull-based batch model: a central CheckHub orchestrator runs periodically, defines metrics and providers in code, and materializes results to e.g. a database using SQLAlchemyMaterializer. This works well for a central data team managing metrics for multiple data products.

However, this model has limitations:
- Metrics are only as fresh as the batch schedule allows
- All metrics are always recomputed in a full load, regardless of which product changed
- Developers cannot easily check their product's health themselves when making changes

We want to extend checkup to support a push-based model where CI pipelines run checkup at build time and push results to a central materialization. This enables fresher data, decoupled scaling, and adds a developer experience for data product developers.

## Considered options

1. **CLI with YAML configuration**: Build a `checkup` CLI that reads configuration from `checkup.yaml` files, computes metrics, and materializes results.

2. **Embedded library**: Data products import checkup as a library and write Python scripts to compute metrics. CI can run these scripts.

3. **Central service**: Build a service that CI pipelines call to trigger metric computation. The service fetches code/config and runs computation.

## Chosen option

We are choosing **option 1 (CLI with YAML configuration)** because:

- It provides a clean separation between configuration (YAML) and metric definitions (Python code in plugins)
- YAML configuration is accessible to data engineers who may not write Python
- The CLI supports both a CI workflow, and provides the same tool for developers to run locally for quick feedback
- It can reuse all existing checkup framework concepts (metrics, providers, materializers)
- Configuration can live in the repo, enabling version control and code review

**Option 2 not chosen** because it requires every data product to write Python code, increasing the barrier to adoption and making it harder to enforce consistent configuration across products. Additionally, when metrics are defined in code within each product, it becomes harder for a central platform team to define and manage metrics across all products, unlike a CLI approach with YAML configuration files which can be more easily centrally defined and inherited.

**Option 3 not chosen** because it adds significant infrastructure complexity (remote execution, code fetching, sandboxing) without clear benefits over local CLI execution in CI.

## Consequences

With the chosen option, we see the following consequences requiring extra effort:

1. **Plugin discovery**: The CLI needs to discover and load metric/provider classes from installed plugins. We can use [Python entry points](https://packaging.python.org/en/latest/specifications/entry-points/).

2. **Config validation**: YAML configuration needs validation against available metrics and providers. We will provide clear error messages when config references unknown metrics or providers.

## Future considerations

**Client-server approach with CheckupMaterializer**: Instead of CI pipelines writing directly to the database via SQLAlchemyMaterializer, a future `CheckupMaterializer` could push metrics to a central HTTP service. The service would own database credentials and handle writes. Benefits:
- CI pipelines only need an API token, not database credentials
- Central service can validate, rate-limit, and audit incoming metrics
- API versioning decouples CLI from database schema changes

## More information

### CLI commands

The CLI could include these commands:
- `checkup check`: Local development mode, outputs to console
- `checkup run`: CI mode, uses configured materializer

### Central team vs. developer-defined metrics

The addition of this push-based model supports a spectrum of metric ownership:

**Central team defines baseline metrics:**
- Add `checkup.yaml` to repo templates or monorepo root
- Defines the metrics the platform team considers important
- All products inherit these metrics automatically

**Developers extend or override:**
- Project-level `checkup.yaml` can add metrics the developer finds valuable
- Can override inherited config if a metric doesn't apply to their project
- Developers can experiment with new metrics before proposing them for platform-wide adoption, allowing for quick iteration and feedback.

### Related ADRs

- [Configuration File Design](./2026.03.25-configuration-file-design.md)
- [Credentials and Secrets](./2026.03.25-credentials-and-secrets.md)
29 changes: 29 additions & 0 deletions docs/adr/template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Proposal name

* Status: `proposed|decided`
* Deciders: `list of names or well-know group`
* Proposal date:
* Decision date:

## Context and problem statement


## Considered options

1.
2.
3.

## Chosen option

We are choosing option X as ... explain why
Option X not chosen because... explain why for every option

## Consequences

With chosen option, we see the following consequences requiring extra effort: describe how we are fixing weak points
of this option.

## More information

Anything you can provide to simplify decision.
Loading