From 6d9dfeb115fe101b47d9875771be4485e3afca8e Mon Sep 17 00:00:00 2001 From: rosspeili Date: Sat, 13 Jun 2026 16:18:27 +0300 Subject: [PATCH] ci: run pytest skills/ in GitHub Actions alongside pytest tests/ Add bundle test step after lint; sync TESTING, CONTRIBUTING, agent workflow, and PR template with two-step CI model. Fixes #159 Closes #90 --- .github/PULL_REQUEST_TEMPLATE.md | 2 +- .github/workflows/ci.yml | 11 ++++++++--- CHANGELOG.md | 1 + CONTRIBUTING.md | 8 +++++--- docs/TESTING.md | 16 +++++++++------- docs/contributing/ai_native_workflow.md | 5 +++-- 6 files changed, 27 insertions(+), 16 deletions(-) diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 2ede702..c9ada7f 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -28,7 +28,7 @@ Humans: Please describe what this PR does and why it's needed. ## Checklist (all PRs) - [ ] My code follows the **Agent Code of Conduct**. -- [ ] I have run `python -m flake8 .` and `pytest tests/` locally (or the subset relevant to this change). +- [ ] I have run `python -m flake8 .`, `pytest skills/`, and `pytest tests/` locally (or the subset relevant to this change). - [ ] `CHANGELOG.md` updated under `[Unreleased]` if this PR changes user-visible behavior. - [ ] `examples/README.md` is updated if this PR adds, renames, or removes a runnable script under `examples/`. diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 88d3591..6b6131c 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -36,9 +36,14 @@ jobs: # strict check against our config flake8 . --count --statistics - - name: Test with pytest + - name: Skill bundle tests env: ANTHROPIC_API_KEY: "dummy_key_for_ci" ETHERSCAN_API_KEY: "dummy_key_for_ci" - run: | - pytest tests/ + run: pytest skills/ + + - name: Framework and maintainer tests + env: + ANTHROPIC_API_KEY: "dummy_key_for_ci" + ETHERSCAN_API_KEY: "dummy_key_for_ci" + run: pytest tests/ diff --git a/CHANGELOG.md b/CHANGELOG.md index 002b9fb..995c3c4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,6 +12,7 @@ Contributors add user-facing entries under `[Unreleased]` in the same PR. Mainta - **Tests**: Backfilled `test_skill.py` for six registry skills (`mica_module`, `pii_masker`, `synthetic_generator`, `wallet_screening`, `pdf_form_filler`, `prompt_rewriter`); all registry skills now ship co-located bundle tests. Fixed `prompt_rewriter` package export so pytest can collect the bundle (#158). ### Changed +- **CI**: GitHub Actions runs `pytest skills/` then `pytest tests/` after lint (bundle + framework/maintainer tests; closes #90) (#159). - **CI**: CodeQL GitHub Action upgraded from v3 to v4. - **Dependencies**: Extended `[all]` with registry skill runtime deps (`web3`, `fastembed`, `numpy`); added `[defi]` and `[embeddings]` optional extras. Documented manifest ↔ `pyproject.toml` convention in CONTRIBUTING and TESTING.md. - **Documentation**: [TESTING.md](docs/TESTING.md), [CONTRIBUTING.md](CONTRIBUTING.md), [ai_native_workflow.md](docs/contributing/ai_native_workflow.md), and README architecture tree document the bundle / framework / maintainer / example testing model. Pytest collects `tests/` and `skills/` only (`examples/` ignored). diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 4144802..6d38b55 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -111,20 +111,21 @@ Follow the [Agent Code of Conduct](CODE_OF_CONDUCT.md): deterministic skill outp ### Tests and CI - Add or update tests in the correct layer when behavior changes (see [TESTING.md](docs/TESTING.md)). -- **Skill bundle test** — `skills///test_skill.py` (required for new skills; ships in the wheel; run locally before skill PRs). +- **Skill bundle test** — `skills///test_skill.py` (required for new skills; ships in the wheel; runs in CI via `pytest skills/`). - **Framework test** — `tests/test_*.py` at repo root (loader, CLI, issuer rules). - **Maintainer skill test** — optional `tests/skills//test_.py` for extra loader or edge-case coverage. - **Usage examples** — `examples/*.py` are not tests and are not run in CI. -- **GitHub Actions** installs `pip install -e ".[dev,all]"`, runs `python -m black --check .`, then `flake8 .`, then **`pytest tests/`** (framework + maintainer tests). Do not add per-skill pip lines or test paths to `.github/workflows/ci.yml`. +- **GitHub Actions** installs `pip install -e ".[dev,all]"`, runs `python -m black --check .`, then `flake8 .`, then **`pytest skills/`** (bundle tests), then **`pytest tests/`** (framework + maintainer tests). Do not add per-skill pip lines or hardcoded skill paths to `.github/workflows/ci.yml`. - Run locally before opening a PR: ```bash python -m black --check . python -m flake8 . + python -m pytest skills/ python -m pytest tests/ ``` - For skill work, also run: + For a single skill: ```bash python -m pytest skills///test_skill.py @@ -153,6 +154,7 @@ Agents must follow [Agent Contribution Workflow](docs/contributing/ai_native_wor ```bash python -m black --check . python -m flake8 . + pytest skills/ pytest tests/ ``` diff --git a/docs/TESTING.md b/docs/TESTING.md index f0bf3ba..5660a84 100644 --- a/docs/TESTING.md +++ b/docs/TESTING.md @@ -22,7 +22,7 @@ pip install -r requirements.txt | Layer | Location | Shipped in pip wheel? | CI on PR? | | :--- | :--- | :---: | :---: | -| **Skill bundle test** | `skills///test_skill.py` | Yes | No — run locally for skill PRs | +| **Skill bundle test** | `skills///test_skill.py` | Yes | Yes | | **Framework test** | `tests/test_*.py` (not under `tests/skills/`) | No (clone only) | Yes | | **Maintainer skill test** | `tests/skills//test_.py` | No (clone only) | Yes when present | | **Usage example** | `examples/*.py` | No | No — not pytest | @@ -62,7 +62,7 @@ pip install -r requirements.txt | Loader, CLI, registry issuer rules | Framework test | `tests/test_loader.py`, `tests/test_skill_issuer.py` | | End-to-end provider demo script | Usage example | `examples/gemini_tos_evaluator.py` | -**Rule of thumb:** if it ships with the skill and must pass before merge → **bundle test** (run locally). If it is extra regression depth for clone-repo work → **maintainer test** (optional). If it proves provider integration → **example**, not pytest. +**Rule of thumb:** if it ships with the skill and must pass before merge → **bundle test** (CI + local). If it is extra regression depth for clone-repo work → **maintainer test** (optional). If it proves provider integration → **example**, not pytest. ## 1. Code Formatting (Black) @@ -121,20 +121,21 @@ GitHub Actions installs `pip install -e ".[dev,all]"`, then runs: ```bash python -m black --check . python -m flake8 . +python -m pytest skills/ python -m pytest tests/ ``` -That covers **framework tests** and **maintainer skill tests** under `tests/`. It does not run `examples/` or skill bundle tests. Do not add per-skill pip lines or test paths to `.github/workflows/ci.yml`. +That covers **skill bundle tests** under `skills/` and **framework + maintainer tests** under `tests/`. It does not run `examples/`. Do not add per-skill pip lines or hardcoded skill paths to `.github/workflows/ci.yml`. The `[all]` extra includes optional SDK groups plus registry skill runtime deps (`web3`, `fastembed`, `numpy`, …) so `pytest skills/` works after `pip install -e ".[dev,all]"`. When a skill adds new `manifest.yaml` `requirements`, add the same packages to the matching optional extra and to `[all]` in `pyproject.toml`. ### Local commands -Match CI, and run bundle tests when you touch skills: +Match CI: ```bash -python -m pytest tests/ python -m pytest skills/ +python -m pytest tests/ ``` Single skill bundle test: @@ -164,5 +165,6 @@ Before pushing your code, run the following commands: 1. `skillware list` (verify install and path resolution) 2. `python -m black --check .` (verify formatting; use `python -m black .` to fix) 3. `python -m flake8 .` (check quality) -4. `python -m pytest tests/` (framework + maintainer tests — same scope as CI) -5. `python -m pytest skills///test_skill.py` when your PR adds or changes a skill bundle test (or `pytest skills/` for broad skill changes) +4. `python -m pytest skills/` (bundle tests — same scope as CI) +5. `python -m pytest tests/` (framework + maintainer tests — same scope as CI) +6. `python -m pytest skills///test_skill.py` when you want a single-skill subset diff --git a/docs/contributing/ai_native_workflow.md b/docs/contributing/ai_native_workflow.md index 34023b4..f86c223 100644 --- a/docs/contributing/ai_native_workflow.md +++ b/docs/contributing/ai_native_workflow.md @@ -132,6 +132,7 @@ You must: ```bash python -m black . python -m flake8 . +pytest skills/ pytest tests/ ``` @@ -160,7 +161,7 @@ Run a **pre-PR audit** on yourself: 1. Map every acceptance criterion in the issue to a file or test in your diff. 2. Complete the [verification checklist](#verification-checklists-by-contribution-type) for your contribution type. 3. If the change is user-visible, confirm [CHANGELOG.md](../../CHANGELOG.md) has entries under `[Unreleased]` (same rule as [CONTRIBUTING.md](../../CONTRIBUTING.md)). -4. Run `flake8` and `pytest tests/`; for skill work also run the relevant `pytest skills/.../test_skill.py`. Report actual command output to your operator—do not claim success without evidence. +4. Run `flake8`, `pytest skills/`, and `pytest tests/`; for skill work also run the relevant `pytest skills/.../test_skill.py`. Report actual command output to your operator—do not claim success without evidence. 5. Draft PR template answers: check only boxes that apply; fill the skill section only if `skills/` changed. If anything fails, return to Stage 4, fix, and audit again. @@ -200,7 +201,7 @@ You should: 1. Draft the PR description (why, not only what; link the issue). 2. Map changed files to the [pull request template](../../.github/PULL_REQUEST_TEMPLATE.md)—skill checklist only when `skills/` changed. -3. Monitor CI (lint and `pytest tests/`). If checks fail, diagnose, fix in Stage 4, and push to the same branch. +3. Monitor CI (lint, `pytest skills/`, and `pytest tests/`). If checks fail, diagnose, fix in Stage 4, and push to the same branch. 4. Address review comments with focused follow-up commits. Do not force-push shared branches unless a maintainer instructs you.