Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Humans: Please describe what this PR does and why it's needed.
## Checklist (all PRs)

- [ ] My code follows the **Agent Code of Conduct**.
- [ ] I have run `python -m flake8 .` and `pytest tests/` locally (or the subset relevant to this change).
- [ ] I have run `python -m flake8 .`, `pytest skills/`, and `pytest tests/` locally (or the subset relevant to this change).
- [ ] `CHANGELOG.md` updated under `[Unreleased]` if this PR changes user-visible behavior.
- [ ] `examples/README.md` is updated if this PR adds, renames, or removes a runnable script under `examples/`.

Expand Down
11 changes: 8 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,14 @@ jobs:
# strict check against our config
flake8 . --count --statistics

- name: Test with pytest
- name: Skill bundle tests
env:
ANTHROPIC_API_KEY: "dummy_key_for_ci"
ETHERSCAN_API_KEY: "dummy_key_for_ci"
run: |
pytest tests/
run: pytest skills/

- name: Framework and maintainer tests
env:
ANTHROPIC_API_KEY: "dummy_key_for_ci"
ETHERSCAN_API_KEY: "dummy_key_for_ci"
run: pytest tests/
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Contributors add user-facing entries under `[Unreleased]` in the same PR. Mainta
- **Tests**: Backfilled `test_skill.py` for six registry skills (`mica_module`, `pii_masker`, `synthetic_generator`, `wallet_screening`, `pdf_form_filler`, `prompt_rewriter`); all registry skills now ship co-located bundle tests. Fixed `prompt_rewriter` package export so pytest can collect the bundle (#158).

### Changed
- **CI**: GitHub Actions runs `pytest skills/` then `pytest tests/` after lint (bundle + framework/maintainer tests; closes #90) (#159).
- **CI**: CodeQL GitHub Action upgraded from v3 to v4.
- **Dependencies**: Extended `[all]` with registry skill runtime deps (`web3`, `fastembed`, `numpy`); added `[defi]` and `[embeddings]` optional extras. Documented manifest ↔ `pyproject.toml` convention in CONTRIBUTING and TESTING.md.
- **Documentation**: [TESTING.md](docs/TESTING.md), [CONTRIBUTING.md](CONTRIBUTING.md), [ai_native_workflow.md](docs/contributing/ai_native_workflow.md), and README architecture tree document the bundle / framework / maintainer / example testing model. Pytest collects `tests/` and `skills/` only (`examples/` ignored).
Expand Down
8 changes: 5 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,20 +111,21 @@ Follow the [Agent Code of Conduct](CODE_OF_CONDUCT.md): deterministic skill outp
### Tests and CI

- Add or update tests in the correct layer when behavior changes (see [TESTING.md](docs/TESTING.md)).
- **Skill bundle test** — `skills/<category>/<name>/test_skill.py` (required for new skills; ships in the wheel; run locally before skill PRs).
- **Skill bundle test** — `skills/<category>/<name>/test_skill.py` (required for new skills; ships in the wheel; runs in CI via `pytest skills/`).
- **Framework test** — `tests/test_*.py` at repo root (loader, CLI, issuer rules).
- **Maintainer skill test** — optional `tests/skills/<category>/test_<name>.py` for extra loader or edge-case coverage.
- **Usage examples** — `examples/*.py` are not tests and are not run in CI.
- **GitHub Actions** installs `pip install -e ".[dev,all]"`, runs `python -m black --check .`, then `flake8 .`, then **`pytest tests/`** (framework + maintainer tests). Do not add per-skill pip lines or test paths to `.github/workflows/ci.yml`.
- **GitHub Actions** installs `pip install -e ".[dev,all]"`, runs `python -m black --check .`, then `flake8 .`, then **`pytest skills/`** (bundle tests), then **`pytest tests/`** (framework + maintainer tests). Do not add per-skill pip lines or hardcoded skill paths to `.github/workflows/ci.yml`.
- Run locally before opening a PR:

```bash
python -m black --check .
python -m flake8 .
python -m pytest skills/
python -m pytest tests/
```

For skill work, also run:
For a single skill:

```bash
python -m pytest skills/<category>/<skill_name>/test_skill.py
Expand Down Expand Up @@ -153,6 +154,7 @@ Agents must follow [Agent Contribution Workflow](docs/contributing/ai_native_wor
```bash
python -m black --check .
python -m flake8 .
pytest skills/
pytest tests/
```

Expand Down
16 changes: 9 additions & 7 deletions docs/TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ pip install -r requirements.txt

| Layer | Location | Shipped in pip wheel? | CI on PR? |
| :--- | :--- | :---: | :---: |
| **Skill bundle test** | `skills/<category>/<skill_name>/test_skill.py` | Yes | No — run locally for skill PRs |
| **Skill bundle test** | `skills/<category>/<skill_name>/test_skill.py` | Yes | Yes |
| **Framework test** | `tests/test_*.py` (not under `tests/skills/`) | No (clone only) | Yes |
| **Maintainer skill test** | `tests/skills/<category>/test_<name>.py` | No (clone only) | Yes when present |
| **Usage example** | `examples/*.py` | No | No — not pytest |
Expand Down Expand Up @@ -62,7 +62,7 @@ pip install -r requirements.txt
| Loader, CLI, registry issuer rules | Framework test | `tests/test_loader.py`, `tests/test_skill_issuer.py` |
| End-to-end provider demo script | Usage example | `examples/gemini_tos_evaluator.py` |

**Rule of thumb:** if it ships with the skill and must pass before merge → **bundle test** (run locally). If it is extra regression depth for clone-repo work → **maintainer test** (optional). If it proves provider integration → **example**, not pytest.
**Rule of thumb:** if it ships with the skill and must pass before merge → **bundle test** (CI + local). If it is extra regression depth for clone-repo work → **maintainer test** (optional). If it proves provider integration → **example**, not pytest.

## 1. Code Formatting (Black)

Expand Down Expand Up @@ -121,20 +121,21 @@ GitHub Actions installs `pip install -e ".[dev,all]"`, then runs:
```bash
python -m black --check .
python -m flake8 .
python -m pytest skills/
python -m pytest tests/
```

That covers **framework tests** and **maintainer skill tests** under `tests/`. It does not run `examples/` or skill bundle tests. Do not add per-skill pip lines or test paths to `.github/workflows/ci.yml`.
That covers **skill bundle tests** under `skills/` and **framework + maintainer tests** under `tests/`. It does not run `examples/`. Do not add per-skill pip lines or hardcoded skill paths to `.github/workflows/ci.yml`.

The `[all]` extra includes optional SDK groups plus registry skill runtime deps (`web3`, `fastembed`, `numpy`, …) so `pytest skills/` works after `pip install -e ".[dev,all]"`. When a skill adds new `manifest.yaml` `requirements`, add the same packages to the matching optional extra and to `[all]` in `pyproject.toml`.

### Local commands

Match CI, and run bundle tests when you touch skills:
Match CI:

```bash
python -m pytest tests/
python -m pytest skills/
python -m pytest tests/
```

Single skill bundle test:
Expand Down Expand Up @@ -164,5 +165,6 @@ Before pushing your code, run the following commands:
1. `skillware list` (verify install and path resolution)
2. `python -m black --check .` (verify formatting; use `python -m black .` to fix)
3. `python -m flake8 .` (check quality)
4. `python -m pytest tests/` (framework + maintainer tests — same scope as CI)
5. `python -m pytest skills/<category>/<skill_name>/test_skill.py` when your PR adds or changes a skill bundle test (or `pytest skills/` for broad skill changes)
4. `python -m pytest skills/` (bundle tests — same scope as CI)
5. `python -m pytest tests/` (framework + maintainer tests — same scope as CI)
6. `python -m pytest skills/<category>/<skill_name>/test_skill.py` when you want a single-skill subset
5 changes: 3 additions & 2 deletions docs/contributing/ai_native_workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ You must:
```bash
python -m black .
python -m flake8 .
pytest skills/
pytest tests/
```

Expand Down Expand Up @@ -160,7 +161,7 @@ Run a **pre-PR audit** on yourself:
1. Map every acceptance criterion in the issue to a file or test in your diff.
2. Complete the [verification checklist](#verification-checklists-by-contribution-type) for your contribution type.
3. If the change is user-visible, confirm [CHANGELOG.md](../../CHANGELOG.md) has entries under `[Unreleased]` (same rule as [CONTRIBUTING.md](../../CONTRIBUTING.md)).
4. Run `flake8` and `pytest tests/`; for skill work also run the relevant `pytest skills/.../test_skill.py`. Report actual command output to your operator—do not claim success without evidence.
4. Run `flake8`, `pytest skills/`, and `pytest tests/`; for skill work also run the relevant `pytest skills/.../test_skill.py`. Report actual command output to your operator—do not claim success without evidence.
5. Draft PR template answers: check only boxes that apply; fill the skill section only if `skills/` changed.

If anything fails, return to Stage 4, fix, and audit again.
Expand Down Expand Up @@ -200,7 +201,7 @@ You should:

1. Draft the PR description (why, not only what; link the issue).
2. Map changed files to the [pull request template](../../.github/PULL_REQUEST_TEMPLATE.md)—skill checklist only when `skills/` changed.
3. Monitor CI (lint and `pytest tests/`). If checks fail, diagnose, fix in Stage 4, and push to the same branch.
3. Monitor CI (lint, `pytest skills/`, and `pytest tests/`). If checks fail, diagnose, fix in Stage 4, and push to the same branch.
4. Address review comments with focused follow-up commits.

Do not force-push shared branches unless a maintainer instructs you.
Expand Down
Loading