Skip to content

Canonical CI: grouped-tests.yml + root test/test_groups.toml#217

Merged
ChrisRackauckas merged 3 commits into
SciML:mainfrom
ChrisRackauckas-Claude:grouped-tests-ci
Jun 15, 2026
Merged

Canonical CI: grouped-tests.yml + root test/test_groups.toml#217
ChrisRackauckas merged 3 commits into
SciML:mainfrom
ChrisRackauckas-Claude:grouped-tests-ci

Conversation

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor

Summary

Converts the root test workflow (.github/workflows/Tests.yml, name: CI) to the canonical SciML grouped-tests.yml@v1 thin caller, with the group x version matrix declared once in a root test/test_groups.toml.

  • .github/workflows/Tests.yml — the matrix test job (which previously hand-maintained group: {CPU, QA} x version: {lts, 1} and called the legacy per-job tests.yml@v1) is replaced by a thin caller to SciML/.github/.github/workflows/grouped-tests.yml@v1 with secrets: inherit. on: and concurrency: are preserved verbatim; filename and name: kept. No with: overrides needed (root reads GROUP, coverage/check-bounds/coverage-directories all at defaults, no apt packages).
  • test/test_groups.toml (new, repo root) — [Core] and [QA], each on ["lts", "1"]. Linux-only (no os axis).
  • test/qa/Project.toml (new) — isolated QA environment containing only the QA tooling (Aqua, ExplicitImports, Documenter) plus the deps the QA checks reference (SciMLSensitivity, Lux, Random, OrdinaryDiffEq, NonlinearSolve, Test), with DeepEquilibriumNetworks brought in via [sources] path = "../.." and julia = "1.10". This keeps the heavy QA tooling out of the main test env and out of reverse-dependency resolution.
  • test/qa/qa.jl — the QA checks (Aqua, ExplicitImports, Doctests), moved verbatim from the old test/qa_tests.jl.
  • test/runtests.jlGROUP == "QA" now Pkg.activates test/qa, Pkg.develops the package, Pkg.instantiates, then includes qa/qa.jl. Functional tests run under canonical GROUP Core (CPU/ALL still accepted for back-compat).
  • test/shared_testsetup.jlBACKEND_GROUP == "core" now selects the CPU device, so the renamed Core group behaves exactly as the old CPU group did.

Matrix match

Old caller matrix New (test_groups.toml -> grouped-tests.yml)
(CPU, lts) (Core, lts)
(CPU, 1) (Core, 1)
(QA, lts) (QA, lts)
(QA, 1) (QA, 1)

Same 4 cells: the functional CPU group is renamed to the canonical Core (QA unchanged), version set {lts, 1} preserved, Linux-only ubuntu-latest, no OS axis. Verified statically by running scripts/compute_affected_sublibraries.jl <root> --root-matrix from SciML/.github@v1, which emits exactly {(Core, lts), (Core, 1), (QA, lts), (QA, 1)}.

Project.toml already satisfies the benign-metadata checks ([compat] julia = "1.10" LTS floor; every [extras] dep has a [compat] entry), so no metadata changes were needed.

Notes

The QA group's Aqua/JET-style checks run in CI under this wiring (Aqua, ExplicitImports, Doctests). No checks were skipped, silenced, or excluded; any QA failures surfaced in CI will be triaged in a follow-up.

This was a structural conversion only — tests/Aqua/JET were not run locally; the QA group runs in CI.

Ignore until reviewed by @ChrisRackauckas.

🤖 Generated with Claude Code

ChrisRackauckas and others added 2 commits June 9, 2026 17:58
Convert the root test workflow (Tests.yml, name: CI) to the canonical
SciML/.github grouped-tests.yml@v1 thin caller, with the group x version
matrix declared once in test/test_groups.toml at the repo root.

- .github/workflows/Tests.yml: replace the hand-maintained
  group({CPU,QA}) x version({lts,1}) matrix job (which called the legacy
  per-job tests.yml@v1) with a thin caller to grouped-tests.yml@v1.
  on: and concurrency: preserved verbatim; filename + name: kept.
- test/test_groups.toml (new, root): [Core] and [QA], each on [lts, 1].
  Reproduces the old matrix with the functional CPU group renamed to the
  canonical Core (QA unchanged). Linux-only, no os axis.
- test/qa/Project.toml (new): isolated QA env (Aqua, ExplicitImports,
  Documenter + the doctest/ExplicitImports support deps) with
  DeepEquilibriumNetworks via [sources] path = "../.." and julia = "1.10".
- test/qa/qa.jl: the QA checks, moved verbatim from test/qa_tests.jl
  (no checks added, removed, or excluded).
- test/runtests.jl: GROUP == "QA" now activates test/qa, develops the
  package, instantiates, then includes qa/qa.jl. Functional tests run
  under GROUP Core (CPU and ALL still accepted for back-compat).
- test/shared_testsetup.jl: treat BACKEND_GROUP "core" as CPU so the
  renamed Core group selects the CPU device.

Project.toml already satisfies the benign-metadata checks: [compat]
julia = "1.10" (LTS floor) and every [extras] dep has a [compat] entry.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The bespoke .github/workflows/GPU.yml ran the functional suite on the
self-hosted CUDA runner with BACKEND_GROUP=CUDA. That is now a [GPU] group in
test/test_groups.toml (versions ["1"], runner [self-hosted, Linux, X64, gpu],
timeout 240) executed by the grouped-tests.yml@v1 caller, with runtests.jl
setting BACKEND_GROUP=CUDA for GROUP=GPU. The workflow's gpu-docs job was a
duplicate of the centralized documentation.yml already on main, so GPU.yml is
deleted outright.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Pushed 8575f2d: folded the bespoke .github/workflows/GPU.yml into the canonical group matrix.

  • test/test_groups.toml gains [GPU]versions = ["1"], runner = ["self-hosted", "Linux", "X64", "gpu"], timeout = 240 (the old workflow's exact runs-on labels and timeout-minutes).
  • test/runtests.jl: GROUP == "GPU" sets ENV["BACKEND_GROUP"] = "CUDA" and runs the same Utils/Layers suite — exactly what the old workflow's julia-runtest step with BACKEND_GROUP: CUDA did (shared_testsetup.jl reads BACKEND_GROUP to select the CUDA mode).
  • GPU.yml deleted. Its second job (gpu-docs) was a duplicate docs build/deploy: the centralized documentation.yml (SciML/.github documentation.yml@v1) is already on main and remains the docs pipeline.

Static verification: TOML parses, Meta.parseall on runtests.jl clean, and the v1 compute_affected_sublibraries.jl --root-matrix emits the GPU cell:

{"group":"GPU","version":"1","runner":["self-hosted","Linux","X64","gpu"],"timeout":240,"num_threads":1,"continue_on_error":false}

…ts conversion)

DEQ's GROUP semantics are backend/capability-based, not folder-partitioned: the
GPU group runs the *same* Core test files (utils_tests.jl + layers_tests.jl) with
shared_testsetup.jl's backend switched to CUDA via BACKEND_GROUP, so it cannot be
expressed as a separate folder of files. Use explicit-args run_tests (v1.2):
core = the two top-level Core files (each in its own @safetestset), GPU group =
same body with BACKEND_GROUP=CUDA, QA = test/qa sub-env. Curated all=["Core"] so
the self-hosted GPU lane and QA stay out of the aggregate. Drop Pkg from the test
deps (only the old harness used it; SciMLTesting handles activate/develop now) and
add SciMLTesting + SafeTestsets to the QA sub-env. test_groups.toml unchanged.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Added the SciMLTesting v1.2 folder-based run_tests harness on top of this grouped-tests conversion; merges as one PR (grouped-tests + SciMLTesting v1.2).

@ChrisRackauckas ChrisRackauckas marked this pull request as ready for review June 15, 2026 04:57
@ChrisRackauckas ChrisRackauckas merged commit 0cca4d0 into SciML:main Jun 15, 2026
7 of 11 checks passed
ChrisRackauckas-Claude pushed a commit to ChrisRackauckas-Claude/DeepEquilibriumNetworks.jl that referenced this pull request Jun 16, 2026
The grouped-tests v1.2 conversion (SciML#217) dropped LuxCUDA from the test
[extras]/[targets]. The self-hosted GPU lane sets BACKEND_GROUP=CUDA, which
makes test/shared_testsetup.jl do `using LuxCUDA`; without the dependency the
GPU job errors with `ArgumentError: Package LuxCUDA not found in current path`.

Re-add LuxCUDA (compat 0.3) to [compat], [extras], and the test target, using
the correct registry UUID d0bbae9a-e099-4d5b-a835-1c6931763bda (the
pre-conversion Project.toml had carried a typo'd UUID ...1c6931f17571).

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ChrisRackauckas added a commit that referenced this pull request Jun 17, 2026
)

* Apply Runic formatting to test/runtests.jl

The Runic Format Check on main fails because core_body()'s final
expression (the Layers Tests @safetestset) is missing the explicit
return that Runic requires. Add it to make the format check pass.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Restore LuxCUDA test dependency dropped during v1.2 CI conversion

The grouped-tests v1.2 conversion (#217) dropped LuxCUDA from the test
[extras]/[targets]. The self-hosted GPU lane sets BACKEND_GROUP=CUDA, which
makes test/shared_testsetup.jl do `using LuxCUDA`; without the dependency the
GPU job errors with `ArgumentError: Package LuxCUDA not found in current path`.

Re-add LuxCUDA (compat 0.3) to [compat], [extras], and the test target, using
the correct registry UUID d0bbae9a-e099-4d5b-a835-1c6931763bda (the
pre-conversion Project.toml had carried a typo'd UUID ...1c6931f17571).

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Fix MNIST docs build: steady-state adjoint VJP shape + missing VCAB3 import

The Documentation build failed on two genuine bugs in the basic MNIST DEQ
tutorial (both reproduce on `main`, independent of the test/Project.toml
changes in this PR):

1. `DimensionMismatch` in the steady-state adjoint VJP. For a conv DEQ the
   rootfind state is a 4D array (e.g. `(13, 13, 64, 2)`). When the state has
   more than 50 elements, SciMLSensitivity's `SteadyStateAdjoint` takes the
   matrix-free `VecJacOperator` path and builds the operator from `vec(y)`, so
   it seeds the residual's pullback with a *flat* cotangent. Zygote's
   `ProjectTo` then rejects that flat cotangent when it is broadcast against the
   multi-dimensional `u` in the residual `y .- u`, raising
   `variable with size(x) == (13,13,64,2) cannot have a gradient with
   size(dx) == (21632,)`. The Core test suite never hit this because its conv
   states are <= 50 elements (dense-Jacobian path). Fix: compute the residual in
   flattened space, `reshape(vec(y) .- vec(u), size(u))` — a value-preserving
   no-op that keeps the broadcast operands consistent with the flat cotangent.

2. `UndefVarError: VCAB3`. The tutorial calls `VCAB3()` but only imports
   `OrdinaryDiffEq`, which (v7) no longer re-exports `VCAB3`. Added
   `OrdinaryDiffEqAdamsBashforthMoulton` to the docs project and an explicit
   `using OrdinaryDiffEqAdamsBashforthMoulton: VCAB3` in the tutorial.

Verified locally on Julia 1.12:
- Full `docs/make.jl` builds end-to-end (both MNIST `@example` blocks run; only
  a transient ColPrac linkcheck 429 from the sandbox differs from CI).
- `GROUP=Core Pkg.test()` passes: Utils 11/11, Layers 1538/1538.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: ChrisRackauckas-Claude <accounts@chrisrackauckas.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants