ci: extract composite actions + run drift test against main's engine#22
ci: extract composite actions + run drift test against main's engine#22
Conversation
Three workflows (_build-matrix.yml, prebuild-cabal-store.yml,
prebuild-mumps.yml) were carrying near-identical platform-setup and
release-publish logic. Drift was already creeping in (e.g. msys2
`cache: true` set in one file, missing in two others). Extract into
three composite actions:
* read-versions: source versions.env and emit GHC + MUMPS pins as
step outputs. Replaces 4 inline `Read pinned versions` steps.
* setup-haskell-env: cross-platform MSYS2 / apt / brew toolchain
install, libquadmath workaround, GHCup bootstrap (with optional
actions/cache restore on Linux/macOS), C:\cabal -> D:\cabal
junction. Driven by `ghc-version`, `install-upx`, `cache-ghcup`
inputs so the same action serves both the engine build (UPX +
cache) and the prebuild workflows (no UPX, no cache, optionally
no GHC at all for the MUMPS-only build).
* publish-prebuilt-release: tag preflight + SHA256SUMS + `gh release
create` shared by prebuild-mumps and prebuild-cabal-store. Caller
strings (title, notes, glob) flow through env vars instead of
direct `${{ }}` shell interpolation, removing a quote-escaping
foot-gun.
Workflows shrink by ~300 lines net; behaviour is preserved (every
original comment block kept, every condition mirrored). The cache
save half of the GHCup cache stays in _build-matrix.yml because
composite actions cannot defer a step until after the caller's
downstream work — the action's outputs feed straight into the save.
Two unrelated hygiene improvements that don't warrant separate PRs:
_build-matrix.yml:
* Add `permissions: contents: read` at the workflow level. The
reusable matrix only checks out the repo and downloads from
public release assets; publishing happens in callers under their
own elevated scope, so dropping the default token to read-only
eliminates a write capability the matrix never uses.
* Set `timeout-minutes: 60` at the job level. Windows already has
a 25-min internal timeout for hung tests; Linux/macOS had none
and would burn the full 360-min default on a wedged process.
60 min covers a worst-case cold path (MSYS2 + GHCup + cold
cabal compile + tests ≈ 50 min) with ~10 min headroom.
pyvolca.yml:
* Wire up tests/test_drift.py, which had been skipped with a TODO
comment since it needs a built engine. Pull the linux-amd64
artefact from main's most recent successful build.yml run via
`gh run download`, drop it under dist-newstyle/ where the
live_spec fixture finds it, and let pytest run the drift checks
against that engine. If no main build exists yet the download
step exits 0 with a warning, the fixture returns None, and the
drift tests skip themselves — the PR is never blocked.
* Add `permissions: actions: read` so gh-CLI can list/download
artefacts from another workflow's run.
The previous PR commits added composite actions but the net line count
went up by ~76 — too much documentation and structure for the value.
Also: setup-haskell-env's description block contained a literal
\${{ steps.setup-haskell-env.outputs.ghcup-cache-key }} as a code
example, which GitHub tried to evaluate inside the action metadata
context (where `steps` doesn't exist) and refused to load — that's
what failed every job in the previous run.
Changes:
* Drop .github/actions/read-versions: 33 lines + 4 wire-ups to
replace ~25 lines of inline `source versions.env`. Net loss for
a marginal DRY win. The 5 call sites get back their inline reads
(the cabal-store publish job collapses two reads into one).
* setup-haskell-env: drop the 27-line description block (with the
template-in-metadata bug), tighten input descriptions to one line,
merge the cold/warm GHCup steps into one shell that branches on
whether the cache was hit. Fewer steps = less GH Actions overhead
on the warm path. From 260 to 149 lines.
* publish-prebuilt-release: same descriptor-trim treatment plus
inline two near-identical steps. From 86 to 62 lines.
Net workflow+action total drops from 1336 to 1177 lines (-159), and
versus origin/main the whole refactor is now -86 lines net (with the
new pyvolca drift-test feature included).
actions/upload-artifact@v4 strips the longest common parent path from the upload glob, so volca-linux-amd64 lands as <arch>/ghc-*/volca-*/... rather than the dist-newstyle/build/<arch>/... that conftest.py's live_spec fixture rglobs for. Drift tests were silently skipping. Recreate the dist-newstyle/build/ prefix and download into it.
Merging the cold/warm GHCup steps into one regressed cabal update: the previous structure had `Pin CABAL_DIR` as its own step that wrote to GITHUB_ENV and was therefore visible to the next step's `cabal update`. After merging, `cabal update` runs in the SAME step that writes GITHUB_ENV — but $GITHUB_ENV only takes effect for subsequent steps, so cabal update wrote the Hackage index under XDG_STATE_HOME (default for cabal-install >=3.10) while the build step (which DID see the new CABAL_DIR=~/.cabal) found no index and failed with "unknown package: vector (dependency of mumps-hs)". Add `export CABAL_DIR` alongside the GITHUB_ENV write so cabal in the same step uses the right path too.
ReviewCI green on all 4 build platforms + pyvolca, and the drift test is now exercising a real engine (verified against run Issues to fix1. PR description mentions a 2. Risks flagged in the description — assessed
Notes (no action)
Bottom line: behaviour-preserving where it claims to be, with one real gap closed (drift test). Fix the description discrepancy and good to merge. |
Restore the read-step + env: pattern previously used here. `source versions.env; export MUMPS_VERSION` works but couples the dependency to whoever happens to write the first two lines of the run-block — the explicit env: makes the dependency visible at the step header.
Summary
Refactor of
.github/workflows/to remove ~300 lines of cross-workflow duplication and close one real coverage gap..github/actions/:setup-haskell-env— cross-platform MSYS2/apt/brew + GHCup + libquadmath workaround. Driven byghc-version,install-upx,cache-ghcupinputs so the same action serves the engine build (UPX + cache) and the prebuild workflows (no UPX, no cache, optionally no GHC at all for the MUMPS-only build).publish-prebuilt-release— tag preflight + SHA256SUMS +gh release createshared byprebuild-mumpsandprebuild-cabal-store. Caller strings flow through env vars, removing a quote-escaping foot-gun._build-matrix.ymlhygiene:permissions: contents: read(the matrix only reads; publishing happens in callers under their own scope) andtimeout-minutes: 60(was relying on GH's default 360-min cap; Linux/macOS had no protection against hung tests).pyvolca.ymldrift test:tests/test_drift.pyhad been skipped with a TODO since it needs a built engine. Now it pulls the linux-amd64 artefact from main's most recent successfulbuild.ymlrun viagh run download, drops it underdist-newstyle/, and the existinglive_specfixture picks it up. Falls through to skip cleanly if no main build exists yet.Behaviour-preserving where possible — every original comment block kept, every condition mirrored. Touched workflows shrink from 1077 to 775 lines (-302); the two new actions add 215 lines, for a net reduction of 87 lines (and the bulk of the new lines is comments documenting the GHCup install layout, the C:\cabal junction, and the libquadmath quirk).
Risks I'm aware of
\${{ ... && pkg || '' }}ternary for the optional UPX packages. After substitution the string contains runs of multiple spaces;msys2/setup-msys2@v2's parser handles this iff it tokenizes on\\s+rather than literal single-space. Confirmed working in this PR's CI (both UPX-on and UPX-off paths exercised).shell: 'msys2 {0}'inside the composite action depends onmsys2/setup-msys2having registered the shell in a previous step of the same composite. Sequential composite steps share runner context so this works in practice (Windows job is green).gh run downloadpreserves thedist-newstyle/build/.../volcapath layout from the upload glob —conftest.pyrglobs foropt/build/volca/volca(default cabal-O1). Ifbuild.shever switches to-O0, drift test silently skips (same as today's behaviour).timeout-minutes: 60is conservative for the Windows cold path (~50 min in worst case). Bump if a real cold run shows it's tight.Test plan
build.ymlrun passes on all 4 platforms with cache hits comparable to recent main runs.pyvolca.ymldrift test either runs against main's engine or skips cleanly with the warning.prebuild-mumps.yml(after bumpingMUMPS_PREBUILT_REVISIONin a test branch) and confirm a release publishes withmumps-prebuilt-*artefacts.prebuild-cabal-store.yml(after bumpingCABAL_PREBUILT_REVISION) and confirm thecabal-store-prebuilt-*release publishes.