Skip to content

Consolidate CI workflows and fix test gaps#96

Merged
Oddly merged 5 commits intomainfrom
ci/workflow-improvements
Mar 11, 2026
Merged

Consolidate CI workflows and fix test gaps#96
Oddly merged 5 commits intomainfrom
ci/workflow-improvements

Conversation

@Oddly
Copy link
Owner

@Oddly Oddly commented Mar 11, 2026

Summary

This extracts the ~95 identical lines of molecule CI steps (checkout, deps, collection install, SSH, converge, verify, idempotence, diagnostics, cleanup) into a single reusable workflow at .github/workflows/molecule.yml. Seven caller workflows now pass scenarios, distros, and releases as inputs instead of repeating the full step sequence — each caller drops from ~165 lines to ~40. Net reduction of about 800 lines.

The elasticsearch_diagnostics scenario gets its own reusable workflow call with skip-idempotence: true, decoupled from the ES matrix where it was awkwardly shoehorned as a matrix include:.

Also extracts the repeated certificate generation tasks from the two logstash SSL scenarios into molecule/shared/generate_test_certs.yml.

Beyond the structural changes, this fixes two real test issues found during validation: the elasticsearch_default verify was unconditionally asserting cluster.logsdb.enabled which only exists in ES 9.x (now guarded), and the role's security packages step was getting OOM-killed during parallel runs because python3-cryptography and python3-packaging weren't pre-installed in prepare (now they are).

Path filters on all role workflows now include roles/elasticstack/**, molecule/shared/**, and .github/workflows/molecule.yml so changes to shared code actually trigger the right tests. Branch restrictions on PR triggers are removed entirely. Nightly schedules are staggered across the week to avoid running ~300 jobs in the same window.

The test_full_stack and test_elasticsearch_upgrade workflows are left as-is — they have enough unique structure (gate jobs, conditional idempotence, molecule test mode) that forcing them into the reusable pattern would add complexity rather than remove it.

Validated with a full dispatch of all 9 workflows: 381 jobs total, all green.

Test plan

  • All 9 workflows dispatched via workflow_dispatch on this branch
  • 312/319 passed on first run (7 failures were the ES 8 verify + OOM bugs fixed in subsequent commits)
  • ES re-run after verify fix: 62/62 passed

🤖 Generated with Claude Code

Oddly added 5 commits March 11, 2026 09:14
All role and ES-specific workflows now trigger on changes to
roles/elasticstack/** and molecule/shared/** in addition to their own
role paths. Previously a change to the shared elasticstack role (password
fetching, cert distribution, package bootstrap) would only be caught by
the full_stack nightly run, not by any PR-triggered role workflow.

The branch name restrictions (feature/**, fix/**) are removed — path
filters alone are sufficient and the old restrictions silently skipped
tests for PRs from ci/, chore/, refactor/, or deps/ branches.
The six heaviest workflows (elasticsearch 60 jobs, logstash 72 jobs,
full_stack 48 jobs, beats 48 jobs, custom_certs 24 jobs) now alternate
across days instead of all firing every night. This keeps the nightly
peak under ~150 jobs for 20 runners instead of 316.

Mon/Wed/Fri: elasticsearch + logstash (~132 jobs + dailies)
Tue/Thu/Sat: full_stack + beats + custom_certs (~120 jobs + dailies)
Daily: repos (12), upgrade (4), modules (12), kibana (36)
Consolidate ~95 identical lines of molecule CI steps (checkout, deps,
collection install, SSH setup, converge, verify, idempotence, cleanup)
into a single reusable workflow at .github/workflows/molecule.yml. Seven
caller workflows now pass scenarios/distros/releases as inputs instead
of repeating the full step sequence. The elasticsearch_diagnostics
scenario gets its own reusable workflow call with skip-idempotence.

Also extract the repeated certificate generation tasks from
logstash_ssl and logstash_standalone_certs into
molecule/shared/generate_test_certs.yml.

Net reduction of ~800 lines across the CI configuration.
The LogsDB assertion was unconditional but cluster.logsdb.enabled only
exists in ES 9.x. Guard it with a release check so ES 8 runs skip it.
Also add the elasticstack_release var to the verify play since it runs
as a separate play from converge.
Move python3-cryptography and python3-packaging into prepare_common.yml
so they're installed during container setup rather than during converge.
This avoids a heavy dnf transaction during the role's packages.yml step,
which was getting OOM-killed under memory pressure when many molecule
jobs run in parallel.
@Oddly Oddly merged commit 18b760b into main Mar 11, 2026
127 checks passed
@Oddly Oddly deleted the ci/workflow-improvements branch March 11, 2026 14:04
Oddly added a commit that referenced this pull request Mar 12, 2026
…ion task

Merging main brought the task name prefix convention from PR #96. The new
directory creation task added by this branch now follows the same pattern.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant