Skip to content

CI fixes: venv isolation, python 3.12, logstash memory#109

Merged
Oddly merged 8 commits intomainfrom
fix/logstash-container-memory
Mar 27, 2026
Merged

CI fixes: venv isolation, python 3.12, logstash memory#109
Oddly merged 8 commits intomainfrom
fix/logstash-container-memory

Conversation

@Oddly
Copy link
Owner

@Oddly Oddly commented Mar 26, 2026

Three CI fixes:

Logstash molecule container memory bumped from 2GB to 3GB to prevent OOM during idempotence checks on rockylinux9 with ES8. The 512m JVM heap plus OS overhead exceeded 2GB on restart.

ansible-core 2.20+ requires Python 3.12. The sanity and compatibility test jobs now select python3.12 when testing against 2.20+.

All CI workflows (molecule, linting, plugins) switched from uv pip install --system to per-job venvs. The 20 self-hosted runners share a single filesystem, and concurrent --system installs were racing — one job's package upgrade would delete binaries another job was about to use. Each job now creates an isolated venv in RUNNER_TEMP.

Oddly added 3 commits March 26, 2026 20:45
The Logstash container was hitting OOM (rc=137) during idempotence on
rockylinux9 with ES8. With 2GB and a 512m JVM heap plus ES overhead,
there wasn't enough headroom for a clean restart. 3GB gives room for
the JVM to reinitialize without tripping the OOM killer.
ansible-core 2.20 requires Python >= 3.12. The self-hosted runners
ship with 3.11 as default, so the sanity and compatibility jobs were
failing. This adds a version check step that selects python3.12 when
testing against ansible-core 2.20+.
All CI workflows (molecule, linting, plugins) ran uv pip install --system
which shared a single Python environment across 20 concurrent runners.
Jobs racing to install different ansible-core versions would clobber
each other's binaries, causing command-not-found and permission errors.

Each job now creates its own venv in RUNNER_TEMP via uv venv. This
isolates all Python dependencies per job and eliminates the shared-state
race condition.
@Oddly Oddly force-pushed the fix/logstash-container-memory branch from fef16df to 10d3916 Compare March 27, 2026 08:24
The rolling upgrade previously only ran when elasticstack_version was
pinned to a specific version. This left two gaps:

1. Changing elasticstack_release from 8 to 9 without pinning a version
   would install the new package but skip the node-by-node restart.
2. Running with state: latest would upgrade all nodes simultaneously
   through the normal handler, bringing the whole cluster down at once.

Now the rolling upgrade triggers in all cases:
- Pre-install: when the target version or major release differs from
  the installed version (pinned version or release change).
- Post-install: when the normal package task changed the package
  (covers the latest case where we can't predict pre-install).

A 10-second countdown with Ctrl+C abort option runs before any rolling
upgrade so users are aware of what is about to happen. The countdown
duration is configurable via elasticsearch_upgrade_countdown.
@Oddly Oddly force-pushed the fix/logstash-container-memory branch from 10d3916 to d8ad04e Compare March 27, 2026 08:32
Oddly added 4 commits March 27, 2026 09:40
When elasticstack_version is set to 'latest', every new minor or patch
release triggers a rolling restart. This is safe but may surprise users
who run the playbook frequently. Added a warning to the reference docs
and a code comment on the package install tasks explaining why.
The upgrade scenarios previously pinned elasticstack_version to a
specific 9.x version, which bypassed the new release-change detection.
Now they only set elasticstack_release: 9 without pinning a version,
exercising the real-world upgrade path where the role detects the
major version mismatch and triggers the rolling upgrade automatically.

Also sets elasticsearch_upgrade_countdown: 0 to skip the interactive
pause in CI.
After the 8→9 upgrade completes, re-runs the role with
elasticstack_version: latest. Since the package is already at the
latest 9.x, the package task should report no change and ES should
NOT be restarted. Verified by comparing the ES process PID before
and after the re-run.
Without an explicit state, ansible.builtin.package defaults to
state: present, which means 'installed, don't upgrade'. When
elasticstack_version is not pinned, the package name is just
'elasticsearch' with no version suffix, so the package manager
sees it as already installed and does nothing.

All package tasks in the rolling upgrade now use state: latest
so the package manager actually installs the newest version from
the target release repository.
@Oddly Oddly merged commit 22e9f69 into main Mar 27, 2026
132 checks passed
@Oddly Oddly deleted the fix/logstash-container-memory branch March 27, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant