feat(garm): apply configurator runner options via per-scaleset GARM templates (ISD-5728) by cbartz · Pull Request #238 · canonical/github-runner-operators

cbartz · 2026-06-16T13:58:38Z

What this PR does

Exposes six runner-behaviour options on the garm-configurator charm —
dockerhub-mirror, runner-http-proxy, aproxy-exclude-addresses,
aproxy-redirect-ports, otel-collector-endpoint, pre-job-script — and applies
them to GARM runners via the GARM template system (per ISD278), not
extra_specs.pre_install_scripts.

garm-configurator: a new RunnerConfig model validates the options and shares
them over the garm-configurator relation databag.
garm: reads them into ScalesetSpec.runner_config; the scaleset reconciler copies
GARM's system github_linux template, injects rendered pre-install and pre-job-hook
blocks (runner_template.py), and creates/updates a per-scaleset template via the
templates API, referencing it by template_id. New GarmClient template methods back this.

Stacked on #236 (feat/garm-scaleset-isd-5729) — base it there for a clean diff.
Includes two small GARM-charm enablement fixes needed for the reconcile path (see Review focus).

Why we need it

Operators need to customise runner proxy / mirror / telemetry / pre-job behaviour without
modifying charm code (ISD-5728).

Checklist

Test plan

Unit: 74 garm + 45 garm-configurator tests pass; ruff + pyright clean.
Integration (local microk8s, reused GARM rock): 11 passed, 4 skipped, 0 failed.
The scaleset-creation tests (including the new test_runner_options_render_into_scaleset_template)
skip until the provider-registration chain (feat(configurator): configurator integration #234) lands — the configurator derives
provider_name="openstack-<project>" while GARM only has the built-in openstack provider, so
no scaleset is created. Same skip-guard as feat(garm): sync scalesets via GARM REST API without service restart (ISD-5729) #236's own scaleset tests; the template rendering logic
itself is fully unit-covered.

Review focus

runner_template.py bash rendering (aproxy / docker mirror / otel / pre-job hook). Production-only
static steps (tmate proxy, apt-mirror self-test, post-job metrics) are stubbed with TODO(ISD-278)
pending the old charm's cloud-init source.
Enablement fixes (distinct from the feature): _get_garm_url now targets the in-pod GARM URL
(the previous config-based path read a non-existent option and always returned ""), and
_maybe_first_run registers the GARM admin via the existing GarmClient.first_run. These overlap
with Added generated OpenAPI client to GARM #232 (api-client / first-run) and will need reconciling if it lands first.
Per-scaleset template lifecycle in the reconciler (create, update-on-drift, delete-with-scaleset).

Potential breaking changes

None. All new config is optional; behaviour is unchanged when the options are unset.

New dependencies / APIs / modules

New module charms/garm/src/runner_template.py.
New GarmClient template methods (list/get/create/update/delete_template).
No new third-party dependencies.

🤖 Generated with Claude Code

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Use 'status 409' substring match to match the specific error format string rather than bare '409' which could appear in error bodies. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ation databag Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…armState Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…rator relation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…efer Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…N for pre_install_scripts - scaleset_reconciler: track all desired names (not just ready ones) so existing scalesets are NOT deleted when their provider or credential is temporarily unavailable - charm.py / configurator charm.py: switch pre_install_scripts wire format from str()/ast.literal_eval to json.dumps()/json.loads for a well-defined serialization; add regression test for the delete-on-skip bug Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…emplates Expose six runner-behaviour options on the garm-configurator charm (dockerhub-mirror, runner-http-proxy, aproxy-exclude-addresses, aproxy-redirect-ports, otel-collector-endpoint, pre-job-script), validated by a new RunnerConfig model and shared over the garm-configurator relation. The GARM charm reads them into ScalesetSpec.runner_config and the scaleset reconciler copies GARM's system github_linux template, injects the rendered pre-install and pre-job hook blocks (runner_template.py), and creates/updates a per-scaleset template via the templates API, referencing it by template_id. Also enable the scaleset reconcile path on this branch: register the GARM admin via GarmClient.first_run (_maybe_first_run) and target the in-pod GARM URL, and align the integration test to authenticate with the charm-managed admin credentials.

Copilot

Pull request overview

This PR exposes runner-behaviour options on the garm-configurator charm and applies them to GARM runners via per-scaleset GARM templates (copying the system github_linux template and injecting rendered blocks), instead of using extra_specs.pre_install_scripts. It also includes small GARM-charm enablement fixes around first-run initialization and local API URL resolution.

Changes:

Add a RunnerConfig model and rendering pipeline to inject proxy/mirror/telemetry/pre-job behavior into per-scaleset GARM templates.
Extend the GARM API client and scaleset reconciler to manage template lifecycle (create/update/delete) and reference templates via template_id.
Update unit/integration tests to cover template API and template rendering (including an integration assertion against template content when available).

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
charms/tests/integration/test_garm.py	Uses charm-managed admin secret for first-run/login; adds integration check that runner options appear in rendered template content.
charms/garm/tests/unit/test_scaleset_reconciler.py	Adds unit coverage for template creation/update/delete behavior during reconcile.
charms/garm/tests/unit/test_runner_template.py	New unit tests for runner-install template rendering and `RunnerConfig` parsing.
charms/garm/tests/unit/test_garm_api.py	Adds unit tests for template CRUD/list API calls and base64 encoding behavior.
charms/garm/tests/unit/test_charm.py	Updates tests for reconcile path to account for `_maybe_first_run`; adds first-run tests.
charms/garm/src/scaleset_reconciler.py	Implements per-scaleset template ensure/update/delete and passes `template_id` into scaleset create/update payloads.
charms/garm/src/runner_template.py	New module rendering the injected bash blocks into a base template.
charms/garm/src/garm_api.py	Adds template list/get/create/update/delete methods.
charms/garm/src/charm.py	Wires runner config into `ScalesetSpec`, adds `_maybe_first_run`, fixes local API URL, triggers first-run retries.
charms/garm-configurator/tests/unit/test_charm.py	Adds validation tests for runner config options and asserts relation databag fields are written/absent appropriately.
charms/garm-configurator/tests/unit/test_charm_state.py	Updates tests for new `RunnerConfig` field in charm state.
charms/garm-configurator/src/charm.py	Writes runner-option fields into the `garm-configurator` relation databag.
charms/garm-configurator/src/charm_state.py	Adds `RunnerConfig` parsing/validation for new runner options.
charms/garm-configurator/charmcraft.yaml	Adds new runner-behaviour config options and descriptions.

…eared Address Copilot review on #238: - the reconciler now detaches the per-scaleset template (template_id -> 0) and deletes the unreferenced custom template when runner options are cleared, so operators can revert to the default template by clearing the config - aproxy-exclude-addresses rejects non-IP/CIDR tokens (they would fail in the nft ruleset and the failure would be swallowed) instead of silently accepting - align the runner-http-proxy and otel-collector-endpoint config descriptions with their http(s) URL validation

cbartz · 2026-06-16T14:25:56Z

Addressed all 5 review comments in 01a33bc:

1–2 (reconciler template detach): when runner options are cleared the reconciler now reverts the scaleset to GARM's default template (template_id → 0) and deletes the now-unreferenced custom template; added test_template_detached_when_runner_config_cleared.
3 (aproxy-exclude-addresses): now rejects non-IP/CIDR tokens instead of silently accepting them (they would fail in the nft ruleset); added a unit test.
4–5 (docs): updated the runner-http-proxy and otel-collector-endpoint config descriptions to state the http(s) URL requirement.

AI-generated message.

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Address Copilot review on #238: when runner options are set but the system github_linux template is transiently absent from list_templates, _ensure_template returned 0, which detached and deleted an existing per-scaleset template and lost the runner config. Now keep an existing custom template untouched in that case and only fall back to the default template when there is none to keep.

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Address Copilot review on #238: - _template_bytes now caches data fetched via get_template back onto the template dict, avoiding redundant GETs when the system template is reused across specs - aproxy-exclude-addresses now rejects IPv6 (and hostnames): the rendered nft ruleset uses an IPv4-only `table ip`, so IPv6 would pass validation but fail at runtime; the config description is updated to say IPv4

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Address Copilot review on #238: - _maybe_first_run now logs in first and only calls /first-run when login fails, avoiding repeated first-run calls (and the 409s they log) on every update-status tick while still self-healing if GARM's database is reset - the scaleset reconciler lists templates with the github_linux partial-name filter instead of pulling every template on the controller

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

… creds helper Address Copilot review on #238: - block config that sets aproxy-exclude-addresses / aproxy-redirect-ports without runner-http-proxy, since the runner template only renders the aproxy block when a proxy is set (the options would otherwise silently no-op) - centralise the GARM admin-credentials secret lookup in a conftest helper (get_garm_admin_creds_from_secret) reused by garm_login_from_secret and the first-run integration helper

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Address Copilot review on #238: the runner template renders relation databag values (aproxy redirect ports / exclude addresses) into a root-executed nft ruleset, so re-validate them at this trust boundary instead of relying on the configurator's validation alone — malformed and IPv6 tokens are dropped. Also document the unresolved ubuntu-vs-runner target for the static group-membership step under the existing ISD-278 TODO.

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

Address Copilot review on #238: the consumer-side port sanitisation checked only token shape, so out-of-range (0, 99999) or inverted (443-80) ports could still reach the nft ruleset and break it. Now also enforce 1..65535 and N<=M.

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Address Copilot review on #238: - _validate_http_url rejects URLs containing whitespace/control characters, which are later rendered into scripts and env files - the otel endpoint has CR/LF stripped before being written to the runner env file, so a databag value can't inject extra env entries - remove the now-unused garm_app_name parameter from garm_login_from_secret and update its call sites

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

…docs Address Copilot review on #238: - guard the update-status and configurator-relation handlers with is_ready() so the GARM client's 30s timeout can't hang a hook while the workload is starting - only call list_templates when a spec has runner options or an existing scaleset references a custom template, avoiding an API call per reconcile otherwise - derive collision-resistant heredoc delimiters so an operator pre-job-script containing the delimiter can't truncate the generated script/env file

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

Address Copilot review on #238: _validate_http_url accepted URLs with invalid ports (e.g. http://host:bad) because urlparse does not validate them. Force evaluation of parsed.port (which raises on a bad port) and require a hostname.

florentianayuwono and others added 11 commits June 16, 2026 12:37

chore: add garm_configurator_v0 relation endpoints to both charms

2d471a1

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

feat(garm): add ScalesetReconciler with diff-and-apply logic

37b616a

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

feat(garm): add GarmClient REST API wrapper (garm_api.py)

6494b19

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

fix(garm): narrow 409 check in first_run to avoid false positives

9671594

Use 'status 409' substring match to match the specific error format string rather than bare '409' which could appear in error bodies. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

feat(garm-configurator): write scaleset data to garm-configurator rel…

b1e952e

…ation databag Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

feat(garm-configurator): add provider_name and credentials_name to Ch…

371f5df

…armState Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

feat(garm): add admin secrets and scaleset reconcile via garm-configu…

88a6983

…rator relation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

test(garm): add integration tests for scaleset create/update/delete/d…

5d3df8e

…efer Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

style: fix import sort and blank line lint issues

dad0f08

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

cbartz requested a review from Copilot June 16, 2026 13:59

Copilot started reviewing on behalf of cbartz June 16, 2026 13:59 View session