Skip to content

[fix] provision_operator: derive identity + endpoints generically (ADR 0407)#25

Open
baditaflorin wants to merge 3 commits into
mainfrom
claude/provision-operator-localize
Open

[fix] provision_operator: derive identity + endpoints generically (ADR 0407)#25
baditaflorin wants to merge 3 commits into
mainfrom
claude/provision-operator-localize

Conversation

@baditaflorin

Copy link
Copy Markdown
Owner

Summary

scripts/provision_operator.py (ADR 0318 — the one-shot operator onboarding
tool that creates a Keycloak user, mints a Headscale VPN key, and emails the
new operator their SSO credentials + service list + SSH/VPN tutorial) still
carried pre-migration generic constants and never adopted the ADR-0385/0407
identity-derivation pattern. Against the live deployment it targeted a
non-existent Keycloak realm (lv3, now 404) and unreachable *.localhost
hosts, and emailed a stale service list.

This makes it work as one command on any deployment without introducing
deployment-specific literals (keeps the public mirror diff minimal):

  • Identity derived at runtime from inventory + the .local overlay via
    scripts/identity_yaml.load_identity_vars(): realm = platform_domain.split('.')[0],
    admin ids <prefix>-bootstrap-admin / <prefix>-admin-runtime, and
    Keycloak / Headscale / step-ca URLs derived from the domain.
  • Welcome-email service list is now generated from
    config/service-capability-catalog.json (public entries), substituting the
    generic example.com for the live domain, grouped by catalog category — no
    more hand-maintained stale list.
  • Mail send switched from the broken smtplib-to-docker-hostname path to the
    documented mail-gateway HTTP API (service_url("mail_platform") + /send,
    platform-transactional gateway key) reached via the SSH proxy.
  • --dry-run now prints the resolved identity and the fully rendered email.
  • PLATFORM_DOMAIN, LV3_KEYCLOAK_REALM, and the existing LV3_*_URL env
    overrides are preserved; an explicit PLATFORM_DOMAIN override now also
    drives the prefix/realm so they can't disagree.

Test plan

  • pytest tests/test_provision_operator.py — 12 passed (added coverage for
    identity resolution, domain substitution in the service list, and payload URL
    derivation; updated the mail-gateway helper test).
  • --dry-run against the real overlay resolves realm=0mcp,
    https://sso.0mcp.com, and the catalog-driven service list with the live
    domain — and the committed file contains no deployment-specific literals.

🤖 Generated with Claude Code

baditaflorin and others added 3 commits June 8, 2026 22:36
Localize provision_operator.py to derive Keycloak realm, config prefix, admin
identifiers, and service endpoints from inventory + the .local identity overlay
(ADR 0407), instead of the stale lv3/*.localhost generic constants.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…R 0407)

provision_operator.py carried pre-migration generic constants (Keycloak realm
"lv3", *.localhost endpoints, a stale hardcoded email service list, and a fixed
SMTP relay). Against the live deployment realm/endpoints/admin identifiers
derive from platform_domain, so it targeted a non-existent realm and
unreachable hosts.

Now everything is derived at runtime from inventory + the .local identity
overlay (scripts/identity_yaml.load_identity_vars), keeping the committed
script generic (no deployment literals):

- realm = platform_domain.split('.')[0]; admin ids = <prefix>-{bootstrap-admin,
  admin-runtime}; Keycloak/Headscale/step-ca URLs derived from the domain.
- Welcome-email service list is built from config/service-capability-catalog.json
  (public entries) with example.com -> live domain, grouped by category.
- Mail send switched from the broken smtplib-to-docker-hostname path to the
  documented mail-gateway HTTP API (service_url("mail_platform") + /send,
  platform-transactional gateway key) via the SSH proxy.
- --dry-run now prints resolved identity and the rendered welcome email.

PLATFORM_DOMAIN / LV3_KEYCLOAK_REALM / LV3_* URL env overrides preserved.
Tests updated + added for identity resolution and service rendering.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…back

operator_manager.py hardcoded KEYCLOAK_REALM="lv3" and
KEYCLOAK_BOOTSTRAP_ADMIN="lv3-bootstrap-admin", so the governed
recover-totp / reset-password / onboard flows targeted a non-existent realm on
the live deployment (realm is platform_domain.split('.')[0]).

- Derive realm, config prefix, bootstrap-admin and admin-client ids from
  inventory + the .local identity overlay (identity_yaml.load_identity_vars),
  matching the provision_operator.py fix. No deployment literals committed.
- KeycloakAdminAdapter now accepts an optional admin client id + secret loader
  and prefers the client-credentials grant, falling back to the bootstrap-admin
  password grant. This keeps admin auth working when the bootstrap password has
  been rotated (the live case). operator_manager wires in <prefix>-admin-runtime
  + .local/keycloak/admin-client-secret.txt.
- PLATFORM_DOMAIN / LV3_KEYCLOAK_REALM / LV3_KEYCLOAK_ADMIN_CLIENT_SECRET env
  overrides honored. Added tests for the auth fallback and identity resolution.

Note: admin REST API is internal-only (public edge returns 403), so these
commands must run from a tailnet/controller host (or with the internal Keycloak
URL forwarded via LV3_KEYCLOAK_URL).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@baditaflorin

Copy link
Copy Markdown
Owner Author

Extended this PR with a second commit (ca9237b3f): generalized operator_manager.py's realm/admin derivation the same way, and added a client-credentials auth fallback to KeycloakAdminAdapter so recover-totp / reset-password / onboard work against the live realm even when the bootstrap-admin password is stale. No deployment literals committed; verified the module resolves the real realm + admin client and 34/35 tests pass (the one failure, test_repo_operator_roster_validates, is a pre-existing generic-roster vs. real-name mismatch, unrelated to this change).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant