Skip to content

fix: clear error for hostname-less tenant listeners#220

Open
ecv wants to merge 1 commit into
mainfrom
fix/issue-219-listener-hostname-required
Open

fix: clear error for hostname-less tenant listeners#220
ecv wants to merge 1 commit into
mainfrom
fix/issue-219-listener-hostname-required

Conversation

@ecv

@ecv ecv commented Jun 24, 2026

Copy link
Copy Markdown

Summary

Closes #219.

Tenant Gateways (class datum-external-global-proxy) whose listeners use the managed ports (80/443) without a hostname were rejected at admission with the opaque upstream CEL message:

spec.listeners: Invalid value: Combination of port, protocol and hostname must be unique for each listener

— even for a single listener, which can't collide with anything.

Root cause

The mutating defaulter (GatewayCustomDefaulter.Default) injects default-http (80/HTTP) and default-https (443/HTTPS) with no hostname at creation (the canonical <uid>.<targetDomain> hostname is stamped later by the controller, since the UID isn't available in the create webhook). A tenant listener on the same port/protocol without a hostname produces a duplicate (port, protocol, hostname=∅) tuple, which the upstream Gateway API CRD CEL rejects.

Admission order is mutating webhook → CRD schema/CEL validation → validating webhook, so the CEL fires before validateListeners runs. That makes the existing validateListeners hostname-required branch (gateway_validation.go) effectively unreachable for tenant listeners at creation — the opaque CEL error always wins.

Fix

Reject hostname-less tenant (non-default) listeners in the defaulting webhook, before injecting the defaults — the defaulter returns error, which denies admission before the CEL stage, so the operator surfaces an actionable message:

spec.listeners[0].hostname: Required value: a hostname must be set; this controller
injects default HTTP/HTTPS listeners, so listeners you define must specify a
distinguishing hostname

default-named listeners are exempt (they're overwritten by defaulting and stamped by the controller), and bare gateways (no listeners) are unaffected.

This is the design-intent-aligned option from the issue discussion (#3): the design already mandates a distinguishing hostname on tenant listeners (ports are locked to 80/443, so hostname is the only uniqueness axis), and the injected defaults are meant to coexist with hostname-bearing tenant listeners. It does not dedupe/suppress the default listeners (#1, would drop the canonical hostname) and does not try to stamp a hostname at create (#2, UID unavailable then).

Tests

  • TestValidateTenantListenerHostnames: bare gateway, hostname-less tenant listener (rejected), hostname-bearing tenant listener (ok), default-named listeners (exempt).
  • go build ./..., go vet, and the validation package tests pass.

Validation note

Reproduced the original failure against a live project control plane with kubectl apply --dry-run=server: a single 80/HTTP listener with no hostname was rejected with the CEL "must be unique" message; adding a hostname made it apply cleanly — confirming the diagnosis.

Refs datum-cloud/infra#2804.

Tenant Gateways whose listeners use the managed ports (80/443) without a
hostname were rejected at admission with the opaque upstream CEL message
"Combination of port, protocol and hostname must be unique for each
listener", even for a single listener.

Root cause: the mutating defaulter injects default-http/default-https
listeners with no hostname at creation. A tenant listener on the same
port/protocol without a hostname produces a duplicate
(port, protocol, hostname) tuple, which the upstream Gateway API CRD CEL
rejects. CRD CEL validation runs after mutating webhooks but before
validating webhooks, so the equivalent check in validateListeners is
preempted and never surfaces at creation.

Reject hostname-less tenant (non-default) listeners in the defaulting
webhook, before injecting the defaults, so the operator returns an
actionable message instead of leaking the CEL error.

Closes #219
@ecv ecv marked this pull request as ready for review June 24, 2026 18:14
@ecv

ecv commented Jun 24, 2026

Copy link
Copy Markdown
Author

Setting to draft until the bug this fixes is reproduced with end to end tests in this repo.

@ecv

ecv commented Jun 24, 2026

Copy link
Copy Markdown
Author

E2E reproduction of the bug this PR fixes: #221 (separate branch, test-only, no production code). It reproduces the misleading must be unique rejection on unfixed main. When this fix merges, probes A/B in that test will need their expected message updated to the clear hostname-required message.

@ecv ecv marked this pull request as ready for review June 24, 2026 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gateway: hostname-less tenant listeners rejected with misleading 'must be unique' (default-listener injection collision)

1 participant