Skip to content

controlplane: pgbouncer follow-ups — race fix, drift correction, hardening#456

Merged
fuziontech merged 1 commit intomainfrom
pgbouncer-followups
Apr 24, 2026
Merged

controlplane: pgbouncer follow-ups — race fix, drift correction, hardening#456
fuziontech merged 1 commit intomainfrom
pgbouncer-followups

Conversation

@fuziontech
Copy link
Copy Markdown
Member

Summary

Follow-ups from the #455 review, bundled into one PR.

  • Add configurable DuckDB extension loading #4 PUT race — new MutateManagedWarehouse store method wraps the merge in a transaction with SELECT ... FOR UPDATE, closing the read-modify-write race where concurrent PUTs could silently drop updates. Handler rewritten to use it; typed error routes validation failures to 400 vs. store errors to 500.
  • Update README with new features #5 Drift correction for spec.metadataStore.pgbouncer.enabled — controller now reconciles Ready warehouses. On drift, patches the Duckling CR with a JSON merge patch (RFC 7396) so sibling metadataStore fields stay intact. Scope is narrow by design: only this one field is reconciled today.
  • Add connection rate limiting to prevent brute-force attacks #3 Empty-endpoint guardducklingMetadataStoreAddress errors when both Endpoint and PgBouncerEndpoint are empty instead of letting an empty host propagate into the DSN.
  • Add pg_catalog compatibility and binary format support #2 JSON-tag coherence docstring — notes the invariant that managedWarehouseRequest json tags must stay aligned with ManagedWarehouse.

Scope decisions

  • Drift correction stays narrow (just pgbouncer.enabled). ACU, image, etc. aren't user-mutable via the admin API today, and aggressive drift correction would fight manual kubectl patches. Broader declarative reconciliation can land as its own PR if ops decide they want it.
  • No automatic retry on transaction conflict. SELECT FOR UPDATE blocks until the prior writer commits, so callers don't see conflicts in the normal path. If contention becomes a real problem we can add retry-on-40001.

Test plan

  • go build -tags kubernetes ./controlplane/... — clean
  • go vet -tags kubernetes ./controlplane/... — clean
  • go test -tags kubernetes ./controlplane/... — all green except the pre-existing docker-compose-dependent postgres integration tests (same note as feat(controlplane): per-warehouse PgBouncer opt-in for Duckling CR #455/feat(ducklake): route metadata catalog through per-Duckling PgBouncer #448)
  • New tests:
    • TestReconcileReadyPatchesCRWhenPgBouncerFlippedOn — enable path
    • TestReconcileReadyPatchesCRWhenPgBouncerFlippedOff — disable path
    • TestReconcileReadyNoDriftDoesNotPatch — no-op when in sync (asserts resourceVersion unchanged)
    • TestDucklingMetadataStoreAddressRejectsEmptyEndpoints — empty-endpoint guard
    • TestMutateManagedWarehouseSerializesConcurrentWriters — integration test: 8 goroutines increment a counter through Mutate; with plain Get+Upsert this loses updates, with the new transactional path it lands at 8. Runs in CI where docker-compose is available.
  • Manual canary after merge: flip pgbouncer_enabled on an existing Ready org via admin API; verify the controller patches the CR within one poll interval; confirm the Crossplane composition provisions/deprovisions the pooler accordingly.

Rollback

Revert. The new store method is additive — rolling it back returns the handler to Get+Upsert semantics (the race reopens, but correctness is preserved for non-concurrent callers). Drift correction is opt-in via the new Ready state in actionableStates; reverting removes that and Ready warehouses stop being reconciled, matching pre-PR behavior.

🤖 Generated with Claude Code

…ening

Follow-ups from the #455 review, bundled into one PR because #4 and #5
share code paths and reviewers.

**#4 — close the read-modify-write race on PUT /managed-warehouses/:id**

Adds `apiStore.MutateManagedWarehouse(orgID, mutate func(*ManagedWarehouse) error)`.
The gorm implementation runs inside `db.Transaction` and takes a row-level
`SELECT ... FOR UPDATE` on the warehouse row, so two concurrent PUTs on the
same org serialize cleanly instead of silently clobbering each other.
Validation runs inside the closure on the locked row; a typed
`warehouseBadRequestError` distinguishes caller errors (400) from store
errors (500). A new integration test in api_postgres_test.go fans 8
goroutines through Mutate and asserts no lost updates (runs in CI where
docker-compose is available).

**#5 — drift correction for spec.metadataStore.pgbouncer.enabled**

Adds Ready to the controller's actionable states and a new
`reconcileReady` path that reads the CR's current
`spec.metadataStore.pgbouncer.enabled`, compares against
`w.PgBouncer.Enabled` from the config store, and patches on mismatch using
a JSON merge patch (RFC 7396) so sibling metadataStore fields stay intact.

Scope is intentionally narrow: only this one field is reconciled. ACU,
image, and other spec fields are not user-mutable via the admin API today,
and aggressive drift correction would conflict with manual kubectl
patches. Three new tests cover flip-on, flip-off, and no-op-when-in-sync.

**#3 — empty-endpoint guard**

`ducklingMetadataStoreAddress` now errors when both `Endpoint` and
`PgBouncerEndpoint` are empty rather than silently returning ("", 5432,
false, nil) and letting the downstream DSN fail with an opaque connect
error.

**#2 — JSON-tag coherence docstring**

Notes that `managedWarehouseRequest` json tags must stay aligned with
`ManagedWarehouse` for the merge-via-unmarshal approach to work; adding a
field to the request struct without a matching tag on the target would
silently drop the value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fuziontech fuziontech merged commit 46ec48b into main Apr 24, 2026
21 checks passed
@fuziontech fuziontech deleted the pgbouncer-followups branch April 24, 2026 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant