K8SPG-1057: Allow using etcd as patroni DCS#1647
Conversation
|
I will wait for the jira ticket to open the PR in the helm charts repo |
|
and another note - this is my first OSS contribute so be gentel 😄 |
egegunes
left a comment
There was a problem hiding this comment.
@yoav-katz the implementation looks good to me in general. but we definitely need an e2e test that deploys etcd and configures PerconaPGCluster to use it.
|
would love to see this merged! |
…erator into etcd-dcs
QUESTIONS FOR REVIEWERS:
|
|
Operator-managed etcd (future consideration) The current design requires users to supply an external etcd cluster via spec:
patroni:
dcs:
type: etcd
etcd:
managed: # operator deploys etcd itself
replicas: 3 # 1 for dev, 3 for production HA
storage: 1Gi
storageClass: standard
# endpoints: omitted when managed: is setThe operator would create and reconcile an etcd StatefulSet (with PVCs) co-located with the PostgreSQL cluster. This raises a few design questions: |
commit: 692e3e3 |
Let's start with (a).
I don't think it's crucial but would be a good addition.
For the start, I think it's better to not allow live migration. We can revisit this after receiving feedback.
Why the longer-term goal is to replace routing by labels with HAProxy? Also, operator already creates a headless service covering all postgres pods.
I don't think we should ever have etcd managed by the operator. It should be the user who configure and manage etcd infrastructure. |
There was a problem hiding this comment.
let's remove this step completely
| ctx context.Context, cluster *v1beta1.PostgresCluster, | ||
| ) error { | ||
| // With etcd DCS, Patroni stores distributed configuration in etcd, not k8s Endpoints. | ||
| if dcs := cluster.Spec.Patroni.GetDCS(); dcs != nil && dcs.Type == v1beta1.PatroniDCSTypeEtcd { |
There was a problem hiding this comment.
maybe we can extract this into a method of PostgresCluster and use everywhere
CHANGE DESCRIPTION
Problem:
Patroni supports multiple DCS backends, but the operator hardcodes Kubernetes Endpoints as the only option. This blocks clusters on managed Kubernetes platforms where workloads cannot reach the control plane API.
Cause:
The kubernetes: stanza was hardcoded in the generated Patroni config with no mechanism to select a different backend.
Several other pieces of the operator also assumed k8s DCS: RBAC rules unconditionally granted Endpoints permissions, the primary service routed through Patroni-managed Endpoints objects, and pod role labels/annotations were expected to be set by Patroni itself (which only happens with k8s DCS).
Solution:
Add a spec.patroni.dcs field (type: kubernetes default, type: etcd alternative). The field is immutable after cluster creation, enforced by a CEL validation rule on the CRD.
When type: etcd, the operator:
The Kubernetes DCS path is unchanged.
CHECKLIST
Jira
Needs Doc) and QA (Needs QA)?Tests
Config/Logging/Testability