Skip to content

fix(kyverno-policies): generate-pdb creates duplicates that block node drain in client-app namespaces #2

@danielgines

Description

@danielgines

Context

The Kyverno ClusterPolicy generate-pdb (defined in components/kyverno-policies/templates/generate-pdb.yaml) automatically generates a {{ .metadata.name }}-pdb PodDisruptionBudget for every Deployment with replicas > 1 that is not in the excluded-namespace-list (_helpers.tpl).

The excluded-namespace-list covers platform namespaces (argocd, cert-manager, cnpg-system, etc.) — all places where the rendered Helm chart already provides its own PDB. It does not cover client application namespaces.

Bug observed

During teardown validation on transfero-workload-azure-eastus2-hml, terraform destroy failed to drain an AKS node because the Dapr sidecar-injector pod had two overlapping PDBs:

PDB Source Selector
dapr-sidecar-injector-disruption-budget Dapr Helm chart (dapr_sidecar_injector/templates/dapr_sidecar_injector_poddisruptionbudget.yaml) dapr.io/control-plane=dapr-sidecar-injector
dapr-sidecar-injector-pdb Kyverno generate-pdb policy (this chart) same matchLabels propagated from Deployment.spec.selector

Kubernetes eviction API returns:

This pod has more than one PodDisruptionBudget, which the eviction subresource does not support

Every replica eviction fails, node drain blocks indefinitely, AKS agent pool delete fails, Terraform times out. Manual remediation: kubectl delete pdb --all -A across the cluster.

The same duplication affected kong-kong-pdb (from Kong chart) vs kong-kong (from Kyverno).

Root causes

  1. Excluded namespaces only cover platform-owned namespaces. When a client installs a Helm chart in a namespace outside the platform set (e.g., dapr-system, kong), and that chart ships its own PDB, the Kyverno generator duplicates it.
  2. The generator has no pre-existence check. It does not skip Deployments that already have a PDB selecting them.
  3. synchronize: true + generateExisting: true means deleting the Kyverno-generated PDB does not help — it gets re-created on the next reconcile.

Options

A — Extend the excluded-namespace-list

Add client-app namespaces where upstream charts already provide PDBs (dapr-system, kong, ...).

Pros: trivial change, unblocks today's cases.
Cons: doesn't scale — every new client app with PDBs needs a chart bump. Pollutes the "platform" excluded list with client-specific names.

B — Add an opt-out label on the Deployment or Namespace

Policy skips if estabilis.io/pdb: unmanaged is present on the Deployment (or its Namespace). Client opts out when their chart ships PDBs.

Pros: scales — client self-serves. Clear ownership model.
Cons: requires documentation and client awareness.

C — Pre-existence check in the policy

Use a Kyverno JMESPath or CEL expression to query existing PodDisruptionBudgets in the namespace and skip if one selects the current Deployment.

Pros: fully automatic — zero configuration burden on clients.
Cons: expensive (policy checks all PDBs in the namespace per Deployment event), Kyverno's query capabilities are limited, complexity in testing.

D — Rename the generated PDB to a namespaced distinctive name

Instead of {{name}}-pdb, use estabilis-{{name}}-pdb. This does not solve the duplicate-PDB problem (both still select the same pod) but makes the Kyverno-owned PDB clearly distinguishable.

Pros: clarity of ownership.
Cons: doesn't fix the core bug.

Recommendation

Combine B (opt-out label) + A (immediate exclude for known cases).

  • Apply A now: add dapr-system and kong to the excluded-namespace-list as a quick fix
  • Implement B in a follow-up PR: add the opt-out label check in the match expression, document in the chart README

Longer-term, consider C when Kyverno's query API matures or when a generate-if-absent mutation pattern is available.

Acceptance criteria

  • dapr-system and kong added to the excluded-namespace-list in _helpers.tpl
  • Opt-out label estabilis.io/pdb: unmanaged honored by the generate-pdb policy via exclude.any.resources.selector.matchLabels
  • README updated with the opt-out mechanism and an example
  • Chart bumped (workload-bootstrap + values.yaml.repoVersion, per cross-repo rules)
  • Tested on a fresh workload cluster: Dapr charts install, no duplicate PDBs, kubectl drain works

Related

  • Teardown validation session (2026-04-14) — blocker for terraform destroy on workload cluster
  • kubernetes/kubernetes#72320 — canonical issue documenting eviction API limitation with multiple PDBs

Evidence

Deploy age at time of failure: 95m (initial workload-bootstrap) + 39m (re-sync after bridge refactor).

Kyverno-generated PDBs at time of drain failure:

  • dapr-operator-pdb
  • dapr-sentry-pdb
  • dapr-sidecar-injector-pdb
  • kong-kong-pdb

Dapr chart-generated PDBs:

  • dapr-operator-disruption-budget
  • dapr-sentry-budget
  • dapr-sidecar-injector-disruption-budget
  • (also dapr-placement-server-disruption-budget, dapr-scheduler-server-disruption-budget — these had no Kyverno duplicate because their Deployments are StatefulSets, not Deployments)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions