Skip to content

Add application credential finalizer management#1108

Open
Deydra71 wants to merge 1 commit intoopenstack-k8s-operators:mainfrom
Deydra71:appcred-finalizer
Open

Add application credential finalizer management#1108
Deydra71 wants to merge 1 commit intoopenstack-k8s-operators:mainfrom
Deydra71:appcred-finalizer

Conversation

@Deydra71
Copy link
Copy Markdown
Contributor

@Deydra71 Deydra71 commented Apr 27, 2026

Jira: OSPRH-29269

Application Credential dev-doc: https://github.com/openstack-k8s-operators/dev-docs/blob/main/application_credentials.md

  • Tracks the active AC secret name in Status.ApplicationCredentialSecret
  • Add openstack.org/nova-ac-consumer finalizer to the AC secret after service config is rendered
  • On AC rotation, move the finalizer from the old secret to the new one
  • On CR deletion, remove the consumer finalizer from the AC secret before cleaning up the CR

This ensures that the keystone-operator cannot revoke a rotated AC secret while Nova is still consuming it.

2026-04-28T12:00:02Z	INFO	Controllers.Nova	Added consumer finalizer	{"controller": "nova", "controllerGroup": "nova.openstack.org", "controllerKind": "Nova", "Nova": {"name":"nova","namespace":"openstack"}, "namespace": "openstack", "name": "nova", "reconcileID": "28bfd03d-0b09-4b34-a505-992dd89e029e", "object": "ac-nova-c91b6-secret", "finalizer": "openstack.org/nova-ac-consumer"}
2026-04-28T12:00:02Z	INFO	Controllers.Nova	Removed consumer finalizer	{"controller": "nova", "controllerGroup": "nova.openstack.org", "controllerKind": "Nova", "Nova": {"name":"nova","namespace":"openstack"}, "namespace": "openstack", "name": "nova", "reconcileID": "28bfd03d-0b09-4b34-a505-992dd89e029e", "object": "ac-nova-62906-secret", "finalizer": "openstack.org/nova-ac-consumer"}

Depends-On: openstack-k8s-operators/keystone-operator#685

Assisted-by: Claude Opus 4.6 noreply@anthropic.com

Signed-off-by: Veronika Fisarova <vfisarov@redhat.com>
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 27, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Deydra71
Once this PR has been reviewed and has the lgtm label, please assign kk7ds for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/09a52fa229b0443eae25a234ac4cd247

✔️ openstack-meta-content-provider SUCCESS in 4h 16m 43s
nova-operator-kuttl FAILURE in 59m 28s
✔️ nova-operator-tempest-multinode SUCCESS in 2h 12m 46s
✔️ nova-operator-tempest-multinode-ceph SUCCESS in 2h 39m 44s

@Deydra71
Copy link
Copy Markdown
Contributor Author

Note: this PR does not include EDPM-aware revocation blocking. Nova-operator and telemetry-operator will not need code changes for the EDPM tracking problem.

The EDPM credential lifecycle gap (old AC secrets being revoked while EDPM nodes still use them) will be handled entirely in keystone-operator, building on the SecretDeploymentStatus / AreAllNodesUpdated() tracking from openstack-operator PR #1781.

The plan:

  • keystone-operator's cleanupUnusedRotatedSecrets() will add a NodeSet status check only for EDPM-consuming ACs (ac-nova, ac-ceilometer). Before revoking an old Keystone AC or deleting its K8s Secret, it will list all OpenStackDataPlaneNodeSet CRs and check AreAllNodesUpdated(). If any NodeSet has pending node updates, revocation/deletion is skipped and retried on the next reconcile.
  • The same guard will apply during reconcileDelete() for Keystone-side revocation.
  • Nova-operator and telemetry-operator continue managing their control-plane consumer finalizers (openstack.org/nova-ac-consumer, openstack.org/ceilometer-ac-consumer) exactly as implemented in this PR. No EDPM specific finalizer or NodeSet watch is needed in the service operators.

This keeps the tracking responsibility centralized in keystone-operator (the credential owner) rather than duplicating NodeSet awareness across service operators. Same pattern as infra-operator's ae1787c for RabbitMQ user deletion.

@SeanMooney
Copy link
Copy Markdown
Contributor

The plan:

  • keystone-operator's cleanupUnusedRotatedSecrets() will add a NodeSet
    status check only for EDPM-consuming ACs (ac-nova, ac-ceilometer). Before
    revoking an old Keystone AC or deleting its K8s Secret, it will list all
    OpenStackDataPlaneNodeSet CRs and check AreAllNodesUpdated(). If any
    NodeSet has pending node updates, revocation/deletion is skipped and retried
    on the next reconcile.

Service operators are not allowed to depend on the OpenStackControlPlane or
OpenStackDataPlaneNodeSet CRs.

So this also breaks the architecture by creating a circular dependency between
the meta operator and the keystone operator. Service operators are not allowed to
read any input CRs reconsiled by the openstack operator like the node set cr.
That can and has resulted in reconcile loops in the past, so that is not a
sound approach to take and arguable a bigger regression then the issue you are trying
to solve. this is related to https://github.com/openstack-k8s-operators/dev-docs/blob/main/developer.md#external-crd-dependencies

A better approach is to follow how the RabbitMQ rotation works
where the infra operator will remove the finalizer from the application
credential CR once the update of the EDPM nodes has been completed.

We just need to provide the reference to those in the config maps that we
generate so it has the names and can add and remove the finalizer
automatically.

The keystone operator does not need to actually do anything special. It can
delete the CR and rely on K8s to not actually do the deletion until the last
finalizer is removed. When the CR is finally deleted, the keystone operator can
do any remaining work that is required, but we should rely on finalizers and
K8s native mechanisms to guard the actual deletion of the API resource.

@Deydra71
Copy link
Copy Markdown
Contributor Author

Hi @SeanMooney - thanks for flagging the concern, I want to clarify the planned approach since there's no keystone-operator code up for review yet regarding the EDMP tracking issue, and I want to make sure we align before publishing a PR.

What we're planning to do (and not do):
The design follows the same pattern as infra-operator's RabbitMQ user deletion blocking (infra-operator commit ae1787c4, which depends on openstack-operator PR #1781). Specifically:

  • keystone-operator would read OpenStackDataPlaneNodeSet.Status.SecretDeployment.AllNodesUpdated (read-only, same field infra-operator reads)
  • keystone-operator would not write, update, patch, or own any OpenStackDataPlaneNodeSet resource

With that, the check is unidirectional: NodeSet status flows into keystone-operator's decision on whether to revoke old rotated application credentials. There is no feedback path - keystone-operator's actions (revoking an AC, deleting a Secret) do not trigger NodeSet status changes and that is identical to what infra-operator does in isUserStillInUseByNodeSets() for RabbitMQ users.

On the referenced guideline:
The linked file in dev-doc AFAIU discusses how to reference external CRDs, it doesn't contain guidance about directional dependency constraints between operators or prohibitions on service operators reading CRs reconciled by openstack-operator. If there's a separate documented policy on this, could you point me to it? I want to make sure we both follow the right constraint.

On the reconciliation loop concern:
This is a valid concern in general,however the pattern here is a read-only status check with no write-back - the same pattern infra-operator already uses. If a NodeSet status change triggers a keystone-operator reconcile, the reconciler reads the status, decides whether cleanup is safe, and either proceeds or re-queues. It never modifies the NodeSet, so there's no opportunity for a loop.

With the above - my questions to resolve before proceeding:

  • Is the infra-operator pattern (commit ae1787c4 - reading NodeSet status from RabbitMQ user controller) considered acceptable? If so, replicating it in keystone-operator should be equally acceptable since it's the same read-only interaction
  • If this pattern is not acceptable for service operators in general, should infra-operator's approach also be revised?
  • If neither should read-only NodeSet CRs, what's the preferred mechanism?

@SeanMooney
Copy link
Copy Markdown
Contributor

so the issue is service operator like keyston are not allwo to read OpenStackDataPlaneNodeSet.Status

or interact with CRs of that kind/CRD at all.

the infra operator is not a service operator.

the keystone-oporatr shoudl jsut watch the ApplciationCredtial CR for the finaliser to be remvoed and the openstack operator shoudl do the removal of the filaiser from the applciation creditail cr when all edpm nodes are updated.

if the keystone operatort is aware of OpenStackDataPlaneNodeSet at all directly thats a problem

you are not ment to have refence to the CRDs contained in https://github.com/openstack-k8s-operators/openstack-operator in service operators, the infra operator was created so that service operator could depend on resouce form a common operatior that was not the openstack operator.

so my meta point is its not ok to implemnt the same pattern the infa operator does in teh keystone operator.

@Deydra71
Copy link
Copy Markdown
Contributor Author

Deydra71 commented May 4, 2026

@SeanMooney Alright, I now understand the full picture and the difference from infra-operator, thanks for explaining. With that the updated approach is:

  • openstack-operator will manage an EDPM finalizer on the KeystoneApplicationCredential CR for Nova and Ceilometer ACs. It will add the finalizer when the AC secret is deployed to EDPM nodes, and remove it once SecretDeploymentStatus.AllNodesUpdated is true (utilizing #1781)
  • keystone-operator remains completely unaware of NodeSets. It just respects the finalizer and K8s won't actually delete the KeystoneApplicationCredential CR until openstack-operator removes the EDPM finalizer, so keystone-operator's reconcileDelete (which revokes the credential in Keystone) won't run

Please let me know if it now makes sense.

@Deydra71
Copy link
Copy Markdown
Contributor Author

Deydra71 commented May 4, 2026

Note: The openstack-k8s-operators/openstack-operator#1781 was reworked, so the rpevious approach is no longer applicable. The PR now only guards NodeSet.Status.SecretHashes from being updated on ansibleLimit scoped deployments. I assume the new mechanism for blocking deletion would be to compare hash with the live secret content against NodeSet.Status.SecretHashes

@Deydra71
Copy link
Copy Markdown
Contributor Author

Deydra71 commented May 4, 2026

We could have this updated approach after #1781 lands - NodeSet.Status.SecretHashes won't be updated on ansibleLimit-scoped deployments which gives us a signal for whether all EDPM nodes have the latest secrets.

There are two paths where an old credential can be revoked while EDPM nodes still use it:

  1. AC CR deletion (user disables AC auth) - openstack-operator deletes the KeystoneApplicationCredential CR -> keystone-operator's reconcileDelete revokes in Keystone
  2. Rotation cleanup - keystone-operator automatically revokes old AC secrets once all consumer finalizers are removed by service operators, but EDPM nodes may not have been redeployed yet

A finalizer on the AC CR would only guard path 1. It does not guard path 2, because rotation cleanup operates on the AC secret - the CR stays alive, keystone-operator just revokes the old secret (immediately previous secret is still guarded too) underneath it once consumer finalizers are gone.

Proposed approach:

  • Path 2 (rotation): openstack-operator adds openstack.org/edpm-ac-consumer finalizer on AC secrets for Nova and Ceilometer, alongside the existing service operator consumer finalizers. openstack-operator removes this finalizer only when hash(live config secret) matches NodeSet.Status.SecretHashes for all NodeSets - meaning all nodes have the new credential. keystone-operator already respects consumer finalizers, so revocation is naturally blocked until EDPM is synced.
  • Path 1 (CR deletion): An additional guard in CleanupApplicationCredentialForService checks the same hash condition before deleting the AC CR.
    The config secret names are discovered dynamically by reading OpenStackDataPlaneService CRs dataSources.

Specifically in openstack-operator:

  • EDPM finalizer addition happens in EnsureApplicationCredentialForService - when the AC CR is ready and the AC secret name is returned (L161-163), openstack-operator adds the openstack.org/edpm-ac-consumer finalizer to that AC secret for Nova and Ceilometer services
  • EDPM finalizer removal happens in the same function during subsequent reconciles - compare the live config secret hash against NodeSet.Status.SecretHashes, and remove the EDPM finalizer from the old AC secret once all NodeSets are in sync
  • CR deletion guard is in CleanupApplicationCredentialForService - checks the hash condition before proceeding with the Delete call

@SeanMooney Could you please review this updated proposed approach whether it aligns with the intended architecture/design?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants