Skip to content

[CONTP-1335] organize finalizers#2879

Open
nlchung wants to merge 9 commits intomainfrom
nlchung/CONTP-1335-organize-finalizers
Open

[CONTP-1335] organize finalizers#2879
nlchung wants to merge 9 commits intomainfrom
nlchung/CONTP-1335-organize-finalizers

Conversation

@nlchung
Copy link
Copy Markdown

@nlchung nlchung commented Apr 8, 2026

What does this PR do?

Migrates 5 controllers to use the shared internal/controller/finalizer/ package, eliminating duplicated finalizer boilerplate across DatadogAgent, DatadogAgentInternal, DatadogMonitor, DatadogDashboard, and DatadogGenericResource. (DatadogSLO already used the shared package.)

Each controller's handleFinalizer() and addFinalizer() methods are replaced with a deleteResource() callback that wraps the existing cleanup logic, passed to the shared finalizer.NewFinalizer(...).HandleFinalizer(...).

Motivation

CONTP-1335: The finalizer package existed but only DatadogSLO used it. All other controllers copy-pasted the same ~40-60 lines of add/check/remove finalizer scaffolding. This consolidates them.

Changes

Shared HandleFinalizer owns the full lifecycle. It adds the finalizer on create, runs the delete callback and removes the finalizer on deletion, and returns a requeue result so ShouldReturn handles the early exit. Callers no longer need to check GetDeletionTimestamp() after HandleFinalizer. The post-deletion requeue uses the controller's configured defaultRequeuePeriod:

  • API-calling controllers (dashboard, monitor, genericresource, SLO) requeue with their defaultRequeuePeriod (60s), avoiding hot-loop reconcile churn on clusters with many terminating objects.
  • Agent controllers pass 0, getting immediate requeue (same as before).

deleteResource signature standardized. All controllers now use deleteResource(logger logr.Logger) finalizer.ResourceDeleteFunc (except DatadogAgentInternal, which gets its logger from ctrl.LoggerFrom(ctx) and uses deleteResource()).

ResourceDeleteFunc contract documented. Added a doc comment explaining the error-handling contract: returning an error blocks deletion and retries; returning nil allows deletion to proceed.

No extra requeue after adding finalizer. Previously, controllers returned Requeue: true after adding a finalizer to a new object (triggering an extra reconcile cycle). The shared package returns ctrl.Result{} and lets the current reconcile continue. This is safe because client.Update returns the updated object with the new ResourceVersion.

Error handling preserved per controller

  • Dashboard, GenericResource: deletion errors are logged and swallowed (return nil) — finalizer is removed even if the DD API delete fails
  • Monitor, SLO: deletion errors propagate — finalizer is NOT removed, reconcile requeues for retry (SLO also handles 404 gracefully)
  • Agent, AgentInternal: no external API calls during finalization, only k8s resource cleanup

Minimum Agent Versions

N/A — operator-side refactor only.

Describe your test plan

  • Shared HandleFinalizer unit tests cover all reachable paths (add finalizer, no-op, delete success, delete failure)
  • All existing per-controller finalizer tests updated and passing
  • Full test suite passes for all 7 affected packages
  • Deployed custom operator image to minikube: created a DatadogAgent CR (reconciled successfully), then deleted it — both DDA and DDAI finalizers executed cleanly, all cluster-level and namespaced resources were cleaned up, CR did not get stuck in Terminating state

Checklist

  • PR has at least one valid label: bug, enhancement, refactoring, documentation, tooling, and/or dependencies
  • PR has a milestone or the qa/skip-qa label
  • All commits are signed (see: signing commits)

@nlchung nlchung requested a review from a team April 8, 2026 13:03
@nlchung nlchung marked this pull request as draft April 8, 2026 13:04
@nlchung nlchung self-assigned this Apr 8, 2026
@nlchung nlchung added this to the v1.26.0 milestone Apr 8, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e2f8940593

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/controller/finalizer/finalizer.go Outdated
@nlchung nlchung added the enhancement New feature or request label Apr 8, 2026
@nlchung nlchung added refactoring qa/skip-qa and removed enhancement New feature or request labels Apr 8, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 8, 2026

Codecov Report

❌ Patch coverage is 61.01695% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.02%. Comparing base (d09fcb3) to head (a93c8bc).

Files with missing lines Patch % Lines
internal/controller/datadogmonitor/finalizer.go 56.25% 4 Missing and 3 partials ⚠️
internal/controller/datadogdashboard/finalizer.go 50.00% 4 Missing and 1 partial ⚠️
...ernal/controller/datadogagentinternal/finalizer.go 0.00% 3 Missing ⚠️
internal/controller/datadogagent/finalizer.go 33.33% 1 Missing and 1 partial ⚠️
...er/datadogagentinternal/controller_reconcile_v2.go 0.00% 2 Missing ⚠️
...nal/controller/datadoggenericresource/finalizer.go 84.61% 1 Missing and 1 partial ⚠️
...controller/datadogagent/controller_reconcile_v2.go 50.00% 0 Missing and 1 partial ⚠️
internal/controller/datadogslo/controller.go 66.66% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2879      +/-   ##
==========================================
- Coverage   40.05%   40.02%   -0.03%     
==========================================
  Files         319      319              
  Lines       28052    27960      -92     
==========================================
- Hits        11237    11192      -45     
+ Misses      15993    15959      -34     
+ Partials      822      809      -13     
Flag Coverage Δ
unittests 40.02% <61.01%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/controller/datadogdashboard/controller.go 50.37% <100.00%> (+0.37%) ⬆️
...al/controller/datadoggenericresource/controller.go 53.54% <100.00%> (+0.36%) ⬆️
internal/controller/datadogmonitor/controller.go 56.65% <100.00%> (+0.70%) ⬆️
internal/controller/finalizer/finalizer.go 86.66% <100.00%> (+7.35%) ⬆️
...controller/datadogagent/controller_reconcile_v2.go 61.19% <50.00%> (+0.19%) ⬆️
internal/controller/datadogslo/controller.go 59.88% <66.66%> (ø)
internal/controller/datadogagent/finalizer.go 71.73% <33.33%> (+12.03%) ⬆️
...er/datadogagentinternal/controller_reconcile_v2.go 0.00% <0.00%> (ø)
...nal/controller/datadoggenericresource/finalizer.go 84.61% <84.61%> (+7.94%) ⬆️
...ernal/controller/datadogagentinternal/finalizer.go 55.10% <0.00%> (+7.21%) ⬆️
... and 2 more

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d09fcb3...a93c8bc. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tbavelier tbavelier modified the milestones: v1.26.0, v1.27.0 Apr 8, 2026
@datadog-datadog-prod-us1
Copy link
Copy Markdown

datadog-datadog-prod-us1 bot commented Apr 12, 2026

❌ Code Coverage

Fix all issues with BitsAI

🛑 Gate Violations

🎯 1 Code Coverage issue detected

A Patch coverage percentage gate may be blocking this PR.

Patch coverage: 58.49% (threshold: 80.00%)

ℹ️ Info

🎯 Code Coverage (details)
Patch Coverage: 58.49%
Overall Coverage: 40.12% (-0.02%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: a93c8bc | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

@nlchung nlchung marked this pull request as ready for review April 13, 2026 12:45
nlchung and others added 3 commits April 13, 2026 08:45
…guards

HandleFinalizer was missing a return on the deletion path, causing
reconciliation to continue into create/update logic. Remove the
compensating GetDeletionTimestamp checks from all 5 callers, fix SLO
deleteResource closure capture, and improve finalizer test coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

// 2. Handle finalizer logic.
if result, err := r.handleFinalizer(reqLogger, instance, r.finalizeDadV2); utils.ShouldReturn(result, err) {
final := finalizer.NewFinalizer(reqLogger, r.client, r.deleteResource(reqLogger), 0, 0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we pass in an error re-queue period like what's done for datadogslo? Same for all the other migrated controllers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants