fix: retry VCS status on finalize for failed state#2716
Conversation
When postFinalStatus fails during reconciliation, the PipelineRun state is set to "failed" but FinalizeKind never retries the VCS update. This leaves commit statuses stuck at "pending" with no recovery path. Add a StateFailed handler in FinalizeKind that retries the status post one last time during PipelineRun cleanup. Signed-off-by: ab-ghosh <abghosh@redhat.com> Assisted-by: Claude Opus 4.6 (via Claude Code)
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2716 +/- ##
==========================================
- Coverage 59.25% 59.23% -0.03%
==========================================
Files 208 208
Lines 20573 20605 +32
==========================================
+ Hits 12191 12205 +14
- Misses 7610 7622 +12
- Partials 772 778 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review
This pull request introduces a retry mechanism for updating the final status on the git provider when a PipelineRun enters a failed state during finalization. It includes new logic in the reconciler to fetch repository details and re-attempt the status post, along with updated unit tests to cover this scenario. A significant thread-safety concern was raised regarding the assignment of the logger to a shared client object, which could lead to data races in concurrent reconciliation loops.
I am having trouble creating individual review comments. Click here to see my feedback.
pkg/reconciler/finalizer.go (103-105)
Assigning to r.run.Clients.Log is not thread-safe because r.run is a shared object across multiple concurrent reconciliation loops. This creates a data race. Since postFinalStatus and other methods should ideally take the logger as an argument or use the one already present in the context, this mutation should be avoided. If the client requires a logger, consider if it can be initialized once or handled in a thread-safe manner.
chmouel
left a comment
There was a problem hiding this comment.
This can race with a newer rerun and put the wrong status back on the PR.
We already send the final status once in the normal flow. With this change, if that failed PipelineRun gets deleted later, the finalizer sends it again. The problem is that it uses the same check/status name as the rerun, so an older failed run can overwrite a newer successful one.
In general please tell your agent to be very careful when touching this part of the code, because race like this can be introduced.
|
If the intent is to solve VCS interminent, i think we may be able to do this, but it would need some quite a bit of thinking to make it right, feel free to join the PaC meeting to talk about it |
|
Thank you @chmouel, will join the PaC call to discuss the approach. Converting to draft for now |
ab-ghosh
left a comment
There was a problem hiding this comment.
📋 Review Summary
The PR successfully addresses the issue where VCS status updates are not retried when a PipelineRun fails and is then deleted (finalized). The implementation follows existing patterns in the reconciler.
🔍 General Feedback
- The logic is sound and correctly identifies the
StateFailedcondition during finalization. - Consider refactoring the repository retrieval logic in
FinalizeKindto reduce duplication. - Be cautious with side effects on the shared
Runobject (setting the logger).
| detectedProvider, event, err := r.initGitProviderClient(ctx, logger, repo, pr) | ||
| if err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
[MEDIUM] Setting r.run.Clients.Log modifies shared state within the Reconciler's Run object. While this ensures that methods like GetStatusFromTaskStatusOrFromAsking have a logger, it introduces a side effect that could affect other concurrent reconciliations. Since postFinalStatus and other methods already take a logger parameter, it would be safer to pass the logger down to where it's needed instead of relying on the shared client state.
| } | ||
| } | ||
|
|
||
| if state == kubeinteraction.StateFailed { |
There was a problem hiding this comment.
[LOW] This block for StateFailed repeats the repository retrieval logic used for StateQueued and StateStarted. Consider refactoring to fetch the repository once for all states that require it.
| } | ||
|
|
||
| if err := r.retryFinalStatusUpdate(ctx, repo, pr); err != nil { | ||
| logger.Errorf("failed to retry final status update on finalize: %v", err) |
There was a problem hiding this comment.
[LOW] Inconsistent error formatting. Line 47 uses %w while here %v is used. While logger.Errorf doesn't wrap, consistency across the file is preferred.
| }, | ||
| }, | ||
| { | ||
| name: "failed state retries final status", |
There was a problem hiding this comment.
[LOW] The test case "failed state retries final status" verifies that the function runs without error but doesn't assert that the status was actually retried or posted. Consider adding assertions for the expected side effects.
📝 Description of the Change
When the VCS update fails, PaC sets
statetofailedand never re-attempts during FinalizeKind when the PipelineRun is being deleted. This leaves the VCS commit status stuck atpending. This PR adds aStateFailedhandler inFinalizeKindthat retries the VCS status update one last time when the PipelineRun is deleted, following the same pattern as the existingreportPipelineRunAsCancelled🔗 Linked GitHub Issue
Fixes #
🧪 Testing Strategy
🤖 AI Assistance
AI assistance can be used for various tasks, such as code generation,
documentation, or testing.
Please indicate whether you have used AI assistance
for this PR and provide details if applicable.
Important
Slop will be simply rejected, if you are using AI assistance you need to make sure you
understand the code generated and that it meets the project's standards. you
need at least know how to run the code and deploy it (if needed). See
startpaac to make it easy
to deploy and test your code changes.
If the majority of the code in this PR was generated by an AI, please add a
Co-authored-bytrailer to your commit message.For example:
Co-authored-by: Claude noreply@anthropic.com
✅ Submitter Checklist
fix:,feat:) matches the "Type of Change" I selected above.make testandmake lintlocally to check for and fix anyissues. For an efficient workflow, I have considered installing
pre-commit and running
pre-commit installtoautomate these checks.