Unnecessary machine creation and immediate deletion fix#1069
Unnecessary machine creation and immediate deletion fix#1069gardener-prow[bot] merged 5 commits intogardener:masterfrom
Conversation
Manual Testing to mock the error scenariofunc (c *controller) manageReplicas(ctx context.Context, allMachines []*v1alpha1.Machine, machineSet *v1alpha1.MachineSet) error {
machineSetKey, err := KeyFunc(machineSet)
if err != nil {
utilruntime.HandleError(fmt.Errorf("Couldn't get key for %v %#v: %v", machineSet.Kind, machineSet, err))
return nil
}
// Testing the timing issue.
mcd := c.getMachineDeploymentsForMachineSet(machineSet)
if mcd[0].Spec.Replicas == 5 {
mcdCopy := mcd[0].DeepCopy()
mcdCopy.Spec.Replicas = 4
mcdCopy.Annotations[machineutils.TriggerDeletionByMCM] = allMachines[0].Name
_, err = c.controlMachineClient.MachineDeployments(c.namespace).Update(ctx, mcdCopy, metav1.UpdateOptions{})
}
...
}LogsPrevious logsCode-fix logs |
takoverflow
left a comment
There was a problem hiding this comment.
Thanks for the careful consideration of in-place update scenarios. Have added some comments, PTAL!
|
@r4mek can you please summarize how the fix solves the issue (for posterity) and also add the release note ? |
|
Done. |
|
LGTM label has been added. DetailsGit tree hash: d1d1085fdfd175291add6a31a0a581166f45ba28 |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: takoverflow The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/cherry-pick rel-v0.61 |
|
@takoverflow: new pull request created: #1071 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Integration LogsDetails |
|
/cherry-pick rel-v0.60 |
|
@takoverflow: new pull request created: #1073 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What this PR does / why we need it:
This PR contains fix for the issue #1068, where MCM was creating a machine and immediately deletes it. This happened because of timing issue in MCM. Refer the issue for further details.
Summarizing the approach:
This PR solves the issue by first letting
MachineSetcontroller take scaling decisions based on what view it has of the cluster. And the stale machine termination is done at the end. We know that in the next reconciliation,MachineSetcontroller will also observe the deletion ofstaleMachinesdone in the previous reconciliation. Hence,MachineSetcontroller at all times will atmost lag behindMachineDeploymentcontroller by 1 reconciliation.Which issue(s) this PR fixes:
Fixes #1068
Special notes for your reviewer:
Release note: