Add flex CIDR allocator controller#535
Conversation
| func initOCIRetry() { | ||
| initRetryOnce.Do(func() { | ||
| p := common.DefaultRetryPolicy() | ||
| common.GlobalRetry = &p | ||
| }) | ||
| } | ||
|
|
||
| func init() { initOCIRetry() } |
There was a problem hiding this comment.
common.GlobalRetry is the OCI SDK's process-wide default. Importing pkg/flexcidr anywhere (tests, tools, CCM main) silently changes retry behavior for CSI, CCM LB, tagging, every OCI call in the binary. This feels like unintended behavior.
If per-operation retry is wanted, passing a RequestMetadata{RetryPolicy: &p} explicitly on the specific requests would be an appropriate alternative.
|
|
||
| func init() { initOCIRetry() } | ||
|
|
||
| func runWithRateLimitRetry(ctx context.Context, logger *zap.SugaredLogger, operation string, fn func(context.Context) error) error { |
There was a problem hiding this comment.
This function is currently only called in tests but not used otherwise. It should probably either be deleted or used in CreatePrivateIpWithRequest/CreateIpv6WithRequest which is where rate limits actually apply.
| if len(node.Spec.PodCIDRs) > 0 && len(node.Spec.ProviderID) == 0 { | ||
| logger.Debug("node already has podCIDRs but providerID is empty, skipping") | ||
| return nil | ||
| } |
There was a problem hiding this comment.
I think this skip condition is currently inverted from what is is intended to be. The log message sounds like an OR but the test condition is an AND, meaning that we'll most likely never enter this block.
Also, there is currently no early exit for the "already has the expected PodCIDRs" case. This means that every node update event will trigger GetInstance, GetPrimaryVNIC and ListPrivateIps API calls. There's already such an early exit pattern in the node info controller here:
There was a problem hiding this comment.
I think this skip condition is currently inverted from what is is intended to be. The log message sounds like an OR but the test condition is an AND, meaning that we'll most likely never enter this block.
len(node.Spec.PodCIDRs) > 0 => node already has podCIDRs
&& => but
len(node.Spec.ProviderID) == 0 => providerID is empty
the if block looks good to me.
Also, there is currently no early exit for the "already has the expected PodCIDRs" case.
I agree an early exit for already processed node is desirable.
There was a problem hiding this comment.
Good point, you're right, the if block is good 👍
On the early exit, the new expectedPodCIDRsCache works, but I think it can be improved even further:
- Currently, the 5 minutes TTL means we re-do GetInstance + GetPrimaryVNICForInstance + ListPrivateIps for every already-reconciled node on every cache expiry, forever. A plain
map[string][]stringguarded by a mutex (or sync.Map), populated on successful reconcile and cleared from the existing DeleteFunc, would give zero API traffic in steady state without the TTL churn. The node informer's DeleteFunc is already wired up, so there's no leak risk. - The
if fcc.expectedPodCIDRsCache == nilguards insetExpectedPodCIDRs/deleteExpectedPodCIDRsare unreachable,NewFlexCIDRControlleralways constructs the store. - The
expectedPodCIDRsEntrystruct +expectedPodCIDRsCacheKeyFnexist only to satisfycache.Store's interface, they disappear if you switch to a plain map.
Anyway, none of that is blocking, just minor suggestions ;)
There was a problem hiding this comment.
Made the changes now and also made the controller optional and can be enabled via envvars
| return family, nil | ||
| } | ||
|
|
||
| func PatchNodePodCIDRs(ctx context.Context, kubeClient kubernetes.Interface, nodeName string, podCIDRs []string, logger *zap.SugaredLogger) error { |
There was a problem hiding this comment.
This function does not actually do a "patch", the way it is currently implemented does a double update: first a dry-run update, then a retry-on-conflict update and a final post update verification get.
For performance and consistency with existing controllers, it would probably be better to use StrategicMergePatch like here:
| "k8s.io/client-go/util/retry" | ||
| ) | ||
|
|
||
| var retryRNG = rand.New(rand.NewSource(time.Now().UnixNano())) |
There was a problem hiding this comment.
This is not safe for concurrent use, with multiple workers calling runWithRateLimitRetry in parallel, this races.
That race can be avoided by using rand.Float64() (global, locked) or a sync.Mutex.
There was a problem hiding this comment.
Doesn't seem like we use runWithRateLimitRetry() .
- We could remove it.
- Or remove retryRNG by replacing retryRNG.Float64() with rand.Float64()
i.e jitter := 1 + (rand.Float64()-0.5)*0.2
| if f.ClusterIpFamily.IPv4 != "" { | ||
| ipv4FlexCIDR, err := f.CreateFlexCidr(primaryVnicID, true, false) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| flexCidrs = append(flexCidrs, ipv4FlexCIDR) | ||
| } | ||
|
|
||
| if f.ClusterIpFamily.IPv6 != "" { | ||
| ipv6FlexCIDR, err := f.CreateFlexCidr(primaryVnicID, false, true) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| flexCidrs = append(flexCidrs, ipv6FlexCIDR) | ||
| } |
There was a problem hiding this comment.
These per family logic blocks currently run sequencially, but are actually independent, they could be run in parallel
| } | ||
|
|
||
| func formatIpCidr(ip string, mask *int) string { | ||
| return fmt.Sprintf("%s/%d", ip, *mask) |
There was a problem hiding this comment.
There's a risk for nil pointer dereference here if mask is nil
| type requireAssertions struct{} | ||
|
|
||
| var require requireAssertions |
There was a problem hiding this comment.
This repository already imports testify which offers all those testing utilities:
oci-cloud-controller-manager/go.mod
Line 53 in 3d019ab
You can directly use functions like assert.EqualValues
| @@ -0,0 +1,497 @@ | |||
| package flexcidr | |||
There was a problem hiding this comment.
I think you'll want to add the standard copyright header to all new files:
| for k, v := range instance.Metadata { | ||
| if k == primaryVnicMetadataKey { |
There was a problem hiding this comment.
you don't need to loop every instance until you find the right one here, you can directly lookup the instance key on the map with something like v, ok := instance.Metadata[primaryVnicMetadataKey]
| if len(node.Spec.PodCIDRs) > 0 && len(node.Spec.ProviderID) == 0 { | ||
| logger.Debug("node already has podCIDRs but providerID is empty, skipping") | ||
| return nil | ||
| } |
There was a problem hiding this comment.
I think this skip condition is currently inverted from what is is intended to be. The log message sounds like an OR but the test condition is an AND, meaning that we'll most likely never enter this block.
len(node.Spec.PodCIDRs) > 0 => node already has podCIDRs
&& => but
len(node.Spec.ProviderID) == 0 => providerID is empty
the if block looks good to me.
Also, there is currently no early exit for the "already has the expected PodCIDRs" case.
I agree an early exit for already processed node is desirable.
| return nil | ||
| } | ||
|
|
||
| if err := fcc.instanceCache.Add(instance); err != nil { |
There was a problem hiding this comment.
Let us remove the instance cache since we do not use it. We would either way need the latest LifecycleState of the instance for the immediate next condition.
| "k8s.io/client-go/util/retry" | ||
| ) | ||
|
|
||
| var retryRNG = rand.New(rand.NewSource(time.Now().UnixNano())) |
There was a problem hiding this comment.
Doesn't seem like we use runWithRateLimitRetry() .
- We could remove it.
- Or remove retryRNG by replacing retryRNG.Float64() with rand.Float64()
i.e jitter := 1 + (rand.Float64()-0.5)*0.2
|
@HadrienPatte & @l-technicore I have addressed your comments. Can you please take a look again while I do the manual testing. |
b64bc46 to
b042c50
Compare
b042c50 to
b3e5f01
Compare
|
@HadrienPatte @l-technicore I have updated the PR to address all your comments and also made the controller optional and it can be enabled via envvars. Manual testing also looked good. Can you folks give a final review so that we can merge it |
@AkarshES LGTM. I am fine with the latest changes. Any additional optimisations can be accepted via new PRs. |
HadrienPatte
left a comment
There was a problem hiding this comment.
Looks great, thanks!
For using external CNIs in direct routing mode we need to provide a way for IPs to be attached to the worker node. This change introduces a new controller which looks at the instance metadata for a well known field and processes the allocation of IPs on the worker node.