feat(plugindefinitions): implement image replication for registry mirrors#1826
feat(plugindefinitions): implement image replication for registry mirrors#1826mikolajkucinski wants to merge 13 commits intomainfrom
Conversation
a4ed344 to
e151e07
Compare
e151e07 to
67e79a2
Compare
0c5be89 to
ab7a402
Compare
This comment was marked as spam.
This comment was marked as spam.
17895e0 to
458dcaf
Compare
…pport Extract image parsing and registry mirror config from the plugin controller into shared packages. Add on-demand image replication that pulls manifests to trigger mirror sync. Bump Go to 1.25.6 required by go-containerregistry. On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
Use go-containerregistry/pkg/name.ParseReference for image ref parsing. On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
Use DescribeTable for table-driven buildMirroredImageRef tests. On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
Move image parsing, registry mirror config, and replication code from internal/common and internal/replication into a single internal/imagemirror package. On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
PluginDefinition controller now replicates the Helm chart OCI artifact directly instead of templating the chart and replicating container images. Container image replication belongs in the Plugin controller which has the final option values. On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
458dcaf to
93cbfcf
Compare
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
93cbfcf to
9d5c163
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (7)
internal/ocimirror/ociref.go (2)
13-13: Regex may miss some valid YAML image field patterns.The regex pattern captures common cases but may not handle:
- Images specified with YAML block scalars
- Images in nested structures like
spec.containers[].imagewithout the literal- image:prefixFor the current use case of extracting images from Helm-rendered manifests, this is likely acceptable. If broader coverage is needed later, consider using a proper YAML parser.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/ocimirror/ociref.go` at line 13, The current regex variable imageFieldPattern may miss valid YAML image declarations (e.g., block scalars or nested keys like spec.containers[].image); update the extraction to use a YAML parser instead of regex for robustness: replace usages of imageFieldPattern with parsing the manifest via a YAML library (unmarshal each document and walk maps/arrays to collect values for the "image" key, including nested paths like spec.containers[].image), or if you must keep regex, add a clear comment on imageFieldPattern documenting its limitations and intended scope for Helm-rendered manifests only.
34-38: Silent failure on parse error may lead to incorrect behavior.When
name.ParseReferencefails, the function returns the originalimageRefas the repository with an emptytagOrDigest. This silent fallback could cause downstream issues where callers process invalid references without knowing the parse failed. Consider either returning an error or logging a warning.♻️ Alternative: return error to signal parse failure
-func SplitOCIRef(imageRef string) (registry, repository, tagOrDigest string) { +func SplitOCIRef(imageRef string) (registry, repository, tagOrDigest string, err error) { ref, err := name.ParseReference(imageRef) if err != nil { - return "docker.io", imageRef, "" + return "", "", "", fmt.Errorf("failed to parse OCI reference %q: %w", imageRef, err) }If the current fallback behavior is intentional for graceful degradation, consider adding a comment explaining why and what the expected caller behavior is.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/ocimirror/ociref.go` around lines 34 - 38, SplitOCIRef currently swallows name.ParseReference errors and returns a fallback ("docker.io", imageRef, "") which can hide invalid refs; change SplitOCIRef to surface parse failures by returning an error instead of silently falling back: update the SplitOCIRef signature to include an error return, propagate the parse error from name.ParseReference, and update all callers to handle the error (or, if you prefer graceful degradation, add a clear comment explaining the fallback and emit a warning log via the package logger when ParseReference fails). Locate the function SplitOCIRef and the name.ParseReference call to implement this change and update call sites accordingly.internal/ocimirror/ociref_test.go (1)
62-104: Consider adding test cases for error/edge scenarios.The tests cover common valid inputs well. Consider adding tests for edge cases:
- Empty string input
- Completely malformed references (e.g.,
":::invalid")- References with ports (e.g.,
"localhost:5000/myimage:tag")This would help document the expected behavior of
SplitOCIRefwhen parsing fails.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/ocimirror/ociref_test.go` around lines 62 - 104, Add unit tests for SplitOCIRef to cover edge cases: call ocimirror.SplitOCIRef with an empty string and assert it does not panic and returns empty reg/repo and empty tagOrDigest; call it with a malformed input like ":::invalid" and assert it returns a non-panicking result (e.g., reg == "" and repo == ":::invalid" or whatever the current implementation returns) to lock in current behavior; and call it with a registry containing a port "localhost:5000/myimage:tag" and assert reg == "localhost:5000", repo == "myimage", tagOrDigest == ":tag". Name the tests clearly and use the same Expect style as existing It blocks to document expected behavior of SplitOCIRef.internal/ocimirror/replicator.go (1)
105-115: Consider documenting theoptsslice mutation behavior.The
appendon line 108 mutates the originalextraOptsslice's backing array if it has spare capacity. This is typically safe here since callers pass variadic arguments, but documenting this or usingslices.Clonewould make the intent clearer.♻️ Optional: defensive copy of options
func (r *OCIReplicator) TriggerReplication(ctx context.Context, mirroredRef string, extraOpts ...crane.Option) error { log.FromContext(ctx).V(1).Info("triggering replication", "ref", mirroredRef) - opts := append([]crane.Option{crane.WithAuth(r.auth)}, extraOpts...) + opts := make([]crane.Option, 0, 1+len(extraOpts)) + opts = append(opts, crane.WithAuth(r.auth)) + opts = append(opts, extraOpts...) _, err := r.manifestFetcher(mirroredRef, opts...)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/ocimirror/replicator.go` around lines 105 - 115, TriggerReplication currently builds opts via opts := append([]crane.Option{crane.WithAuth(r.auth)}, extraOpts...) which can mutate the caller's extraOpts backing array; change this to make a defensive copy instead (e.g., allocate a new slice and copy extraOpts or use slices.Clone) so appending the auth option cannot alter the caller's slice, and update the comment to note that TriggerReplication creates an internal copy of options before appending; reference: TriggerReplication, extraOpts, opts, manifestFetcher.internal/controller/plugindefinition/utils.go (2)
235-237: Consider adding platform option for consistency.
TriggerReplicationis called here without the platform option, but inReplicateOCIArtifacts(replicator.go:75), the linux/amd64 platform is explicitly specified viacrane.WithPlatform.Per the PR objectives, the replication should use "linux/amd64 platform". Consider adding the platform option here for consistency:
♻️ Add platform option for consistency
+import ( + "github.com/google/go-containerregistry/pkg/crane" + v1 "github.com/google/go-containerregistry/pkg/v1" +) - if err := replicator.TriggerReplication(ctx, mirroredRef); err != nil { + if err := replicator.TriggerReplication(ctx, mirroredRef, crane.WithPlatform(&v1.Platform{OS: "linux", Architecture: "amd64"})); err != nil { return failReplication(err, "chart replication failed") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/plugindefinition/utils.go` around lines 235 - 237, The call to replicator.TriggerReplication(ctx, mirroredRef) omits the platform option and should replicate using the linux/amd64 platform like ReplicateOCIArtifacts does; update the TriggerReplication invocation (in utils.go) to pass the platform option (e.g., the same crane.WithPlatform("linux/amd64") or the replicator's platform option parameter) so the replicator uses the linux/amd64 platform consistently with ReplicateOCIArtifacts.
210-218: Consider a more robust skip-check mechanism.The current logic uses
strings.Contains(replicationCond.Message, chartRef)to determine if replication can be skipped. This is brittle because:
- Message format changes could break the check
- If a chart name is a substring of another (e.g.,
app:v1vsapp:v1.0), false matches could occurConsider storing the replicated chart reference in a dedicated status field or using exact message matching.
♻️ Safer alternative using exact suffix match
if replicationCond != nil && replicationCond.IsTrue() && replicationCond.Reason == greenhousev1alpha1.OCIReplicationSucceededReason && - strings.Contains(replicationCond.Message, chartRef) { + strings.HasSuffix(replicationCond.Message, chartRef) { logger.V(1).Info("chart already replicated, skipping", "chart", chartRef) return nil }Alternatively, consider adding a
ReplicatedChartReffield to the status for explicit tracking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/plugindefinition/utils.go` around lines 210 - 218, Replace the brittle substring check using replicationCond.Message with an explicit, robust check: add or use a dedicated status field (e.g., Status.ReplicatedChartRef) on the plugin definition and change the skip logic in the function that calls pluginDef.GetConditions()/GetConditionByType(OCIReplicationReadyCondition) to first compare pluginDef.Status.ReplicatedChartRef == chartRef (exact equality), and only fall back to a strict exact message match (replicationCond.Message == expectedMessage) if the status field is not present; ensure updates to set Status.ReplicatedChartRef when replication succeeds so future checks use the exact equality.internal/controller/plugindefinition/plugindefinition_controller.go (1)
100-106: Add a comment explaining the intentional error handling pattern for chart replication failures.The replication error is intentionally swallowed here (
//nolint:nilerr), allowing the reconciliation to returnlifecycle.SuccesswhileensureChartReplicationsets theOCIReplicationReadyConditiontoFalseand logs the error internally. This enables "eventually consistent" behavior: theReadyConditionpropagates the replication failure (viasetReadyCondition), the resource requeues after 1 minute, and retry attempts continue without blocking the reconciliation.While this pattern is intentional and correct, consider adding an inline comment explaining why replication failures are non-blocking and how the condition propagation works, so future maintainers understand this is deliberate design rather than oversight.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/plugindefinition/plugindefinition_controller.go` around lines 100 - 106, Add an inline comment above the ensureChartReplication call explaining that replication errors are intentionally not returned to the controller: note that ensureChartReplication sets the OCIReplicationReadyCondition (via setReadyCondition) to False and logs errors internally, the reconciliation returns lifecycle.Success with nolint:nilerr and requeues after 1 minute to allow eventual consistency and retries, and therefore swallowing the error here is deliberate to avoid blocking other reconciliation logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/ocimirror/registry_mirror.go`:
- Around line 84-91: ResolveOCIRef lacks a nil-receiver guard and will panic if
called on a nil *RegistryMirrorConfig (e.g., when GetRegistryMirrorConfig
returns (nil, nil)); add an early check at the start of
RegistryMirrorConfig.ResolveOCIRef (if c == nil) to return nil, and optionally
also handle a nil c.RegistryMirrors map safely before lookup, so the method
returns nil instead of panicking.
---
Nitpick comments:
In `@internal/controller/plugindefinition/plugindefinition_controller.go`:
- Around line 100-106: Add an inline comment above the ensureChartReplication
call explaining that replication errors are intentionally not returned to the
controller: note that ensureChartReplication sets the
OCIReplicationReadyCondition (via setReadyCondition) to False and logs errors
internally, the reconciliation returns lifecycle.Success with nolint:nilerr and
requeues after 1 minute to allow eventual consistency and retries, and therefore
swallowing the error here is deliberate to avoid blocking other reconciliation
logic.
In `@internal/controller/plugindefinition/utils.go`:
- Around line 235-237: The call to replicator.TriggerReplication(ctx,
mirroredRef) omits the platform option and should replicate using the
linux/amd64 platform like ReplicateOCIArtifacts does; update the
TriggerReplication invocation (in utils.go) to pass the platform option (e.g.,
the same crane.WithPlatform("linux/amd64") or the replicator's platform option
parameter) so the replicator uses the linux/amd64 platform consistently with
ReplicateOCIArtifacts.
- Around line 210-218: Replace the brittle substring check using
replicationCond.Message with an explicit, robust check: add or use a dedicated
status field (e.g., Status.ReplicatedChartRef) on the plugin definition and
change the skip logic in the function that calls
pluginDef.GetConditions()/GetConditionByType(OCIReplicationReadyCondition) to
first compare pluginDef.Status.ReplicatedChartRef == chartRef (exact equality),
and only fall back to a strict exact message match (replicationCond.Message ==
expectedMessage) if the status field is not present; ensure updates to set
Status.ReplicatedChartRef when replication succeeds so future checks use the
exact equality.
In `@internal/ocimirror/ociref_test.go`:
- Around line 62-104: Add unit tests for SplitOCIRef to cover edge cases: call
ocimirror.SplitOCIRef with an empty string and assert it does not panic and
returns empty reg/repo and empty tagOrDigest; call it with a malformed input
like ":::invalid" and assert it returns a non-panicking result (e.g., reg == ""
and repo == ":::invalid" or whatever the current implementation returns) to lock
in current behavior; and call it with a registry containing a port
"localhost:5000/myimage:tag" and assert reg == "localhost:5000", repo ==
"myimage", tagOrDigest == ":tag". Name the tests clearly and use the same Expect
style as existing It blocks to document expected behavior of SplitOCIRef.
In `@internal/ocimirror/ociref.go`:
- Line 13: The current regex variable imageFieldPattern may miss valid YAML
image declarations (e.g., block scalars or nested keys like
spec.containers[].image); update the extraction to use a YAML parser instead of
regex for robustness: replace usages of imageFieldPattern with parsing the
manifest via a YAML library (unmarshal each document and walk maps/arrays to
collect values for the "image" key, including nested paths like
spec.containers[].image), or if you must keep regex, add a clear comment on
imageFieldPattern documenting its limitations and intended scope for
Helm-rendered manifests only.
- Around line 34-38: SplitOCIRef currently swallows name.ParseReference errors
and returns a fallback ("docker.io", imageRef, "") which can hide invalid refs;
change SplitOCIRef to surface parse failures by returning an error instead of
silently falling back: update the SplitOCIRef signature to include an error
return, propagate the parse error from name.ParseReference, and update all
callers to handle the error (or, if you prefer graceful degradation, add a clear
comment explaining the fallback and emit a warning log via the package logger
when ParseReference fails). Locate the function SplitOCIRef and the
name.ParseReference call to implement this change and update call sites
accordingly.
In `@internal/ocimirror/replicator.go`:
- Around line 105-115: TriggerReplication currently builds opts via opts :=
append([]crane.Option{crane.WithAuth(r.auth)}, extraOpts...) which can mutate
the caller's extraOpts backing array; change this to make a defensive copy
instead (e.g., allocate a new slice and copy extraOpts or use slices.Clone) so
appending the auth option cannot alter the caller's slice, and update the
comment to note that TriggerReplication creates an internal copy of options
before appending; reference: TriggerReplication, extraOpts, opts,
manifestFetcher.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2f793cfd-cb80-46df-98c5-3e21681f318c
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (18)
api/v1alpha1/plugindefinition_types.gocharts/manager/crds/greenhouse.sap_catalogs.yamlcharts/manager/crds/greenhouse.sap_clusterplugindefinitions.yamlgo.modinternal/controller/plugin/image_mirror.gointernal/controller/plugin/oci_mirror.gointernal/controller/plugin/oci_mirror_test.gointernal/controller/plugin/plugin_controller_flux.gointernal/controller/plugindefinition/cluster_plugindefinition_controller.gointernal/controller/plugindefinition/plugindefinition_controller.gointernal/controller/plugindefinition/utils.gointernal/ocimirror/ociref.gointernal/ocimirror/ociref_test.gointernal/ocimirror/registry_mirror.gointernal/ocimirror/registry_mirror_test.gointernal/ocimirror/replicator.gointernal/ocimirror/replicator_test.gointernal/ocimirror/suite_test.go
💤 Files with no reviewable changes (1)
- internal/controller/plugin/image_mirror.go
| func (c *RegistryMirrorConfig) ResolveOCIRef(imageRef string) *ResolvedOCIRef { | ||
| registry, repo, tagOrDigest := SplitOCIRef(imageRef) | ||
| mirror, ok := c.RegistryMirrors[registry] | ||
| if !ok { | ||
| return nil | ||
| } | ||
| return &ResolvedOCIRef{Registry: registry, Mirror: mirror, Repository: repo, TagOrDigest: tagOrDigest} | ||
| } |
There was a problem hiding this comment.
Add nil receiver check to prevent potential panic.
ResolveOCIRef is a method on *RegistryMirrorConfig but doesn't check for a nil receiver. If GetRegistryMirrorConfig returns (nil, nil) (which it does when no config exists), calling ResolveOCIRef on that nil pointer would panic.
🛡️ Proposed fix to add nil check
// ResolveOCIRef looks up the mirror configuration for an OCI reference.
func (c *RegistryMirrorConfig) ResolveOCIRef(imageRef string) *ResolvedOCIRef {
+ if c == nil {
+ return nil
+ }
registry, repo, tagOrDigest := SplitOCIRef(imageRef)
mirror, ok := c.RegistryMirrors[registry]
if !ok {
return nil
}
return &ResolvedOCIRef{Registry: registry, Mirror: mirror, Repository: repo, TagOrDigest: tagOrDigest}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (c *RegistryMirrorConfig) ResolveOCIRef(imageRef string) *ResolvedOCIRef { | |
| registry, repo, tagOrDigest := SplitOCIRef(imageRef) | |
| mirror, ok := c.RegistryMirrors[registry] | |
| if !ok { | |
| return nil | |
| } | |
| return &ResolvedOCIRef{Registry: registry, Mirror: mirror, Repository: repo, TagOrDigest: tagOrDigest} | |
| } | |
| func (c *RegistryMirrorConfig) ResolveOCIRef(imageRef string) *ResolvedOCIRef { | |
| if c == nil { | |
| return nil | |
| } | |
| registry, repo, tagOrDigest := SplitOCIRef(imageRef) | |
| mirror, ok := c.RegistryMirrors[registry] | |
| if !ok { | |
| return nil | |
| } | |
| return &ResolvedOCIRef{Registry: registry, Mirror: mirror, Repository: repo, TagOrDigest: tagOrDigest} | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/ocimirror/registry_mirror.go` around lines 84 - 91, ResolveOCIRef
lacks a nil-receiver guard and will panic if called on a nil
*RegistryMirrorConfig (e.g., when GetRegistryMirrorConfig returns (nil, nil));
add an early check at the start of RegistryMirrorConfig.ResolveOCIRef (if c ==
nil) to return nil, and optionally also handle a nil c.RegistryMirrors map
safely before lookup, so the method returns nil instead of panicking.
Description
This PR adds chart replication support for registry mirrors. When a PluginDefinition's HelmChart becomes ready, the controller replicates the chart OCI artifact to the configured mirror registry via
crane.Manifest. Image parsing usesgo-containerregistry/pkg/namefor correct OCI reference handling. All mirror logic is consolidated ininternal/imagemirror.What type of PR is this? (check all applicable)
Related Tickets & Documents
Added tests?
Added to documentation?
Checklist
Summary by CodeRabbit
New Features
Chores