Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: kube-rbac-proxy-crio-deny-all
namespace: openshift-machine-config-operator
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
spec:
podSelector:
matchLabels:
k8s-app: kube-rbac-proxy-crio
policyTypes:
- Ingress
- Egress
Comment on lines +1 to +16
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Deny-all policy will block metrics scraping and API server communication.

This policy blocks all traffic for kube-rbac-proxy-crio pods. The kube-rbac-proxy needs:

  • Ingress: To receive metrics scrape requests from Prometheus/monitoring stack
  • Egress: To communicate with the API server for authorization (SubjectAccessReview)

Without allow rules, metrics collection will fail and the proxy cannot authorize requests.

Additionally, this file uses a hardcoded namespace: openshift-machine-config-operator while the other two NetworkPolicy files use the {{.TargetNamespace}} template. Confirm whether this inconsistency is intentional (e.g., this file is applied directly while others go through templating).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@install/0000_80_machine-config_00_networkpolicy-kube-rbac-proxy-crio.yaml`
around lines 1 - 16, The NetworkPolicy resource named
kube-rbac-proxy-crio-deny-all currently denies all traffic for pods with label
k8s-app: kube-rbac-proxy-crio; update it to allow required traffic instead of a
blanket deny by adding explicit allow rules: permit Ingress from the
Prometheus/monitoring service(s) or the monitoring namespace (e.g.,
label/selectors used by your Prometheus stack) so metrics scraping can reach the
kube-rbac-proxy-crio pods, and permit Egress to the API server (or to the
namespace/service account performing SubjectAccessReview calls) so authorization
requests succeed. Also reconcile the namespace field: replace the hardcoded
namespace openshift-machine-config-operator with the templated value
{{.TargetNamespace}} if other NetworkPolicy files use that template and this
file should be templated as well.

12 changes: 12 additions & 0 deletions manifests/machineconfigcontroller/networkpolicy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: machine-config-controller-deny-all
namespace: {{.TargetNamespace}}
spec:
podSelector:
matchLabels:
k8s-app: machine-config-controller
policyTypes:
- Ingress
- Egress
12 changes: 12 additions & 0 deletions manifests/machineconfigdaemon/networkpolicy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: machine-config-daemon-deny-all
namespace: {{.TargetNamespace}}
spec:
podSelector:
matchLabels:
k8s-app: machine-config-daemon
policyTypes:
- Ingress
- Egress
Comment on lines +1 to +12
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Deny-all policy will break machine-config-daemon functionality.

Similar to the machine-config-server policy, this defines Ingress and Egress policyTypes without allow rules. The machine-config-daemon requires egress connectivity to:

  • Communicate with the Kubernetes API server
  • Pull container images
  • Report node status and configuration state

Blocking all egress will prevent MCD from functioning. You'll need to add appropriate egress rules for API server access and any other required destinations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@manifests/machineconfigdaemon/networkpolicy.yaml` around lines 1 - 12, The
NetworkPolicy resource machine-config-daemon-deny-all currently sets
spec.policyTypes to [Ingress, Egress] with no allow rules and will block
required outbound traffic; update the NetworkPolicy (metadata.name
machine-config-daemon-deny-all, podSelector matchLabels k8s-app:
machine-config-daemon) to include explicit spec.egress rules that permit egress
to the Kubernetes API server (control-plane IPs/FQDN and port 6443) and to image
registries/container runtime endpoints (HTTP(S) ports and registry CIDRs) and
any node reporting endpoints, or remove Egress from spec.policyTypes if you
intend to only restrict Ingress; ensure the rules are narrowly scoped (by CIDR,
namespaceSelector, or ipBlock) rather than wide-open.

12 changes: 12 additions & 0 deletions manifests/machineconfigserver/networkpolicy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: machine-config-server-deny-all
namespace: {{.TargetNamespace}}
spec:
podSelector:
matchLabels:
k8s-app: machine-config-server
policyTypes:
- Ingress
- Egress
Comment on lines +1 to +12
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Deny-all policy will block machine-config-server traffic.

This NetworkPolicy defines Ingress and Egress policyTypes without any allow rules, which effectively blocks all inbound and outbound traffic for machine-config-server pods. Since machine-config-server serves Ignition configs to nodes during cluster bootstrap and provisioning, this policy will prevent nodes from fetching their configurations.

For a functional policy, you'll need to add ingress rules allowing traffic on the MCS port (typically 22623) from the appropriate sources (e.g., node CIDRs or the load balancer).

Example structure with ingress rules
 spec:
   podSelector:
     matchLabels:
       k8s-app: machine-config-server
   policyTypes:
   - Ingress
   - Egress
+  ingress:
+  - from:
+    - ipBlock:
+        cidr: <node-cidr>  # Or appropriate source selector
+    ports:
+    - protocol: TCP
+      port: 22623
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@manifests/machineconfigserver/networkpolicy.yaml` around lines 1 - 12, The
current NetworkPolicy (kind: NetworkPolicy, metadata.name:
machine-config-server-deny-all) defines only policyTypes: [Ingress, Egress] and
will block all traffic to pods selected by podSelector.matchLabels.k8s-app:
machine-config-server; update this resource to add explicit ingress rules that
allow TCP traffic on the MCS port (22623) from the appropriate sources (e.g.,
node CIDRs, kubelet IP ranges, and any load balancer CIDRs) and, if needed, add
egress rules to permit responses; specifically modify the spec to include an
ingress section with a port: 22623/protocol: TCP and from: entries for the node
networks or service/loadbalancer sources so nodes can fetch Ignition configs
while preserving other restrictions.

2 changes: 1 addition & 1 deletion pkg/operator/bootstrap.go
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ func buildSpec(dependencies *BootstrapDependencies, imgs *ctrlcommon.Images, rel
}

config := getRenderConfig("", dependencies.KubeAPIServerServingCA, spec,
&imgs.RenderConfigImages, dependencies.Infrastructure, nil, nil, "2")
&imgs.RenderConfigImages, dependencies.Infrastructure, nil, nil, "2", "")
return config, nil
}

Expand Down
1 change: 1 addition & 0 deletions pkg/operator/operator.go
Original file line number Diff line number Diff line change
Expand Up @@ -570,6 +570,7 @@ func (optr *Operator) sync(key string) error {
// "RenderConfig" should be the first one to run (except OSImageStream) as it sets the renderConfig in
// the operator for the sync funcs below
{"RenderConfig", optr.syncRenderConfig},
{"NetworkPolicies", optr.syncNetworkPolicies},
{"MachineConfiguration", optr.syncMachineConfiguration},
{"MachineConfigNode", optr.syncMachineConfigNodes},
{"MachineConfigPools", optr.syncMachineConfigPools},
Expand Down
1 change: 1 addition & 0 deletions pkg/operator/render.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ type renderConfig struct {
TLSMinVersion string
TLSCipherSuites []string
LogLevel string
ClusterNetworkCIDR string
}

type assetRenderer struct {
Expand Down
75 changes: 73 additions & 2 deletions pkg/operator/sync.go
Original file line number Diff line number Diff line change
Expand Up @@ -653,7 +653,13 @@ func (optr *Operator) syncRenderConfig(_ *renderConfig, _ *configv1.ClusterOpera
optr.setOperatorLogLevel(mcop.Spec.OperatorLogLevel)
}

optr.renderConfig = getRenderConfig(optr.namespace, string(kubeAPIServerServingCABytes), spec, &imgs.RenderConfigImages, infra, pointerConfigData, apiServer, fmt.Sprintf("%d", optr.logLevel))
// Get the Cluster Network CIDR for the MCC's allow NetworkPolicy
clusterNetworkCIDR, err := optr.getClusterNetworkCIDR()
if err != nil {
return err
}

optr.renderConfig = getRenderConfig(optr.namespace, string(kubeAPIServerServingCABytes), spec, &imgs.RenderConfigImages, infra, pointerConfigData, apiServer, fmt.Sprintf("%d", optr.logLevel), clusterNetworkCIDR)
Comment on lines +656 to +662
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Network CIDR fetch failure will block entire operator sync.

If the cluster Network resource doesn't have Status.ClusterNetwork populated (e.g., during early bringup before CNI is fully configured), this will cause syncRenderConfig to fail, blocking all subsequent sync operations and potentially degrading the operator.

Consider adding graceful handling similar to how proxy is handled (lines 2112-2114) by tolerating missing/empty state during bringup:

🛡️ Proposed fix for graceful degradation
 	// Get the Cluster Network CIDR for the MCC's allow NetworkPolicy
 	clusterNetworkCIDR, err := optr.getClusterNetworkCIDR()
 	if err != nil {
-		return err
+		// During early cluster bringup, the network status may not be populated yet.
+		// Log a warning but continue with empty CIDR; network policies will be applied
+		// on subsequent syncs once the network is ready.
+		if optr.inClusterBringup {
+			klog.Warningf("Could not get cluster network CIDR during bringup: %v", err)
+			clusterNetworkCIDR = ""
+		} else {
+			return err
+		}
 	}

Alternatively, if network policies are critical, consider moving syncNetworkPolicies to only run after inClusterBringup is false, or adding a check within syncNetworkPolicies to skip if CIDR is empty.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/sync.go` around lines 656 - 662, The call to
getClusterNetworkCIDR in syncRenderConfig can return empty/missing data during
early bringup and currently aborts the entire sync; change
syncRenderConfig/optr.renderConfig construction to tolerate a missing CIDR by
catching the error from getClusterNetworkCIDR and treating it as non-fatal
(e.g., set clusterNetworkCIDR to "" and log/debug the condition) or gate
syncNetworkPolicies to only run when inClusterBringup is false; specifically
modify the code around getClusterNetworkCIDR and the optr.renderConfig
assignment in syncRenderConfig to not return the error directly, and ensure
syncNetworkPolicies checks for an empty clusterNetworkCIDR or inClusterBringup
before applying NetworkPolicy logic.


return nil
}
Expand Down Expand Up @@ -1994,6 +2000,21 @@ func (optr *Operator) getOsImageURLs(namespace, osImageStreamName string) (strin
return cfg.BaseOSContainerImage, cfg.BaseOSExtensionsContainerImage, nil
}

func (optr *Operator) getClusterNetworkCIDR() (string, error) {
// Fetch the cluster-wide Network configuration
network, err := optr.networkLister.Get("cluster")
if err != nil {
return "", err
}

// Always use Status as it represents the current reality of the CNI
if len(network.Status.ClusterNetwork) > 0 {
return network.Status.ClusterNetwork[0].CIDR, nil
}

return "", fmt.Errorf("no cluster network CIDR found in status")
}

func (optr *Operator) getCAsFromConfigMap(namespace, name, key string) ([]byte, error) {
cm, err := optr.clusterCmLister.ConfigMaps(namespace).Get(name)
if err != nil {
Expand Down Expand Up @@ -2138,7 +2159,7 @@ func setGVK(obj runtime.Object, scheme *runtime.Scheme) error {
return nil
}

func getRenderConfig(tnamespace, kubeAPIServerServingCA string, ccSpec *mcfgv1.ControllerConfigSpec, imgs *ctrlcommon.RenderConfigImages, infra *configv1.Infrastructure, pointerConfigData []byte, apiServer *configv1.APIServer, logLevel string) *renderConfig {
func getRenderConfig(tnamespace, kubeAPIServerServingCA string, ccSpec *mcfgv1.ControllerConfigSpec, imgs *ctrlcommon.RenderConfigImages, infra *configv1.Infrastructure, pointerConfigData []byte, apiServer *configv1.APIServer, logLevel, clusterNetworkCIDR string) *renderConfig {
tlsMinVersion, tlsCipherSuites := ctrlcommon.GetSecurityProfileCiphersFromAPIServer(apiServer)
return &renderConfig{
TargetNamespace: tnamespace,
Expand All @@ -2153,6 +2174,7 @@ func getRenderConfig(tnamespace, kubeAPIServerServingCA string, ccSpec *mcfgv1.C
TLSMinVersion: tlsMinVersion,
TLSCipherSuites: tlsCipherSuites,
LogLevel: logLevel,
ClusterNetworkCIDR: clusterNetworkCIDR,
}
}

Expand Down Expand Up @@ -2610,3 +2632,52 @@ func (optr *Operator) getOCPVersionFromClusterVersion() string {
}
return fmt.Sprintf("%d.%d.%d", parsedVersion.Major(), parsedVersion.Minor(), parsedVersion.Patch())
}

func (optr *Operator) syncNetworkPolicies(config *renderConfig, _ *configv1.ClusterOperator) error {
// 1. Define your manifest list
manifests := []string{
// "common/network-policies/00-default-deny-all.yaml",
"common/network-policies/machine-config-controller-allow.yaml",
// "common/network-policies/03-allow-operator.yaml",
}

// if config.IsImageBuildEnabled {
// manifests = append(manifests, "common/network-policies/02-allow-os-builder.yaml")
// }

// 2. Iterate and Apply using Server-Side Apply
for _, path := range manifests {
// Render the template using the operator's internal asset renderer
npBytes, err := renderAsset(config, path)
if err != nil {
return fmt.Errorf("failed to render %s: %w", path, err)
}
Comment on lines +2649 to +2654
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Verify the network policy manifest path exists in the embedded filesystem

echo "Searching for network policy manifest files..."

# Check the templates directory
echo "=== Checking templates/ ==="
fd -t f "machine-config-controller-allow" templates/ 2>/dev/null || echo "Not found in templates/"

# Check the manifests directory  
echo ""
echo "=== Checking manifests/ ==="
fd -t f "machine-config-controller-allow" manifests/ 2>/dev/null || echo "Not found in manifests/"

# Check for any network-policies directories
echo ""
echo "=== Searching for network-policies directories ==="
fd -t d "network-policies" . 2>/dev/null

# Check how manifests.ReadFile is used and what paths are valid
echo ""
echo "=== Checking manifests package embed directive ==="
rg -n "//go:embed" pkg/manifests/ manifests/ 2>/dev/null | head -20

Repository: openshift/machine-config-operator

Length of output: 430


🏁 Script executed:

# Search for renderAsset function definition
rg -n "func renderAsset" pkg/operator/ -A 10

Repository: openshift/machine-config-operator

Length of output: 643


🏁 Script executed:

# Examine manifests.go to understand the embed directive
cat -n pkg/manifests/manifests.go | head -30

Repository: openshift/machine-config-operator

Length of output: 135


🏁 Script executed:

# Check how other manifest paths are used in the sync.go file
rg -n "renderAsset" pkg/operator/sync.go -B 2 -A 2 | head -50

Repository: openshift/machine-config-operator

Length of output: 1485


🏁 Script executed:

# Find where manifests.go actually exists
find . -name "manifests.go" -type f 2>/dev/null

Repository: openshift/machine-config-operator

Length of output: 101


🏁 Script executed:

# Find where "common/network-policies/machine-config-controller-allow.yaml" is referenced
rg -n "common/network-policies/machine-config-controller-allow" . -A 2 -B 2

Repository: openshift/machine-config-operator

Length of output: 421


🏁 Script executed:

# Find newAssetRenderer implementation
rg -n "func newAssetRenderer" pkg/operator/ -A 15

Repository: openshift/machine-config-operator

Length of output: 833


🏁 Script executed:

# Read manifests.go to see the embed directive
cat -n ./manifests/manifests.go | head -30

Repository: openshift/machine-config-operator

Length of output: 802


🏁 Script executed:

# List files in the templates directory to see the actual path structure
fd -t f "" templates/common/network-policies/ -x echo

Repository: openshift/machine-config-operator

Length of output: 147


🏁 Script executed:

# Check other path patterns used in sync.go to understand the naming convention
rg -n '"\w+/\w+\.yaml"' pkg/operator/sync.go | head -20

Repository: openshift/machine-config-operator

Length of output: 138


🏁 Script executed:

# Check what's actually in the manifests directory
ls -la manifests/common/network-policies/ 2>/dev/null || echo "Directory does not exist in manifests/"

Repository: openshift/machine-config-operator

Length of output: 115


🏁 Script executed:

# Check the full directory structure of manifests/
fd -t d "" manifests/ | head -20

Repository: openshift/machine-config-operator

Length of output: 283


🏁 Script executed:

# Check if templates are part of the build process
rg -n "templates" go.mod pkg/ Makefile build/ 2>/dev/null | grep -i "copy\|build\|embed" | head -10

Repository: openshift/machine-config-operator

Length of output: 522


🏁 Script executed:

# Check if there are any other embed directives that include templates/common
rg -n "go:embed.*common" . 2>/dev/null

Repository: openshift/machine-config-operator

Length of output: 59


🏁 Script executed:

# List all files in the manifests directory to see what's available
fd -t f "" manifests/ | sort

Repository: openshift/machine-config-operator

Length of output: 4338


🏁 Script executed:

# Check git history or comments around this code to see if this is new
rg -n "network-policies" pkg/operator/sync.go -B 5 -A 5

Repository: openshift/machine-config-operator

Length of output: 790


The manifest path will fail at runtime because the file is not embedded in the manifests directory.

The path "common/network-policies/machine-config-controller-allow.yaml" references a file that exists in templates/common/network-policies/ but is not included in the embedded filesystem. The manifests/ directory contains no common/network-policies/ subdirectory, and manifests.ReadFile() only reads from files embedded via the //go:embed * directive in manifests/manifests.go. This will cause the code to fail when renderAsset() attempts to load the file.

Either move the file to manifests/common/network-policies/ or update the path to match an existing manifest location (e.g., "machineconfigcontroller/machine-config-controller-allow.yaml" if placing it in manifests/machineconfigcontroller/).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/sync.go` around lines 2649 - 2654, The manifest path
"common/network-policies/machine-config-controller-allow.yaml" used when calling
renderAsset(config, path) is not embedded in the manifests FS and will fail at
runtime; fix by either moving the template file into the embedded manifests tree
(e.g., add it under manifests/common/network-policies/) or change the path
string to the actual embedded location (for example
"machineconfigcontroller/machine-config-controller-allow.yaml") so that
renderAsset (and any manifests.ReadFile usage) can find the file at runtime.


// Convert raw bytes to a NetworkPolicy object
netPol := resourceread.ReadNetworkPolicyV1OrDie(npBytes)

// Marshal the NetworkPolicy to JSON for Server-Side Apply
netPolBytes, err := json.Marshal(netPol)
if err != nil {
return fmt.Errorf("failed to marshal network policy %s: %w", netPol.Name, err)
}

// Apply with ForceOwnership to ensure the Operator is the source of truth
// Apply via Server-Side Apply with ForceOwnership
// This ensures the MCO 'owns' these fields and reverts manual drifts
_, err = optr.kubeClient.NetworkingV1().NetworkPolicies(netPol.Namespace).Patch(
context.TODO(),
netPol.Name,
types.ApplyPatchType,
netPolBytes,
metav1.PatchOptions{
FieldManager: "machine-config-operator",
Force: ptr.To(true),
},
)
if err != nil {
return fmt.Errorf("failed to apply network policy %s: %w", netPol.Name, err)
}
}
return nil
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: mco-allow-controller
namespace: {{.TargetNamespace}}
spec:
podSelector:
matchLabels: # TODO: see if the following should be `k8s-app: {{.ControllerAppLabel}}` instead
k8s-app: machine-config-controller
policyTypes:
- Ingress
- Egress
ingress:
- from: # Allow Prometheus Metrics Scraping
- namespaceSelector:
matchLabels:
network.openshift.io/policy-group: monitoring
ports:
- protocol: TCP
port: 9001
- from: # Allow Kubelet Health Probes (from Node Network)
- ipBlock:
cidr: {{.ClusterNetworkCIDR}}
ports:
- protocol: TCP
port: 9443
Comment on lines +21 to +26
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Search for ClusterNetworkCIDR usage in the repository
fd -t f '\.yaml\|\.yml\|\.go\|\.md' | xargs grep -l "ClusterNetworkCIDR" | head -20

Repository: openshift/machine-config-operator

Length of output: 59


🏁 Script executed:

# Also search for NodeNetworkCIDR or ServiceNetworkCIDR to understand the variable naming
rg "NetworkCIDR" --type yaml --type md -A 2 -B 2 | head -80

Repository: openshift/machine-config-operator

Length of output: 585


🏁 Script executed:

# Look for the actual file and surrounding context
cat templates/common/network-policies/machine-config-controller-allow.yaml

Repository: openshift/machine-config-operator

Length of output: 1237


🏁 Script executed:

# Search for documentation or comments about kubelet probes and network CIDR
rg "kubelet.*probe|health.*probe" --type yaml --type md -A 3 -B 3 | head -100

Repository: openshift/machine-config-operator

Length of output: 6902


🏁 Script executed:

# Search for where ClusterNetworkCIDR is defined as a variable/parameter
rg "ClusterNetworkCIDR" -B 5 -A 5 --type go --type yaml --type md

Repository: openshift/machine-config-operator

Length of output: 8443


🏁 Script executed:

# Look for other network CIDR definitions to understand the pattern
rg "NetworkCIDR|NetworkNetwork" --type go --type yaml | head -50

Repository: openshift/machine-config-operator

Length of output: 5480


🏁 Script executed:

# Search for how these template variables are populated
rg "\.ClusterNetworkCIDR|ClusterNetworkCIDR:" --type go -B 3 -A 3 | head -80

Repository: openshift/machine-config-operator

Length of output: 395


🏁 Script executed:

# Check for any documentation or README files about network setup
fd -t f '\.md$|README' | xargs grep -i "ClusterNetworkCIDR\|network.*cidr" | head -30

Repository: openshift/machine-config-operator

Length of output: 59


🏁 Script executed:

# Read the getClusterNetworkCIDR implementation
sed -n '/func (optr \*Operator) getClusterNetworkCIDR/,/^}/p' pkg/operator/sync.go

Repository: openshift/machine-config-operator

Length of output: 517


🏁 Script executed:

# Search for any node network or machine network CIDR usage
rg "NodeNetwork|MachineNetwork|node.*cidr" --type go -A 3 -B 3 | head -100

Repository: openshift/machine-config-operator

Length of output: 9425


🏁 Script executed:

# Look for kubelet configuration or probe documentation in the codebase
rg "9443|kubelet.*probe|Kubelet Health" -A 3 -B 3 --type yaml --type go | head -80

Repository: openshift/machine-config-operator

Length of output: 7475


🏁 Script executed:

# Search for what uses or listens on port 9443
rg "9443|machine.*config.*controller" --type go --type yaml -B 3 -A 3 | grep -A 10 -B 10 "9443\|probe\|webhook" | head -150

Repository: openshift/machine-config-operator

Length of output: 11461


🏁 Script executed:

# Look for webhook configurations or health probe definitions
rg "webhook|healthz|readiness.*probe|liveness.*probe" --type yaml -A 5 -B 5 | head -120

Repository: openshift/machine-config-operator

Length of output: 15321


🏁 Script executed:

# Search the codebase for how the machine-config-controller is accessed on port 9443
fd -t f '\.go$' | xargs grep -l "9443" | head -10

Repository: openshift/machine-config-operator

Length of output: 402


🏁 Script executed:

# Find machine-config-controller deployment manifests
fd -t f "\.yaml\|\.yml" | xargs grep -l "machine-config-controller" | head -20

Repository: openshift/machine-config-operator

Length of output: 59


🏁 Script executed:

# Look for any deployment/pod specs for machine-config-controller
fd -t f "deployment\|pod\|statefulset" -path "*/machine-config*"

Repository: openshift/machine-config-operator

Length of output: 245


🏁 Script executed:

# Search for webhook or TLS server configuration in machine-config-controller code
rg "ValidatingWebhook|MutatingWebhook|listener.*9443|Serve.*9443" --type go -A 5 -B 5

Repository: openshift/machine-config-operator

Length of output: 50390


🏁 Script executed:

# Look for any references to what calls/probes this service
rg "machine.*config.*controller.*9443|9443.*machine" --type yaml --type go

Repository: openshift/machine-config-operator

Length of output: 59


🏁 Script executed:

# Search in the actual pkg directory for machine-config-controller main code
find pkg -name "*.go" -type f | xargs grep -l "port.*9443\|9443.*port" 2>/dev/null | head -10

Repository: openshift/machine-config-operator

Length of output: 59


🏁 Script executed:

# Search for where machine-config-controller might have a server/listen on a port
rg "ListenAndServe|server.*Start|Addr.*Port" pkg/controller --type go -B 5 -A 5 | head -80

Repository: openshift/machine-config-operator

Length of output: 789


🏁 Script executed:

# Let's look at what the file names suggest about network policies
ls -la templates/common/network-policies/

Repository: openshift/machine-config-operator

Length of output: 276


🏁 Script executed:

# Search for metrics or server port configuration in machine-config-controller
rg "Addr.*:80|Addr.*:8080|Addr.*:9000|Addr.*:9" pkg/controller --type go -B 3 -A 3 | head -60

Repository: openshift/machine-config-operator

Length of output: 59


🏁 Script executed:

# Look for any operator code that manages network policies
rg "mco-allow\|mco.*allow\|NetworkPolicy" pkg/operator --type go -B 3 -A 3 | head -100

Repository: openshift/machine-config-operator

Length of output: 59


🏁 Script executed:

# Check if port 9443 is specifically for webhooks or health checks
rg "webhook|health|readiness" pkg/controller --type go | head -30

Repository: openshift/machine-config-operator

Length of output: 59


🏁 Script executed:

# Look at what the comment in the NetworkPolicy file says around the problematic rule
head -n 30 templates/common/network-policies/machine-config-controller-allow.yaml

Repository: openshift/machine-config-operator

Length of output: 891


Use the correct CIDR source for the intended access.

The comment says "from Node Network" but the rule uses {{.ClusterNetworkCIDR}}, which represents the pod network overlay, not node infrastructure. This makes the ingress rule much broader than intended—allowing every pod in the cluster to access port 9443—while potentially missing the actual kubelet/node source if probes originate from outside the pod network. Either use the appropriate node/machine network CIDR, or validate and update the comment to match the pod network intent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@templates/common/network-policies/machine-config-controller-allow.yaml`
around lines 21 - 26, The ingress rule labelled "Allow Kubelet Health Probes
(from Node Network)" currently uses {{.ClusterNetworkCIDR}} which is the pod
overlay and is too broad; update the rule to use the correct node/machine
network CIDR (e.g., replace {{.ClusterNetworkCIDR}} with the appropriate
variable for node/machine network such as {{.MachineNetworkCIDR}} or
{{.NodeNetworkCIDR}}) so only node IPs can access TCP port 9443, or if the
intent was to allow pod-to-controller access, change the human-facing comment to
say "from Cluster (pod) Network" to match {{.ClusterNetworkCIDR}}. Ensure the
change touches the ipBlock.cidr entry and/or the comment text associated with
port 9443.

egress:
- to: # Allow API Server Access
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: openshift-kube-apiserver
ports:
- protocol: TCP
port: 6443
- to: # Allow DNS Lookups
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: openshift-dns
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53