feat(orchestrator): add OOTB CRD collection for Gateway API, service mesh, and ingress controllers by eliottness · Pull Request #48966 · DataDog/datadog-agent

eliottness · 2026-04-07T16:36:53Z

What does this PR do?

Adds 22 new builtin CRD entries to the orchestrator explorer so Cloud Security can determine internet-reachability paths for k8s workloads. Uses a hybrid collection strategy: resource-specific entries for high-volume vendors (Istio, NGINX, Traefik) and group-level entries for less common vendors.

Three new per-family config flags (all opt-in, default: false):

orchestrator_explorer.custom_resources.ootb.gateway_api
orchestrator_explorer.custom_resources.ootb.service_mesh
orchestrator_explorer.custom_resources.ootb.ingress_controllers

New families:

Gateway API (5 resource-specific): gateways, httproutes, grpcroutes, tlsroutes, listenersets
Service mesh — Istio (5 resource-specific): virtualservices, gateways, destinationrules, serviceentries, sidecars (with v1beta1 fallback)
Service mesh — others (6 group-level): Envoy Gateway, Traefik (legacy), Linkerd, Consul, Consul Mesh, Kuma
Ingress controllers — NGINX (2 resource-specific): virtualservers, virtualserverroutes
Ingress controllers — Traefik (1 resource-specific): ingressroutes
Ingress controllers — others (3 group-level): Kong, HAProxy Core, HAProxy v1

Motivation

Cloud Security needs to tell customers which container workloads are internet-reachable. Today, the agent collects standard Ingress and Service objects, covering ~16% of EKS customers. Over 36% use service meshes or non-standard ingress controllers whose exposure paths go through CRDs we don't collect.

RFC: https://datadoghq.atlassian.net/wiki/x/4IOyfAE
Technical implementation: https://datadoghq.atlassian.net/wiki/x/EgO6fAE

Describe how you validated your changes

All existing tests pass: TestNewBuiltinCRDConfigs, TestImportBuiltinCollectors, TestGetDatadogCustomResourceCollectors, TestFilterCRCollectorsByPermission
New test TestNewBuiltinCRDConfigsPerFamilyFlags verifies each per-family flag independently disables its family, and the global OOTB flag disables everything
Package compiles cleanly with go build -tags "kubeapiserver orchestrator"

Additional Notes

All three flags default to false (opt-in). Collection is only activated when RBAC is granted (via helm/operator) and the corresponding flag is set to true.

Merge order: The backend allowlist PR (dd-go) must be deployed before this PR merges, otherwise collected CRs will be silently dropped.

Backend allowlist deployed: DataDog/dd-go#230589
Helm chart RBAC merged: [datadog] Add RBAC for network topology CRD collection helm-charts#2541
Operator RBAC merged: Add RBAC for Gateway API, service mesh, and ingress controller CRDs datadog-operator#2874

Related PRs

Repo	PR	Purpose
DataDog/dd-go	DataDog/dd-go#230589	Backend allowlist (deploy FIRST)
DataDog/helm-charts	DataDog/helm-charts#2541	Helm chart RBAC
DataDog/datadog-operator	DataDog/datadog-operator#2874	Operator RBAC

…mesh, and ingress controllers Add 22 new builtin CRD entries to the orchestrator explorer so Cloud Security can determine internet-reachability paths for k8s workloads. Uses a hybrid collection strategy: resource-specific entries for high-volume vendors (Istio, NGINX, Traefik) to control isForbidden API call volume, and group-level entries for less common vendors. Three new per-family config flags allow operators to disable specific families independently: - orchestrator_explorer.custom_resources.ootb.gateway_api - orchestrator_explorer.custom_resources.ootb.service_mesh - orchestrator_explorer.custom_resources.ootb.ingress_controllers Constraint: Must reuse existing K8sCR pipeline — no new entity kinds Constraint: isForbidden does a live List API call per resource, so resource-specific entries used for top vendors to limit call count Rejected: All group-level entries | isForbidden scaling (60-100+ API calls worst case) Rejected: Single global flag only | operators need per-family opt-out Confidence: high Scope-risk: moderate Not-tested: clusters running all supported meshes simultaneously (theoretical 48-54 isForbidden calls)

Gateway API, service mesh, and ingress controller collection should be opt-in, not opt-out. Users must explicitly enable via config or RBAC (helm/operator) before these CRDs are collected. Constraint: RBAC must be granted before collection can work

agent-platform-auto-pr · 2026-04-07T17:30:27Z

Files inventory check summary

File checks results against ancestor 00ba5786:

Results for datadog-agent_7.79.0~devel.git.472.82a8f86.pipeline.106602452-1_amd64.deb:

No change detected

cit-pr-commenter-54b7da · 2026-04-07T17:51:57Z

Regression Detector

Regression Detector Results

Metrics dashboard
Target profiles
Run ID: 92089b94-dfc6-4b79-b137-0bc259e536e3

Baseline: eed7901
Comparison: 8834b4b
Diff

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
➖	docker_containers_cpu	% cpu utilization	-1.19	[-4.12, +1.75]	1	Logs

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
➖	quality_gate_metrics_logs	memory utilization	+2.16	[+1.92, +2.40]	1	Logs bounds checks dashboard
➖	quality_gate_logs	% cpu utilization	+1.68	[+0.04, +3.33]	1	Logs bounds checks dashboard
➖	file_tree	memory utilization	+0.75	[+0.70, +0.81]	1	Logs
➖	otlp_ingest_metrics	memory utilization	+0.45	[+0.29, +0.61]	1	Logs
➖	ddot_logs	memory utilization	+0.30	[+0.22, +0.37]	1	Logs
➖	docker_containers_memory	memory utilization	+0.29	[+0.21, +0.37]	1	Logs
➖	quality_gate_idle	memory utilization	+0.18	[+0.12, +0.23]	1	Logs bounds checks dashboard
➖	uds_dogstatsd_20mb_12k_contexts_20_senders	memory utilization	+0.11	[+0.05, +0.17]	1	Logs
➖	tcp_syslog_to_blackhole	ingress throughput	+0.08	[-0.09, +0.26]	1	Logs
➖	file_to_blackhole_1000ms_latency	egress throughput	+0.03	[-0.40, +0.46]	1	Logs
➖	ddot_metrics_sum_delta	memory utilization	+0.03	[-0.14, +0.20]	1	Logs
➖	file_to_blackhole_100ms_latency	egress throughput	+0.02	[-0.10, +0.14]	1	Logs
➖	tcp_dd_logs_filter_exclude	ingress throughput	+0.00	[-0.11, +0.11]	1	Logs
➖	uds_dogstatsd_to_api	ingress throughput	-0.01	[-0.20, +0.19]	1	Logs
➖	file_to_blackhole_500ms_latency	egress throughput	-0.01	[-0.41, +0.39]	1	Logs
➖	uds_dogstatsd_to_api_v3	ingress throughput	-0.02	[-0.23, +0.19]	1	Logs
➖	quality_gate_idle_all_features	memory utilization	-0.02	[-0.06, +0.01]	1	Logs bounds checks dashboard
➖	file_to_blackhole_0ms_latency	egress throughput	-0.11	[-0.68, +0.45]	1	Logs
➖	ddot_metrics_sum_cumulativetodelta_exporter	memory utilization	-0.22	[-0.44, +0.01]	1	Logs
➖	ddot_metrics	memory utilization	-0.30	[-0.47, -0.12]	1	Logs
➖	otlp_ingest_logs	memory utilization	-0.33	[-0.43, -0.23]	1	Logs
➖	ddot_metrics_sum_cumulative	memory utilization	-0.62	[-0.76, -0.48]	1	Logs
➖	docker_containers_cpu	% cpu utilization	-1.19	[-4.12, +1.75]	1	Logs

Bounds Checks: ✅ Passed

perf	experiment	bounds_check_name	replicates_passed	observed_value	links
✅	docker_containers_cpu	simple_check_run	10/10	696 ≥ 26
✅	docker_containers_memory	memory_usage	10/10	276.50MiB ≤ 370MiB
✅	docker_containers_memory	simple_check_run	10/10	647 ≥ 26
✅	file_to_blackhole_0ms_latency	memory_usage	10/10	0.19GiB ≤ 1.20GiB
✅	file_to_blackhole_0ms_latency	missed_bytes	10/10	0B = 0B
✅	file_to_blackhole_1000ms_latency	memory_usage	10/10	0.23GiB ≤ 1.20GiB
✅	file_to_blackhole_1000ms_latency	missed_bytes	10/10	0B = 0B
✅	file_to_blackhole_100ms_latency	memory_usage	10/10	0.20GiB ≤ 1.20GiB
✅	file_to_blackhole_100ms_latency	missed_bytes	10/10	0B = 0B
✅	file_to_blackhole_500ms_latency	memory_usage	10/10	0.21GiB ≤ 1.20GiB
✅	file_to_blackhole_500ms_latency	missed_bytes	10/10	0B = 0B
✅	quality_gate_idle	intake_connections	10/10	3 = 3	bounds checks dashboard
✅	quality_gate_idle	memory_usage	10/10	176.21MiB ≤ 181MiB	bounds checks dashboard
✅	quality_gate_idle_all_features	intake_connections	10/10	3 = 3	bounds checks dashboard
✅	quality_gate_idle_all_features	memory_usage	10/10	495.55MiB ≤ 550MiB	bounds checks dashboard
✅	quality_gate_logs	intake_connections	10/10	4 ≤ 6	bounds checks dashboard
✅	quality_gate_logs	memory_usage	10/10	207.23MiB ≤ 220MiB	bounds checks dashboard
✅	quality_gate_logs	missed_bytes	10/10	0B = 0B	bounds checks dashboard
✅	quality_gate_metrics_logs	cpu_usage	10/10	357.93 ≤ 2000	bounds checks dashboard
✅	quality_gate_metrics_logs	intake_connections	10/10	4 ≤ 6	bounds checks dashboard
✅	quality_gate_metrics_logs	memory_usage	10/10	432.62MiB ≤ 475MiB	bounds checks dashboard
✅	quality_gate_metrics_logs	missed_bytes	10/10	0B = 0B	bounds checks dashboard

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

CI Pass/Fail Decision

✅ Passed. All Quality Gates passed.

quality_gate_metrics_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check cpu_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.

agent-platform-auto-pr · 2026-04-07T18:06:10Z

Static quality checks

✅ Please find below the results from static quality gates
Comparison made with ancestor 00ba578
📊 Static Quality Gates Dashboard
🔗 SQG Job

Successful checks

Info

	Quality gate	Change	Size (prev → curr → max)
✅	agent_deb_amd64	+3.88 KiB (0.00% increase)	753.303 → 753.307 → 753.380
✅	agent_deb_amd64_fips	+3.88 KiB (0.00% increase)	710.215 → 710.219 → 713.900
✅	agent_heroku_amd64	+4.0 KiB (0.00% increase)	313.459 → 313.462 → 320.580
✅	agent_msi	+11.5 KiB (0.00% increase)	605.191 → 605.202 → 651.440
✅	agent_rpm_amd64	+3.88 KiB (0.00% increase)	753.287 → 753.291 → 753.350
✅	agent_rpm_amd64_fips	+3.88 KiB (0.00% increase)	710.199 → 710.203 → 713.880
✅	agent_rpm_arm64	+7.39 KiB (0.00% increase)	731.663 → 731.670 → 735.290
✅	agent_rpm_arm64_fips	+7.39 KiB (0.00% increase)	691.627 → 691.634 → 696.840
✅	agent_suse_amd64	+3.88 KiB (0.00% increase)	753.287 → 753.291 → 753.350
✅	agent_suse_amd64_fips	+3.88 KiB (0.00% increase)	710.199 → 710.203 → 713.880
✅	agent_suse_arm64	+7.39 KiB (0.00% increase)	731.663 → 731.670 → 735.290
✅	agent_suse_arm64_fips	+7.39 KiB (0.00% increase)	691.627 → 691.634 → 696.840
✅	docker_agent_amd64	+3.89 KiB (0.00% increase)	813.575 → 813.579 → 815.700
✅	docker_agent_arm64	+7.39 KiB (0.00% increase)	816.753 → 816.760 → 821.970
✅	docker_agent_jmx_amd64	+3.88 KiB (0.00% increase)	1004.491 → 1004.494 → 1006.580
✅	docker_agent_jmx_arm64	+7.39 KiB (0.00% increase)	996.447 → 996.454 → 1001.570
✅	docker_cluster_agent_amd64	+4.01 KiB (0.00% increase)	204.103 → 204.107 → 206.270

14 successful checks with minimal change (< 2 KiB)

	Quality gate	Current Size
✅	docker_cluster_agent_arm64	218.547 MiB
✅	docker_cws_instrumentation_amd64	7.142 MiB
✅	docker_cws_instrumentation_arm64	6.689 MiB
✅	docker_dogstatsd_amd64	39.262 MiB
✅	docker_dogstatsd_arm64	37.507 MiB
✅	dogstatsd_deb_amd64	29.913 MiB
✅	dogstatsd_deb_arm64	28.062 MiB
✅	dogstatsd_rpm_amd64	29.913 MiB
✅	dogstatsd_suse_amd64	29.913 MiB
✅	iot_agent_deb_amd64	43.297 MiB
✅	iot_agent_deb_arm64	40.344 MiB
✅	iot_agent_deb_armhf	41.092 MiB
✅	iot_agent_rpm_amd64	43.298 MiB
✅	iot_agent_suse_amd64	43.298 MiB

On-wire sizes (compressed)

	Quality gate	Change	Size (prev → curr → max)
✅	agent_deb_amd64	+30.55 KiB (0.02% increase)	174.781 → 174.810 → 178.360
✅	agent_deb_amd64_fips	-27.59 KiB (0.02% reduction)	165.449 → 165.422 → 172.790
✅	agent_heroku_amd64	neutral	75.043 MiB → 79.970
✅	agent_msi	neutral	138.469 MiB → 146.220
✅	agent_rpm_amd64	+28.46 KiB (0.02% increase)	177.607 → 177.634 → 181.830
✅	agent_rpm_amd64_fips	+12.75 KiB (0.01% increase)	167.736 → 167.748 → 173.370
✅	agent_rpm_arm64	-19.06 KiB (0.01% reduction)	159.633 → 159.614 → 163.060
✅	agent_rpm_arm64_fips	-38.54 KiB (0.02% reduction)	151.509 → 151.471 → 156.170
✅	agent_suse_amd64	+28.46 KiB (0.02% increase)	177.607 → 177.634 → 181.830
✅	agent_suse_amd64_fips	+12.75 KiB (0.01% increase)	167.736 → 167.748 → 173.370
✅	agent_suse_arm64	-19.06 KiB (0.01% reduction)	159.633 → 159.614 → 163.060
✅	agent_suse_arm64_fips	-38.54 KiB (0.02% reduction)	151.509 → 151.471 → 156.170
✅	docker_agent_amd64	-7.14 KiB (0.00% reduction)	268.257 → 268.250 → 272.480
✅	docker_agent_arm64	+2.84 KiB (0.00% increase)	255.447 → 255.449 → 261.060
✅	docker_agent_jmx_amd64	-8.01 KiB (0.00% reduction)	336.907 → 336.900 → 341.100
✅	docker_agent_jmx_arm64	+3.76 KiB (0.00% increase)	320.084 → 320.088 → 325.620
✅	docker_cluster_agent_amd64	+7.88 KiB (0.01% increase)	71.417 → 71.425 → 72.920
✅	docker_cluster_agent_arm64	+4.0 KiB (0.01% increase)	67.039 → 67.042 → 68.220
✅	docker_cws_instrumentation_amd64	neutral	2.999 MiB → 3.330
✅	docker_cws_instrumentation_arm64	neutral	2.729 MiB → 3.090
✅	docker_dogstatsd_amd64	neutral	15.175 MiB → 15.820
✅	docker_dogstatsd_arm64	+4.49 KiB (0.03% increase)	14.495 → 14.499 → 14.830
✅	dogstatsd_deb_amd64	neutral	7.899 MiB → 8.790
✅	dogstatsd_deb_arm64	neutral	6.785 MiB → 7.710
✅	dogstatsd_rpm_amd64	neutral	7.909 MiB → 8.800
✅	dogstatsd_suse_amd64	neutral	7.909 MiB → 8.800
✅	iot_agent_deb_amd64	neutral	11.407 MiB → 13.040
✅	iot_agent_deb_arm64	neutral	9.722 MiB → 11.450
✅	iot_agent_deb_armhf	neutral	9.946 MiB → 11.620
✅	iot_agent_rpm_amd64	neutral	11.424 MiB → 13.060
✅	iot_agent_suse_amd64	neutral	11.424 MiB → 13.060

Older Istio clusters (pre-1.22) only serve networking.istio.io/v1alpha3. Add it as a fallback version so those clusters are not silently excluded.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 82a8f86af3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

rahulkaukuntla

config options LGTM

…agent When networkCRDs.enabled=true, set all three per-family agent config flags on the cluster-agent pod so collection is activated alongside the RBAC: DD_ORCHESTRATOR_EXPLORER_CUSTOM_RESOURCES_OOTB_GATEWAY_API=true DD_ORCHESTRATOR_EXPLORER_CUSTOM_RESOURCES_OOTB_SERVICE_MESH=true DD_ORCHESTRATOR_EXPLORER_CUSTOM_RESOURCES_OOTB_INGRESS_CONTROLLERS=true These map to orchestrator_explorer.custom_resources.ootb.{gateway_api, service_mesh,ingress_controllers} in the agent config (DataDog/datadog-agent#48966).

dd-octo-sts bot added the internal Identify a non-fork PR label Apr 7, 2026

github-actions bot added the medium review PR review might take time label Apr 7, 2026

dd-octo-sts bot added team/kubernetes-experiences team/agent-configuration labels Apr 7, 2026

This was referenced Apr 7, 2026

[datadog] Add RBAC for network topology CRD collection DataDog/helm-charts#2541

Draft

Add RBAC for Gateway API, service mesh, and ingress controller CRDs DataDog/datadog-operator#2874

Open

eliottness added the qa/done QA done before merge and regressions are covered by tests label Apr 7, 2026

Add release note for OOTB CRD collection feature

0835e5b

Add Istio v1alpha3 fallback for older clusters

82a8f86

Older Istio clusters (pre-1.22) only serve networking.istio.io/v1alpha3. Add it as a fallback version so those clusters are not silently excluded.

eliottness marked this pull request as ready for review April 8, 2026 13:15

eliottness requested review from a team as code owners April 8, 2026 13:15

eliottness requested a review from rahulkaukuntla April 8, 2026 13:15

chatgpt-codex-connector bot reviewed Apr 8, 2026

View reviewed changes

Comment thread pkg/collector/corechecks/cluster/orchestrator/collector_bundle.go

rahulkaukuntla approved these changes Apr 8, 2026

View reviewed changes

kangyili approved these changes Apr 9, 2026

View reviewed changes

gh-worker-dd-mergequeue-cf854d bot merged commit 8834b4b into main Apr 9, 2026
293 checks passed

gh-worker-dd-mergequeue-cf854d bot deleted the eliottness/ootb-crd-gateway-mesh-ingress branch April 9, 2026 21:09

github-actions bot added this to the 7.79.0 milestone Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(orchestrator): add OOTB CRD collection for Gateway API, service mesh, and ingress controllers#48966

feat(orchestrator): add OOTB CRD collection for Gateway API, service mesh, and ingress controllers#48966
gh-worker-dd-mergequeue-cf854d[bot] merged 4 commits intomainfrom
eliottness/ootb-crd-gateway-mesh-ingress

eliottness commented Apr 7, 2026 •

edited

Loading

Uh oh!

agent-platform-auto-pr bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

cit-pr-commenter-54b7da bot commented Apr 7, 2026 •

edited

Loading

Experiments ignored for regressions

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

Uh oh!

agent-platform-auto-pr bot commented Apr 7, 2026 •

edited

Loading

Info

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

rahulkaukuntla left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

eliottness commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Describe how you validated your changes

Additional Notes

Related PRs

Uh oh!

agent-platform-auto-pr bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Files inventory check summary

Results for datadog-agent_7.79.0~devel.git.472.82a8f86.pipeline.106602452-1_amd64.deb:

Uh oh!

cit-pr-commenter-54b7da bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regression Detector

Regression Detector Results

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

CI Pass/Fail Decision

Uh oh!

agent-platform-auto-pr bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Static quality checks

Info

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

rahulkaukuntla left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eliottness commented Apr 7, 2026 •

edited

Loading

agent-platform-auto-pr bot commented Apr 7, 2026 •

edited

Loading

cit-pr-commenter-54b7da bot commented Apr 7, 2026 •

edited

Loading

agent-platform-auto-pr bot commented Apr 7, 2026 •

edited

Loading