Skip to content

[NO-TICKET] Convert as much direct env var access to config DSL as possible#5559

Open
vpellan wants to merge 16 commits intomasterfrom
vpellan/env-vars-to-config-dsl
Open

[NO-TICKET] Convert as much direct env var access to config DSL as possible#5559
vpellan wants to merge 16 commits intomasterfrom
vpellan/env-vars-to-config-dsl

Conversation

@vpellan
Copy link
Copy Markdown
Contributor

@vpellan vpellan commented Apr 3, 2026

What does this PR do?

This PR converts as much configurations that are accessed directly through environment variable to our config DSL as possible. Some cannot be converted (env vars accessed before initializing the config DSL, or when booting a single piece of the tracer such as DI or AppSec)

Motivation:

By doing this, customers will not only be able to configure them through Datadog.configure blocks, but also through Stable Config. These configs will also be correctly reported through telemetry thanks to the work done for config visibility.

Change log entry

Yes. Enable the configuration of DD_GIT_REPOSITORY_URL, DD_GIT_COMMIT_SHA, DD_TRACE_AGENT_URL, DD_TRACE_AGENT_TIMEOUT_SECONDS, DD_METRIC_AGENT_PORT and DD_DYNAMIC_INSTRUMENTATION_PROBE_FILE through Datadog.configure blocks and in the Datadog UI

Additional Notes:

I'll open a PR on datadog-ci-rb so that DD_GIT_REPOSITORY_URL, DD_GIT_COMMIT_SHA are supported.

How to test the change?

@github-actions github-actions bot added core Involves Datadog core libraries integrations Involves tracing integrations tracing labels Apr 3, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

Typing analysis

Ignored files

This PR clears 5 ignored files. It increases the percentage of typed files from 45.99% to 46.48% (+0.49%).

Ignored files (+0-5)Cleared:
lib/datadog/core/environment/variable_helpers.rb
lib/datadog/core/metrics/client.rb
lib/datadog/core/runtime/ext.rb
lib/datadog/core/runtime/metrics.rb
lib/datadog/core/workers/runtime_metrics.rb

Note: Ignored files are excluded from the next sections.

steep:ignore comments

This PR introduces 5 steep:ignore comments, and clears 4 steep:ignore comments.

steep:ignore comments (+5-4)Introduced:
lib/datadog/core/telemetry/event/app_started.rb:130
lib/datadog/core/telemetry/event/app_started.rb:136
lib/datadog/core/telemetry/event/app_started.rb:151
lib/datadog/core/telemetry/event/app_started.rb:162
lib/datadog/core/telemetry/event/app_started.rb:172
Cleared:
lib/datadog/core/telemetry/event/app_started.rb:138
lib/datadog/core/telemetry/event/app_started.rb:144
lib/datadog/core/telemetry/event/app_started.rb:155
lib/datadog/core/telemetry/event/app_started.rb:165

Untyped methods

This PR introduces 23 untyped methods and 12 partially typed methods, and clears 1 untyped method. It decreases the percentage of typed methods from 61.89% to 60.88% (-1.01%).

Untyped methods (+23-1)Introduced:
sig/datadog/core/metrics/client.rbs:22
└── def supported?: () -> untyped
sig/datadog/core/metrics/client.rbs:24
└── def enabled?: () -> untyped
sig/datadog/core/metrics/client.rbs:26
└── def enabled=: (untyped enabled) -> untyped
sig/datadog/core/metrics/client.rbs:28
└── def default_hostname: () -> untyped
sig/datadog/core/metrics/client.rbs:30
└── def default_port: () -> untyped
sig/datadog/core/metrics/client.rbs:32
└── def default_statsd_client: () -> untyped
sig/datadog/core/metrics/client.rbs:36
└── def send_stats?: () -> untyped
sig/datadog/core/metrics/client.rbs:38
└── def count: (untyped stat, ?untyped? value, ?untyped? options) { () -> untyped } -> untyped
sig/datadog/core/metrics/client.rbs:40
└── def distribution: (untyped stat, ?untyped? value, ?untyped? options) ?{ () -> untyped } -> untyped
sig/datadog/core/metrics/client.rbs:42
└── def increment: (untyped stat, ?untyped? options) { () -> untyped } -> untyped
sig/datadog/core/metrics/client.rbs:44
└── def gauge: (untyped stat, ?untyped? value, ?untyped? options) ?{ () -> untyped } -> untyped
sig/datadog/core/metrics/client.rbs:46
└── def time: (untyped stat, ?untyped? options) { () -> untyped } -> untyped
sig/datadog/core/metrics/client.rbs:48
└── def send_metrics: (untyped metrics) -> untyped
sig/datadog/core/metrics/client.rbs:54
└── def dogstatsd_version: () -> untyped
sig/datadog/core/metrics/client.rbs:58
└── def ignored_statsd_warning: () -> untyped
sig/datadog/core/runtime/metrics.rbs:11
└── def gc_metrics: () -> untyped
sig/datadog/core/runtime/metrics.rbs:13
└── def try_flush: () { () -> untyped } -> untyped
sig/datadog/core/runtime/metrics.rbs:15
└── def default_metric_options: () -> untyped
sig/datadog/core/runtime/metrics.rbs:25
└── def compile_service_tags!: () -> untyped
sig/datadog/core/runtime/metrics.rbs:27
└── def nested_gc_metric: (untyped prefix, untyped k, untyped v) -> untyped
sig/datadog/core/runtime/metrics.rbs:29
└── def to_metric_name: (untyped str) -> untyped
sig/datadog/core/telemetry/worker.rbs:69
└── def buffer_klass: () -> untyped
sig/datadog/core/workers/runtime_metrics.rbs:16
└── def register_service: (untyped service) -> untyped
Cleared:
sig/datadog/core/telemetry/worker.rbs:67
└── def buffer_klass: () -> untyped
Partially typed methods (+12-0)Introduced:
sig/datadog/core/environment/variable_helpers.rbs:7
└── def env_to_bool: (untyped var, ?untyped? default, ?deprecation_warning: bool) -> untyped
sig/datadog/core/environment/variable_helpers.rbs:11
└── def decode_array: (untyped var, bool deprecation_warning) -> untyped
sig/datadog/core/metrics/client.rbs:20
└── def initialize: (telemetry: Core::Telemetry::Component, ?port: ::Integer, ?logger: Datadog::Core::Logger, ?statsd: untyped?, ?enabled: bool, **untyped _) -> void
sig/datadog/core/metrics/client.rbs:34
└── def configure: (?::Hash[untyped, untyped] options) -> untyped
sig/datadog/core/metrics/client.rbs:50
└── def close: () -> (untyped | nil)
sig/datadog/core/runtime/metrics.rbs:7
└── def initialize: (telemetry: Core::Telemetry::Component, port: ::Integer, **untyped options) -> void
sig/datadog/core/runtime/metrics.rbs:8
└── def register_service: (untyped service) -> (nil | untyped)
sig/datadog/core/runtime/metrics.rbs:9
└── def flush: () -> (nil | untyped)
sig/datadog/core/runtime/metrics.rbs:31
└── def gauge_if_not_nil: (untyped metric_name, untyped metric_value) -> (untyped | nil)
sig/datadog/core/telemetry/worker.rbs:47
└── def stop: (?bool force_stop, ?::Integer | ::Float timeout) -> untyped
sig/datadog/core/workers/runtime_metrics.rbs:12
└── def initialize: (telemetry: Core::Telemetry::Component, port: ::Integer, **untyped) -> void
sig/datadog/core/workers/runtime_metrics.rbs:17
└── def stop: (*untyped args, ?close_metrics: bool) -> untyped

Untyped other declarations

This PR introduces 6 untyped other declarations. It increases the percentage of typed other declarations from 77.96% to 78.06% (+0.1%).

Untyped other declarations (+6-0)Introduced:
sig/datadog/core/metrics/client.rbs:16
└── attr_reader statsd: untyped
sig/datadog/core/metrics/client.rbs:56
└── IGNORED_STATSD_ONLY_ONCE: untyped
sig/datadog/core/runtime/metrics.rbs:19
└── attr_reader service_tags: untyped
sig/datadog/core/runtime/metrics.rbs:21
└── attr_reader services: untyped
sig/datadog/core/runtime/metrics.rbs:23
└── attr_reader process_tags_enabled: untyped
sig/datadog/core/workers/runtime_metrics.rbs:10
└── attr_reader metrics: untyped

If you believe a method or an attribute is rightfully untyped or partially typed, you can add # untyped:accept on the line before the definition to remove it from the stats.

@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Apr 3, 2026

Benchmarks

Benchmark execution time: 2026-04-15 13:06:29

Comparing candidate commit bc8f773 in PR branch vpellan/env-vars-to-config-dsl with baseline commit 9e57ef7 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 45 metrics, 1 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

@datadog-prod-us1-6
Copy link
Copy Markdown

datadog-prod-us1-6 bot commented Apr 3, 2026

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 96.56%
Overall Coverage: 95.35% (-0.02%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: bc8f773 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

Comment on lines 58 to 61
# DEV 3.0: In dd-trace-rs, DD_TRACE_AGENT_URL takes precedence over DD_AGENT_HOST and DD_TRACE_AGENT_PORT
# Before releasing dd-trace-rb 3.0, we should consider following that logic.
# (Add the env vars to c.agent.host and c.agent.port, and prioritize parsed_http_url)
return @configured_hostname if defined?(@configured_hostname)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this part, is there an internal reference for it?

I was hoping that, if we change this, we'd align on a direction that all libraries are actively planning to move towards, rather than "before we were consistent with java, now we're consistent with python" kinda vibe (in particular, I want to avoid ever needing to change this again in the future).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the reference in the comment. It is in the public documentation here : https://docs.datadoghq.com/tracing/trace_collection/library_config/#agent . There's also internal doc for that but it basically state the same thing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, public docs is the best source for this, ty!

@github-actions github-actions bot added the otel OpenTelemetry-related changes label Apr 8, 2026
@vpellan vpellan marked this pull request as ready for review April 8, 2026 13:26
@vpellan vpellan requested review from a team as code owners April 8, 2026 13:26
@vpellan vpellan requested a review from mabdinur April 8, 2026 13:26
@vpellan vpellan force-pushed the vpellan/env-vars-to-config-dsl branch from d172bf5 to cb9f2d6 Compare April 8, 2026 13:36
options[:statsd] = settings.health_metrics.statsd unless settings.health_metrics.statsd.nil?

Core::Diagnostics::Health::Metrics.new(telemetry: telemetry, logger: logger, **options)
Core::Diagnostics::Health::Metrics.new(telemetry: telemetry, port: settings.agent.statsd.port, logger: logger, **options)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this change is related to config DSL; also could settings.agent.statsd be nil?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems unrelated but Health::Metrics was getting the port directly through DATADOG_ENV access. By doing this, we provide the port through the config DSL

also could settings.agent.statsd be nil?

This should not be possible or it means the configuration file has an error or failed loading at all

Comment on lines +112 to +122
settings :statsd do
# Agent Statsd UDP port.
# @configure_with {Datadog::Statsd}
# @default `DD_METRIC_AGENT_PORT` environment variable, otherwise `8125`
# @return [Integer,nil]
option :port do |o|
o.type :int
o.env Datadog::Core::Configuration::Ext::Metrics::ENV_DEFAULT_PORT
o.default Datadog::Core::Configuration::Ext::Metrics::DEFAULT_PORT
end
end
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this deserves a mention in the changelog

Comment on lines +1038 to +1057
settings :git do
# The URL of the git repository.
# @default `DD_GIT_REPOSITORY_URL` environment variable, otherwise `nil`
# @return [String,nil]
option :repository_url do |o|
o.type :string, nilable: true
o.env Core::Git::Ext::ENV_REPOSITORY_URL
# Sensitive informations are filtered before being manually sent through telemetry
# in core/telemetry/event/app_started.rb
o.skip_telemetry true
end

# The SHA of the git commit.
# @default `DD_GIT_COMMIT_SHA` environment variable, otherwise `nil`
# @return [String,nil]
option :commit_sha do |o|
o.type :string, nilable: true
o.env Core::Git::Ext::ENV_COMMIT_SHA
end
end
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this also should be mentioned in the changelog? I would probably even add a separate entry for every new config there

Comment thread lib/datadog/core/environment/git.rb Outdated

@git_commit_sha = DATADOG_ENV[Datadog::Core::Git::Ext::ENV_COMMIT_SHA]
def self.git_repository_url(settings)
Utils::Url.filter_basic_auth(settings.git.repository_url)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the memoization removed here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that this setting can be set programmatically, it can also be reconfigured at runtime using a Datadog.configure block. Memoization would not catch this change

@git_repository_url = Utils::Url.filter_basic_auth(DATADOG_ENV[Datadog::Core::Git::Ext::ENV_REPOSITORY_URL])
end

def self.git_commit_sha
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it kinda feels odd that we will be accessing repo url differently from commit sha, just because of this basic auth filter. Why don't we change this method to read commit sha from settings instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed that file following your suggestions in another comment

Comment thread lib/datadog/core/metrics/client.rb Outdated
attr_reader :statsd, :logger, :telemetry

def initialize(telemetry:, logger: Datadog.logger, statsd: nil, enabled: true, **_)
def initialize(telemetry:, port:, logger: Datadog.logger, statsd: nil, enabled: true, **_)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this definitely feels like a big breaking change here, even though this method is not marked as public. Is this really necessary to add new mandatory keyword argument here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a default value in 28823da, although we should use the default value set by settings.agent.statsd.port, hence why I made it mandatory at first

Comment thread lib/datadog/core/environment/git.rb Outdated
return @git_commit_sha if defined?(@git_commit_sha)

@git_commit_sha = DATADOG_ENV[Datadog::Core::Git::Ext::ENV_COMMIT_SHA]
def self.git_repository_url(settings)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method name doesn't seem right now - why do we even need it, when we could call filter_basic_auth directly where needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted the file and directly called that method where needed in 28823da

conf_value(
'DD_INJECT_FORCE',
Core::Environment::VariableHelpers.env_to_bool('DD_INJECT_FORCE', false),
%w[1 true].include?(DATADOG_ENV.fetch('DD_INJECT_FORCE', 'false').strip.downcase),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we not using env_to_bool here anymore? could we still have a helper for converting string to boolean?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally removed it to push for using the DSL instead as o.type :bool and o.env would have the same effect on env vars but would also work with stable config and programmatic configs, and used include here because that env var is not from dd-trace-rb, but datadog-injector. I reverted the change in 28823da and reintroduced env_to_bool but we should avoid using it whenever possible. Another use case could be for configs set before loading the config DSL (e.g. in boot.rb for DI)

Copy link
Copy Markdown
Member

@p-datadog p-datadog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things caught my eye — both with apply-able suggestions.

Comment thread spec/datadog/tracing/diagnostics/health_spec.rb Outdated
Comment thread lib/datadog/tracing/contrib/extensions.rb Outdated
Comment thread lib/datadog/core/environment/git.rb Outdated
Copy link
Copy Markdown
Member

@p-datadog p-datadog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (non-blocking): DI probe file loader changes

DD_DYNAMIC_INSTRUMENTATION_PROBE_FILE is an undocumented internal/development env var — it loads probe definitions from a local JSON file instead of remote configuration. It's not something customers would configure, so the Stable Config and Datadog.configure benefits don't really apply here.

The original code deliberately reads the env var directly because the probe file loader runs at boot time, before the normal component lifecycle. The reordering means Datadog::DI.component (which triggers DI component initialization) now runs before checking whether a probe file path is even set. Today this is gated by the env var check in boot.rb:42, but the coupling feels more fragile than the original.

Would it make sense to leave this file as-is and skip the probe_file DSL option? The telemetry visibility benefit feels minimal for an internal-only env var. Happy to be wrong on this — curious if there's a reason I'm not seeing for including it.

Copy link
Copy Markdown
Contributor

@buraizu buraizu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with one update request for description consistency (and also subject-verb agreement).

Comment thread docs/GettingStarted.md Outdated

This allows you to keep important spans when trace-level sampling is applied. Is is not possible to drop spans using Single Span Sampling.

You can provide single span sampling rules inline with `tracing.sampling.span_rules`, or configure `tracing.sampling.span_rules_file` with a path to a file containing the rules. When both are configured, `tracing.sampling.span_rules` takes precedence.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can provide single span sampling rules inline with `tracing.sampling.span_rules`, or configure `tracing.sampling.span_rules_file` with a path to a file containing the rules. When both are configured, `tracing.sampling.span_rules` takes precedence.
You can provide single span sampling rules inline with `tracing.sampling.span_rules`, or configure `tracing.sampling.span_rules_file` with a path to a file containing the rules. When both are configured, inline rules take precedence.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Applied in 28823da

'Both DD_SPAN_SAMPLING_RULES and DD_SPAN_SAMPLING_RULES_FILE were provided: only ' \
'DD_SPAN_SAMPLING_RULES will be used. Please do not provide DD_SPAN_SAMPLING_RULES_FILE when ' \
'also providing DD_SPAN_SAMPLING_RULES as their configuration conflicts. ' \
"DD_SPAN_SAMPLING_RULES_FILE=#{span_rules_file} DD_SPAN_SAMPLING_RULES=#{value}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

span_rules_file references the contents of the file (and not the file name). So this log message will not show the value set in DD_SPAN_SAMPLING_RULES_FILE. Is this expected?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that, updated in 28823da

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I removed it completely in d06d386, I did not liked that it was printing the path only when the env var was set. I don't think that it was providing a lot of value to log the values of the configs there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Involves Datadog core libraries integrations Involves tracing integrations otel OpenTelemetry-related changes tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants