Skip to content

[DRAFT] NOT MERGING POC:: Send Go runtime metrics via OTLP using OTel-native naming#4611

Draft
link04 wants to merge 3 commits intomainfrom
maximo/otlp-runtime-metrics-poc
Draft

[DRAFT] NOT MERGING POC:: Send Go runtime metrics via OTLP using OTel-native naming#4611
link04 wants to merge 3 commits intomainfrom
maximo/otlp-runtime-metrics-poc

Conversation

@link04
Copy link
Copy Markdown

@link04 link04 commented Mar 27, 2026

Do not review — initial POC only.

Sends go.memory.*, go.goroutine.count, go.processor.limit, go.config.gogc via OTLP. Related: DataDog/dd-trace-dotnet#8299

🤖 Generated with Claude Code

Adds go.memory.used, go.memory.limit, go.memory.allocated,
go.memory.allocations, go.memory.gc.goal, go.goroutine.count,
go.processor.limit, go.config.gogc as OTel instruments on the
existing OTLP metrics pipeline.

Enabled with DD_RUNTIME_METRICS_ENABLED=true + DD_METRICS_OTEL_ENABLED=true.
Falls back to DogStatsD if OTLP init fails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
return
}
if err := ddmetric.Shutdown(context.Background(), o.provider); err != nil {
log.Error("Error shutting down OTLP runtime metrics: %v", err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [golangci] reported by reviewdog 🐶
ruleguard: suggestion: log.Error("Error shutting down OTLP runtime metrics: %v", err.Error()) (gocritic)

Comment thread ddtrace/tracer/tracer.go
log.Debug("Runtime metrics enabled via OTLP with OTel-native naming.")
orm, err := startOTLPRuntimeMetrics()
if err != nil {
log.Error("Failed to start OTLP runtime metrics, falling back to DogStatsD: %v", err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [golangci] reported by reviewdog 🐶
ruleguard: suggestion: log.Error("Failed to start OTLP runtime metrics, falling back to DogStatsD: %v", err.Error()) (gocritic)

@datadog-prod-us1-5
Copy link
Copy Markdown

datadog-prod-us1-5 bot commented Mar 27, 2026

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 7.69%
Overall Coverage: 59.96% (+3.82%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: a903e74 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Mar 27, 2026

Benchmarks

Benchmark execution time: 2026-04-09 17:22:41

Comparing candidate commit a903e74 in PR branch maximo/otlp-runtime-metrics-poc with baseline commit 92711d0 in branch main.

Found 0 performance improvements and 2 performance regressions! Performance is the same for 215 metrics, 7 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:BenchmarkConfig/scenario_WithStartSpanConfig-25

  • 🟥 execution_time [+1.487ns; +5.149ns] or [+2.189%; +7.580%]

scenario:BenchmarkPayloadVersions/simple_1spans/v1_0-25

  • 🟥 execution_time [+32.454ns; +46.346ns] or [+2.975%; +4.248%]

@link04 link04 changed the title POC: Send Go runtime metrics via OTLP using OTel-native naming [DRAFT] NOT MERGING POC:: Send Go runtime metrics via OTLP using OTel-native naming Mar 27, 2026
link04 and others added 2 commits April 3, 2026 14:10
Verifies all 8 OTel Go semantic convention names are expected
and none accidentally use DD-proprietary naming. Mirrors the
TestReportRuntimeMetrics pattern for DogStatsD metrics.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Uses ManualReader + real instruments + callbacks to verify all 8
go.* metrics produce positive values with correct tags. Mirrors
TestReportRuntimeMetrics pattern for DogStatsD. 3/3 runs, no flakes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
link04 added a commit to DataDog/system-tests that referenced this pull request Apr 9, 2026
New scenario OTLP_RUNTIME_METRICS that sets DD_METRICS_OTEL_ENABLED=true
alongside DD_RUNTIME_METRICS_ENABLED=true. Tests verify OTel-native metric
names (dotnet.*, jvm.*, go.*, v8js.*) appear in OTLP payloads and that
DD-proprietary names (runtime.dotnet.*, runtime.go.*) do not.

All languages marked as missing_feature in manifests until POC PRs are merged:
- .NET: DataDog/dd-trace-dotnet#8299
- Go: DataDog/dd-trace-go#4611
- Node.js: DataDog/dd-trace-js#7869
- Java: DataDog/dd-trace-java#10985

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant