Skip to content

Improve peer-tag calculation fast-path performance#8445

Open
andrewlock wants to merge 8 commits intoandrew/client-side-stats/fnvhashfrom
andrew/client-side-stats/peer-tag-improvements
Open

Improve peer-tag calculation fast-path performance#8445
andrewlock wants to merge 8 commits intoandrew/client-side-stats/fnvhashfrom
andrew/client-side-stats/peer-tag-improvements

Conversation

@andrewlock
Copy link
Copy Markdown
Member

Summary of changes

Improves the performance of peer-tag hash calculation for "fast path" cases (where there's an existing bucket)

Reason for change

We have to calculate the hash of all peer tags as part of client-side-stats bucketing calculations, which involves encoding them as utf-8. Additionally, when we send the buckets, we have to send the tags as utf-8.

In the initial CSS 1.2.0 implementation, we encoded the tags every-time we ran a calculation, which would allocate for every span that had peer tags, generally quite expensive. In this PR, we switch to doing the encoding twice: once with a zero-allocation implementation (amortized 0 on .NET Framework) to calculate the hash, and then, if we need the "real" encoded tags, then we do that encoding again.

Implementation details

  • Split the peer tags work in two, once to calculate the peer tags hash, once to get the actual tags as key:value
  • In the hash-calculation stage, we can use stackalloc for .NET Core, and array pool implementation for .NET Framework etc
  • As additional optimizations, we also
    • Pre-encode the peer tag keys to utf-8, so that we only need to do those once for the hash calculation.
    • Pass key details from BuildKey to GetEncodedPeerTags (i.e. is this a "base service" only tag, if so, what's the tag value, otherwise how big does the peer tag list need to be)
    • If client-side stats is disabled, don't bother doing all the pre-calculation (or for trace filters)

There are some possible future optimizations, not implemented in this PR:

  • Track which tags require IP quantization to avoid re-doing it. (Maybe we should allow-list the quantization anyway, so that it only applies to specific tags? Or alternatively, block list?)
  • Instead of doing the encoding of tag:value, allow writing the pre-computed byte[] to MessagePackBinary and appending the value. This is doable, but requires updating the MessagePackBinary implementation to support it, so I considered it out of scope for now

Test coverage

Mostly covered by existing tests, but added some additional unit tests that directly compare the hashing to values used in Go agent tests.

Additionally did some benchmarking. The key thing is that the ClientSpanWithPeerTags path is zero-allocation (and it's nice that the slow-path is still lower allocation than before, even if it's slower over all)

Method Runtime Mean Error Allocated
BuildKey_SimpleSpan_Before .NET 10.0 44.34 ns 0.906 ns -
BuildKey_SimpleSpan_After .NET 10.0 39.17 ns 0.445 ns -
BuildKey_SimpleSpan_Before .NET 6.0 93.13 ns 1.837 ns -
BuildKey_SimpleSpan_After .NET 6.0 87.56 ns 0.839 ns -
BuildKey_SimpleSpan_Before .NET Core 3.1 192.53 ns 2.572 ns -
BuildKey_SimpleSpan_After .NET Core 3.1 217.32 ns 4.362 ns -
BuildKey_SimpleSpan_Before .NET Framework 4.8 131.98 ns 2.621 ns -
BuildKey_SimpleSpan_After .NET Framework 4.8 175.20 ns 7.647 ns -
BuildKey_ClientSpanNoPeerTags_Before .NET 10.0 160.62 ns 2.622 ns -
BuildKey_ClientSpanNoPeerTags_After .NET 10.0 151.27 ns 1.995 ns -
BuildKey_ClientSpanNoPeerTags_Before .NET 6.0 243.62 ns 3.199 ns -
BuildKey_ClientSpanNoPeerTags_After .NET 6.0 232.55 ns 2.975 ns -
BuildKey_ClientSpanNoPeerTags_Before .NET Core 3.1 550.47 ns 9.998 ns -
BuildKey_ClientSpanNoPeerTags_After .NET Core 3.1 680.42 ns 10.022 ns -
BuildKey_ClientSpanNoPeerTags_Before .NET Framework 4.8 442.88 ns 8.854 ns -
BuildKey_ClientSpanNoPeerTags_After .NET Framework 4.8 682.12 ns 13.468 ns -
BuildKey_ClientSpanWithPeerTags_Before .NET 10.0 800.82 ns 15.062 ns 840 B
BuildKey_ClientSpanWithPeerTags_After .NET 10.0 575.04 ns 5.604 ns -
BuildKey_ClientSpanWithPeerTags_Before .NET 6.0 1,076.07 ns 13.124 ns 840 B
BuildKey_ClientSpanWithPeerTags_After .NET 6.0 1,440.26 ns 28.755 ns -
BuildKey_ClientSpanWithPeerTags_Before .NET Core 3.1 1,531.27 ns 13.509 ns 840 B
BuildKey_ClientSpanWithPeerTags_After .NET Core 3.1 1,774.35 ns 89.601 ns -
BuildKey_ClientSpanWithPeerTags_Before .NET Framework 4.8 1,682.15 ns 17.082 ns 859 B
BuildKey_ClientSpanWithPeerTags_After .NET Framework 4.8 2,040.90 ns 162.895 ns -
BuildKey_ClientSpanWithPeerTags_GetEncodedPeerTags .NET 10.0 1,141.85 ns 9.980 ns 760 B
BuildKey_ClientSpanWithPeerTags_GetEncodedPeerTags .NET 6.0 2,720.55 ns 53.097 ns 760 B
BuildKey_ClientSpanWithPeerTags_GetEncodedPeerTags .NET Core 3.1 3,534.28 ns 141.234 ns 760 B
BuildKey_ClientSpanWithPeerTags_GetEncodedPeerTags .NET Framework 4.8 3,438.80 ns 84.352 ns 778 B
Details

// <copyright file="StatsAggregatorBenchmark.cs" company="Datadog">
// Unless explicitly stated otherwise all files in this repository are licensed under the Apache 2 License.
// This product includes software developed at Datadog (https://www.datadoghq.com/). Copyright 2017 Datadog, Inc.
// </copyright>

using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using BenchmarkDotNet.Attributes;
using Datadog.Trace;
using Datadog.Trace.Agent;
using Datadog.Trace.Agent.DiscoveryService;
using Datadog.Trace.Configuration;
using Datadog.Trace.Tagging;

namespace Benchmarks.Trace;

/// <summary>
/// StatsAggregator.BuildKey benchmarks
/// </summary>
[MemoryDiagnoser]
[BenchmarkCategory(Constants.TracerCategory)]
public class StatsAggregatorBenchmark
{
    private static readonly List<StatsAggregator.PeerTagKey> PeerTagKeys =
    [
        new("_dd.base_service"),
        new("aws.queue.name"),
        new("aws.queue.url"),
        new("aws.s3.bucket"),
        new("aws.stream.name"),
        new("bucketname"),
        new("db.couchbase.seed.nodes"),
        new("db.hostname"),
        new("db.instance"),
        new("db.system"),
        new("messaging.destination"),
        new("messaging.kafka.bootstrap.servers"),
        new("messaging.rabbitmq.exchange"),
        new("messaging.system"),
        new("network.destination.name"),
        new("peer.hostname"),
        new("peer.service"),
        new("server.address"),
        new("topic"),
    ];

    private StatsAggregator _aggregator;
    private Span _simpleSpan;
    private Span _clientSpanNoPeerTags;
    private Span _clientSpanWithPeerTags;
    private StatsAggregationKey _key;
    private List<byte[]> _encoded;

    [GlobalSetup]
    public void GlobalSetup()
    {
        _aggregator = new StatsAggregator(
            new NoOpApi(),
            new TracerSettings(),
            NullDiscoveryService.Instance,
            isOtlp: false);

        var now = DateTimeOffset.UtcNow;

        // Simple span: no span kind, no peer tags — exercises the "internal" fast path
        _simpleSpan = CreateSpan(now, "web-service", "web.request", "GET /api/users", "web");

        // Client span with SpanKind but no matching peer tags — iterates peer tag keys but finds nothing
        var clientTags = new HttpTags { HttpMethod = "GET", HttpStatusCode = "200" };
        _clientSpanNoPeerTags = CreateSpan(now, "web-service", "http.request", "GET /api/orders", "http", clientTags);

        // Client span with several peer tags set — the most expensive path
        var peerTags = new HttpTags { HttpMethod = "POST", HttpStatusCode = "200" };
        _clientSpanWithPeerTags = CreateSpan(now, "web-service", "http.request", "POST /api/data", "http", peerTags);
        _clientSpanWithPeerTags.Tags.SetTag("peer.service", "remote-service");
        _clientSpanWithPeerTags.Tags.SetTag("db.instance", "i-1234");
        _clientSpanWithPeerTags.Tags.SetTag("db.system", "postgres");
        _clientSpanWithPeerTags.Tags.SetTag("server.address", "db.example.com");
        _clientSpanWithPeerTags.Tags.SetTag("network.destination.name", "db.example.com");

    }

    [GlobalCleanup]
    public void GlobalCleanup()
    {
        _aggregator.DisposeAsync().GetAwaiter().GetResult();
        var value = _key;
        var encoded = _encoded;
    }

    /// <summary>
    /// BuildKey for a simple span with no span kind (internal fast-path, no peer tag iteration).
    /// </summary>
    [Benchmark]
    public void BuildKey_SimpleSpan()
    {
        _key = _aggregator.BuildKey(_simpleSpan, PeerTagKeys, out _);
    }

    /// <summary>
    /// BuildKey for a client span that has no matching peer tags (iterates all peer tag keys, finds none).
    /// </summary>
    [Benchmark]
    public void BuildKey_ClientSpanNoPeerTags()
    {
        _key = _aggregator.BuildKey(_clientSpanNoPeerTags, PeerTagKeys, out _);
    }

    /// <summary>
    /// BuildKey for a client span with several peer tags set (UTF-8 encoding + FNV hashing).
    /// </summary>
    [Benchmark]
    public void BuildKey_ClientSpanWithPeerTags()
    {
        _key = _aggregator.BuildKey(_clientSpanWithPeerTags, PeerTagKeys, out var results);
    }

    /// <summary>
    /// BuildKey for a client span with several peer tags set (UTF-8 encoding + FNV hashing).
    /// </summary>
    [Benchmark]
    public void BuildKey_ClientSpanWithPeerTags_GetEncodedPeerTags()
    {
        _key = _aggregator.BuildKey(_clientSpanWithPeerTags, PeerTagKeys, out var results);
        _encoded = StatsAggregator.GetEncodedPeerTags(_clientSpanWithPeerTags, PeerTagKeys, in results);
    }

    private static Span CreateSpan(DateTimeOffset start, string serviceName, string operationName, string resourceName, string type, ITags tags = null)
    {
        var tracer = Benchmarks.Trace.Asm.EmptyDatadogTracer.Instance;
        var traceContext = new TraceContext(tracer);
        var context = new SpanContext(null, traceContext, serviceName);
        var span = new Span(context, start, tags);
        span.OperationName = operationName;
        span.ResourceName = resourceName;
        span.Type = type;
        return span;
    }

    private sealed class NoOpApi : IApi
    {
        public TracesEncoding TracesEncoding => TracesEncoding.DatadogV0_4;

        public Task<bool> Ping() => Task.FromResult(true);

        public Task<bool> SendTracesAsync(ArraySegment<byte> traces, int numberOfTraces, bool statsComputationEnabled, long numberOfDroppedP0Traces, long numberOfDroppedP0Spans, bool apmTracingEnabled = true)
            => Task.FromResult(true);

        public Task<bool> SendStatsAsync(StatsBuffer stats, long bucketDuration, int tracerObfuscationVersion)
            => Task.FromResult(true);
    }
}

Other details

Part of a stack:

@andrewlock andrewlock added area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) type:performance Performance, speed, latency, resource usage (CPU, memory) labels Apr 13, 2026
@andrewlock andrewlock requested a review from a team as a code owner April 13, 2026 12:34
@dd-trace-dotnet-ci-bot
Copy link
Copy Markdown

dd-trace-dotnet-ci-bot bot commented Apr 13, 2026

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing This PR (8445) and master.

✅ No regressions detected - check the details below

Full Metrics Comparison

FakeDbCommand

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration72.00 ± (71.93 - 72.25) ms71.12 ± (71.17 - 71.47) ms-1.2%
.NET Framework 4.8 - Bailout
duration75.72 ± (75.64 - 76.02) ms76.45 ± (76.26 - 76.64) ms+1.0%✅⬆️
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1069.15 ± (1071.14 - 1080.01) ms1063.89 ± (1064.62 - 1069.88) ms-0.5%
.NET Core 3.1 - Baseline
process.internal_duration_ms22.29 ± (22.26 - 22.32) ms22.19 ± (22.15 - 22.22) ms-0.5%
process.time_to_main_ms83.82 ± (83.62 - 84.01) ms83.37 ± (83.17 - 83.58) ms-0.5%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.92 ± (10.92 - 10.93) MB10.93 ± (10.93 - 10.93) MB+0.1%✅⬆️
runtime.dotnet.threads.count12 ± (12 - 12)12 ± (12 - 12)+0.0%
.NET Core 3.1 - Bailout
process.internal_duration_ms22.31 ± (22.27 - 22.34) ms22.16 ± (22.12 - 22.20) ms-0.6%
process.time_to_main_ms85.12 ± (84.96 - 85.29) ms83.92 ± (83.72 - 84.12) ms-1.4%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.96 ± (10.96 - 10.97) MB10.95 ± (10.95 - 10.96) MB-0.1%
runtime.dotnet.threads.count13 ± (13 - 13)13 ± (13 - 13)+0.0%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms227.08 ± (225.83 - 228.34) ms226.96 ± (225.56 - 228.35) ms-0.1%
process.time_to_main_ms518.37 ± (517.13 - 519.61) ms518.60 ± (517.33 - 519.86) ms+0.0%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed48.42 ± (48.39 - 48.45) MB48.45 ± (48.41 - 48.48) MB+0.0%✅⬆️
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.0%
.NET 6 - Baseline
process.internal_duration_ms21.04 ± (21.00 - 21.08) ms20.82 ± (20.79 - 20.85) ms-1.0%
process.time_to_main_ms72.40 ± (72.24 - 72.56) ms71.76 ± (71.61 - 71.92) ms-0.9%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.63 ± (10.63 - 10.63) MB10.63 ± (10.63 - 10.64) MB+0.1%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 6 - Bailout
process.internal_duration_ms21.00 ± (20.97 - 21.04) ms20.90 ± (20.87 - 20.94) ms-0.5%
process.time_to_main_ms73.79 ± (73.63 - 73.95) ms73.45 ± (73.29 - 73.61) ms-0.5%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.68 ± (10.68 - 10.69) MB10.75 ± (10.74 - 10.75) MB+0.6%✅⬆️
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms385.43 ± (383.54 - 387.32) ms383.76 ± (381.85 - 385.66) ms-0.4%
process.time_to_main_ms517.75 ± (516.81 - 518.69) ms519.45 ± (518.53 - 520.37) ms+0.3%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed50.08 ± (50.05 - 50.11) MB50.11 ± (50.08 - 50.13) MB+0.1%✅⬆️
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.0%✅⬆️
.NET 8 - Baseline
process.internal_duration_ms19.18 ± (19.14 - 19.21) ms19.16 ± (19.12 - 19.19) ms-0.1%
process.time_to_main_ms71.25 ± (71.08 - 71.42) ms71.36 ± (71.21 - 71.51) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.69 ± (7.68 - 7.70) MB7.69 ± (7.69 - 7.70) MB+0.0%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 8 - Bailout
process.internal_duration_ms19.17 ± (19.13 - 19.20) ms19.19 ± (19.16 - 19.23) ms+0.2%✅⬆️
process.time_to_main_ms72.34 ± (72.21 - 72.48) ms72.39 ± (72.26 - 72.52) ms+0.1%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.73 ± (7.72 - 7.74) MB7.74 ± (7.73 - 7.75) MB+0.1%✅⬆️
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms306.20 ± (303.93 - 308.48) ms304.76 ± (302.54 - 306.98) ms-0.5%
process.time_to_main_ms478.84 ± (477.97 - 479.70) ms478.44 ± (477.72 - 479.15) ms-0.1%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed37.11 ± (37.09 - 37.14) MB37.14 ± (37.11 - 37.17) MB+0.1%✅⬆️
runtime.dotnet.threads.count27 ± (27 - 27)27 ± (27 - 27)-0.5%

HttpMessageHandler

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration190.29 ± (190.30 - 190.99) ms191.69 ± (191.98 - 192.72) ms+0.7%✅⬆️
.NET Framework 4.8 - Bailout
duration193.46 ± (193.52 - 193.92) ms195.28 ± (195.39 - 196.23) ms+0.9%✅⬆️
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1139.43 ± (1138.81 - 1144.75) ms1135.47 ± (1137.27 - 1144.12) ms-0.3%
.NET Core 3.1 - Baseline
process.internal_duration_ms185.20 ± (184.92 - 185.49) ms185.06 ± (184.82 - 185.30) ms-0.1%
process.time_to_main_ms79.54 ± (79.37 - 79.70) ms79.64 ± (79.50 - 79.79) ms+0.1%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed16.18 ± (16.16 - 16.21) MB16.10 ± (16.07 - 16.12) MB-0.5%
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 20)+0.4%✅⬆️
.NET Core 3.1 - Bailout
process.internal_duration_ms184.57 ± (184.34 - 184.81) ms184.84 ± (184.57 - 185.10) ms+0.1%✅⬆️
process.time_to_main_ms80.97 ± (80.86 - 81.09) ms81.27 ± (81.12 - 81.41) ms+0.4%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed16.16 ± (16.07 - 16.26) MB16.19 ± (16.16 - 16.22) MB+0.2%✅⬆️
runtime.dotnet.threads.count21 ± (20 - 21)21 ± (21 - 21)+0.6%✅⬆️
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms392.10 ± (390.63 - 393.58) ms392.30 ± (390.95 - 393.65) ms+0.1%✅⬆️
process.time_to_main_ms504.22 ± (503.13 - 505.31) ms505.26 ± (503.70 - 506.81) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed58.64 ± (58.42 - 58.86) MB58.87 ± (58.66 - 59.08) MB+0.4%✅⬆️
runtime.dotnet.threads.count30 ± (30 - 30)30 ± (30 - 30)+0.3%✅⬆️
.NET 6 - Baseline
process.internal_duration_ms189.67 ± (189.36 - 189.98) ms189.72 ± (189.48 - 189.95) ms+0.0%✅⬆️
process.time_to_main_ms69.29 ± (69.14 - 69.45) ms69.54 ± (69.42 - 69.67) ms+0.4%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed15.86 ± (15.68 - 16.03) MB15.89 ± (15.73 - 16.06) MB+0.2%✅⬆️
runtime.dotnet.threads.count18 ± (18 - 18)18 ± (18 - 18)+0.3%✅⬆️
.NET 6 - Bailout
process.internal_duration_ms188.26 ± (188.11 - 188.41) ms188.52 ± (188.28 - 188.76) ms+0.1%✅⬆️
process.time_to_main_ms70.24 ± (70.19 - 70.30) ms70.37 ± (70.31 - 70.44) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.00 ± (15.83 - 16.17) MB15.80 ± (15.63 - 15.96) MB-1.3%
runtime.dotnet.threads.count19 ± (18 - 19)19 ± (19 - 19)+0.6%✅⬆️
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms594.85 ± (591.84 - 597.87) ms599.00 ± (596.50 - 601.50) ms+0.7%✅⬆️
process.time_to_main_ms508.09 ± (507.18 - 508.99) ms509.54 ± (508.65 - 510.44) ms+0.3%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed61.67 ± (61.58 - 61.77) MB61.58 ± (61.48 - 61.68) MB-0.2%
runtime.dotnet.threads.count30 ± (30 - 30)30 ± (30 - 30)-0.3%
.NET 8 - Baseline
process.internal_duration_ms185.99 ± (185.74 - 186.25) ms186.84 ± (186.59 - 187.09) ms+0.5%✅⬆️
process.time_to_main_ms68.37 ± (68.25 - 68.49) ms68.91 ± (68.76 - 69.05) ms+0.8%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed11.72 ± (11.65 - 11.79) MB11.73 ± (11.67 - 11.80) MB+0.1%✅⬆️
runtime.dotnet.threads.count18 ± (18 - 18)18 ± (17 - 18)-2.1%
.NET 8 - Bailout
process.internal_duration_ms185.71 ± (185.49 - 185.92) ms186.11 ± (185.93 - 186.28) ms+0.2%✅⬆️
process.time_to_main_ms69.62 ± (69.56 - 69.69) ms69.77 ± (69.72 - 69.83) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed11.67 ± (11.58 - 11.77) MB11.65 ± (11.54 - 11.75) MB-0.2%
runtime.dotnet.threads.count19 ± (18 - 19)18 ± (18 - 19)-0.7%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms521.08 ± (518.52 - 523.65) ms517.25 ± (514.60 - 519.90) ms-0.7%
process.time_to_main_ms466.62 ± (465.98 - 467.26) ms467.41 ± (466.69 - 468.12) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed50.75 ± (50.72 - 50.78) MB50.81 ± (50.78 - 50.84) MB+0.1%✅⬆️
runtime.dotnet.threads.count30 ± (30 - 30)30 ± (30 - 30)+0.1%✅⬆️
Comparison explanation

Execution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

Duration charts
FakeDbCommand (.NET Framework 4.8)
gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8445) - mean (71ms)  : 69, 74
    master - mean (72ms)  : 70, 74

    section Bailout
    This PR (8445) - mean (76ms)  : 75, 78
    master - mean (76ms)  : 74, 78

    section CallTarget+Inlining+NGEN
    This PR (8445) - mean (1,067ms)  : 1029, 1105
    master - mean (1,076ms)  : 1010, 1141

Loading
FakeDbCommand (.NET Core 3.1)
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8445) - mean (112ms)  : 108, 116
    master - mean (114ms)  : 110, 117

    section Bailout
    This PR (8445) - mean (112ms)  : 110, 115
    master - mean (115ms)  : 112, 118

    section CallTarget+Inlining+NGEN
    This PR (8445) - mean (782ms)  : 761, 803
    master - mean (784ms)  : 764, 805

Loading
FakeDbCommand (.NET 6)
gantt
    title Execution time (ms) FakeDbCommand (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8445) - mean (99ms)  : 95, 102
    master - mean (100ms)  : 96, 103

    section Bailout
    This PR (8445) - mean (101ms)  : 98, 103
    master - mean (101ms)  : 98, 104

    section CallTarget+Inlining+NGEN
    This PR (8445) - mean (931ms)  : 896, 967
    master - mean (932ms)  : 899, 964

Loading
FakeDbCommand (.NET 8)
gantt
    title Execution time (ms) FakeDbCommand (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8445) - mean (98ms)  : 95, 101
    master - mean (98ms)  : 94, 103

    section Bailout
    This PR (8445) - mean (99ms)  : 97, 101
    master - mean (99ms)  : 97, 101

    section CallTarget+Inlining+NGEN
    This PR (8445) - mean (814ms)  : 777, 851
    master - mean (814ms)  : 781, 848

Loading
HttpMessageHandler (.NET Framework 4.8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8445) - mean (192ms)  : 187, 198
    master - mean (191ms)  : 187, 194

    section Bailout
    This PR (8445) - mean (196ms)  : 192, 200
    master - mean (194ms)  : 192, 196

    section CallTarget+Inlining+NGEN
    This PR (8445) - mean (1,141ms)  : 1092, 1189
    master - mean (1,142ms)  : 1100, 1184

Loading
HttpMessageHandler (.NET Core 3.1)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8445) - mean (273ms)  : 269, 276
    master - mean (273ms)  : 269, 277

    section Bailout
    This PR (8445) - mean (274ms)  : 270, 278
    master - mean (273ms)  : 270, 276

    section CallTarget+Inlining+NGEN
    This PR (8445) - mean (924ms)  : 901, 947
    master - mean (924ms)  : 895, 953

Loading
HttpMessageHandler (.NET 6)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8445) - mean (267ms)  : 264, 270
    master - mean (267ms)  : 262, 272

    section Bailout
    This PR (8445) - mean (267ms)  : 264, 270
    master - mean (266ms)  : 264, 268

    section CallTarget+Inlining+NGEN
    This PR (8445) - mean (1,137ms)  : 1096, 1178
    master - mean (1,133ms)  : 1090, 1176

Loading
HttpMessageHandler (.NET 8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8445) - mean (265ms)  : 261, 269
    master - mean (264ms)  : 261, 267

    section Bailout
    This PR (8445) - mean (265ms)  : 263, 268
    master - mean (265ms)  : 261, 268

    section CallTarget+Inlining+NGEN
    This PR (8445) - mean (1,016ms)  : 981, 1052
    master - mean (1,019ms)  : 980, 1058

Loading

@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Apr 13, 2026

Benchmarks

Benchmark execution time: 2026-04-16 09:06:29

Comparing candidate commit 7817c48 in PR branch andrew/client-side-stats/peer-tag-improvements with baseline commit b176976 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 27 metrics, 0 unstable metrics, 87 known flaky benchmarks.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

Known flaky benchmarks

These benchmarks are marked as flaky and will not trigger a failure. Modify FLAKY_BENCHMARKS_REGEX to control which benchmarks are marked as flaky.

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.008%; +0.005%]
  • ignore execution_time [-1003.130µs; -30.441µs] or [-0.498%; -0.015%]
  • ignore throughput [-920.364op/s; -362.962op/s] or [-1.091%; -0.430%]

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.006%; +0.007%]
  • ignore execution_time [-1481.501µs; +2054.290µs] or [-0.739%; +1.025%]
  • 🟩 throughput [+10249.162op/s; +12507.939op/s] or [+8.615%; +10.513%]

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.003%; +0.007%]
  • ignore execution_time [+0.743ms; +2.938ms] or [+0.374%; +1.478%]
  • ignore throughput [-1524.100op/s; -317.290op/s] or [-1.550%; -0.323%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net472

  • ignore allocated_mem [-20 bytes; -19 bytes] or [-0.613%; -0.600%]
  • 🟥 execution_time [+304.183ms; +305.830ms] or [+150.946%; +151.763%]
  • ignore throughput [+10.954op/s; +14.634op/s] or [+1.971%; +2.633%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.009%; +0.002%]
  • 🟥 execution_time [+381.903ms; +384.841ms] or [+301.726%; +304.048%]
  • ignore throughput [+2.208op/s; +5.743op/s] or [+0.291%; +0.757%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1

  • ignore allocated_mem [+1 bytes; +2 bytes] or [+0.065%; +0.075%]
  • 🟥 execution_time [+396.674ms; +399.576ms] or [+351.041%; +353.609%]
  • ignore throughput [-5.612op/s; -1.776op/s] or [-0.793%; -0.251%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody net472

  • 🟥 allocated_mem [+1.308KB; +1.308KB] or [+27.529%; +27.541%]
  • ignore execution_time [+156.932µs; +775.678µs] or [+0.078%; +0.387%]
  • ignore throughput [-4208.471op/s; -3805.513op/s] or [-3.274%; -2.961%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody net6.0

  • 🟥 allocated_mem [+471 bytes; +472 bytes] or [+9.977%; +9.987%]
  • 🟩 execution_time [-15.982ms; -11.795ms] or [-7.464%; -5.508%]
  • ignore throughput [+5026.561op/s; +7863.829op/s] or [+3.669%; +5.740%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody netcoreapp3.1

  • 🟥 allocated_mem [+1.272KB; +1.272KB] or [+27.502%; +27.510%]
  • ignore execution_time [-11.563ms; -7.435ms] or [-5.506%; -3.541%]
  • ignore throughput [-1646.426op/s; +633.517op/s] or [-1.489%; +0.573%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net472

  • 🟥 allocated_mem [+1.307KB; +1.307KB] or [+105.746%; +105.759%]
  • ignore execution_time [-1104.883µs; -354.616µs] or [-0.550%; -0.177%]
  • 🟥 throughput [-248575.897op/s; -245257.623op/s] or [-25.381%; -25.042%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net6.0

  • 🟥 allocated_mem [+471 bytes; +472 bytes] or [+38.558%; +38.566%]
  • 🟩 execution_time [-25.833ms; -20.928ms] or [-11.520%; -9.333%]
  • ignore throughput [-60591.383op/s; -37524.573op/s] or [-6.473%; -4.009%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody netcoreapp3.1

  • 🟥 allocated_mem [+1.272KB; +1.272KB] or [+105.292%; +105.304%]
  • ignore execution_time [+0.969ms; +5.277ms] or [+0.484%; +2.634%]
  • 🟥 throughput [-151960.329op/s; -135786.835op/s] or [-21.834%; -19.510%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.007%; +0.003%]
  • ignore execution_time [-1014.543µs; +8.122µs] or [-0.506%; +0.004%]
  • ignore throughput [-534.259op/s; +283.915op/s] or [-0.360%; +0.191%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.004%; +0.003%]
  • ignore execution_time [-1163.272µs; +2369.451µs] or [-0.587%; +1.196%]
  • 🟩 throughput [+11807.959op/s; +14754.222op/s] or [+7.513%; +9.388%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.007%; +0.003%]
  • ignore execution_time [-8.556ms; +1.355ms] or [-4.362%; +0.691%]
  • ignore throughput [+6204.760op/s; +9160.707op/s] or [+4.943%; +7.298%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.008%; +0.004%]
  • ignore execution_time [-276.980µs; -61.787µs] or [-0.138%; -0.031%]
  • ignore throughput [+70313.442op/s; +82821.295op/s] or [+2.139%; +2.520%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.006%; +0.008%]
  • ignore execution_time [-3.170ms; -2.138ms] or [-1.567%; -1.057%]
  • 🟩 throughput [+488750.505op/s; +506466.879op/s] or [+16.297%; +16.888%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.004%; +0.004%]
  • 🟩 execution_time [-19.012ms; -14.641ms] or [-8.764%; -6.749%]
  • 🟩 throughput [+177678.089op/s; +232286.081op/s] or [+7.053%; +9.220%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs net472

  • ignore allocated_mem [+0 bytes; +2 bytes] or [-0.001%; +0.007%]
  • 🟥 execution_time [+299.543ms; +300.168ms] or [+149.672%; +149.984%]
  • ignore throughput [+68.391op/s; +109.018op/s] or [+0.755%; +1.204%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs net6.0

  • ignore allocated_mem [-1 bytes; +2 bytes] or [-0.004%; +0.008%]
  • 🟥 execution_time [+299.082ms; +302.207ms] or [+150.828%; +152.404%]
  • ignore throughput [+401.540op/s; +611.416op/s] or [+3.071%; +4.676%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs netcoreapp3.1

  • ignore allocated_mem [-1 bytes; +2 bytes] or [-0.004%; +0.008%]
  • 🟥 execution_time [+300.223ms; +302.697ms] or [+151.229%; +152.475%]
  • ignore throughput [+86.837op/s; +215.923op/s] or [+0.838%; +2.085%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net472

  • ignore allocated_mem [+5 bytes; +6 bytes] or [+0.277%; +0.290%]
  • 🟥 execution_time [+296.509ms; +297.326ms] or [+145.633%; +146.035%]
  • ignore throughput [-14.079op/s; -5.154op/s] or [-0.373%; -0.137%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.004%; +0.009%]
  • 🟥 execution_time [+294.624ms; +296.207ms] or [+144.031%; +144.805%]
  • ignore throughput [+118.893op/s; +157.765op/s] or [+1.727%; +2.292%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.004%; +0.009%]
  • 🟥 execution_time [+300.263ms; +301.559ms] or [+150.071%; +150.719%]
  • ignore throughput [+52.401op/s; +72.428op/s] or [+1.040%; +1.438%]

scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • ignore execution_time [+2.658µs; +6.118µs] or [+0.546%; +1.256%]
  • ignore throughput [-25.351op/s; -11.010op/s] or [-1.235%; -0.536%]

scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.000%; +0.010%]
  • ignore execution_time [+17.740µs; +44.379µs] or [+4.069%; +10.178%]
  • ignore throughput [-218.709op/s; -98.846op/s] or [-9.509%; -4.297%]

scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.003%; +0.003%]
  • ignore execution_time [+7.652µs; +29.586µs] or [+1.639%; +6.339%]
  • ignore throughput [-146.117op/s; -65.486op/s] or [-6.745%; -3.023%]

scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • ignore execution_time [-6.170µs; -2.216µs] or [-1.666%; -0.598%]
  • ignore throughput [+17.042op/s; +45.596op/s] or [+0.631%; +1.689%]

scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.003%; +0.007%]
  • 🟥 execution_time [+21.504µs; +45.074µs] or [+6.865%; +14.390%]
  • 🟥 throughput [-420.882op/s; -222.514op/s] or [-13.120%; -6.936%]

scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.003%; +0.003%]
  • ignore execution_time [-12.611µs; +9.829µs] or [-3.450%; +2.689%]
  • ignore throughput [-106.811op/s; +27.285op/s] or [-3.833%; +0.979%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • 🟥 execution_time [+300.000ms; +300.655ms] or [+149.731%; +150.058%]
  • ignore throughput [-1556283.201op/s; -992576.013op/s] or [-0.780%; -0.497%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net6.0

  • ignore allocated_mem [+113 bytes; +115 bytes] or [+0.633%; +0.644%]
  • unstable execution_time [+372.738ms; +403.324ms] or [+404.995%; +438.228%]
  • 🟩 throughput [+1113.601op/s; +1264.601op/s] or [+9.151%; +10.391%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest netcoreapp3.1

  • ignore allocated_mem [+20 bytes; +22 bytes] or [+0.099%; +0.110%]
  • unstable execution_time [+288.243ms; +326.602ms] or [+218.860%; +247.986%]
  • 🟩 throughput [+707.439op/s; +907.444op/s] or [+6.848%; +8.785%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net472

  • ignore allocated_mem [+2.787KB; +2.791KB] or [+4.951%; +4.959%]
  • unstable execution_time [+275.780ms; +331.500ms] or [+126.801%; +152.420%]
  • 🟥 throughput [-470.021op/s; -414.700op/s] or [-42.588%; -37.576%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

  • ignore allocated_mem [-1.270KB; -1.268KB] or [-2.995%; -2.990%]
  • unstable execution_time [+181.865ms; +316.744ms] or [+77.503%; +134.983%]
  • 🟥 throughput [-741.137op/s; -657.722op/s] or [-49.434%; -43.870%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1

  • 🟥 allocated_mem [+2.304KB; +2.308KB] or [+5.441%; +5.449%]
  • 🟥 execution_time [+346.745ms; +355.765ms] or [+207.394%; +212.789%]
  • 🟥 throughput [-425.575op/s; -388.938op/s] or [-29.632%; -27.081%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • ignore execution_time [-59.596µs; -45.584µs] or [-2.999%; -2.294%]
  • ignore throughput [+11.973op/s; +15.711op/s] or [+2.379%; +3.122%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • ignore execution_time [-51.296µs; -39.203µs] or [-3.524%; -2.693%]
  • ignore throughput [+19.342op/s; +25.279op/s] or [+2.815%; +3.680%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • ignore execution_time [-179.866µs; -83.346µs] or [-6.257%; -2.900%]
  • ignore throughput [+11.393op/s; +30.695op/s] or [+3.275%; +8.823%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • ignore execution_time [-12.612µs; -8.641µs] or [-1.089%; -0.746%]
  • ignore throughput [+6.528op/s; +9.518op/s] or [+0.756%; +1.102%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • ignore execution_time [-57.236µs; -48.176µs] or [-5.308%; -4.468%]
  • ignore throughput [+43.869op/s; +52.157op/s] or [+4.730%; +5.624%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • ignore execution_time [-81.200µs; +19.440µs] or [-4.350%; +1.041%]
  • unstable throughput [-0.362op/s; +62.625op/s] or [-0.068%; +11.689%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net472

  • ignore allocated_mem [-43 bytes; +21 bytes] or [-0.007%; +0.003%]
  • ignore execution_time [+21.787µs; +43.701µs] or [+0.851%; +1.707%]
  • ignore throughput [-6.365op/s; -3.159op/s] or [-1.630%; -0.809%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net6.0

  • ignore allocated_mem [-38 bytes; +46 bytes] or [-0.006%; +0.007%]
  • 🟩 execution_time [-160.617µs; -121.984µs] or [-8.136%; -6.179%]
  • 🟩 throughput [+35.331op/s; +45.531op/s] or [+6.975%; +8.988%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice netcoreapp3.1

  • ignore allocated_mem [-42 bytes; +23 bytes] or [-0.007%; +0.004%]
  • ignore execution_time [-126.752µs; -89.488µs] or [-3.214%; -2.269%]
  • ignore throughput [+6.114op/s; +8.476op/s] or [+2.411%; +3.342%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.001%; +0.008%]
  • 🟥 execution_time [+301.792ms; +303.524ms] or [+151.977%; +152.849%]
  • ignore throughput [-1491.010op/s; +177.565op/s] or [-0.480%; +0.057%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.006%; +0.004%]
  • 🟥 execution_time [+299.586ms; +300.751ms] or [+150.123%; +150.707%]
  • ignore throughput [+20083.398op/s; +23765.929op/s] or [+3.166%; +3.747%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.006%; +0.004%]
  • 🟥 execution_time [+301.575ms; +304.953ms] or [+151.498%; +153.195%]
  • ignore throughput [+15324.629op/s; +23799.018op/s] or [+3.228%; +5.013%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.008%; +0.004%]
  • 🟥 execution_time [+303.928ms; +305.517ms] or [+152.622%; +153.420%]
  • ignore throughput [+3311.554op/s; +5022.570op/s] or [+1.109%; +1.683%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.007%; +0.006%]
  • 🟥 execution_time [+299.274ms; +300.709ms] or [+147.978%; +148.687%]
  • ignore throughput [+12516.609op/s; +18338.595op/s] or [+2.017%; +2.955%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.006%; +0.004%]
  • 🟥 execution_time [+303.749ms; +307.304ms] or [+153.953%; +155.755%]
  • ignore throughput [-9573.710op/s; -1396.343op/s] or [-2.067%; -0.302%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net472

  • ignore allocated_mem [+0 bytes; +1 bytes] or [+0.108%; +0.119%]
  • 🟥 execution_time [+301.990ms; +303.528ms] or [+151.572%; +152.344%]
  • ignore throughput [+2728.891op/s; +4517.427op/s] or [+0.708%; +1.172%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.006%; +0.006%]
  • 🟥 execution_time [+299.091ms; +300.771ms] or [+149.069%; +149.907%]
  • 🟩 throughput [+56641.495op/s; +61387.793op/s] or [+11.247%; +12.190%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.006%; +0.006%]
  • 🟥 execution_time [+299.742ms; +302.433ms] or [+149.119%; +150.458%]
  • ignore throughput [-13598.146op/s; -8439.399op/s] or [-3.219%; -1.998%]

scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.003%; +0.006%]
  • ignore execution_time [-1002.132µs; -164.320µs] or [-0.498%; -0.082%]
  • ignore throughput [-3313.859op/s; -2118.005op/s] or [-1.333%; -0.852%]

scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.005%; +0.003%]
  • 🟩 execution_time [-16.664ms; -13.013ms] or [-7.749%; -6.051%]
  • 🟩 throughput [+23939.803op/s; +30745.874op/s] or [+6.567%; +8.434%]

scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.004%; +0.008%]
  • ignore execution_time [-0.745ms; +3.195ms] or [-0.374%; +1.603%]
  • ignore throughput [+9468.345op/s; +15126.733op/s] or [+3.456%; +5.522%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark net472

  • ignore allocated_mem [-4.459KB; -4.431KB] or [-1.623%; -1.613%]
  • ignore execution_time [+5.892µs; +46.147µs] or [+1.455%; +11.399%]
  • ignore throughput [-243.031op/s; -38.706op/s] or [-9.780%; -1.558%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark net6.0

  • 🟩 allocated_mem [-21.041KB; -21.020KB] or [-7.675%; -7.668%]
  • unstable execution_time [-32.924µs; +20.231µs] or [-6.507%; +3.999%]
  • ignore throughput [-67.161op/s; +118.013op/s] or [-3.351%; +5.889%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark netcoreapp3.1

  • 🟩 allocated_mem [-19.782KB; -19.767KB] or [-7.211%; -7.206%]
  • ignore execution_time [-44.637µs; +12.222µs] or [-7.735%; +2.118%]
  • ignore throughput [-24.777op/s; +130.357op/s] or [-1.416%; +7.447%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net472

  • ignore allocated_mem [-2 bytes; +2 bytes] or [-0.005%; +0.006%]
  • ignore execution_time [-95.980ns; +1380.647ns] or [-0.166%; +2.392%]
  • ignore throughput [-384.320op/s; +33.669op/s] or [-2.218%; +0.194%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net6.0

  • ignore allocated_mem [-4 bytes; +0 bytes] or [-0.010%; -0.001%]
  • unstable execution_time [+6.215µs; +10.688µs] or [+14.690%; +25.263%]
  • 🟥 throughput [-4666.900op/s; -2850.978op/s] or [-19.646%; -12.002%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark netcoreapp3.1

  • ignore allocated_mem [-1 bytes; +1 bytes] or [-0.002%; +0.002%]
  • unstable execution_time [-15.575µs; -8.373µs] or [-24.165%; -12.990%]
  • 🟩 throughput [+2304.556op/s; +3787.317op/s] or [+14.139%; +23.236%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog net472

  • ignore allocated_mem [+1 bytes; +2 bytes] or [+0.039%; +0.050%]
  • 🟥 execution_time [+302.658ms; +303.699ms] or [+152.980%; +153.506%]
  • ignore throughput [-123.399op/s; -101.668op/s] or [-2.062%; -1.699%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog net6.0

  • ignore allocated_mem [-1 bytes; +0 bytes] or [-0.027%; -0.017%]
  • 🟥 execution_time [+301.438ms; +304.200ms] or [+153.431%; +154.837%]
  • ignore throughput [-76.786op/s; -1.388op/s] or [-0.952%; -0.017%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.005%; +0.005%]
  • 🟥 execution_time [+300.089ms; +301.891ms] or [+150.231%; +151.134%]
  • ignore throughput [-154.203op/s; -90.126op/s] or [-1.964%; -1.148%]

scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.005%; +0.005%]
  • ignore execution_time [-581.830µs; +85.909µs] or [-0.290%; +0.043%]
  • ignore throughput [-3530.032op/s; -1932.534op/s] or [-0.977%; -0.535%]

scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.004%; +0.007%]
  • ignore execution_time [-131.633µs; +600.373µs] or [-0.066%; +0.300%]
  • 🟩 throughput [+42322.660op/s; +44893.601op/s] or [+8.011%; +8.497%]

scenario:Benchmarks.Trace.RedisBenchmark.SendReceive netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.004%; +0.006%]
  • ignore execution_time [+1.148ms; +4.832ms] or [+0.582%; +2.449%]
  • ignore throughput [+2778.660op/s; +11027.752op/s] or [+0.658%; +2.610%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.005%; +0.006%]
  • 🟥 execution_time [+300.317ms; +301.987ms] or [+149.681%; +150.513%]
  • ignore throughput [-7224.746op/s; -6229.126op/s] or [-4.771%; -4.113%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+0.000%; +0.009%]
  • 🟥 execution_time [+302.548ms; +303.933ms] or [+151.925%; +152.620%]
  • ignore throughput [+1966.946op/s; +3415.213op/s] or [+0.855%; +1.485%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.004%; +0.003%]
  • 🟥 execution_time [+303.422ms; +305.894ms] or [+153.876%; +155.130%]
  • ignore throughput [+703.157op/s; +2543.603op/s] or [+0.396%; +1.433%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • 🟥 execution_time [+299.766ms; +300.517ms] or [+149.525%; +149.900%]
  • 🟩 throughput [+61308833.654op/s; +61561566.949op/s] or [+44.649%; +44.833%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net6.0

  • ignore allocated_mem [+84 bytes; +86 bytes] or [+0.495%; +0.505%]
  • 🟥 execution_time [+421.384ms; +424.344ms] or [+524.066%; +527.748%]
  • 🟩 throughput [+928.582op/s; +1099.722op/s] or [+7.178%; +8.501%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [+nan%; +nan%]
  • 🟥 execution_time [+299.310ms; +300.228ms] or [+149.289%; +149.747%]
  • ignore throughput [+1750055.130op/s; +2689765.173op/s] or [+0.775%; +1.191%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.005%; +0.006%]
  • ignore execution_time [+23.613µs; +490.134µs] or [+0.012%; +0.245%]
  • ignore throughput [+8553.207op/s; +12513.759op/s] or [+0.955%; +1.397%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.005%; +0.007%]
  • ignore execution_time [-4.527ms; -3.417ms] or [-2.217%; -1.674%]
  • 🟩 throughput [+94202.719op/s; +103266.625op/s] or [+8.795%; +9.642%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.003%; +0.006%]
  • ignore execution_time [-0.338ms; +3.777ms] or [-0.171%; +1.911%]
  • 🟩 throughput [+57217.485op/s; +76900.492op/s] or [+6.623%; +8.901%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.000%; +0.009%]
  • ignore execution_time [+0.975µs; +693.050µs] or [+0.000%; +0.346%]
  • ignore throughput [+1003.290op/s; +5158.077op/s] or [+0.092%; +0.472%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.004%; +0.004%]
  • ignore execution_time [+6.244ms; +10.348ms] or [+3.253%; +5.391%]
  • 🟩 throughput [+90105.810op/s; +120607.555op/s] or [+6.974%; +9.335%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.001%; +0.008%]
  • ignore execution_time [-4.303ms; -2.858ms] or [-2.114%; -1.404%]
  • 🟩 throughput [+86881.499op/s; +96757.279op/s] or [+8.629%; +9.610%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.008%; +0.002%]
  • ignore execution_time [-901.036µs; +363.801µs] or [-0.448%; +0.181%]
  • ignore throughput [+7593.354op/s; +10633.719op/s] or [+1.692%; +2.369%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.004%; +0.009%]
  • ignore execution_time [-512.547µs; +1194.192µs] or [-0.256%; +0.596%]
  • 🟩 throughput [+53178.305op/s; +59284.684op/s] or [+9.656%; +10.765%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.006%; +0.004%]
  • ignore execution_time [-0.588ms; +3.531ms] or [-0.295%; +1.774%]
  • 🟩 throughput [+25608.172op/s; +35274.978op/s] or [+5.732%; +7.896%]

scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin net472

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.005%; +0.006%]
  • ignore execution_time [-1283.782µs; -315.773µs] or [-0.640%; -0.157%]
  • ignore throughput [-10721.860op/s; -6036.092op/s] or [-1.569%; -0.883%]

scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin net6.0

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.005%; +0.007%]
  • ignore execution_time [-1015.538µs; +2493.065µs] or [-0.508%; +1.247%]
  • 🟩 throughput [+57135.165op/s; +74486.524op/s] or [+6.383%; +8.322%]

scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin netcoreapp3.1

  • ignore allocated_mem [+0 bytes; +0 bytes] or [-0.005%; +0.005%]
  • ignore execution_time [+1.787ms; +5.687ms] or [+0.907%; +2.888%]
  • ignore throughput [+24712.856op/s; +39440.068op/s] or [+3.451%; +5.507%]

if (config.PeerTags is not null)
if (CanComputeStats.Value)
{
Log.Debug("Stats computation has been enabled.");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Log.Debug("Stats computation has been enabled.");
Log.Debug("Stats computation enabled.");

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂 That's the same log as was there before, I just moved it, but sure

else
{
Interlocked.Exchange(ref _peerTagKeys, config.PeerTags);
Log.Warning("Stats computation disabled because the detected agent does not support this feature.");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're almost certainly going to need an override for this when using the Rust agents, but that's a separate issue...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that whole question of how we identify the rust agent is a whole open question...

var spanMetaStructs = jObject["span_meta_structs"]?.Value<bool>() ?? false;
var spanEvents = jObject["span_events"]?.Value<bool>() ?? false;
var peerTags = (jObject["peer_tags"] as JArray)?.Values<string>().Where(x => !string.IsNullOrEmpty(x)).Distinct().OrderBy(x => x).ToList();
var peerTags = (jObject["peer_tags"] as JArray)?.Values<string>().ToList();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:tears-of-joy:

@andrewlock andrewlock force-pushed the andrew/client-side-stats/peer-tag-improvements branch from 58d4473 to 9daade0 Compare April 14, 2026 10:08
@andrewlock andrewlock requested review from a team as code owners April 14, 2026 10:08
@andrewlock andrewlock force-pushed the andrew/client-side-stats/fnvhash branch from e94c3c9 to 9b41576 Compare April 14, 2026 10:08
Copy link
Copy Markdown
Contributor

@zacharycmontoya zacharycmontoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@andrewlock andrewlock force-pushed the andrew/client-side-stats/peer-tag-improvements branch from 9daade0 to 7817c48 Compare April 16, 2026 07:59
@andrewlock andrewlock requested review from a team as code owners April 16, 2026 07:59
@andrewlock andrewlock force-pushed the andrew/client-side-stats/fnvhash branch from 9b41576 to 25dabec Compare April 16, 2026 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) type:performance Performance, speed, latency, resource usage (CPU, memory)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants