chore(benchmarks): add PR performance quality gate#220
Conversation
BenchmarksBenchmark execution time: 2025-06-12 08:49:29 Comparing candidate commit 178f867 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #220 +/- ##
=======================================
Coverage 86.45% 86.45%
=======================================
Files 80 80
Lines 5251 5251
=======================================
Hits 4540 4540
Misses 711 711 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
07796a0 to
65544df
Compare
65544df to
178f867
Compare
|
Two questions @dubloom :
|
|
I agree that 20% is lot (especially because it is from one PR to another). Though from the ticket description: https://datadoghq.atlassian.net/browse/APMSP-2048, it seems the standard percentage. I think @ddyurchenko could have a more relevant answer to that. |
|
👋 Hi folks, Based on the behavior of your benchmark that I see in https://benchmarking.us1.prod.dog/trends?timeStart=1741776100909&timeEnd=1749721300909&trendsTab=per_scenario&projectId=7&branch=main&scenario=BM_TraceTinyCCSource&trendsType=scenario, you indeed can reduce the block threshold down to 5% if you want to. I won't reduce it lower than 3.5% though. The precision at with BP works depends on couple of factors (primarily, how stable hardware is and how stable is benchmark), usually we observed that regressions of size lower than 2%-3% are not catchable without additional measures. 20% is the safe threshold that won't block you randomly, at the same time, if you get something really disruptive, it will block and you will have confidence that you need to do something about it. |
|
@dmehala The threshold has been decreased to 5%. It can be increased easily if needed. |
Description
Adds performance quality gate to
dd-trace-cpp.Now if a PR adds more than 20% overhead to microbenchmarks, a job called check-big-regressions will fail and will prevent the PR to be merged.
Motivation
Performance quality gates initiative.
Additional Notes
Jira ticket: APMSP-2048