Skip to content

chore(benchmarks): add PR performance quality gate#220

Merged
dubloom merged 4 commits intomainfrom
dubloom/chore/pr-perf-quality-gate
Jun 16, 2025
Merged

chore(benchmarks): add PR performance quality gate#220
dubloom merged 4 commits intomainfrom
dubloom/chore/pr-perf-quality-gate

Conversation

@dubloom
Copy link
Copy Markdown
Contributor

@dubloom dubloom commented Jun 10, 2025

Description

Adds performance quality gate to dd-trace-cpp.
Now if a PR adds more than 20% overhead to microbenchmarks, a job called check-big-regressions will fail and will prevent the PR to be merged.

Motivation

Performance quality gates initiative.

Additional Notes

Jira ticket: APMSP-2048

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Jun 10, 2025

Benchmarks

Benchmark execution time: 2025-06-12 08:49:29

Comparing candidate commit 178f867 in PR branch dubloom/chore/pr-perf-quality-gate with baseline commit cf59df3 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.45%. Comparing base (cf59df3) to head (178f867).
⚠️ Report is 66 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #220   +/-   ##
=======================================
  Coverage   86.45%   86.45%           
=======================================
  Files          80       80           
  Lines        5251     5251           
=======================================
  Hits         4540     4540           
  Misses        711      711           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dubloom dubloom force-pushed the dubloom/chore/pr-perf-quality-gate branch from 07796a0 to 65544df Compare June 12, 2025 08:36
@dubloom dubloom requested a review from a team June 12, 2025 08:44
@dubloom dubloom marked this pull request as ready for review June 12, 2025 08:45
@dubloom dubloom requested a review from a team as a code owner June 12, 2025 08:45
@dubloom dubloom requested review from pablomartinezbernardo and removed request for a team June 12, 2025 08:45
@dubloom dubloom force-pushed the dubloom/chore/pr-perf-quality-gate branch from 65544df to 178f867 Compare June 12, 2025 08:46
@dmehala
Copy link
Copy Markdown
Contributor

dmehala commented Jun 12, 2025

Two questions @dubloom :

  • Where the 20% overheads comes from?
  • Can we lower it down to 5%?

@dubloom
Copy link
Copy Markdown
Contributor Author

dubloom commented Jun 12, 2025

@dmehala.

I agree that 20% is lot (especially because it is from one PR to another). Though from the ticket description: https://datadoghq.atlassian.net/browse/APMSP-2048, it seems the standard percentage.

I think @ddyurchenko could have a more relevant answer to that.

@ddyurchenko
Copy link
Copy Markdown
Contributor

ddyurchenko commented Jun 12, 2025

👋 Hi folks,

Based on the behavior of your benchmark that I see in https://benchmarking.us1.prod.dog/trends?timeStart=1741776100909&timeEnd=1749721300909&trendsTab=per_scenario&projectId=7&branch=main&scenario=BM_TraceTinyCCSource&trendsType=scenario, you indeed can reduce the block threshold down to 5% if you want to. I won't reduce it lower than 3.5% though.

The precision at with BP works depends on couple of factors (primarily, how stable hardware is and how stable is benchmark), usually we observed that regressions of size lower than 2%-3% are not catchable without additional measures. 20% is the safe threshold that won't block you randomly, at the same time, if you get something really disruptive, it will block and you will have confidence that you need to do something about it.

@dubloom
Copy link
Copy Markdown
Contributor Author

dubloom commented Jun 12, 2025

@dmehala The threshold has been decreased to 5%. It can be increased easily if needed.

@dubloom dubloom requested a review from dmehala June 12, 2025 12:58
Copy link
Copy Markdown
Contributor

@dmehala dmehala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Good job. :shipit:

@dubloom dubloom merged commit dbe6292 into main Jun 16, 2025
24 checks passed
@dubloom dubloom deleted the dubloom/chore/pr-perf-quality-gate branch June 16, 2025 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants