Skip to content

Fix microbenchmarks crash caused by suspended GitHub App#5538

Closed
p-datadog wants to merge 1 commit intomasterfrom
fix/microbenchmarks-baseline-fallback
Closed

Fix microbenchmarks crash caused by suspended GitHub App#5538
p-datadog wants to merge 1 commit intomasterfrom
fix/microbenchmarks-baseline-fallback

Conversation

@p-datadog
Copy link
Copy Markdown
Member

@p-datadog p-datadog commented Apr 1, 2026

What does this PR do?

Sets MANUAL_BASELINE_BRANCH before calling bp-runner in the microbenchmarks job, bypassing the github-find-merge-into-branch tool that crashes with a 403 error due to a suspended GitHub App installation (incident-51987).

Motivation:

All dd-trace-rb PRs that touch lib/, ext/, benchmarks/, datadog.gemspec, or .gitlab/ are currently failing microbenchmarks. The failure chain:

  1. The microbenchmarks GitLab job runs bp-runner bp-runner.yml, which uses the shared run_microbenchmarks template from benchmarking-platform-tools.
  2. Because dd-trace-rb's bp-runner.yml has parallelize:, the template uses a separate setup script (run-microbenchmarks__setup.sh).
  3. During setup, github-find-merge-into-branch attempts to authenticate via a GitHub App installation to find the PR's target branch. That App is suspended:
    github.GithubException.GithubException: 403 {"message": "This installation has been suspended"}
    
  4. The setup script exits non-zero. bp-runner catches this as NonZeroExitCodeError and fails the job.
  5. With microbenchmarks failed, dd-gitlab/default-pipeline reports failure, and all-jobs-are-green fails on the GitHub side.

The shared template already has an escape hatch: if MANUAL_BASELINE_BRANCH is set, it skips github-find-merge-into-branch entirely. This PR sets that variable to the repo's default branch (detected from the clone's remote), which is master for dd-trace-rb.

Why this approach:

Other tracers already solved this at their own level rather than depending on the shared template:

  • Go, Cpp, PHP: Call github-find-merge-into-branch with || : in their own custom run-benchmarks.sh scripts
  • Python: Has a separate baseline:detect job using dd-octo-sts auth with || echo "main" fallback (in production since March 2025)
  • Java: Hardcodes origin/master as baseline
  • Dotnet: Downloads baseline from S3

dd-trace-rb is the only tracer still using the unguarded shared template. This one-line fix matches the resilience pattern every other tracer already has.

Why benchmarking-platform PR #254 didn't fix this:

benchmarking-platform PR #254 was merged as a workaround for incident-51987, but it only patched steps/post-pr-comment.sh (the PR commenting step). The actual failure is in github-find-merge-into-branch during the setup step — a different code path that uses the same suspended App.

Trade-off:

MANUAL_BASELINE_BRANCH is set to the repo's default branch (master), not the PR's actual target branch. For PRs targeting non-default branches, the baseline comparison would be against master instead of the actual target. This is the same trade-off Go/Cpp/PHP make, and for dd-trace-rb where ~100% of PRs target master, it's correct behavior.

Change log entry

None.

Additional Notes:

Verified affected pipelines:

How to test the change?

Push a PR that touches lib/ or ext/ (to trigger the changes: rules for microbenchmarks) and verify the microbenchmarks jobs succeed. The bp-runner log should show MANUAL_BASELINE_BRANCH being used instead of calling github-find-merge-into-branch.

Set MANUAL_BASELINE_BRANCH before bp-runner so the shared template
skips github-find-merge-into-branch, which fails with 403 because
the GitHub App installation is suspended (incident-51987).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@p-datadog p-datadog added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Apr 1, 2026
@p-datadog p-datadog requested a review from a team as a code owner April 1, 2026 16:43
@p-datadog p-datadog added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Apr 1, 2026
@p-datadog
Copy link
Copy Markdown
Member Author

See also #5539 for a more robust alternative that detects the actual PR target branch via dd-octo-sts + gh CLI (modeled on dd-trace-py). This PR (#5538) is the simpler/faster fix; #5539 is the long-term solution.

@datadog-datadog-prod-us1
Copy link
Copy Markdown
Contributor

datadog-datadog-prod-us1 bot commented Apr 1, 2026

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 95.33% (-0.04%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: d94d32a | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

p-datadog pushed a commit that referenced this pull request Apr 1, 2026
* origin/fix/microbenchmarks-baseline-fallback:
  Fix microbenchmarks crash by bypassing suspended GitHub App
@p-datadog p-datadog closed this Apr 1, 2026
@p-datadog p-datadog deleted the fix/microbenchmarks-baseline-fallback branch April 1, 2026 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants