Skip to content

Exchange test statistics through S3#11579

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits into
masterfrom
bdu/test-count-jobs-s3
Jun 5, 2026
Merged

Exchange test statistics through S3#11579
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits into
masterfrom
bdu/test-count-jobs-s3

Conversation

@bric3
Copy link
Copy Markdown
Contributor

@bric3 bric3 commented Jun 5, 2026

What Does This Do

Uploads each CI test job's test statistics JSON report to S3 under a pipeline-specific prefix, then has the aggregate job download those files before building the summary report.

This replaces the previous mechanism relying on GitLab artifacts, that was performing a sequential download of every test job artifacts content (including things useless here). And it re-enables test count aggregation for branch pipelines.

And a PR branch the job should now take ~52 seconds (with ~50 seconds setting up container), while it was ~10 minutes (on merge queue it could be longer)

Motivation

The aggregate job needs test count files from many test jobs, but relying on GitLab's default artifact download is very long, in merge queue it can take as long as 20 minutes, also this job is on the critical path. Using S3 keeps the aggregate input explicit and lets branch pipelines produce the same test count summary.

Additional Notes

Follow-up

Jira ticket: [PROJ-IDENT]

@datadog-datadog-prod-us1-2
Copy link
Copy Markdown
Contributor

datadog-datadog-prod-us1-2 Bot commented Jun 5, 2026

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

Check pull requests | Check pull requests   View in Datadog   GitHub Actions

See error Missing required labels on pull request. Please add at least one type and one component or instrumentation label.

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 1cb05e3 | Docs | Datadog PR Page | Give us feedback!

@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented Jun 5, 2026

🟢 Java Benchmark SLOs — All performance SLOs passed

Suite Status
Startup 🟢 pass

SLO thresholds are defined here based on automatically generated metrics. A warning is raised when results are within 5% of the threshold.

PR vs. master results
Scenario Candidate master Δ (95% CI of mean)
startup:insecure-bank:iast:Agent 13.97 s 13.88 s [-0.7%; +2.0%] (no difference)
startup:insecure-bank:tracing:Agent 12.85 s 12.97 s [-2.3%; +0.4%] (no difference)
startup:petclinic:appsec:Agent 16.40 s 16.34 s [-0.8%; +1.5%] (no difference)
startup:petclinic:iast:Agent 16.64 s 16.70 s [-1.5%; +0.8%] (no difference)
startup:petclinic:profiling:Agent 16.41 s 16.59 s [-3.0%; +0.9%] (no difference)
startup:petclinic:tracing:Agent 15.68 s 15.86 s [-2.7%; +0.5%] (no difference)

Commit: 1cb05e35 · CI Pipeline · Benchmarking Platform UI


Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion.

@bric3 bric3 marked this pull request as ready for review June 5, 2026 12:52
@bric3 bric3 requested review from a team as code owners June 5, 2026 12:52
@bric3 bric3 requested review from AlexeyKuznetsov-DD and erikayasuda and removed request for a team June 5, 2026 12:52
@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented Jun 5, 2026

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1cb05e35a6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .gitlab-ci.yml
- .gitlab/count_tests.sh "$GRADLE_TARGET" "$testJvm" "./results" "./test_counts_${CI_JOB_ID}.json"
- export TEST_COUNTS_S3_PREFIX="test-counts/${CI_PIPELINE_ID}"
- export TEST_COUNTS_FILE="./test_counts_${CI_JOB_ID}.json"
- export TEST_COUNTS_S3_URI="s3://${TEST_COUNTS_S3_BUCKET}/${TEST_COUNTS_S3_PREFIX}/test_counts_${CI_JOB_ID}.json"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Deduplicate retried test jobs before aggregating

When a test job is retried manually or by the template's retry: max: 2, the retry stays in the same CI_PIPELINE_ID but gets a new CI_JOB_ID, so this key leaves the first attempt's test_counts_<old job id>.json in the same S3 prefix while the retry uploads another file. The aggregate job then downloads every test_counts_*.json from that prefix, which double-counts tests and can preserve stale failed/zero-test attempts in the summary; use a stable per-job key that is overwritten by retries or filter to the latest attempt before aggregation.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice finding... BTW
@bric3 WDYT? maybe we should use pipelineId + test name?
like: 1223456789_test_base: [17, 1/4] ?

Copy link
Copy Markdown
Contributor Author

@bric3 bric3 Jun 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YUp I saw that, I made a follow-up PR: #11580

@bric3 bric3 requested a review from jpbempel June 5, 2026 12:54
@bric3 bric3 changed the title Exchange test counts through S3 Exchange test statistics through S3 Jun 5, 2026
Copy link
Copy Markdown
Contributor

@AlexeyKuznetsov-DD AlexeyKuznetsov-DD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The only question I have - is there any cleanup needed for S3?

  1. Set some expiration date?
  2. Or just delete after aggregation?
  3. Cron job to cleanup every month?
  4. ...
  5. PROFIT :)

@bric3
Copy link
Copy Markdown
Contributor Author

bric3 commented Jun 5, 2026

@AlexeyKuznetsov-DD none of the above the bucket have a 3 day expiration (or at least 1 is the closest)

@bric3 bric3 added tag: no release notes Changes to exclude from release notes comp: tooling Build & Tooling labels Jun 5, 2026
@bric3
Copy link
Copy Markdown
Contributor Author

bric3 commented Jun 5, 2026

/merge -f --reason="Do not impact shipped code"

@gh-worker-devflow-routing-ef8351
Copy link
Copy Markdown

gh-worker-devflow-routing-ef8351 Bot commented Jun 5, 2026

View all feedbacks in Devflow UI.

2026-06-05 13:04:28 UTC ℹ️ Start processing command /merge -f --reason="Do not impact shipped code"


2026-06-05 13:04:33 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in master is approximately 0s (p90).


2026-06-05 13:04:43 UTC ℹ️ MergeQueue: This merge request was merged

Warning

This change was merged without running any pre merge CI checks

Reason: Do not impact shipped code

@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 61cb4e0 into master Jun 5, 2026
585 of 591 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the bdu/test-count-jobs-s3 branch June 5, 2026 13:04
@github-actions github-actions Bot added this to the 1.64.0 milestone Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: tooling Build & Tooling tag: no release notes Changes to exclude from release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants