Use O(log n/eps) budget for CPU distributed sketch by RAMitchell · Pull Request #12150 · dmlc/xgboost

RAMitchell · 2026-04-09T08:30:03Z

Summary

Use count-aware O(log n / eps) sizing for CPU distributed quantile sketch merge/prune.

This change makes WQuantileSketch track the number of represented elements, serializes those per-feature counts through the CPU distributed sketch payload, and recomputes SketchSummaryBudget(...) after merge using the summed counts from both sides.

Testing

cmake -S <worktree> -B <worktree>/build-cpu -DUSE_CUDA=OFF -DGOOGLE_TEST=ON -DUSE_DMLC_GTEST=ON
cmake --build <worktree>/build-cpu --target testxgboost -j35
<worktree>/build-cpu/testxgboost --gtest_filter='Quantile.*'

Copilot

Pull request overview

Updates CPU distributed quantile sketching to size merge/prune budgets using a count-aware O(log n / eps) summary budget, by tracking and propagating per-feature represented element counts through the allreduce payload.

Changes:

Add WQuantileSketch::NumElements() and track represented element counts during Push/PushSorted.
Extend the distributed sketch allreduce payload to serialize/merge per-feature element counts and recompute SketchSummaryBudget(...) after merges.
Add unit tests to validate element-count tracking behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`tests/cpp/common/test_quantile.cc`	Adds tests for the new `NumElements()` tracking API.
`src/common/quantile.h`	Introduces `num_elements_` tracking and a `NumElements()` accessor in `WQuantileSketch`.
`src/common/quantile.cc`	Updates distributed sketch serialization/allreduce merge logic to propagate element counts and recompute per-feature budgets.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/common/quantile.h

tests/cpp/common/test_quantile.cc

Use O(log n/eps) budget for CPU distributed sketch

f5b3486

RAMitchell requested a review from Copilot April 9, 2026 08:30

Copilot started reviewing on behalf of RAMitchell April 9, 2026 08:31 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

src/common/quantile.h Outdated Show resolved Hide resolved

tests/cpp/common/test_quantile.cc Show resolved Hide resolved

Align sketch element counting semantics

dfeea19

RAMitchell requested a review from trivialfis April 9, 2026 10:15

trivialfis approved these changes Apr 9, 2026

View reviewed changes

RAMitchell merged commit 01c70d7 into dmlc:master Apr 9, 2026
78 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use O(log n/eps) budget for CPU distributed sketch#12150

Use O(log n/eps) budget for CPU distributed sketch#12150
RAMitchell merged 2 commits intodmlc:masterfrom
RAMitchell:cpu-distributed-count-budget

RAMitchell commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

RAMitchell commented Apr 9, 2026

Summary

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants