Skip to content

Use O(log n/eps) budget for CPU distributed sketch#12150

Merged
RAMitchell merged 2 commits intodmlc:masterfrom
RAMitchell:cpu-distributed-count-budget
Apr 9, 2026
Merged

Use O(log n/eps) budget for CPU distributed sketch#12150
RAMitchell merged 2 commits intodmlc:masterfrom
RAMitchell:cpu-distributed-count-budget

Conversation

@RAMitchell
Copy link
Copy Markdown
Member

Summary

Use count-aware O(log n / eps) sizing for CPU distributed quantile sketch merge/prune.

This change makes WQuantileSketch track the number of represented elements, serializes those per-feature counts through the CPU distributed sketch payload, and recomputes SketchSummaryBudget(...) after merge using the summed counts from both sides.

Testing

  • cmake -S <worktree> -B <worktree>/build-cpu -DUSE_CUDA=OFF -DGOOGLE_TEST=ON -DUSE_DMLC_GTEST=ON
  • cmake --build <worktree>/build-cpu --target testxgboost -j35
  • <worktree>/build-cpu/testxgboost --gtest_filter='Quantile.*'

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates CPU distributed quantile sketching to size merge/prune budgets using a count-aware O(log n / eps) summary budget, by tracking and propagating per-feature represented element counts through the allreduce payload.

Changes:

  • Add WQuantileSketch::NumElements() and track represented element counts during Push/PushSorted.
  • Extend the distributed sketch allreduce payload to serialize/merge per-feature element counts and recompute SketchSummaryBudget(...) after merges.
  • Add unit tests to validate element-count tracking behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
tests/cpp/common/test_quantile.cc Adds tests for the new NumElements() tracking API.
src/common/quantile.h Introduces num_elements_ tracking and a NumElements() accessor in WQuantileSketch.
src/common/quantile.cc Updates distributed sketch serialization/allreduce merge logic to propagate element counts and recompute per-feature budgets.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@RAMitchell RAMitchell requested a review from trivialfis April 9, 2026 10:15
@RAMitchell RAMitchell merged commit 01c70d7 into dmlc:master Apr 9, 2026
78 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants