Use O(log n/eps) budget for CPU distributed sketch#12150
Merged
RAMitchell merged 2 commits intodmlc:masterfrom Apr 9, 2026
Merged
Use O(log n/eps) budget for CPU distributed sketch#12150RAMitchell merged 2 commits intodmlc:masterfrom
RAMitchell merged 2 commits intodmlc:masterfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Updates CPU distributed quantile sketching to size merge/prune budgets using a count-aware O(log n / eps) summary budget, by tracking and propagating per-feature represented element counts through the allreduce payload.
Changes:
- Add
WQuantileSketch::NumElements()and track represented element counts duringPush/PushSorted. - Extend the distributed sketch allreduce payload to serialize/merge per-feature element counts and recompute
SketchSummaryBudget(...)after merges. - Add unit tests to validate element-count tracking behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
tests/cpp/common/test_quantile.cc |
Adds tests for the new NumElements() tracking API. |
src/common/quantile.h |
Introduces num_elements_ tracking and a NumElements() accessor in WQuantileSketch. |
src/common/quantile.cc |
Updates distributed sketch serialization/allreduce merge logic to propagate element counts and recompute per-feature budgets. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
trivialfis
approved these changes
Apr 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Use count-aware
O(log n / eps)sizing for CPU distributed quantile sketch merge/prune.This change makes
WQuantileSketchtrack the number of represented elements, serializes those per-feature counts through the CPU distributed sketch payload, and recomputesSketchSummaryBudget(...)after merge using the summed counts from both sides.Testing
cmake -S <worktree> -B <worktree>/build-cpu -DUSE_CUDA=OFF -DGOOGLE_TEST=ON -DUSE_DMLC_GTEST=ONcmake --build <worktree>/build-cpu --target testxgboost -j35<worktree>/build-cpu/testxgboost --gtest_filter='Quantile.*'