Skip to content

Predefine statistics formatters#976

Open
madsbk wants to merge 2 commits intorapidsai:mainfrom
madsbk:redesign-statistics
Open

Predefine statistics formatters#976
madsbk wants to merge 2 commits intorapidsai:mainfrom
madsbk:redesign-statistics

Conversation

@madsbk
Copy link
Copy Markdown
Member

@madsbk madsbk commented Apr 17, 2026

Previously, Statistics allowed callers to register arbitrary std::function formatters via register_formatter(). While flexible, this prevented full serialization: lambdas cannot be serialized, so Statistics::serialize() dropped formatter information.

This PR replaces dynamic formatters with a closed set of predefined render functions, identified by a new Statistics::Formatter enum (Default, Bytes, Duration, Ratio, MemCopy, MemAlloc).

In addition, Statistics is now fully picklable. As a result, the gather_statistics API is removed. Instead, cudf-polars is responsible for pickling and communicating Statistics objects directly.

@madsbk madsbk self-assigned this Apr 17, 2026
@madsbk madsbk added breaking Introduces a breaking change improvement Improves an existing functionality labels Apr 17, 2026
@madsbk madsbk force-pushed the redesign-statistics branch from ff22c6a to dc5895c Compare April 17, 2026 15:17
@madsbk madsbk force-pushed the redesign-statistics branch from dc5895c to 9541e64 Compare April 17, 2026 15:30
@madsbk madsbk marked this pull request as ready for review April 17, 2026 17:30
@madsbk madsbk requested review from a team as code owners April 17, 2026 17:30
@rapidsai rapidsai deleted a comment from copy-pr-bot bot Apr 17, 2026
Copy link
Copy Markdown
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One tiny question, otherwise LGTM.

Comment thread cpp/src/statistics.cpp
Comment on lines +233 to +235
if (!has_report_entry(name)) {
add_report_entry(name, {name}, Formatter::Bytes);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you ever not wrapping add_report_entry in a has_report_entry check? Isn't that redundant given the semantics that you've established where add does nothing if there's already something present?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Introduces a breaking change improvement Improves an existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants