Changed reduce implementation and added tests#6877
Changed reduce implementation and added tests#6877BhoomishGupta wants to merge 17 commits intoTheHPXProject:masterfrom
Conversation
|
Can one of the admins verify this patch? |
hkaiser
left a comment
There was a problem hiding this comment.
Could you please separate the formatting changes into a different PR? It's close to impossible to review your actual changes here.
Also, for the formatting - are you sure you're using clang-format V20?
|
Yeah, I was not using V20. |
Simply keep (force) pushing to your branch, this wil update the PR. |
22a1f44 to
25bccb7
Compare
3e0fbc4 to
4c3a5d6
Compare
|
@BhoomishGupta FWIW, the reduce tests now seem to fail. Also, please rebase thise onto master to resolve the conflicts. |
3a9e8c3 to
c162a8e
Compare
Working on it |
c162a8e to
e232c91
Compare
|
@BhoomishGupta Please rebase your changes onto master to reduce the amount of unrelated changes (currently more than 300 files). Otherwise it's close to impossible to review your PR. |
8cdfc5f to
4595683
Compare
There was a problem hiding this comment.
@BhoomishGupta Thanks for investigating and attempting to implement a new customization point for the execution parameters.
You have implemented part of the necessary logic. Please have a look at the existing customization points to see what's necessary additionally.
I have the impression, that you have added the new functionality for all algorithms, as the new functionality will be invoked always. I'd suggest making sure, that the new functionality is used only for the reduction algorithms.
@hkaiser Thanks for the review. I will check the existing customization points and add the missing things. Regarding the scope: I did intentionally add it to the generic utility because I thought adjusting chunk sizes could be beneficial for other algorithms down the line. Or do you strongly prefer I hard-restrict it to reduce only? Just to clarify on the current behavior: even though it's in a shared file, it currently acts as a no-op for other algorithms. The default implementation returns {0, 0}, and the call sites only apply adjustments when the return values are non-zero. So it shouldn't be impacting non-reduce algorithms right now, but I will make sure the trait-detection aligns perfectly with the rest of the codebase. |
The default behavior is needed for all algorithms (as a fall-back, most likely), while the customization should apply to the reduction algorithms only.
If the default used by all algorithms (except reduce) is now |
I don,t think that the {0, 0} default changes the current behavior for any algorithm. The call site in auto [adj_chunk, adj_max] =
hpx::execution::experimental::adjust_chunk_size_and_max_chunks(
policy.parameters(), policy.executor(),
max_chunks, chunk_size);
if (adj_chunk != 0)
chunk_size = adj_chunk;
if (adj_max != 0)
max_chunks = adj_max;
else if (adj_chunk == 0)
adjust_chunk_size_and_max_chunks(
cores, count, max_chunks, chunk_size);works when the Customization Point returns the default {0, 0}, both adj_chunk and adj_max are zero, so we go through else if (adj_chunk == 0) which calls the existing adjust_chunk_size_and_max_chunks free function, which is exactly the same code path as before. |
1a239ce to
94c773c
Compare
I have seen that. This code does something different from what was in place before. The default just should be the previous As said, please have a look at the other parameters customizations to see what's missing, e.g., |
There was a problem hiding this comment.
There is one thing missing still. Currently you use policy.with() to replace the current set of parameter customizations, which will cause for any user-supplied customizations to be lost. I think we need to use something like the new API rebind_executor_parameters (like for instance proposed here: #6935) instead.
Otherwise: very nice! Thanks for working on this!
Let's wait for #6935 to be merged, then you can adapt your PR. |
|
@BhoomishGupta On MacOS the new tests are failing. Inspect reports an issue as well. Please pay attention to the CIs yourself. |
Ah, my bad, I had my wires crossed and was looking at the CI for a different PR. I will check the MacOS and Inspect logs for this one tomorrow and will push a fix soon. I'll pay closer attention to the CI status here going forward. |
|
@BhoomishGupta ping? |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes reduce partition initialization for cases where value_type is not convertible to T (issue #6647) by ensuring partitions have at least two elements and by adding a regression test. Also introduces an adjust_chunk_size_and_max_chunks executor-parameters customization point with dispatch/rebind support and corresponding unit tests.
Changes:
- Fix parallel
reducepartition reduction to avoid assuming*firstconverts toT, and add a regression test (reduce_6647). - Add
adjust_chunk_size_and_max_chunkscustomization point + rebind/wrapping support, and update chunking code to use it. - Extend unit tests around executor-parameters dispatch/rebinding (including dispatch order guarantees).
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| libs/core/execution/include/hpx/execution/executors/rebind_executor_parameters.hpp | Adds support for rebinding/wrapping adjust_chunk_size_and_max_chunks and fixes mark_begin_execution double-dispatch. |
| libs/core/execution/include/hpx/execution/executors/execution_parameters_fwd.hpp | Declares the adjust_chunk_size_and_max_chunks CPO. |
| libs/core/execution/include/hpx/execution/executors/execution_parameters.hpp | Implements default adjustment logic + property plumbing for adjust_chunk_size_and_max_chunks. |
| libs/core/execution/tests/unit/rebind_executor_parameters.cpp | Adds rebind tests for adjust_chunk_size_and_max_chunks and dispatch order test for mark_begin_execution. |
| libs/core/execution/tests/unit/executor_parameters_dispatching.cpp | Adds dispatch precedence tests for adjust_chunk_size_and_max_chunks between parameters and executor. |
| libs/core/algorithms/include/hpx/parallel/util/detail/chunk_size.hpp | Switches chunk-size/max-chunks consistency logic to the new CPO. |
| libs/core/algorithms/include/hpx/parallel/algorithms/reduce.hpp | Fixes parallel reduce partition reduction by enforcing partition sizes >= 2 and adding a helper that doesn’t require *first convertible to T. |
| libs/core/algorithms/tests/regressions/reduce_6647.cpp | Adds regression test covering the reported reduce bug case. |
| libs/core/algorithms/tests/regressions/CMakeLists.txt | Registers reduce_6647 regression test. |
| .circleci/tests.unit1.container_algorithms | Adds a test entry to CI list. |
| components/parcel_plugins/coalescing/src/coalescing_message_handler.cpp | Formatting-only change (line wrap). |
| components/containers/partitioned_vector/tests/unit/is_iterator_partitioned_vector.cpp | Formatting-only change (include order). |
| auto rebound_params = | ||
| hpx::execution::experimental::rebind_executor_parameters( | ||
| policy.parameters(), reduce_executor_parameters{}); |
There was a problem hiding this comment.
The PR description says changes were limited to a few algorithm files/tests, but this PR also introduces a new executor-parameters customization point (adjust_chunk_size_and_max_chunks) plus rebind/dispatch wiring and adds/updates several execution unit tests and CI test lists. Please update the PR description (or narrow the PR scope) so reviewers can accurately assess the broader API and behavior changes included here.
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
…max_chunks functionality Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
dfe3334 to
4ad03fb
Compare
Hi @hkaiser, sorry for the delay, I had academic tests. I have pushed updates and resolved the conflicts; please take another look. I will also take a look into the snippets copilot reviewed. |
|
@BhoomishGupta please also have a look at the failing test: rebind_executor_parameters. |
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
| // Helper function to reduce a partition without requiring an init value. | ||
| // Assumes partition size >= 2 (enforced by reduce_executor_parameters). | ||
| template <typename ExPolicy, typename FwdIterB, typename T, typename Reduce> | ||
| T reduce_partition( | ||
| FwdIterB part_begin, std::size_t part_size, Reduce const& r) | ||
| { | ||
| HPX_ASSERT(part_size >= 2); | ||
|
|
There was a problem hiding this comment.
reduce_partition assumes part_size >= 2 (asserts it) and the chunk-size adjustment only targets the fixed-size partitioning path. Execution policies using has_variable_chunk_size parameters (e.g. guided_chunk_size, dynamic_chunk_size, etc.) build variable-sized shapes and can still produce 1-element partitions, which would trip this assert (or worse in release builds). Consider also wrapping/overriding get_chunk_size (and/or the variable-shape generation) for reduce to guarantee no 1-element partitions, or otherwise handle part_size == 1 safely without double-counting elements.
| // Helper function to reduce a partition without requiring an init value. | |
| // Assumes partition size >= 2 (enforced by reduce_executor_parameters). | |
| template <typename ExPolicy, typename FwdIterB, typename T, typename Reduce> | |
| T reduce_partition( | |
| FwdIterB part_begin, std::size_t part_size, Reduce const& r) | |
| { | |
| HPX_ASSERT(part_size >= 2); | |
| // Helper function to reduce a non-empty partition without requiring an | |
| // init value. | |
| template <typename ExPolicy, typename FwdIterB, typename T, typename Reduce> | |
| T reduce_partition( | |
| FwdIterB part_begin, std::size_t part_size, Reduce const& r) | |
| { | |
| HPX_ASSERT(part_size != 0); | |
| if (part_size == 1) | |
| { | |
| return *part_begin; | |
| } |
| max_chunks = | ||
| (std::min) (cores_times_4, num_elements); // -V112 | ||
|
|
||
| chunk_size = (std::max) (chunk_size, | ||
| (num_elements + max_chunks - 1) / max_chunks); | ||
| } |
There was a problem hiding this comment.
adjust_chunk_size_and_max_chunks_default can divide by zero when num_elements == 0 and both max_chunks and chunk_size are 0: max_chunks becomes 0 via min(cores_times_4, num_elements), and then (num_elements + max_chunks - 1) / max_chunks is evaluated. Add an early return/guard for num_elements == 0 (or ensure max_chunks is set to at least 1 before dividing) to keep the default implementation safe for empty ranges.
| #include <hpx/config.hpp> | ||
| #include <hpx/assert.hpp> | ||
| #include <hpx/modules/concepts.hpp> | ||
| #include <hpx/modules/execution.hpp> | ||
| #include <hpx/modules/executors.hpp> | ||
| #include <hpx/modules/functional.hpp> | ||
| #include <hpx/modules/iterator_support.hpp> | ||
| #include <hpx/modules/pack_traversal.hpp> | ||
| #include <hpx/parallel/algorithms/detail/accumulate.hpp> |
There was a problem hiding this comment.
The PR description says the only non-formatting changes are in reduce.hpp, the regressions CMakeLists, and the new reduce_6647.cpp, but this PR also introduces functional changes in the execution module (execution_parameters*, rebind_executor_parameters.hpp) and updates execution unit tests. Please update the PR description to reflect these additional code changes so reviewers have the right scope in mind.
Fixes #6647
Proposed Changes
Created a helper function that reduces the partition value without the need of initial value..
Added a test case that was mentioned in the issue #6647.
Corrected a typo in documentation.
I only did changes in :
And created this :
Any background context you want to provide?
Previously, the implementation assumed that the *first was convertible to T. This assumption does not hold for the actual function logic required here. This PR removes that dependency, ensuring the type handling is correct.
Checklist
Not all points below apply to all pull requests.