Skip to content

Changed reduce implementation and added tests#6877

Open
BhoomishGupta wants to merge 17 commits intoTheHPXProject:masterfrom
BhoomishGupta:fix/reduce-6647
Open

Changed reduce implementation and added tests#6877
BhoomishGupta wants to merge 17 commits intoTheHPXProject:masterfrom
BhoomishGupta:fix/reduce-6647

Conversation

@BhoomishGupta
Copy link
Copy Markdown
Contributor

Fixes #6647

Proposed Changes

Created a helper function that reduces the partition value without the need of initial value..
Added a test case that was mentioned in the issue #6647.
Corrected a typo in documentation.

I only did changes in :

  1. hpx\libs\core\algorithms\include\hpx\parallel\algorithms\reduce.hpp
  2. hpx\libs\core\algorithms\tests\regressions\CMakeLists.txt
  3. the rest were changes due to clang and editor.config

And created this :

  1. hpx\libs\core\algorithms\tests\regressions\reduce_6647.cpp

Any background context you want to provide?

Previously, the implementation assumed that the *first was convertible to T. This assumption does not hold for the actual function logic required here. This PR removes that dependency, ensuring the type handling is correct.

Checklist

Not all points below apply to all pull requests.

  • I have added a new feature and have added tests to go along with it.
  • I have fixed a bug and have added a regression test.
  • I have added a test using random numbers; I have made sure it uses a seed, and that random numbers generated are valid inputs for the tests.

@StellarBot
Copy link
Copy Markdown
Collaborator

Can one of the admins verify this patch?

Copy link
Copy Markdown
Contributor

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please separate the formatting changes into a different PR? It's close to impossible to review your actual changes here.

Also, for the formatting - are you sure you're using clang-format V20?

@BhoomishGupta
Copy link
Copy Markdown
Contributor Author

Yeah, I was not using V20.
So, should I close this PR and open a new one, that is only with the changes?

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Feb 4, 2026

Yeah, I was not using V20. So, should I close this PR and open a new one, that is only with the changes?

Simply keep (force) pushing to your branch, this wil update the PR.

@BhoomishGupta BhoomishGupta force-pushed the fix/reduce-6647 branch 2 times, most recently from 22a1f44 to 25bccb7 Compare February 4, 2026 20:34
Comment thread libs/core/algorithms/include/hpx/parallel/algorithms/reduce.hpp Outdated
Comment thread libs/core/algorithms/include/hpx/parallel/algorithms/reduce.hpp Outdated
@BhoomishGupta BhoomishGupta force-pushed the fix/reduce-6647 branch 3 times, most recently from 3e0fbc4 to 4c3a5d6 Compare February 10, 2026 23:44
@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Feb 11, 2026

@BhoomishGupta FWIW, the reduce tests now seem to fail. Also, please rebase thise onto master to resolve the conflicts.

@BhoomishGupta BhoomishGupta force-pushed the fix/reduce-6647 branch 2 times, most recently from 3a9e8c3 to c162a8e Compare February 11, 2026 00:35
@BhoomishGupta
Copy link
Copy Markdown
Contributor Author

@BhoomishGupta FWIW, the reduce tests now seem to fail. Also, please rebase thise onto master to resolve the conflicts.

Working on it

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Feb 12, 2026

@BhoomishGupta Please rebase your changes onto master to reduce the amount of unrelated changes (currently more than 300 files). Otherwise it's close to impossible to review your PR.

@BhoomishGupta BhoomishGupta force-pushed the fix/reduce-6647 branch 2 times, most recently from 8cdfc5f to 4595683 Compare February 12, 2026 23:45
Copy link
Copy Markdown
Contributor

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BhoomishGupta Thanks for investigating and attempting to implement a new customization point for the execution parameters.

You have implemented part of the necessary logic. Please have a look at the existing customization points to see what's necessary additionally.

I have the impression, that you have added the new functionality for all algorithms, as the new functionality will be invoked always. I'd suggest making sure, that the new functionality is used only for the reduction algorithms.

@BhoomishGupta
Copy link
Copy Markdown
Contributor Author

BhoomishGupta commented Feb 17, 2026

@BhoomishGupta Thanks for investigating and attempting to implement a new customization point for the execution parameters.

You have implemented part of the necessary logic. Please have a look at the existing customization points to see what's necessary additionally.

I have the impression, that you have added the new functionality for all algorithms, as the new functionality will be invoked always. I'd suggest making sure, that the new functionality is used only for the reduction algorithms.

@hkaiser Thanks for the review. I will check the existing customization points and add the missing things.

Regarding the scope: I did intentionally add it to the generic utility because I thought adjusting chunk sizes could be beneficial for other algorithms down the line. Or do you strongly prefer I hard-restrict it to reduce only?

Just to clarify on the current behavior: even though it's in a shared file, it currently acts as a no-op for other algorithms. The default implementation returns {0, 0}, and the call sites only apply adjustments when the return values are non-zero. So it shouldn't be impacting non-reduce algorithms right now, but I will make sure the trait-detection aligns perfectly with the rest of the codebase.

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Feb 17, 2026

@hkaiser Thanks for the review. I will check the existing customization points and add the missing things.

Regarding the scope: I did intentionally add it to the generic utility because I thought adjusting chunk sizes could be beneficial for other algorithms down the line. Or do you strongly prefer I hard-restrict it to reduce only?

The default behavior is needed for all algorithms (as a fall-back, most likely), while the customization should apply to the reduction algorithms only.

Just to clarify on the current behavior: even though it's in a shared file, it currently acts as a no-op for other algorithms. The default implementation returns {0, 0}, and the call sites only apply adjustments when the return values are non-zero. So it shouldn't be impacting non-reduce algorithms right now, but I will make sure the trait-detection aligns perfectly with the rest of the codebase.

If the default used by all algorithms (except reduce) is now {0, 0}, wouldn't that change the current behavior?

@BhoomishGupta
Copy link
Copy Markdown
Contributor Author

BhoomishGupta commented Feb 18, 2026

@hkaiser Thanks for the review. I will check the existing customization points and add the missing things.
Regarding the scope: I did intentionally add it to the generic utility because I thought adjusting chunk sizes could be beneficial for other algorithms down the line. Or do you strongly prefer I hard-restrict it to reduce only?

The default behavior is needed for all algorithms (as a fall-back, most likely), while the customization should apply to the reduction algorithms only.

Just to clarify on the current behavior: even though it's in a shared file, it currently acts as a no-op for other algorithms. The default implementation returns {0, 0}, and the call sites only apply adjustments when the return values are non-zero. So it shouldn't be impacting non-reduce algorithms right now, but I will make sure the trait-detection aligns perfectly with the rest of the codebase.

If the default used by all algorithms (except reduce) is now {0, 0}, wouldn't that change the current behavior?

I don,t think that the {0, 0} default changes the current behavior for any algorithm.

The call site in chunk_size.hpp

auto [adj_chunk, adj_max] =
    hpx::execution::experimental::adjust_chunk_size_and_max_chunks(
        policy.parameters(), policy.executor(),
        max_chunks, chunk_size);
if (adj_chunk != 0)
    chunk_size = adj_chunk;
if (adj_max != 0)
    max_chunks = adj_max;
else if (adj_chunk == 0)
    adjust_chunk_size_and_max_chunks(
        cores, count, max_chunks, chunk_size);

works when the Customization Point returns the default {0, 0}, both adj_chunk and adj_max are zero, so we go through else if (adj_chunk == 0) which calls the existing adjust_chunk_size_and_max_chunks free function, which is exactly the same code path as before.
Only an execution parameters object that provides a custom adjust_chunk_size_and_max_chunks member will return non-zero values and override the defaults.

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Feb 18, 2026

I don,t think that the {0, 0} default changes the current behavior for any algorithm.

The call site in chunk_size.hpp

auto [adj_chunk, adj_max] =
    hpx::execution::experimental::adjust_chunk_size_and_max_chunks(
        policy.parameters(), policy.executor(),
        max_chunks, chunk_size);
if (adj_chunk != 0)
    chunk_size = adj_chunk;
if (adj_max != 0)
    max_chunks = adj_max;
else if (adj_chunk == 0)
    adjust_chunk_size_and_max_chunks(
        cores, count, max_chunks, chunk_size);

works when the Customization Point returns the default {0, 0}, both adj_chunk and adj_max are zero, so we go through else if (adj_chunk == 0) which calls the existing adjust_chunk_size_and_max_chunks free function, which is exactly the same code path as before. Only an execution parameters object that provides a custom adjust_chunk_size_and_max_chunks member will return non-zero values and override the defaults.

I have seen that. This code does something different from what was in place before. The default just should be the previous adjust_chunk_size_and_max_chunks (most likely to be invoked from the CPO fallback).

As said, please have a look at the other parameters customizations to see what's missing, e.g., get_chunk_size here: https://github.com/STEllAR-GROUP/hpx/blob/b514f4ab6274d21a716e3acaac08f825b9a93776/libs/core/execution/include/hpx/execution/executors/execution_parameters_fwd.hpp#L79-L127.

Copy link
Copy Markdown
Contributor

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one thing missing still. Currently you use policy.with() to replace the current set of parameter customizations, which will cause for any user-supplied customizations to be lost. I think we need to use something like the new API rebind_executor_parameters (like for instance proposed here: #6935) instead.

Otherwise: very nice! Thanks for working on this!

Comment thread libs/core/algorithms/include/hpx/parallel/util/detail/chunk_size.hpp Outdated
@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Feb 24, 2026

There is one thing missing still. Currently you use policy.with() to replace the current set of parameter customizations, which will cause for any user-supplied customizations to be lost. I think we need to use something like the new API rebind_executor_parameters (like for instance proposed here: #6935) instead.

Otherwise: very nice! Thanks for working on this!

Let's wait for #6935 to be merged, then you can adapt your PR.

Comment thread libs/core/algorithms/include/hpx/parallel/algorithms/reduce.hpp Outdated
@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Apr 11, 2026

@BhoomishGupta On MacOS the new tests are failing. Inspect reports an issue as well. Please pay attention to the CIs yourself.

@BhoomishGupta
Copy link
Copy Markdown
Contributor Author

@BhoomishGupta On MacOS the new tests are failing. Inspect reports an issue as well. Please pay attention to the CIs yourself.

Ah, my bad, I had my wires crossed and was looking at the CI for a different PR. I will check the MacOS and Inspect logs for this one tomorrow and will push a fix soon. I'll pay closer attention to the CI status here going forward.

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Apr 14, 2026

@BhoomishGupta ping?

Copilot AI review requested due to automatic review settings April 15, 2026 17:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes reduce partition initialization for cases where value_type is not convertible to T (issue #6647) by ensuring partitions have at least two elements and by adding a regression test. Also introduces an adjust_chunk_size_and_max_chunks executor-parameters customization point with dispatch/rebind support and corresponding unit tests.

Changes:

  • Fix parallel reduce partition reduction to avoid assuming *first converts to T, and add a regression test (reduce_6647).
  • Add adjust_chunk_size_and_max_chunks customization point + rebind/wrapping support, and update chunking code to use it.
  • Extend unit tests around executor-parameters dispatch/rebinding (including dispatch order guarantees).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
libs/core/execution/include/hpx/execution/executors/rebind_executor_parameters.hpp Adds support for rebinding/wrapping adjust_chunk_size_and_max_chunks and fixes mark_begin_execution double-dispatch.
libs/core/execution/include/hpx/execution/executors/execution_parameters_fwd.hpp Declares the adjust_chunk_size_and_max_chunks CPO.
libs/core/execution/include/hpx/execution/executors/execution_parameters.hpp Implements default adjustment logic + property plumbing for adjust_chunk_size_and_max_chunks.
libs/core/execution/tests/unit/rebind_executor_parameters.cpp Adds rebind tests for adjust_chunk_size_and_max_chunks and dispatch order test for mark_begin_execution.
libs/core/execution/tests/unit/executor_parameters_dispatching.cpp Adds dispatch precedence tests for adjust_chunk_size_and_max_chunks between parameters and executor.
libs/core/algorithms/include/hpx/parallel/util/detail/chunk_size.hpp Switches chunk-size/max-chunks consistency logic to the new CPO.
libs/core/algorithms/include/hpx/parallel/algorithms/reduce.hpp Fixes parallel reduce partition reduction by enforcing partition sizes >= 2 and adding a helper that doesn’t require *first convertible to T.
libs/core/algorithms/tests/regressions/reduce_6647.cpp Adds regression test covering the reported reduce bug case.
libs/core/algorithms/tests/regressions/CMakeLists.txt Registers reduce_6647 regression test.
.circleci/tests.unit1.container_algorithms Adds a test entry to CI list.
components/parcel_plugins/coalescing/src/coalescing_message_handler.cpp Formatting-only change (line wrap).
components/containers/partitioned_vector/tests/unit/is_iterator_partitioned_vector.cpp Formatting-only change (include order).

Comment thread libs/core/algorithms/include/hpx/parallel/util/detail/chunk_size.hpp Outdated
Comment thread libs/core/algorithms/include/hpx/parallel/algorithms/reduce.hpp
Comment thread libs/core/algorithms/include/hpx/parallel/algorithms/reduce.hpp Outdated
Comment thread libs/core/algorithms/include/hpx/parallel/algorithms/reduce.hpp
Comment thread libs/core/execution/include/hpx/execution/executors/execution_parameters.hpp Outdated
Comment on lines +522 to +524
auto rebound_params =
hpx::execution::experimental::rebind_executor_parameters(
policy.parameters(), reduce_executor_parameters{});
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says changes were limited to a few algorithm files/tests, but this PR also introduces a new executor-parameters customization point (adjust_chunk_size_and_max_chunks) plus rebind/dispatch wiring and adds/updates several execution unit tests and CI test lists. Please update the PR description (or narrow the PR scope) so reviewers can accurately assess the broader API and behavior changes included here.

Copilot uses AI. Check for mistakes.
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
Signed-off-by: Bhoomish <bhoomishguptabits@gmail.com>
…max_chunks functionality

Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
@BhoomishGupta
Copy link
Copy Markdown
Contributor Author

@BhoomishGupta ping?

Hi @hkaiser, sorry for the delay, I had academic tests. I have pushed updates and resolved the conflicts; please take another look.

I will also take a look into the snippets copilot reviewed.

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Apr 15, 2026

@BhoomishGupta please also have a look at the failing test: rebind_executor_parameters.

Signed-off-by: BhoomishGupta <bhoomishguptabits@gmail.com>
Copilot AI review requested due to automatic review settings April 17, 2026 15:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Comment on lines +476 to +483
// Helper function to reduce a partition without requiring an init value.
// Assumes partition size >= 2 (enforced by reduce_executor_parameters).
template <typename ExPolicy, typename FwdIterB, typename T, typename Reduce>
T reduce_partition(
FwdIterB part_begin, std::size_t part_size, Reduce const& r)
{
HPX_ASSERT(part_size >= 2);

Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reduce_partition assumes part_size >= 2 (asserts it) and the chunk-size adjustment only targets the fixed-size partitioning path. Execution policies using has_variable_chunk_size parameters (e.g. guided_chunk_size, dynamic_chunk_size, etc.) build variable-sized shapes and can still produce 1-element partitions, which would trip this assert (or worse in release builds). Consider also wrapping/overriding get_chunk_size (and/or the variable-shape generation) for reduce to guarantee no 1-element partitions, or otherwise handle part_size == 1 safely without double-counting elements.

Suggested change
// Helper function to reduce a partition without requiring an init value.
// Assumes partition size >= 2 (enforced by reduce_executor_parameters).
template <typename ExPolicy, typename FwdIterB, typename T, typename Reduce>
T reduce_partition(
FwdIterB part_begin, std::size_t part_size, Reduce const& r)
{
HPX_ASSERT(part_size >= 2);
// Helper function to reduce a non-empty partition without requiring an
// init value.
template <typename ExPolicy, typename FwdIterB, typename T, typename Reduce>
T reduce_partition(
FwdIterB part_begin, std::size_t part_size, Reduce const& r)
{
HPX_ASSERT(part_size != 0);
if (part_size == 1)
{
return *part_begin;
}

Copilot uses AI. Check for mistakes.
Comment on lines +623 to +628
max_chunks =
(std::min) (cores_times_4, num_elements); // -V112

chunk_size = (std::max) (chunk_size,
(num_elements + max_chunks - 1) / max_chunks);
}
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adjust_chunk_size_and_max_chunks_default can divide by zero when num_elements == 0 and both max_chunks and chunk_size are 0: max_chunks becomes 0 via min(cores_times_4, num_elements), and then (num_elements + max_chunks - 1) / max_chunks is evaluated. Add an early return/guard for num_elements == 0 (or ensure max_chunks is set to at least 1 before dividing) to keep the default implementation safe for empty ranges.

Copilot uses AI. Check for mistakes.
Comment on lines 358 to 366
#include <hpx/config.hpp>
#include <hpx/assert.hpp>
#include <hpx/modules/concepts.hpp>
#include <hpx/modules/execution.hpp>
#include <hpx/modules/executors.hpp>
#include <hpx/modules/functional.hpp>
#include <hpx/modules/iterator_support.hpp>
#include <hpx/modules/pack_traversal.hpp>
#include <hpx/parallel/algorithms/detail/accumulate.hpp>
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says the only non-formatting changes are in reduce.hpp, the regressions CMakeLists, and the new reduce_6647.cpp, but this PR also introduces functional changes in the execution module (execution_parameters*, rebind_executor_parameters.hpp) and updates execution unit tests. Please update the PR description to reflect these additional code changes so reviewers have the right scope in mind.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect reduce implementation

4 participants