Is your feature request related to a problem or challenge?
Currently we use hash % num_partition to divide rows among partitions, based on the hash.
However the % operation is slow on all platforms (especially x86, but ARM as well) and makes RepartitionExec slower.
Describe the solution you'd like
We can consider using fastrange https://github.com/lemire/fastrange instead.
Describe alternatives you've considered
We can apply strength reduction (https://en.wikipedia.org/wiki/Strength_reduction) or the crate
https://docs.rs/strength_reduce/latest/strength_reduce/
to make the hash % partition calculation faster, as partition is a constant over the batch session config.
Additional context
No response
Is your feature request related to a problem or challenge?
Currently we use
hash % num_partitionto divide rows among partitions, based on the hash.However the % operation is slow on all platforms (especially x86, but ARM as well) and makes
RepartitionExecslower.Describe the solution you'd like
We can consider using fastrange https://github.com/lemire/fastrange instead.
Describe alternatives you've considered
We can apply strength reduction (https://en.wikipedia.org/wiki/Strength_reduction) or the crate
https://docs.rs/strength_reduce/latest/strength_reduce/
to make the
hash % partitioncalculation faster, as partition is a constant over the batch session config.Additional context
No response