Conversation
|
Looks interesting - I am wondering if we can't do the same in arrow-rs without relying on a new dependency? |
|
|
||
| impl<'a> KeyAccessor for FixedSizeBinaryAccessor<'a> { | ||
| #[inline(always)] | ||
| fn get_key(&self, index: usize) -> &[u8] { |
| let mut valids: Vec<(u32, u32, u64)> = value_indices | ||
| .into_iter() | ||
| .map(|idx| unsafe { | ||
| // Build (index, 8-byte prefix) tuples for prefix-accelerated comparison sort |
There was a problem hiding this comment.
What is the improvement without orasort on this?
|
run benchmark sort_kernels |
This comment was marked as outdated.
This comment was marked as outdated.
|
🤖 |
|
Benchmark script failed with exit code 101. Last 10 lines of output: Click to expand |
|
run benchmark sort_kernel |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark sort_kernel |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
Seems not to reproduce on the VM 🤔 perhaps machine-dependent |
|
I don't understand the benchmarks, can someone explain it to me? |
Which issue does this PR close?
Rationale for this change
Orasort is spliced based on prefix and uses radix sort in spliced chunks.
What changes are included in this PR?
Orasort sorting inclusion, adapting to prefix splices for array buffers.
Are these changes tested?
Yes, tests are already covering it.
In addition to that extra benchmarks are added to demonstrate the gain.
Are there any user-facing changes?
No.
Bench Results (main vs this branch)
Orasort core implementation: https://github.com/psila-ai/orasort
Perf defaults of the Orasort: https://github.com/psila-ai/orasort?tab=readme-ov-file#performance