Conversation
|
run benchmark filter_kernels boolean_kernels arrow_reader arrow_reader_clickbench |
|
🤖 Hi @Dandandan, thanks for the request (#9284 (comment)).
Please choose one or more of these with |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark boolean_kernels filter_kernels |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark boolean_kernels |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark boolean_kernels |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖: Benchmark completed Details
|
|
run benchmark filter_kernels boolean_kernels arrow_reader_clickbench |
|
🤖 |
|
FYI @alamb this is starting to look good perf wise |
| } | ||
|
|
||
| // both buffers have the same offset, we can use UnalignedBitChunk for both | ||
| let left_chunks = UnalignedBitChunk::new(left, left_offset_in_bits, len_in_bits); |
There was a problem hiding this comment.
I think we actually don't have to use this part (use the byte aligned one above and set the correct offset).
arrow-buffer/src/buffer/boolean.rs
Outdated
| result.truncate(chunks.num_bytes()); | ||
| } | ||
| let src = src.as_ref(); | ||
| let chunks = UnalignedBitChunk::new(src, offset_in_bits, len_in_bits); |
There was a problem hiding this comment.
Probably not needed now that we have a fast byte aligned version.
| } | ||
|
|
||
| #[inline] | ||
| fn fold<B, F>(mut self, init: B, mut f: F) -> B |
There was a problem hiding this comment.
The idea here is to implement an improved fold implementation so from_trusted_len_iter is fast (I used AI assistance to come up with the implementation, but it seems to look jormal)
| let mut dst = buffer.data.as_ptr(); | ||
| for item in iterator { | ||
| let mut dst = buffer.data.as_ptr() as *mut T; | ||
| iterator.for_each(|item| { |
There was a problem hiding this comment.
for_each uses fold so can use a more efficient implementation if it is available.
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
run benchmark coalesce_kernels |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark boolean_kernels |
|
run benchmark boolean_kernels |
1 similar comment
|
run benchmark boolean_kernels |
|
🤖 |
|
(sorry runner had been restarted for some reason) |
|
🤖: Benchmark completed Details
|
Which issue does this PR close?
Rationale for this change
We can speed up bitwise operations (and / or / etc) on boolean data containing offsets.
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?