Introduce PrimitiveArrayBuilder::build(), avoid use of ArrayData#9305
Introduce PrimitiveArrayBuilder::build(), avoid use of ArrayData#9305alamb wants to merge 4 commits intoapache:mainfrom
PrimitiveArrayBuilder::build(), avoid use of ArrayData#9305Conversation
b3bbc1c to
12525b3
Compare
PrimitiveArrayBuilder::build(), avoid use of ArrayData
|
run benchmark builder |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark builder |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤔 seems to get slower. Will investigate |
|
run benchmark builder |
|
🤖 |
| values: ScalarBuffer<T::Native>, | ||
| nulls: Option<NullBuffer>, | ||
| ) -> Self { | ||
| Self::assert_compatible(&data_type); |
There was a problem hiding this comment.
This does an extra check that wasn't there before when using build_unchecked?
There was a problem hiding this comment.
Probably not relevant if we removed all build
There was a problem hiding this comment.
yeah, I think I have a way to remove this
|
🤖: Benchmark completed Details
|
|
And it's the other way around (so noise confirmed) |
Yeah. I thought of some way to make this faster though, so working on that now |
fcebc95 to
11d7c80
Compare
6b8125c to
6895106
Compare
| values_builder: Vec<T::Native>, | ||
| null_buffer_builder: NullBufferBuilder, | ||
| data_type: DataType, | ||
| /// Optional data type override (e.g. to add timezone or precision/scale) |
There was a problem hiding this comment.
This time I tried switching to use Option so we only pay the extra type check when there is actually a datatype override
| /// not [PrimitiveArray::is_compatible] with the builder's primitive type | ||
| /// `T`. | ||
| pub fn with_data_type(self, data_type: DataType) -> Self { | ||
| assert!( |
There was a problem hiding this comment.
this assert is done as part of the PrimitiveArray::with_type later on
I also updated the tests to show that
|
run benchmark builder |
|
🤖 |
|
🤖: Benchmark completed Details
|
This reverts commit 6895106.
|
run benchmark builder |
|
🤖 |
|
(testing without changes to the benchmark) |
|
🤖: Benchmark completed Details
|

Which issue does this PR close?
ArrayDatawith direct Array construction, when possible #9298Note if this looks good I will file a ticket about adding
buildto the other array buildersRationale for this change
While reviewing #9303 from @Dandandan I noticed that using the primitive builders to create arrays was non ideal for two reasons:
DataType(which while not super expensive is total overhead)Vecunecessairly)If this approach is accepted, I will make a ticket to track adding
buildto the other buildersWhat changes are included in this PR?
finishandfinish_clonedto avoid using ArrayDatabuildwhich consumes the builderThis is similar to the build methods added to the other builders here
{Null,Boolean,}BufferBuilder#9155Are these changes tested?
Yes by CI and new doc tests
I will also run benchmarks
Are there any user-facing changes?