Is your feature request related to a problem or challenge? Please describe what you are trying to do.
As we begin to contemplate how to read and write shredded variants, we will need some way to construct arrow arrays that contain shredded variants
Physically these will be Arrow StructArrays with two or three fields
- Non shredded: (2 fields)
STRUCT { "metadata": Binary, "value": Binary}
- Shredded: (3 fields)
STRUCT { "metadata": Binary, "value": Binary, typed_value: STRUCT { ... } }
More information on to represent Variants as Arrow arrays can be found on the proposal:
Describe the solution you'd like
I would like some way to construct such shredded arrays easily and efficiently in Idomatic Rust style
Describe alternatives you've considered
One an idea from @zeroshade (thank you!) is to create a VariantArrayBuilder that is responsible for building the correct StructArrays from variants, including shredding out any columns. In order to created a shredded output, you would provide the shredded schema up front
For example, (based on the go implemntation and @scovich 's comment here), to create a shredded Arrow array that shreds out columns "foo" and "bar" from any variant objects,
We would need this schema:
STRUCT {
metadata: BinaryView,
value: BinaryView,
typed_value: STRUCT {
foo: Int64,
bar: Int32
}
}
The code would look like this
// Create an arrow Field that describes the desired shredded output schema
let shredded_schema = Field::new_struct(
vec![ "metadata", "value", "typed_value"],
vec![Field::new(DataType::BinaryView), Field::new(DataType::BinaryView), Field:::new_struct(
vec!["foo", "bar"],
vec![Field::new(DataType::Int64), Field::new(DataType::Int32)],
));
// Create a builder for an array (batch) of Variant values
let array_builder = VariantArrayBuilder::new(shredded_schema);
// append a row to the builder
let object= array_builder.new_object();
... add appropriate fields ...
// use like normal ObjectBuilder(??)
object.finish()
// append a second row (has no foo or bar fields)
array_builder.append_value(43);
...
/// Finalze the builder
let variant_array: StructArray = array_builder.build()?;
// variant_array is a shreded variant
I think a VariantArrayBuilder will be helpful for usecases other than Variant, and @harshmotw-db has created some version of one here:
Prior Art
Golang implementation:
Additional context
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
As we begin to contemplate how to read and write shredded variants, we will need some way to construct arrow arrays that contain shredded variants
Physically these will be Arrow
StructArrayswith two or three fieldsSTRUCT { "metadata": Binary, "value": Binary}STRUCT { "metadata": Binary, "value": Binary, typed_value: STRUCT { ... } }More information on to represent Variants as Arrow arrays can be found on the proposal:
Describe the solution you'd like
I would like some way to construct such shredded arrays easily and efficiently in Idomatic Rust style
Describe alternatives you've considered
One an idea from @zeroshade (thank you!) is to create a
VariantArrayBuilderthat is responsible for building the correctStructArrays from variants, including shredding out any columns. In order to created a shredded output, you would provide the shredded schema up frontFor example, (based on the go implemntation and @scovich 's comment here), to create a shredded Arrow array that shreds out columns "foo" and "bar" from any variant objects,
We would need this schema:
The code would look like this
I think a VariantArrayBuilder will be helpful for usecases other than Variant, and @harshmotw-db has created some version of one here:
Prior Art
Golang implementation:
Additional context