Starting with the parquet_read_parallel example from arrow2, I am trying to deserialize a Chunk into a Vec of structs.
Using the deserialize_parallel function as defined in the above example, the following code currently works for me:
pub struct Document {
content: String,
}
...
let chunk = deserialize_parallel(&mut columns)?;
let array = StructArray::new(
DataType::Struct(fields.clone()),
chunk.arrays().to_vec(),
None,
);
let documents: Vec<Document> = array.to_boxed().try_into_collection().unwrap();
Questions:
- With the currently exposed APIs in arrow2 and arrow2-convert, is there a better way to convert the Chunk into a Struct? I think the extra conversion from
Chunk to StructArray with the to_boxed at the end is perhaps not the most efficient.
- Would it be possible to expose
TryIntoCollection::try_into_collection directly on the Chunk as well?