From cddf7c4d0f24de2a3b7fb3ae66f70304fc00a045 Mon Sep 17 00:00:00 2001 From: Zehua Zou Date: Thu, 11 Jun 2026 14:46:47 +0800 Subject: [PATCH] Correct equivalent parquet type of binary in variant documentation --- VariantEncoding.md | 2 +- VariantShredding.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/VariantEncoding.md b/VariantEncoding.md index b78c02ef..9b86216c 100644 --- a/VariantEncoding.md +++ b/VariantEncoding.md @@ -419,7 +419,7 @@ The Decimal type contains a scale, but no precision. The implied precision of a | Timestamp | timestamp | `12` | TIMESTAMP(isAdjustedToUTC=true, MICROS) | 8-byte little-endian | | TimestampNTZ | timestamp without time zone | `13` | TIMESTAMP(isAdjustedToUTC=false, MICROS) | 8-byte little-endian | | Float | float | `14` | FLOAT | IEEE little-endian | -| Binary | binary | `15` | BINARY | 4 byte little-endian size, followed by bytes | +| Binary | binary | `15` | BYTE_ARRAY | 4 byte little-endian size, followed by bytes | | String | string | `16` | STRING | 4 byte little-endian size, followed by UTF-8 encoded bytes | | TimeNTZ | time without time zone | `17` | TIME(isAdjustedToUTC=false, MICROS) | 8-byte little-endian | | Timestamp | timestamp with time zone | `18` | TIMESTAMP(isAdjustedToUTC=true, NANOS) | 8-byte little-endian | diff --git a/VariantShredding.md b/VariantShredding.md index 4f7d6142..60f1422a 100644 --- a/VariantShredding.md +++ b/VariantShredding.md @@ -100,8 +100,8 @@ Shredded values must use the following Parquet types: | timestamptz(9) | INT64 | TIMESTAMP(true, NANOS) | | timestampntz(6) | INT64 | TIMESTAMP(false, MICROS) | | timestampntz(9) | INT64 | TIMESTAMP(false, NANOS) | -| binary | BINARY | | -| string | BINARY | STRING | +| binary | BYTE_ARRAY | | +| string | BYTE_ARRAY | STRING | | uuid | FIXED_LEN_BYTE_ARRAY[len=16] | UUID | | array | GROUP; see Arrays below | LIST | | object | GROUP; see Objects below | |