diff --git a/.gitignore b/.gitignore index 6a93bb1..ca9b044 100644 --- a/.gitignore +++ b/.gitignore @@ -27,3 +27,5 @@ npm-debug.log* # Test/temp files *.bin /tmp/ + +.claude/ \ No newline at end of file diff --git a/CLAUDE.md b/CLAUDE.md index 85e8a1f..5e9b670 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -56,6 +56,10 @@ src/ ├── store/ │ └── store.ts # Zustand store (query, parsed data, UI state) └── styles/ # CSS files +docs/ +├── rowbinaryspec.md # RowBinary wire format specification +├── nativespec.md # Native wire format specification +└── jsonspec.md # JSON type specification docker/ ├── nginx.conf # Proxies /clickhouse to ClickHouse server ├── users.xml # Read-only ClickHouse user @@ -64,8 +68,9 @@ docker/ ## Wire Format Docs - * RowBinary: rowbinaryspec.md - * Native: nativespec.md + * RowBinary: docs/rowbinaryspec.md + * Native: docs/nativespec.md + * JSON: docs/jsonspec.md ## Key Concepts diff --git a/QA_ISSUES.md b/QA_ISSUES.md deleted file mode 100644 index 084ea97..0000000 --- a/QA_ISSUES.md +++ /dev/null @@ -1,304 +0,0 @@ -# QA Issues Found - RowBinary/Native Decoder Edge Cases - -This document summarizes bugs found during QA testing of the RowBinary and Native decoders. - -## Critical Issues (Parsing Failures) - -### 1. Enum8 with Negative Values - Both Decoders - -**Severity:** High -**Location:** `rowbinary-decoder.ts:697` and equivalent in native-decoder.ts - -**Description:** -Enum8 can store values from -128 to 127 (Int8), but the decoder reads enum values as `UInt8` and looks them up in the value map. When the enum has negative values (e.g., `'neg' = -128`), the decoder reads `128` instead of `-128`, failing to find the enum name. - -**Query to reproduce:** -```sql -SELECT 'neg'::Enum8('neg' = -128, 'pos' = 127) as val -``` - -**Expected:** `'neg'` -**Actual:** `''` - -**Root Cause:** -`decodeEnum8()` uses `this.reader.readUInt8()` but should use `this.reader.readInt8()` since Enum8 values are stored as signed Int8. - -The parser correctly stores enum values like -128 as Map keys (`type-parser.ts:321`). However, the decoder reads the byte 0x80 as unsigned 128 instead of signed -128, so the Map lookup fails. - -```typescript -// Current (buggy): -const { value, range } = this.reader.readUInt8(); // reads 0x80 as 128 - -// Should be: -const { value, range } = this.reader.readInt8(); // reads 0x80 as -128 -``` - -**Fix Location:** -- `src/core/decoder/rowbinary-decoder.ts:698` - Change `readUInt8()` to `readInt8()` -- `src/core/decoder/native-decoder.ts:763` - Change `readUInt8()` to `readInt8()` - ---- - -### 2. JSON with Nested Array of Objects - RowBinary Decoder - -**Severity:** High -**Location:** `rowbinary-decoder.ts:1110` - -**Description:** -When decoding JSON that contains arrays of objects, the decoder throws `Unknown BinaryTypeIndex: 0x30`. This indicates the decoder doesn't handle all BinaryTypeIndex values used in the JSON Dynamic encoding. - -**Query to reproduce:** -```sql -SELECT '{"items": [{"id": 1}, {"id": 2}]}'::JSON as val --- Requires: allow_experimental_json_type=1 -``` - -**Expected:** Successfully decode the nested JSON structure -**Actual:** `Error: Unknown BinaryTypeIndex: 0x30` - -**Root Cause:** -The `decodeDynamicType()` switch statement is missing the case for BinaryTypeIndex `0x30` which is **JSON**. When JSON contains nested objects/arrays, those nested JSON values are encoded with type index `0x30`, but the decoder doesn't handle this recursive case. - -Per the spec (`jsonspec.md:538` and `nativespec.md:2718`): -``` -| JSON | 0x30 | -``` - -**Fix Location:** -- `src/core/decoder/rowbinary-decoder.ts:942-1111` - Add case for `0x30` (JSON) in `decodeDynamicType()` - ---- - -### 3. JSON with Nested Array of Objects - Native Decoder - -**Severity:** High -**Location:** `native-decoder.ts:2219` - -**Description:** -When decoding JSON that contains arrays of objects, the Native decoder throws `RangeError: Offset is outside the bounds of the DataView`. This indicates the decoder is reading past the end of the buffer. - -**Query to reproduce:** -```sql -SELECT '{"items": [{"id": 1}, {"id": 2}]}'::JSON as val --- Requires: allow_experimental_json_type=1 -``` - -**Expected:** Successfully decode the nested JSON structure -**Actual:** `RangeError: Offset is outside the bounds of the DataView` at `decodeJSONColumnV1` - -**Root Cause:** -The JSON V1 decoder appears to miscalculate offsets or misparse the structure when dealing with arrays containing objects, leading to reading past buffer boundaries. - -**Fix Location:** -- `src/core/decoder/native-decoder.ts:2144` and `native-decoder.ts:2219` - ---- - -## Minor Issues (Cosmetic / Non-Failures) - -### 4. Negative Zero Float64 Preservation - -**Severity:** Low (Not a bug - correct IEEE 754 behavior) -**Location:** Both decoders - -**Description:** -The decoders correctly preserve IEEE 754 negative zero (`-0.0`) rather than normalizing it to positive zero. This is actually correct behavior per the IEEE 754 specification, but may be unexpected. - -**Query:** -```sql -SELECT -0.0::Float64 as val -``` - -**Result:** `-0` (correct IEEE 754 preservation) - -**Notes:** No fix needed - this is correct behavior. Tests should account for this. - ---- - -### 5. DateTime Timezone Not in typeString (Native) - -**Severity:** Low -**Location:** Native decoder - -**Description:** -When querying `DateTime('Pacific/Kiritimati')`, the Native format column's `typeString` shows just `DateTime` without the timezone parameter. This may be ClickHouse returning the type differently in Native format. - -**Query:** -```sql -SELECT '2024-01-15 12:30:45'::DateTime('Pacific/Kiritimati') as val -``` - -**Expected:** typeString contains `Pacific/Kiritimati` -**Actual:** typeString is just `DateTime` - -**Notes:** May be ClickHouse behavior rather than decoder issue. Needs investigation. - ---- - -## Test Issues (Not Decoder Bugs) - -### 6. Variant with Multiple Similar Integer Types - -**Status:** Test issue, not decoder bug - -ClickHouse rejects Variants with similar types (e.g., `Variant(UInt8, UInt16, UInt32)`) unless `allow_suspicious_variant_types=1` is set. Tests need to add this setting. - -### 7. Nested Type Cast Syntax - -**Status:** Test issue, not decoder bug - -The test used incorrect syntax `[1, 2]::Nested(id UInt8)`. Nested types require table creation and INSERT to test properly. - ---- - ---- - -## Dynamic Type Issues - -### 4. Array inside Dynamic has extra nesting level - RowBinary - -**Severity:** High -**Location:** `rowbinary-decoder.ts` - `decodeDynamic()` - -**Description:** -When decoding an Array type wrapped in Dynamic, the resulting AST has an extra nesting level. The value property contains `[[1, 2, 3]]` instead of `[1, 2, 3]`. - -**Query to reproduce:** -```sql -CREATE TABLE test (id UInt32, val Dynamic) ENGINE = Memory; -INSERT INTO test VALUES (1, [1, 2, 3]::Array(UInt8)); -SELECT val FROM test; -``` - -**Expected:** `children[1].value = 1` (direct array elements after length node) -**Actual:** `children[1].value = [1, 2, 3]` (array wrapped in another level) - ---- - -### 5. DateTime64 in Dynamic shows raw timestamp - RowBinary - -**Severity:** Medium -**Location:** `rowbinary-decoder.ts` - `decodeDynamic()` - -**Description:** -When DateTime64 is stored in a Dynamic column and decoded, the displayValue shows the raw Unix timestamp (e.g., `1705321845.123`) instead of a formatted date string. - -**Query to reproduce:** -```sql -CREATE TABLE test (id UInt32, val Dynamic) ENGINE = Memory; -INSERT INTO test VALUES (1, toDateTime64('2024-01-15 12:30:45.123', 3)); -SELECT val FROM test; -``` - -**Expected:** displayValue contains `2024-01-15` -**Actual:** displayValue is `1705321845.123` - ---- - -### 6. Native decoder - Complex types in Dynamic have garbled values - -**Severity:** High -**Location:** `native-decoder.ts` - Dynamic column decoding - -**Description:** -When decoding Array or Tuple types inside Dynamic columns in Native format, the values are garbled or incorrectly structured. Arrays show empty, Tuples show raw bytes. - -**Query to reproduce:** -```sql -SELECT [1, 2, 3]::Array(UInt32)::Dynamic as val -- Native format -SELECT (42, 'test')::Tuple(UInt32, String)::Dynamic as val -- Native format -``` - -**Expected:** Properly decoded array/tuple values -**Actual:** Empty arrays, garbled tuple values (raw bytes instead of proper types) - ---- - -### 7. Deeply nested arrays in Dynamic fail - Both decoders - -**Severity:** Medium -**Location:** Both decoders - -**Description:** -Deeply nested arrays (3+ levels) inside Dynamic columns fail to decode properly. RowBinary shows incorrect nesting, Native crashes with undefined children. - -**Query to reproduce:** -```sql -SELECT [[[1]]]::Array(Array(Array(UInt8)))::Dynamic as val -``` - ---- - -### 8. Named Tuple labels sorted alphabetically - RowBinary - -**Severity:** Medium -**Location:** `rowbinary-decoder.ts` - Dynamic Tuple decoding - -**Description:** -When decoding a Named Tuple inside Dynamic, the field labels appear to be sorted alphabetically rather than in their original definition order. - -**Query to reproduce:** -```sql -SELECT CAST((1, 'x'), 'Tuple(id UInt32, name String)')::Dynamic as val -``` - -**Expected:** First child has label `id`, second has label `name` -**Actual:** Labels are reordered (e.g., `name` before `id` or `type` instead of `id`) - ---- - -### 9. Tuple values misaligned in Dynamic - RowBinary - -**Severity:** High -**Location:** `rowbinary-decoder.ts` - Dynamic Tuple decoding - -**Description:** -When decoding a Tuple inside Dynamic, the values in children nodes don't match the expected tuple elements. The decoder appears to mix type metadata with actual values. - -**Query to reproduce:** -```sql -SELECT (42, 'test')::Tuple(UInt32, String)::Dynamic as val -``` - -**Expected:** `children[0].value = 42`, `children[1].value = 'test'` -**Actual:** `children[0].value = 31` (or other unexpected value) - ---- - -## Summary - -| Issue | Severity | Decoder | Status | -|-------|----------|---------|--------| -| Enum8 negative values | High | Both | **Fix Required** | -| JSON array of objects | High | RowBinary | **Fix Required** | -| JSON array of objects | High | Native | **Fix Required** | -| Array in Dynamic extra nesting | High | RowBinary | **Fix Required** | -| DateTime64 in Dynamic raw timestamp | Medium | RowBinary | **Fix Required** | -| Complex types in Dynamic garbled | High | Native | **Fix Required** | -| Deeply nested arrays in Dynamic | Medium | Both | **Fix Required** | -| Named Tuple labels sorted | Medium | RowBinary | **Fix Required** | -| Tuple values misaligned in Dynamic | High | RowBinary | **Fix Required** | -| Negative zero | Low | Both | Working as intended | -| DateTime timezone | Low | Native | Investigation needed | - ---- - -## Test Files - -**Edge case tests:** -`src/core/decoder/qa-edge-cases.integration.test.ts` - -Run with: -```bash -npm test -- --testNamePattern="QA Edge Case" -``` - -**Dynamic type exhaustive tests:** -`src/core/decoder/dynamic-exhaustive.integration.test.ts` - -Run with: -```bash -npm test -- src/core/decoder/dynamic-exhaustive.integration.test.ts -``` - -All known-failing tests are marked with `it.fails()` to document the decoder bugs while keeping the test suite green. diff --git a/all_types_query.sql b/all_types_query.sql deleted file mode 100644 index 4f4e368..0000000 --- a/all_types_query.sql +++ /dev/null @@ -1,222 +0,0 @@ --- Comprehensive query selecting every ClickHouse type --- Use with: FORMAT RowBinaryWithNamesAndTypes or FORMAT Native --- Required settings are at the bottom - -SELECT - -- =========================================== - -- INTEGERS - -- =========================================== - 42::UInt8 as uint8_val, - 1234::UInt16 as uint16_val, - 123456::UInt32 as uint32_val, - 9223372036854775807::UInt64 as uint64_val, - 170141183460469231731687303715884105727::UInt128 as uint128_val, - toUInt256(12345678901234567890) as uint256_val, - toInt8(-42) as int8_val, - toInt16(-1234) as int16_val, - toInt32(-123456) as int32_val, - toInt64(-9223372036854775807) as int64_val, - toInt128('-123456789012345678901234567890') as int128_val, - toInt256('-12345678901234567890123456789012345678901234567890') as int256_val, - - -- =========================================== - -- FLOATS - -- =========================================== - 3.14::Float32 as float32_val, - 3.141592653589793::Float64 as float64_val, - 1.5::BFloat16 as bfloat16_val, - inf::Float32 as float32_inf, - nan::Float64 as float64_nan, - - -- =========================================== - -- STRINGS - -- =========================================== - 'hello world'::String as string_val, - 'Привет мир 你好 🎉'::String as string_unicode, - 'abc'::FixedString(5) as fixedstring_val, - - -- =========================================== - -- BOOLEAN - -- =========================================== - true::Bool as bool_true, - false::Bool as bool_false, - - -- =========================================== - -- DATE AND TIME - -- =========================================== - '2024-01-15'::Date as date_val, - '1960-06-15'::Date32 as date32_val, - '2024-01-15 12:30:45'::DateTime as datetime_val, - '2024-01-15 12:30:45'::DateTime('UTC') as datetime_tz, - '2024-01-15 12:30:45.123456'::DateTime64(6) as datetime64_val, - '2024-01-15 12:30:45.123'::DateTime64(3, 'America/New_York') as datetime64_tz, - '12:30:45'::Time as time_val, - '12:30:45.123456'::Time64(6) as time64_val, - - -- =========================================== - -- SPECIAL TYPES - -- =========================================== - '550e8400-e29b-41d4-a716-446655440000'::UUID as uuid_val, - '192.168.1.1'::IPv4 as ipv4_val, - '2001:db8::1'::IPv6 as ipv6_val, - - -- =========================================== - -- DECIMALS - -- =========================================== - 123.45::Decimal32(2) as decimal32_val, - 12345.6789::Decimal64(4) as decimal64_val, - 123456789.123456789::Decimal128(9) as decimal128_val, - 0::Decimal256(20) as decimal256_val, - - -- =========================================== - -- ENUMS - -- =========================================== - 'hello'::Enum8('hello' = 1, 'world' = 2) as enum8_val, - 'world'::Enum16('hello' = 1, 'world' = 1000) as enum16_val, - - -- =========================================== - -- NULLABLE - -- =========================================== - 42::Nullable(UInt32) as nullable_int, - NULL::Nullable(String) as nullable_null, - 'test'::Nullable(String) as nullable_string, - - -- =========================================== - -- ARRAYS - -- =========================================== - []::Array(UInt8) as array_empty, - [1, 2, 3]::Array(UInt32) as array_int, - ['a', 'b', 'c']::Array(String) as array_string, - [[1, 2], [3, 4, 5]]::Array(Array(UInt32)) as array_nested, - [[[1, 2], [3]], [[4, 5, 6]]]::Array(Array(Array(UInt8))) as array_3d, - [1, NULL, 3]::Array(Nullable(UInt32)) as array_nullable, - - -- =========================================== - -- TUPLES - -- =========================================== - (42, 'hello')::Tuple(UInt32, String) as tuple_simple, - CAST((1, 'test', 3.14), 'Tuple(id UInt32, name String, value Float64)') as tuple_named, - ((1, 2), ('a', 'b'))::Tuple(Tuple(UInt8, UInt8), Tuple(String, String)) as tuple_nested, - (1, [2, 3], ('inner', 4.5))::Tuple(UInt8, Array(UInt8), Tuple(String, Float32)) as tuple_complex, - - -- =========================================== - -- MAPS - -- =========================================== - map()::Map(String, UInt32) as map_empty, - map('a', 1, 'b', 2, 'c', 3)::Map(String, UInt32) as map_string_int, - map(1, 'one', 2, 'two')::Map(UInt32, String) as map_int_string, - map('nested', map('inner', 42))::Map(String, Map(String, UInt32)) as map_nested, - - -- =========================================== - -- LOWCARDINALITY - -- =========================================== - 'hello'::LowCardinality(String) as lowcard_string, - NULL::LowCardinality(Nullable(String)) as lowcard_nullable, - - -- =========================================== - -- VARIANT - -- =========================================== - 'hello'::Variant(String, UInt64) as variant_string, - 42::Variant(String, UInt64) as variant_int, - NULL::Variant(String, UInt64) as variant_null, - - -- =========================================== - -- DYNAMIC - -- =========================================== - 42::Dynamic as dynamic_int, - 'hello'::Dynamic as dynamic_string, - NULL::Dynamic as dynamic_null, - toDateTime64('2024-01-15 12:30:00', 3)::Dynamic as dynamic_datetime, - - -- =========================================== - -- JSON - -- =========================================== - '{"name": "test", "value": 42}'::JSON as json_simple, - '{"id": 1, "nested": {"x": 10, "y": 20}}'::JSON as json_nested, - '{"id": 1, "nested": {"x": 10, "y": 20}}'::JSON(a Int32) as json_partially_typed, - '{"id": 1 }'::JSON(a Int32) as json_fully_typed, - '{"ip": "127.0.0.1"}'::JSON(ip IPv4) as json_ipv4, - - -- =========================================== - -- GEO TYPES - -- =========================================== - (1.5, 2.5)::Point as point_val, - [(0.0, 0.0), (1.0, 0.0), (1.0, 1.0), (0.0, 0.0)]::Ring as ring_val, - [[(0.0, 0.0), (10.0, 0.0), (10.0, 10.0), (0.0, 10.0), (0.0, 0.0)]]::Polygon as polygon_val, - [[[(0.0, 0.0), (1.0, 0.0), (0.0, 1.0), (0.0, 0.0)]]]::MultiPolygon as multipolygon_val, - [(0.0, 0.0), (1.0, 1.0), (2.0, 0.0)]::LineString as linestring_val, - [[(0.0, 0.0), (1.0, 1.0)], [(2.0, 2.0), (3.0, 3.0)]]::MultiLineString as multilinestring_val, - ((1.0, 2.0)::Point)::Geometry as geometry_point, - ([(0.0, 0.0), (1.0, 1.0)]::LineString)::Geometry as geometry_linestring, - - -- =========================================== - -- QBIT (Quantized Bit Vector) - -- =========================================== - [1.0, 2.0, 3.0]::QBit(Float32, 3) as qbit_float32, - - -- =========================================== - -- COMPLEX NESTED COMBINATIONS - -- =========================================== - -- Array of Tuples - [(1, 'a'), (2, 'b'), (3, 'c')]::Array(Tuple(UInt8, String)) as array_of_tuples, - - -- Array of Maps - [map('x', 1), map('y', 2, 'z', 3)]::Array(Map(String, UInt32)) as array_of_maps, - - -- Map of Arrays - map('nums', [1, 2, 3], 'more', [4, 5])::Map(String, Array(UInt32)) as map_of_arrays, - - -- Tuple with Array and Map - ( - 42::UInt32, - [1, 2, 3]::Array(UInt8), - map('key', 'value')::Map(String, String) - )::Tuple(id UInt32, numbers Array(UInt8), metadata Map(String, String)) as tuple_with_collections, - - -- Deeply nested Array (4D) - [[[[1, 2], [3, 4]], [[5, 6]]]]::Array(Array(Array(Array(UInt8)))) as array_4d, - - -- Map with Tuple values - map('point1', (1.0, 2.0), 'point2', (3.0, 4.0))::Map(String, Tuple(Float64, Float64)) as map_of_tuples, - - -- Tuple containing Nullable Array - (42, [1, NULL, 3]::Array(Nullable(UInt32)))::Tuple(UInt32, Array(Nullable(UInt32))) as tuple_nullable_array, - - -- Map of Maps - map('level1', map('level2a', 1, 'level2b', 2))::Map(String, Map(String, UInt32)) as map_of_maps, - - -- Complex real-world-like structure (user record) - ( - 'user_123'::String, - ['admin', 'editor', 'viewer']::Array(String), - map( - 'profile', (30, 'John Doe', true)::Tuple(age UInt8, name String, active Bool), - 'settings', (25, 'Jane Smith', false)::Tuple(age UInt8, name String, active Bool) - )::Map(String, Tuple(age UInt8, name String, active Bool)), - [(1.0, 2.0), (3.0, 4.0)]::Array(Point) - )::Tuple( - user_id String, - roles Array(String), - profiles Map(String, Tuple(age UInt8, name String, active Bool)), - locations Array(Point) - ) as complex_user_record, - - -- Tuple with Variant - (42, 'hello'::Variant(String, UInt64))::Tuple(UInt32, Variant(String, UInt64)) as tuple_with_variant, - - -- Array of Geo Points with labels - [((0.0, 0.0), 'origin'), ((1.0, 1.0), 'point_a')]::Array(Tuple(Point, String)) as labeled_points, - - -- Nested Maps with Arrays - map( - 'group1', map('items', [1, 2, 3]::Array(UInt32)), - 'group2', map('items', [4, 5, 6]::Array(UInt32)) - )::Map(String, Map(String, Array(UInt32))) as deeply_nested_map - -SETTINGS - allow_experimental_variant_type = 1, - allow_experimental_dynamic_type = 1, - allow_experimental_json_type = 1, - allow_suspicious_variant_types = 1, - allow_experimental_qbit_type = 1, - allow_suspicious_low_cardinality_types = 1 diff --git a/jsonspec.md b/docs/jsonspec.md similarity index 98% rename from jsonspec.md rename to docs/jsonspec.md index 4ba4d38..032456b 100644 --- a/jsonspec.md +++ b/docs/jsonspec.md @@ -1,643 +1,643 @@ -### 6.9 JSON - -The `JSON` type stores semi-structured JSON data with automatic schema inference. It uses an object-oriented storage model where JSON paths are stored as separate columns, similar to how `Nested` columns work. - -**Type Definition:** -```sql -JSON -JSON(max_dynamic_paths=N) -JSON(a Int64, b String) -- with typed paths -``` - -**Storage Model:** - -`JSON` columns store data using three categories of paths: - -1. **Typed Paths**: Paths with known, fixed types (declared in type definition) -2. **Dynamic Paths**: Frequently occurring paths discovered at runtime (up to `max_dynamic_paths`) -3. **Shared Data**: Remaining paths stored in a shared `Dynamic` column - -**Stream Architecture:** - -Similar to `Dynamic`, `JSON` uses multiple streams: - -1. **ObjectStructure** stream: - - Serialization version (UInt64) - - Structure metadata (list of dynamic paths, statistics, etc.) - -2. **ObjectData** stream: - - **ObjectTypedPath** substreams: One per typed path - - **ObjectDynamicPath** substreams: One per dynamic path - - **ObjectSharedData** substream: For overflow paths - -**Serialization Versions:** - -**V1 (version = 0):** -- `` (VarUInt) -- `` (VarUInt) -- `` (String for each) -- Optional statistics - -**V2 (version = 2):** -- Same as V1 without `max_dynamic_paths` - -**V3 (version = 4):** -- Like V2 with binary type encoding and optional statistics flag - -> **Note:** JSON Object version numbers are: V1=0, STRING=1, V2=2, FLATTENED=3, V3=4. - -> **Important:** `Dynamic` columns (used for dynamic paths) have **different** version numbers: V1=1, V2=2, FLATTENED=3, V3=4. Don't confuse JSON V1 (=0) with Dynamic V1 (=1). - -**STRING (version = 1):** -- Special mode for Native format only -- JSON stored as plain String (JSON text) -- Controlled by `output_format_native_write_json_as_string=1` - -**FLATTENED (version = 3):** -- Native format only, for easier client parsing -- All paths (typed + dynamic + shared) serialized as separate columns - -**Example: JSON with STRING serialization** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1&output_format_native_write_json_as_string=1" \ - --data-binary "SELECT '{\"a\": 1}'::JSON AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 044a 534f 4e01 0000 0000 ...col.JSON..... -00000010: 0000 0007 7b22 6122 3a31 7d ....{"a":1} -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column 0: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x04, 0x4a, 0x53, 0x4f, 0x4e, // Column type = "JSON" - - // ObjectStructure stream: - 0x01, 0x00, 0x00, 0x00, // Version = 1 (STRING) - 0x00, 0x00, 0x00, 0x00, - - // ObjectData stream (String column): - 0x07, // String length = 7 - 0x7b, 0x22, 0x61, 0x22, 0x3a, // '{"a":1}' - 0x31, 0x7d, -]); -``` - -**Example: JSON with standard serialization** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ - --data-binary "SELECT '{\"a\": 1, \"b\": \"hello\"}'::JSON AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 044a 534f 4e00 0000 0000 ...col.JSON..... -00000010: 0000 0002 0201 6101 6201 0000 0000 0000 ......a.b....... -00000020: 0001 0105 496e 7436 3400 0000 0000 0000 ....Int64....... -00000030: 0001 0000 0000 0000 0001 0106 5374 7269 ............Stri -00000040: 6e67 0000 0000 0000 0000 0001 0000 0000 ng.............. -00000050: 0000 0001 0568 656c 6c6f 0000 0000 0000 .....hello...... -00000060: 0000 .. -``` - -Breakdown (simplified structure): -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column metadata... - 0x03, 0x63, 0x6f, 0x6c, // "col" - 0x04, 0x4a, 0x53, 0x4f, 0x4e, // "JSON" - - // ObjectStructure stream: - 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1) - 0x00, 0x00, 0x00, 0x00, - - 0x02, // num_dynamic_paths = 2 (VarUInt) - 0x02, // Actual count = 2 - - // Sorted dynamic paths: - 0x01, 0x61, // Path 0: "a" (length=1) - 0x01, 0x62, // Path 1: "b" (length=1) - - // Shared data serialization version: - 0x01, 0x00, 0x00, 0x00, // Version info - 0x00, 0x00, 0x00, 0x00, - - // ObjectData stream: - // Path "a" stored as Dynamic with inferred type Int64: - 0x01, // num_types = 1 - 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // "Int64" - // ... Variant discriminators ... - 0x00, 0x00, 0x00, 0x00, // Mode - 0x00, 0x00, 0x00, 0x00, - 0x01, // Discriminator - // ... Value ... - 0x01, 0x00, 0x00, 0x00, // 1 (as Int64) - 0x00, 0x00, 0x00, 0x00, - - // Path "b" stored as Dynamic with type String: - 0x01, // num_types = 1 - 0x06, 0x53, 0x74, 0x72, 0x69, // "String" - 0x6e, 0x67, - // ... Variant discriminators ... - 0x00, 0x00, 0x00, 0x00, // Mode - 0x00, 0x00, 0x00, 0x00, - 0x01, // Discriminator - // ... Value ... - 0x05, 0x68, 0x65, 0x6c, 0x6c, 0x6f, // "hello" -]); -``` - -Note: Each dynamic path is stored as a `Dynamic` column, so it includes the full `Dynamic` serialization (structure + variant data). - -**Example: JSON with typed and dynamic paths** - -This example demonstrates a `JSON` column with both a **typed path** (declared type) and a **dynamic path** (inferred type). Typed paths are serialized directly using their native format, while dynamic paths are wrapped in `Dynamic` serialization. - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ - --data-binary "SELECT '{\"a\": 42, \"b\": \"hi\"}'::JSON(a UInt32) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0e4a 534f 4e28 6120 5549 ...col.JSON(a UI -00000010: 6e74 3332 2900 0000 0000 0000 0001 0101 nt32)........... -00000020: 6201 0000 0000 0000 0001 0106 5374 7269 b...........Stri -00000030: 6e67 0000 0000 0000 0000 2a00 0000 0102 ng........*..... -00000040: 6869 0000 0000 0000 0000 hi........ -``` - -Full breakdown: -```typescript -const block = new Uint8Array([ - // ═══════════════════════════════════════════════════════════════════ - // BLOCK HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x01, // NumColumns = 1 (VarUInt) - 0x01, // NumRows = 1 (VarUInt) - - // ═══════════════════════════════════════════════════════════════════ - // COLUMN HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (len=3) - 0x0e, // Type name length = 14 - 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x61, // "JSON(a UInt32)" - 0x20, 0x55, 0x49, 0x6e, 0x74, 0x33, // - typed path "a" with type UInt32 - 0x32, 0x29, // - typed paths are part of the type name - - // ═══════════════════════════════════════════════════════════════════ - // OBJECT STRUCTURE STREAM (prefix for JSON column) - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1 serialization) - 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) - - // V1 structure metadata: - 0x01, // max_dynamic_paths = 1 (VarUInt) - 0x01, // num_dynamic_paths = 1 (VarUInt) - - // Sorted list of dynamic path names: - 0x01, 0x62, // Path 0: "b" (len=1, then "b") - // - typed path "a" is NOT listed here - // - only runtime-discovered paths appear - - // NOTE: In Native format, statistics mode defaults to NONE, so no - // statistics are written here. The next bytes are the Dynamic prefix. - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC STRUCTURE STREAM (prefix for dynamic path "b") - // ═══════════════════════════════════════════════════════════════════ - 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) - 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) - // NOTE: Dynamic V1=1, V2=2, FLATTENED=3, V3=4 - - // V1 Dynamic structure: - 0x01, // max_dynamic_types = 1 (VarUInt) - 0x01, // num_dynamic_types = 1 (VarUInt) - - // Type names (not SharedVariant, which is implicit): - 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (len=6) - 0x6e, 0x67, - - // NOTE: Dynamic statistics also use NONE mode in Native format, - // so no statistics written here either. - - // ═══════════════════════════════════════════════════════════════════ - // VARIANT STRUCTURE STREAM (prefix for Dynamic's Variant) - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) - 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) - - // ═══════════════════════════════════════════════════════════════════ - // OBJECT DATA STREAM - // ═══════════════════════════════════════════════════════════════════ - - // --- TYPED PATH "a" (UInt32) --- - // Serialized directly using UInt32's native format (no Dynamic wrapper!) - 0x2a, 0x00, 0x00, 0x00, // value = 42 (UInt32 little-endian) - - // --- DYNAMIC PATH "b" (as Dynamic -> Variant) --- - // Variant discriminator column (1 row): - 0x01, // Row 0: discriminator = 1 (String variant) - // Variants sorted alphabetically: SharedVariant=0, String=1 - // (255 = NULL) - - // String variant data: - 0x02, 0x68, 0x69, // value = "hi" (len=2, then "hi") - - // ═══════════════════════════════════════════════════════════════════ - // SHARED DATA STREAM (overflow paths, empty in this case) - // ═══════════════════════════════════════════════════════════════════ - // Shared data is Array(Tuple(path: String, value: String)) - // For 1 row with 0 shared paths: - 0x00, 0x00, 0x00, 0x00, // Array offsets: [0] (UInt64) - 0x00, 0x00, 0x00, 0x00, // - row 0 has 0 elements -]); -``` - -**Key observations:** - -1. **Typed path "a"** (UInt32): - - Declared in type definition: `JSON(a UInt32)` - - Type name includes the typed path specification - - NOT listed in the dynamic paths list - - Serialized directly as UInt32 (4 bytes, little-endian): `2a 00 00 00` = 42 - - No `Dynamic` wrapper overhead - -2. **Dynamic path "b"** (inferred as String): - - Discovered at runtime from the JSON data - - Listed in the dynamic paths list in ObjectStructure - - Serialized as a full `Dynamic` column: - - DynamicStructure: version, types list, statistics - - VariantStructure: mode - - VariantData: discriminators + type-specific data - -3. **Serialization order in ObjectData**: - - Typed paths first (sorted alphabetically by path name) - - Dynamic paths second (sorted alphabetically by path name) - - Shared data last - -4. **Space efficiency**: Typed paths save significant space by avoiding the `Dynamic` overhead. For path "a", we use 4 bytes (UInt32) instead of ~20+ bytes (Dynamic structure + Variant + value). - -**Example: JSON with typed and dynamic paths (multiple rows)** - -This example shows the same structure with 2 rows, demonstrating how column data is laid out contiguously for each path. - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ - --data-binary "SELECT ('{\"a\": ' || toString(number * 10) || ', \"b\": \"row' || toString(number) || '\"}')::JSON(a UInt32) AS col FROM system.numbers LIMIT 2" | xxd -``` - -This produces rows: `{"a": 0, "b": "row0"}` and `{"a": 10, "b": "row1"}`. - -``` -00000000: 0102 0363 6f6c 0e4a 534f 4e28 6120 5549 ...col.JSON(a UI -00000010: 6e74 3332 2900 0000 0000 0000 0001 0101 nt32)........... -00000020: 6201 0000 0000 0000 0001 0106 5374 7269 b...........Stri -00000030: 6e67 0000 0000 0000 0000 0000 0000 0a00 ng.............. -00000040: 0000 0101 0472 6f77 3004 726f 7731 0000 .....row0.row1.. -00000050: 0000 0000 0000 0000 0000 0000 0000 .............. -``` - -Full breakdown: -```typescript -const block = new Uint8Array([ - // ═══════════════════════════════════════════════════════════════════ - // BLOCK HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x01, // NumColumns = 1 (VarUInt) - 0x02, // NumRows = 2 (VarUInt) <-- 2 rows! - - // ═══════════════════════════════════════════════════════════════════ - // COLUMN HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (len=3) - 0x0e, // Type name length = 14 - 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x61, // "JSON(a UInt32)" - 0x20, 0x55, 0x49, 0x6e, 0x74, 0x33, - 0x32, 0x29, - - // ═══════════════════════════════════════════════════════════════════ - // OBJECT STRUCTURE STREAM - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1 serialization) - 0x00, 0x00, 0x00, 0x00, - - 0x01, // max_dynamic_paths = 1 (VarUInt) - 0x01, // num_dynamic_paths = 1 (VarUInt) - 0x01, 0x62, // Path 0: "b" (len=1) - - // NOTE: No statistics in Native format (NONE mode is default) - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC STRUCTURE STREAM (for path "b") - // ═══════════════════════════════════════════════════════════════════ - 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) - 0x00, 0x00, 0x00, 0x00, // (Dynamic V1=1, V2=2, V3=4) - - 0x01, // max_dynamic_types = 1 (VarUInt) - 0x01, // num_dynamic_types = 1 (VarUInt) - 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (len=6) - 0x6e, 0x67, - - // NOTE: No Dynamic statistics in Native format (NONE mode) - - // ═══════════════════════════════════════════════════════════════════ - // VARIANT STRUCTURE STREAM - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) - 0x00, 0x00, 0x00, 0x00, - - // ═══════════════════════════════════════════════════════════════════ - // OBJECT DATA STREAM - // ═══════════════════════════════════════════════════════════════════ - - // --- TYPED PATH "a" (UInt32) - ALL ROWS CONTIGUOUS --- - 0x00, 0x00, 0x00, 0x00, // Row 0: a = 0 - 0x0a, 0x00, 0x00, 0x00, // Row 1: a = 10 - - // --- DYNAMIC PATH "b" - Variant discriminators (ALL ROWS) --- - // Variants sorted alphabetically: SharedVariant=0, String=1 (255=NULL) - 0x01, // Row 0: discriminator = 1 (String) - 0x01, // Row 1: discriminator = 1 (String) - - // --- DYNAMIC PATH "b" - String variant data (ALL ROWS) --- - 0x04, 0x72, 0x6f, 0x77, 0x30, // Row 0: "row0" (len=4) - 0x04, 0x72, 0x6f, 0x77, 0x31, // Row 1: "row1" (len=4) - - // ═══════════════════════════════════════════════════════════════════ - // SHARED DATA STREAM (Array offsets for each row) - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Row 0 offset: 0 elements - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, // Row 1 offset: 0 elements - 0x00, 0x00, 0x00, 0x00, -]); -``` - -**Key observations for multi-row data:** - -1. **Columnar layout**: All values for each path are stored contiguously: - - Typed path "a": `[0, 10]` as two consecutive UInt32 values - - Dynamic path "b" discriminators: `[1, 1]` (both rows are String) - - Dynamic path "b" String data: `["row0", "row1"]` stored sequentially - -2. **No per-row overhead**: The structure metadata (version, paths, types) is written once in the prefix, not repeated for each row. Only the actual data values scale with row count. - -3. **Variant discriminators**: For the Dynamic path, discriminators for ALL rows come first, then the actual variant data for all rows. This enables efficient columnar processing. - -4. **Shared data offsets**: One UInt64 offset per row indicating how many shared path entries that row has. With 0 shared paths, all offsets are 0. - -**Example: JSON with exceeded max_dynamic_paths (shared data)** - -This example shows what happens when the number of paths exceeds `max_dynamic_paths`. Overflow paths are stored in the **shared data** section. - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ - --data-binary "SELECT '{\"a\": 1, \"b\": 2, \"c\": 3}'::JSON(max_dynamic_paths=2) AS col" | xxd -``` - -With `max_dynamic_paths=2` and 3 paths in the JSON: -- Paths "a" and "b" become **dynamic paths** (stored as `Dynamic` columns) -- Path "c" overflows to **shared data** - -``` -00000000: 0101 0363 6f6c 194a 534f 4e28 6d61 785f ...col.JSON(max_ -00000010: 6479 6e61 6d69 635f 7061 7468 733d 3229 dynamic_paths=2) -00000020: 0000 0000 0000 0000 0202 0161 0162 0100 ...........a.b.. -00000030: 0000 0000 0000 0101 0549 6e74 3634 0000 .........Int64.. -00000040: 0000 0000 0000 0100 0000 0000 0000 0101 ................ -00000050: 0549 6e74 3634 0000 0000 0000 0000 0001 .Int64.......... -00000060: 0000 0000 0000 0000 0200 0000 0000 0000 ................ -00000070: 0100 0000 0000 0000 0163 090a 0300 0000 .........c...... -00000080: 0000 0000 .... -``` - -Full breakdown: -```typescript -const block = new Uint8Array([ - // ═══════════════════════════════════════════════════════════════════ - // BLOCK HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // ═══════════════════════════════════════════════════════════════════ - // COLUMN HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x19, // Type name length = 25 - // "JSON(max_dynamic_paths=2)" - 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x6d, 0x61, 0x78, 0x5f, - 0x64, 0x79, 0x6e, 0x61, 0x6d, 0x69, 0x63, 0x5f, - 0x70, 0x61, 0x74, 0x68, 0x73, 0x3d, 0x32, 0x29, - - // ═══════════════════════════════════════════════════════════════════ - // OBJECT STRUCTURE STREAM - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1) - 0x00, 0x00, 0x00, 0x00, - - 0x02, // max_dynamic_paths = 2 - 0x02, // num_dynamic_paths = 2 - 0x01, 0x61, // Path 0: "a" - 0x01, 0x62, // Path 1: "b" - // Note: "c" is NOT here - it's in shared data - - // NOTE: No statistics in Native format (NONE mode is default) - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC PATH "a" STRUCTURE - // ═══════════════════════════════════════════════════════════════════ - 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) - 0x00, 0x00, 0x00, 0x00, // (Dynamic V1=1, V2=2, V3=4) - 0x01, // max_dynamic_types = 1 - 0x01, // num_dynamic_types = 1 - 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // Type: "Int64" - // NOTE: No Dynamic stats (NONE mode) - 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) - 0x00, 0x00, 0x00, 0x00, - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC PATH "b" STRUCTURE (similar to "a") - // ═══════════════════════════════════════════════════════════════════ - 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) - 0x00, 0x00, 0x00, 0x00, - 0x01, // max_dynamic_types = 1 - 0x01, // num_dynamic_types = 1 - 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // Type: "Int64" - 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) - 0x00, 0x00, 0x00, 0x00, - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC PATH "a" DATA (Int64 value = 1) - // ═══════════════════════════════════════════════════════════════════ - // Variants sorted alphabetically: Int64=0, SharedVariant=1 (255=NULL) - 0x00, // Discriminator = 0 (Int64 variant) - 0x01, 0x00, 0x00, 0x00, // value = 1 (Int64 little-endian) - 0x00, 0x00, 0x00, 0x00, - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC PATH "b" DATA (Int64 value = 2) - // ═══════════════════════════════════════════════════════════════════ - 0x00, // Discriminator = 0 (Int64 variant) - 0x02, 0x00, 0x00, 0x00, // value = 2 (Int64 little-endian) - 0x00, 0x00, 0x00, 0x00, - - // ═══════════════════════════════════════════════════════════════════ - // SHARED DATA STREAM - // This is where path "c" lives (overflow from max_dynamic_paths) - // Format: Array(Tuple(path: String, value: String)) - // ═══════════════════════════════════════════════════════════════════ - - // Array offsets (one UInt64 per row): - 0x01, 0x00, 0x00, 0x00, // Row 0: offset = 1 (has 1 element) - 0x00, 0x00, 0x00, 0x00, - - // Shared data element 0 (path "c"): - 0x01, 0x63, // Path name: "c" (len=1) - - // Binary-encoded Dynamic value for "c": - 0x09, // Value string length = 9 bytes - 0x0a, // BinaryTypeIndex = 0x0a (Int64) - 0x03, 0x00, 0x00, 0x00, // value = 3 (Int64 little-endian) - 0x00, 0x00, 0x00, 0x00, -]); -``` - -**Key observations for shared data:** - -1. **Overflow mechanism**: When paths exceed `max_dynamic_paths`, extra paths go to shared data. Here "a" and "b" are dynamic paths, "c" overflows. - -2. **Shared data format**: `Array(Tuple(path: String, value: String))` - - Array offsets indicate how many shared paths each row has - - Each element is a (path_name, binary_encoded_value) tuple - -3. **Binary-encoded values in shared data**: The value is stored as a length-prefixed string containing: - - `BinaryTypeIndex` (1 byte): Type identifier (0x0a = Int64) - - Native value: The value in its type's binary format - -4. **BinaryTypeIndex values** (common types): - | Type | Index | - |------|-------| - | Nothing | 0x00 | - | UInt8 | 0x01 | - | UInt32 | 0x03 | - | UInt64 | 0x04 | - | Int64 | 0x0a | - | String | 0x15 | - | Array | 0x1e | - | JSON | 0x30 | - -5. **Space tradeoff**: Shared data is less efficient than dynamic paths because each value includes its path name and type encoding. Use `max_dynamic_paths` wisely for your data. - -**Example: Variant discriminator ordering (multiple types)** - -Variant types are sorted **alphabetically by type name** to determine discriminator values. SharedVariant is included in this sorting. Here's an example showing how discriminators are assigned: - -```sql --- Insert multiple rows with different JSON value types -INSERT INTO test_json VALUES -('{"x": true}'), -- Bool -('{"x": 3.14}'), -- Float64 -('{"x": "hello"}'), -- String -('{"x": [1,2,3]}'), -- Array(Nullable(Int64)) -('{"x": null}') -- NULL -``` - -The Dynamic column for path "x" will have variant types sorted alphabetically: - -| Index | Type | Notes | -|-------|------|-------| -| 0 | Array(Nullable(Int64)) | "A" < "B" < ... | -| 1 | Bool | | -| 2 | Float64 | | -| 3 | SharedVariant | Implicit, always present | -| 4 | String | "Sh..." < "St..." | -| 255 | (NULL) | Special NULL_DISCRIMINATOR | - -The discriminator bytes in the serialized data will be: -``` -0x01, // Bool (index 1) -0x02, // Float64 (index 2) -0x04, // String (index 4, after SharedVariant=3) -0x00, // Array (index 0) -0xff, // NULL (255) -``` - -**Key rule**: For any type T in a Dynamic column, its discriminator = index in the alphabetically sorted list of [all_types + SharedVariant]. - -**Path Naming:** - -JSON paths use dot notation: -- `a` - top-level key "a" -- `a.b` - nested key "b" inside "a" -- `a.b.c` - deeply nested - -ClickHouse can escape dots in JSON keys when `json_type_escape_dots_in_keys=1` is set. - -**Typed Paths:** - -When a `JSON` type has typed paths declared: -```sql -JSON(user_id UInt64, name String) -``` - -These paths: -1. Always have the specified type (no inference needed) -2. Are serialized directly in their native type format (not as `Dynamic`) -3. Appear in `ObjectData/ObjectTypedPath` substreams - -**Dynamic Paths:** - -Dynamic paths are: -1. Discovered at insertion time -2. Limited by `max_dynamic_paths` (default 1024) -3. Stored as `Dynamic` columns with inferred types -4. Appear in `ObjectData/ObjectDynamicPath` substreams - -**Shared Data:** - -When more than `max_dynamic_paths` unique paths exist: -- Overflow paths are stored in a shared `Dynamic` column -- Each value in shared data includes its full path + value -- Uses binary serialization similar to `Dynamic`'s shared variant - -**Format Setting: `output_format_native_write_json_as_string`:** - -When set to `1`: -- Uses STRING serialization (version = 1) -- Entire JSON object serialized as text String -- Easier for clients that don't support complex `JSON` deserialization -- Default: `0` (use structured serialization) - -**Format Setting: `output_format_native_use_flattened_dynamic_and_json_serialization`:** - -When set to `1`: -- Uses FLATTENED serialization (version = 3) -- All paths serialized as individual columns -- No `Dynamic` wrapper, just direct column data -- Easier for some clients to parse -- Default: `0` - -**Reading JSON Columns:** - -To deserialize a `JSON` column in standard mode: - -1. Read `ObjectStructure` to get version and path list -2. For each typed path: deserialize using its known type -3. For each dynamic path: deserialize as `Dynamic` column -4. If shared data exists: deserialize the shared `Dynamic` column -5. Reconstruct JSON objects by merging values from all paths - -In STRING mode: just read the String value containing JSON text. - -In FLATTENED mode: read each path as a separate column, with a NULL indicator for missing paths. +### 6.9 JSON + +The `JSON` type stores semi-structured JSON data with automatic schema inference. It uses an object-oriented storage model where JSON paths are stored as separate columns, similar to how `Nested` columns work. + +**Type Definition:** +```sql +JSON +JSON(max_dynamic_paths=N) +JSON(a Int64, b String) -- with typed paths +``` + +**Storage Model:** + +`JSON` columns store data using three categories of paths: + +1. **Typed Paths**: Paths with known, fixed types (declared in type definition) +2. **Dynamic Paths**: Frequently occurring paths discovered at runtime (up to `max_dynamic_paths`) +3. **Shared Data**: Remaining paths stored in a shared `Dynamic` column + +**Stream Architecture:** + +Similar to `Dynamic`, `JSON` uses multiple streams: + +1. **ObjectStructure** stream: + - Serialization version (UInt64) + - Structure metadata (list of dynamic paths, statistics, etc.) + +2. **ObjectData** stream: + - **ObjectTypedPath** substreams: One per typed path + - **ObjectDynamicPath** substreams: One per dynamic path + - **ObjectSharedData** substream: For overflow paths + +**Serialization Versions:** + +**V1 (version = 0):** +- `` (VarUInt) +- `` (VarUInt) +- `` (String for each) +- Optional statistics + +**V2 (version = 2):** +- Same as V1 without `max_dynamic_paths` + +**V3 (version = 4):** +- Like V2 with binary type encoding and optional statistics flag + +> **Note:** JSON Object version numbers are: V1=0, STRING=1, V2=2, FLATTENED=3, V3=4. + +> **Important:** `Dynamic` columns (used for dynamic paths) have **different** version numbers: V1=1, V2=2, FLATTENED=3, V3=4. Don't confuse JSON V1 (=0) with Dynamic V1 (=1). + +**STRING (version = 1):** +- Special mode for Native format only +- JSON stored as plain String (JSON text) +- Controlled by `output_format_native_write_json_as_string=1` + +**FLATTENED (version = 3):** +- Native format only, for easier client parsing +- All paths (typed + dynamic + shared) serialized as separate columns + +**Example: JSON with STRING serialization** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1&output_format_native_write_json_as_string=1" \ + --data-binary "SELECT '{\"a\": 1}'::JSON AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 044a 534f 4e01 0000 0000 ...col.JSON..... +00000010: 0000 0007 7b22 6122 3a31 7d ....{"a":1} +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column 0: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x04, 0x4a, 0x53, 0x4f, 0x4e, // Column type = "JSON" + + // ObjectStructure stream: + 0x01, 0x00, 0x00, 0x00, // Version = 1 (STRING) + 0x00, 0x00, 0x00, 0x00, + + // ObjectData stream (String column): + 0x07, // String length = 7 + 0x7b, 0x22, 0x61, 0x22, 0x3a, // '{"a":1}' + 0x31, 0x7d, +]); +``` + +**Example: JSON with standard serialization** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ + --data-binary "SELECT '{\"a\": 1, \"b\": \"hello\"}'::JSON AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 044a 534f 4e00 0000 0000 ...col.JSON..... +00000010: 0000 0002 0201 6101 6201 0000 0000 0000 ......a.b....... +00000020: 0001 0105 496e 7436 3400 0000 0000 0000 ....Int64....... +00000030: 0001 0000 0000 0000 0001 0106 5374 7269 ............Stri +00000040: 6e67 0000 0000 0000 0000 0001 0000 0000 ng.............. +00000050: 0000 0001 0568 656c 6c6f 0000 0000 0000 .....hello...... +00000060: 0000 .. +``` + +Breakdown (simplified structure): +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column metadata... + 0x03, 0x63, 0x6f, 0x6c, // "col" + 0x04, 0x4a, 0x53, 0x4f, 0x4e, // "JSON" + + // ObjectStructure stream: + 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1) + 0x00, 0x00, 0x00, 0x00, + + 0x02, // num_dynamic_paths = 2 (VarUInt) + 0x02, // Actual count = 2 + + // Sorted dynamic paths: + 0x01, 0x61, // Path 0: "a" (length=1) + 0x01, 0x62, // Path 1: "b" (length=1) + + // Shared data serialization version: + 0x01, 0x00, 0x00, 0x00, // Version info + 0x00, 0x00, 0x00, 0x00, + + // ObjectData stream: + // Path "a" stored as Dynamic with inferred type Int64: + 0x01, // num_types = 1 + 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // "Int64" + // ... Variant discriminators ... + 0x00, 0x00, 0x00, 0x00, // Mode + 0x00, 0x00, 0x00, 0x00, + 0x01, // Discriminator + // ... Value ... + 0x01, 0x00, 0x00, 0x00, // 1 (as Int64) + 0x00, 0x00, 0x00, 0x00, + + // Path "b" stored as Dynamic with type String: + 0x01, // num_types = 1 + 0x06, 0x53, 0x74, 0x72, 0x69, // "String" + 0x6e, 0x67, + // ... Variant discriminators ... + 0x00, 0x00, 0x00, 0x00, // Mode + 0x00, 0x00, 0x00, 0x00, + 0x01, // Discriminator + // ... Value ... + 0x05, 0x68, 0x65, 0x6c, 0x6c, 0x6f, // "hello" +]); +``` + +Note: Each dynamic path is stored as a `Dynamic` column, so it includes the full `Dynamic` serialization (structure + variant data). + +**Example: JSON with typed and dynamic paths** + +This example demonstrates a `JSON` column with both a **typed path** (declared type) and a **dynamic path** (inferred type). Typed paths are serialized directly using their native format, while dynamic paths are wrapped in `Dynamic` serialization. + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ + --data-binary "SELECT '{\"a\": 42, \"b\": \"hi\"}'::JSON(a UInt32) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0e4a 534f 4e28 6120 5549 ...col.JSON(a UI +00000010: 6e74 3332 2900 0000 0000 0000 0001 0101 nt32)........... +00000020: 6201 0000 0000 0000 0001 0106 5374 7269 b...........Stri +00000030: 6e67 0000 0000 0000 0000 2a00 0000 0102 ng........*..... +00000040: 6869 0000 0000 0000 0000 hi........ +``` + +Full breakdown: +```typescript +const block = new Uint8Array([ + // ═══════════════════════════════════════════════════════════════════ + // BLOCK HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x01, // NumColumns = 1 (VarUInt) + 0x01, // NumRows = 1 (VarUInt) + + // ═══════════════════════════════════════════════════════════════════ + // COLUMN HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (len=3) + 0x0e, // Type name length = 14 + 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x61, // "JSON(a UInt32)" + 0x20, 0x55, 0x49, 0x6e, 0x74, 0x33, // - typed path "a" with type UInt32 + 0x32, 0x29, // - typed paths are part of the type name + + // ═══════════════════════════════════════════════════════════════════ + // OBJECT STRUCTURE STREAM (prefix for JSON column) + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1 serialization) + 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) + + // V1 structure metadata: + 0x01, // max_dynamic_paths = 1 (VarUInt) + 0x01, // num_dynamic_paths = 1 (VarUInt) + + // Sorted list of dynamic path names: + 0x01, 0x62, // Path 0: "b" (len=1, then "b") + // - typed path "a" is NOT listed here + // - only runtime-discovered paths appear + + // NOTE: In Native format, statistics mode defaults to NONE, so no + // statistics are written here. The next bytes are the Dynamic prefix. + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC STRUCTURE STREAM (prefix for dynamic path "b") + // ═══════════════════════════════════════════════════════════════════ + 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) + 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) + // NOTE: Dynamic V1=1, V2=2, FLATTENED=3, V3=4 + + // V1 Dynamic structure: + 0x01, // max_dynamic_types = 1 (VarUInt) + 0x01, // num_dynamic_types = 1 (VarUInt) + + // Type names (not SharedVariant, which is implicit): + 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (len=6) + 0x6e, 0x67, + + // NOTE: Dynamic statistics also use NONE mode in Native format, + // so no statistics written here either. + + // ═══════════════════════════════════════════════════════════════════ + // VARIANT STRUCTURE STREAM (prefix for Dynamic's Variant) + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) + 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) + + // ═══════════════════════════════════════════════════════════════════ + // OBJECT DATA STREAM + // ═══════════════════════════════════════════════════════════════════ + + // --- TYPED PATH "a" (UInt32) --- + // Serialized directly using UInt32's native format (no Dynamic wrapper!) + 0x2a, 0x00, 0x00, 0x00, // value = 42 (UInt32 little-endian) + + // --- DYNAMIC PATH "b" (as Dynamic -> Variant) --- + // Variant discriminator column (1 row): + 0x01, // Row 0: discriminator = 1 (String variant) + // Variants sorted alphabetically: SharedVariant=0, String=1 + // (255 = NULL) + + // String variant data: + 0x02, 0x68, 0x69, // value = "hi" (len=2, then "hi") + + // ═══════════════════════════════════════════════════════════════════ + // SHARED DATA STREAM (overflow paths, empty in this case) + // ═══════════════════════════════════════════════════════════════════ + // Shared data is Array(Tuple(path: String, value: String)) + // For 1 row with 0 shared paths: + 0x00, 0x00, 0x00, 0x00, // Array offsets: [0] (UInt64) + 0x00, 0x00, 0x00, 0x00, // - row 0 has 0 elements +]); +``` + +**Key observations:** + +1. **Typed path "a"** (UInt32): + - Declared in type definition: `JSON(a UInt32)` + - Type name includes the typed path specification + - NOT listed in the dynamic paths list + - Serialized directly as UInt32 (4 bytes, little-endian): `2a 00 00 00` = 42 + - No `Dynamic` wrapper overhead + +2. **Dynamic path "b"** (inferred as String): + - Discovered at runtime from the JSON data + - Listed in the dynamic paths list in ObjectStructure + - Serialized as a full `Dynamic` column: + - DynamicStructure: version, types list, statistics + - VariantStructure: mode + - VariantData: discriminators + type-specific data + +3. **Serialization order in ObjectData**: + - Typed paths first (sorted alphabetically by path name) + - Dynamic paths second (sorted alphabetically by path name) + - Shared data last + +4. **Space efficiency**: Typed paths save significant space by avoiding the `Dynamic` overhead. For path "a", we use 4 bytes (UInt32) instead of ~20+ bytes (Dynamic structure + Variant + value). + +**Example: JSON with typed and dynamic paths (multiple rows)** + +This example shows the same structure with 2 rows, demonstrating how column data is laid out contiguously for each path. + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ + --data-binary "SELECT ('{\"a\": ' || toString(number * 10) || ', \"b\": \"row' || toString(number) || '\"}')::JSON(a UInt32) AS col FROM system.numbers LIMIT 2" | xxd +``` + +This produces rows: `{"a": 0, "b": "row0"}` and `{"a": 10, "b": "row1"}`. + +``` +00000000: 0102 0363 6f6c 0e4a 534f 4e28 6120 5549 ...col.JSON(a UI +00000010: 6e74 3332 2900 0000 0000 0000 0001 0101 nt32)........... +00000020: 6201 0000 0000 0000 0001 0106 5374 7269 b...........Stri +00000030: 6e67 0000 0000 0000 0000 0000 0000 0a00 ng.............. +00000040: 0000 0101 0472 6f77 3004 726f 7731 0000 .....row0.row1.. +00000050: 0000 0000 0000 0000 0000 0000 0000 .............. +``` + +Full breakdown: +```typescript +const block = new Uint8Array([ + // ═══════════════════════════════════════════════════════════════════ + // BLOCK HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x01, // NumColumns = 1 (VarUInt) + 0x02, // NumRows = 2 (VarUInt) <-- 2 rows! + + // ═══════════════════════════════════════════════════════════════════ + // COLUMN HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (len=3) + 0x0e, // Type name length = 14 + 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x61, // "JSON(a UInt32)" + 0x20, 0x55, 0x49, 0x6e, 0x74, 0x33, + 0x32, 0x29, + + // ═══════════════════════════════════════════════════════════════════ + // OBJECT STRUCTURE STREAM + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1 serialization) + 0x00, 0x00, 0x00, 0x00, + + 0x01, // max_dynamic_paths = 1 (VarUInt) + 0x01, // num_dynamic_paths = 1 (VarUInt) + 0x01, 0x62, // Path 0: "b" (len=1) + + // NOTE: No statistics in Native format (NONE mode is default) + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC STRUCTURE STREAM (for path "b") + // ═══════════════════════════════════════════════════════════════════ + 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) + 0x00, 0x00, 0x00, 0x00, // (Dynamic V1=1, V2=2, V3=4) + + 0x01, // max_dynamic_types = 1 (VarUInt) + 0x01, // num_dynamic_types = 1 (VarUInt) + 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (len=6) + 0x6e, 0x67, + + // NOTE: No Dynamic statistics in Native format (NONE mode) + + // ═══════════════════════════════════════════════════════════════════ + // VARIANT STRUCTURE STREAM + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) + 0x00, 0x00, 0x00, 0x00, + + // ═══════════════════════════════════════════════════════════════════ + // OBJECT DATA STREAM + // ═══════════════════════════════════════════════════════════════════ + + // --- TYPED PATH "a" (UInt32) - ALL ROWS CONTIGUOUS --- + 0x00, 0x00, 0x00, 0x00, // Row 0: a = 0 + 0x0a, 0x00, 0x00, 0x00, // Row 1: a = 10 + + // --- DYNAMIC PATH "b" - Variant discriminators (ALL ROWS) --- + // Variants sorted alphabetically: SharedVariant=0, String=1 (255=NULL) + 0x01, // Row 0: discriminator = 1 (String) + 0x01, // Row 1: discriminator = 1 (String) + + // --- DYNAMIC PATH "b" - String variant data (ALL ROWS) --- + 0x04, 0x72, 0x6f, 0x77, 0x30, // Row 0: "row0" (len=4) + 0x04, 0x72, 0x6f, 0x77, 0x31, // Row 1: "row1" (len=4) + + // ═══════════════════════════════════════════════════════════════════ + // SHARED DATA STREAM (Array offsets for each row) + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Row 0 offset: 0 elements + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, // Row 1 offset: 0 elements + 0x00, 0x00, 0x00, 0x00, +]); +``` + +**Key observations for multi-row data:** + +1. **Columnar layout**: All values for each path are stored contiguously: + - Typed path "a": `[0, 10]` as two consecutive UInt32 values + - Dynamic path "b" discriminators: `[1, 1]` (both rows are String) + - Dynamic path "b" String data: `["row0", "row1"]` stored sequentially + +2. **No per-row overhead**: The structure metadata (version, paths, types) is written once in the prefix, not repeated for each row. Only the actual data values scale with row count. + +3. **Variant discriminators**: For the Dynamic path, discriminators for ALL rows come first, then the actual variant data for all rows. This enables efficient columnar processing. + +4. **Shared data offsets**: One UInt64 offset per row indicating how many shared path entries that row has. With 0 shared paths, all offsets are 0. + +**Example: JSON with exceeded max_dynamic_paths (shared data)** + +This example shows what happens when the number of paths exceeds `max_dynamic_paths`. Overflow paths are stored in the **shared data** section. + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ + --data-binary "SELECT '{\"a\": 1, \"b\": 2, \"c\": 3}'::JSON(max_dynamic_paths=2) AS col" | xxd +``` + +With `max_dynamic_paths=2` and 3 paths in the JSON: +- Paths "a" and "b" become **dynamic paths** (stored as `Dynamic` columns) +- Path "c" overflows to **shared data** + +``` +00000000: 0101 0363 6f6c 194a 534f 4e28 6d61 785f ...col.JSON(max_ +00000010: 6479 6e61 6d69 635f 7061 7468 733d 3229 dynamic_paths=2) +00000020: 0000 0000 0000 0000 0202 0161 0162 0100 ...........a.b.. +00000030: 0000 0000 0000 0101 0549 6e74 3634 0000 .........Int64.. +00000040: 0000 0000 0000 0100 0000 0000 0000 0101 ................ +00000050: 0549 6e74 3634 0000 0000 0000 0000 0001 .Int64.......... +00000060: 0000 0000 0000 0000 0200 0000 0000 0000 ................ +00000070: 0100 0000 0000 0000 0163 090a 0300 0000 .........c...... +00000080: 0000 0000 .... +``` + +Full breakdown: +```typescript +const block = new Uint8Array([ + // ═══════════════════════════════════════════════════════════════════ + // BLOCK HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // ═══════════════════════════════════════════════════════════════════ + // COLUMN HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x19, // Type name length = 25 + // "JSON(max_dynamic_paths=2)" + 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x6d, 0x61, 0x78, 0x5f, + 0x64, 0x79, 0x6e, 0x61, 0x6d, 0x69, 0x63, 0x5f, + 0x70, 0x61, 0x74, 0x68, 0x73, 0x3d, 0x32, 0x29, + + // ═══════════════════════════════════════════════════════════════════ + // OBJECT STRUCTURE STREAM + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1) + 0x00, 0x00, 0x00, 0x00, + + 0x02, // max_dynamic_paths = 2 + 0x02, // num_dynamic_paths = 2 + 0x01, 0x61, // Path 0: "a" + 0x01, 0x62, // Path 1: "b" + // Note: "c" is NOT here - it's in shared data + + // NOTE: No statistics in Native format (NONE mode is default) + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC PATH "a" STRUCTURE + // ═══════════════════════════════════════════════════════════════════ + 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) + 0x00, 0x00, 0x00, 0x00, // (Dynamic V1=1, V2=2, V3=4) + 0x01, // max_dynamic_types = 1 + 0x01, // num_dynamic_types = 1 + 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // Type: "Int64" + // NOTE: No Dynamic stats (NONE mode) + 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) + 0x00, 0x00, 0x00, 0x00, + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC PATH "b" STRUCTURE (similar to "a") + // ═══════════════════════════════════════════════════════════════════ + 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) + 0x00, 0x00, 0x00, 0x00, + 0x01, // max_dynamic_types = 1 + 0x01, // num_dynamic_types = 1 + 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // Type: "Int64" + 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) + 0x00, 0x00, 0x00, 0x00, + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC PATH "a" DATA (Int64 value = 1) + // ═══════════════════════════════════════════════════════════════════ + // Variants sorted alphabetically: Int64=0, SharedVariant=1 (255=NULL) + 0x00, // Discriminator = 0 (Int64 variant) + 0x01, 0x00, 0x00, 0x00, // value = 1 (Int64 little-endian) + 0x00, 0x00, 0x00, 0x00, + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC PATH "b" DATA (Int64 value = 2) + // ═══════════════════════════════════════════════════════════════════ + 0x00, // Discriminator = 0 (Int64 variant) + 0x02, 0x00, 0x00, 0x00, // value = 2 (Int64 little-endian) + 0x00, 0x00, 0x00, 0x00, + + // ═══════════════════════════════════════════════════════════════════ + // SHARED DATA STREAM + // This is where path "c" lives (overflow from max_dynamic_paths) + // Format: Array(Tuple(path: String, value: String)) + // ═══════════════════════════════════════════════════════════════════ + + // Array offsets (one UInt64 per row): + 0x01, 0x00, 0x00, 0x00, // Row 0: offset = 1 (has 1 element) + 0x00, 0x00, 0x00, 0x00, + + // Shared data element 0 (path "c"): + 0x01, 0x63, // Path name: "c" (len=1) + + // Binary-encoded Dynamic value for "c": + 0x09, // Value string length = 9 bytes + 0x0a, // BinaryTypeIndex = 0x0a (Int64) + 0x03, 0x00, 0x00, 0x00, // value = 3 (Int64 little-endian) + 0x00, 0x00, 0x00, 0x00, +]); +``` + +**Key observations for shared data:** + +1. **Overflow mechanism**: When paths exceed `max_dynamic_paths`, extra paths go to shared data. Here "a" and "b" are dynamic paths, "c" overflows. + +2. **Shared data format**: `Array(Tuple(path: String, value: String))` + - Array offsets indicate how many shared paths each row has + - Each element is a (path_name, binary_encoded_value) tuple + +3. **Binary-encoded values in shared data**: The value is stored as a length-prefixed string containing: + - `BinaryTypeIndex` (1 byte): Type identifier (0x0a = Int64) + - Native value: The value in its type's binary format + +4. **BinaryTypeIndex values** (common types): + | Type | Index | + |------|-------| + | Nothing | 0x00 | + | UInt8 | 0x01 | + | UInt32 | 0x03 | + | UInt64 | 0x04 | + | Int64 | 0x0a | + | String | 0x15 | + | Array | 0x1e | + | JSON | 0x30 | + +5. **Space tradeoff**: Shared data is less efficient than dynamic paths because each value includes its path name and type encoding. Use `max_dynamic_paths` wisely for your data. + +**Example: Variant discriminator ordering (multiple types)** + +Variant types are sorted **alphabetically by type name** to determine discriminator values. SharedVariant is included in this sorting. Here's an example showing how discriminators are assigned: + +```sql +-- Insert multiple rows with different JSON value types +INSERT INTO test_json VALUES +('{"x": true}'), -- Bool +('{"x": 3.14}'), -- Float64 +('{"x": "hello"}'), -- String +('{"x": [1,2,3]}'), -- Array(Nullable(Int64)) +('{"x": null}') -- NULL +``` + +The Dynamic column for path "x" will have variant types sorted alphabetically: + +| Index | Type | Notes | +|-------|------|-------| +| 0 | Array(Nullable(Int64)) | "A" < "B" < ... | +| 1 | Bool | | +| 2 | Float64 | | +| 3 | SharedVariant | Implicit, always present | +| 4 | String | "Sh..." < "St..." | +| 255 | (NULL) | Special NULL_DISCRIMINATOR | + +The discriminator bytes in the serialized data will be: +``` +0x01, // Bool (index 1) +0x02, // Float64 (index 2) +0x04, // String (index 4, after SharedVariant=3) +0x00, // Array (index 0) +0xff, // NULL (255) +``` + +**Key rule**: For any type T in a Dynamic column, its discriminator = index in the alphabetically sorted list of [all_types + SharedVariant]. + +**Path Naming:** + +JSON paths use dot notation: +- `a` - top-level key "a" +- `a.b` - nested key "b" inside "a" +- `a.b.c` - deeply nested + +ClickHouse can escape dots in JSON keys when `json_type_escape_dots_in_keys=1` is set. + +**Typed Paths:** + +When a `JSON` type has typed paths declared: +```sql +JSON(user_id UInt64, name String) +``` + +These paths: +1. Always have the specified type (no inference needed) +2. Are serialized directly in their native type format (not as `Dynamic`) +3. Appear in `ObjectData/ObjectTypedPath` substreams + +**Dynamic Paths:** + +Dynamic paths are: +1. Discovered at insertion time +2. Limited by `max_dynamic_paths` (default 1024) +3. Stored as `Dynamic` columns with inferred types +4. Appear in `ObjectData/ObjectDynamicPath` substreams + +**Shared Data:** + +When more than `max_dynamic_paths` unique paths exist: +- Overflow paths are stored in a shared `Dynamic` column +- Each value in shared data includes its full path + value +- Uses binary serialization similar to `Dynamic`'s shared variant + +**Format Setting: `output_format_native_write_json_as_string`:** + +When set to `1`: +- Uses STRING serialization (version = 1) +- Entire JSON object serialized as text String +- Easier for clients that don't support complex `JSON` deserialization +- Default: `0` (use structured serialization) + +**Format Setting: `output_format_native_use_flattened_dynamic_and_json_serialization`:** + +When set to `1`: +- Uses FLATTENED serialization (version = 3) +- All paths serialized as individual columns +- No `Dynamic` wrapper, just direct column data +- Easier for some clients to parse +- Default: `0` + +**Reading JSON Columns:** + +To deserialize a `JSON` column in standard mode: + +1. Read `ObjectStructure` to get version and path list +2. For each typed path: deserialize using its known type +3. For each dynamic path: deserialize as `Dynamic` column +4. If shared data exists: deserialize the shared `Dynamic` column +5. Reconstruct JSON objects by merging values from all paths + +In STRING mode: just read the String value containing JSON text. + +In FLATTENED mode: read each path as a separate column, with a NULL indicator for missing paths. diff --git a/nativespec.md b/docs/nativespec.md similarity index 97% rename from nativespec.md rename to docs/nativespec.md index e40489e..19beaf2 100644 --- a/nativespec.md +++ b/docs/nativespec.md @@ -1,3822 +1,3822 @@ -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -!!!!! NOTE: DOCS GENERATED BY CLAUDE VIA EXAMINING CORE CODEBASE AND GATHERING DATA FROM QUERIES. HAS NOT BEEN VERIFIED BY A HUMAN. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - -# Native Format Specification - -## 1. Introduction - -### Overview - -The Native format is ClickHouse's columnar binary format designed for efficient data exchange. Unlike row-based formats (like RowBinary) where each row is serialized sequentially, Native organizes data by column - all values for the first column are written together, then all values for the second column, and so on. - -**Columnar Layout (Native):** -``` -Column1: [row1_val, row2_val, row3_val, ...] -Column2: [row1_val, row2_val, row3_val, ...] -... -``` - -**Row-Based Layout (RowBinary):** -``` -Row1: [col1_val, col2_val, col3_val, ...] -Row2: [col1_val, col2_val, col3_val, ...] -... -``` - -The columnar layout enables efficient compression (similar values are adjacent), better CPU cache utilization, and aligns with ClickHouse's internal storage format. - -### Native Format vs TCP Native Protocol - -The term "Native" is used in two related but distinct contexts: - -1. **Native Format**: A binary wire format for serializing blocks of columnar data. This document specifies the Native format. - -2. **TCP Native Protocol**: ClickHouse's proprietary client-server protocol that uses the Native format for data transfer, plus additional protocol messages for queries, progress, exceptions, etc. - -**Key Differences:** - -| Aspect | Native Format (HTTP) | TCP Native Protocol | -|--------|---------------------|---------------------| -| Transport | HTTP POST body | TCP socket | -| BlockInfo | Not included | Included (when `client_revision > 0`) | -| Streaming | Single response | Multi-block streaming with progress | -| Authentication | HTTP headers | Protocol handshake | -| Use Case | Simple integrations | Native clients, high-performance | - -When using HTTP with `FORMAT Native`, you receive pure Native format blocks without the TCP protocol overhead. The TCP protocol wraps Native format blocks with additional metadata like `BlockInfo` (see Section 3). - -### Comparison with RowBinary - -| Feature | Native | RowBinary | -|---------|--------|-----------| -| **Layout** | Columnar (all values per column together) | Row-based (all columns per row together) | -| **Block Header** | NumColumns, NumRows, column names & types | None (schema must be known) | -| **Type Info** | Embedded in stream | External (defined by INSERT or FORMAT clause) | -| **Streaming** | Block-at-a-time | Row-at-a-time | -| **Memory** | Higher (full column buffers) | Lower (single row) | -| **Compression** | Better (similar values adjacent) | Moderate | -| **LowCardinality** | Dictionary-encoded | Materialized values | -| **Nullable** | Null bitmap + values | Per-row null flags + values | -| **Complex Types** | Stream-based (Array sizes + elements) | Inline per-row | -| **Best For** | Bulk transfer, server-to-server | Simple clients, streaming | - -**Size Comparison Example:** - -For a column of 1000 `LowCardinality(String)` values with 10 unique strings: - -- **Native**: Dictionary (10 strings) + 1000 index bytes ≈ small -- **RowBinary**: 1000 full strings ≈ much larger - -For simple `UInt64` data without repetition, both formats produce similar sizes. - ---- - -## 2. Block Structure - -The Native format organizes data into blocks. Each block is a self-contained unit with metadata followed by columnar data. When reading over HTTP (not using the TCP protocol with `client_revision > 0`), blocks do not include `BlockInfo`. - -### Block Dimensions - -A block starts with two VarUInt values that specify its dimensions: - -- **NumColumns** (VarUInt): Number of columns in the block -- **NumRows** (VarUInt): Number of rows in the block - -VarUInt is an unsigned LEB128 (Little Endian Base 128) encoding where each byte uses 7 bits for data and 1 bit as a continuation flag. Values 0-127 encode as a single byte. - -**Example: Simple single-column, single-row block** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT 42::UInt32 AS num" | xxd -``` - -``` -00000000: 0101 036e 756d 0655 496e 7433 322a 0000 ...num.UInt32*.. -00000010: 00 . -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 (VarUInt) - 0x01, // NumRows = 1 (VarUInt) - - // Column 0: - 0x03, 0x6e, 0x75, 0x6d, // Column name = "num" (String: length=3, bytes="num") - 0x06, 0x55, 0x49, 0x6e, // Column type = "UInt32" (String: length=6, bytes="UInt32") - 0x74, 0x33, 0x32, // (continuation of "UInt32") - 0x2a, 0x00, 0x00, 0x00, // Column data: 42 as little-endian UInt32 -]); -``` - -**Example: Two columns, one row** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT 'hello'::String AS msg, 100::UInt8 AS id" | xxd -``` - -``` -00000000: 0201 036d 7367 0653 7472 696e 6705 6865 ...msg.String.he -00000010: 6c6c 6f02 6964 0555 496e 7438 64 llo.id.UInt8d -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x02, // NumColumns = 2 (VarUInt) - 0x01, // NumRows = 1 (VarUInt) - - // Column 0: - 0x03, 0x6d, 0x73, 0x67, // Column name = "msg" (String) - 0x06, 0x53, 0x74, 0x72, // Column type = "String" (String) - 0x69, 0x6e, 0x67, // (continuation) - 0x05, 0x68, 0x65, 0x6c, // Column data: "hello" (String: length=5) - 0x6c, 0x6f, // (continuation) - - // Column 1: - 0x02, 0x69, 0x64, // Column name = "id" (String: length=2) - 0x05, 0x55, 0x49, 0x6e, // Column type = "UInt8" (String: length=5) - 0x74, 0x38, // (continuation) - 0x64, // Column data: 100 as UInt8 -]); -``` - -**Example: One column, three rows (columnar layout)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT number::UInt64 AS n FROM numbers(3)" | xxd -``` - -``` -00000000: 0103 016e 0655 496e 7436 3400 0000 0000 ...n.UInt64..... -00000010: 0000 0001 0000 0000 0000 0002 0000 0000 ................ -00000020: 0000 00 ... -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 (VarUInt) - 0x03, // NumRows = 3 (VarUInt) - - // Column 0: - 0x01, 0x6e, // Column name = "n" (String: length=1) - 0x06, 0x55, 0x49, 0x6e, // Column type = "UInt64" (String: length=6) - 0x74, 0x36, 0x34, // (continuation) - - // Column data (all 3 rows, columnar): - 0x00, 0x00, 0x00, 0x00, // Row 0: 0 as UInt64 (little-endian) - 0x00, 0x00, 0x00, 0x00, - 0x01, 0x00, 0x00, 0x00, // Row 1: 1 as UInt64 (little-endian) - 0x00, 0x00, 0x00, 0x00, - 0x02, 0x00, 0x00, 0x00, // Row 2: 2 as UInt64 (little-endian) - 0x00, 0x00, 0x00, 0x00, -]); -``` - -### Per-Column Structure - -Each column in the block is serialized sequentially with the following structure: - -1. **Column Name** (String): VarUInt length + UTF-8 bytes -2. **Column Type** (String or binary-encoded): - - By default: String representation (e.g., `"UInt32"`, `"Nullable(String)"`) - - If `output_format_native_encode_types_in_binary_format=1`: Binary-encoded type structure -3. **Serialization Info** (conditional, revision >= 54454): - - Present only when `client_revision >= DBMS_MIN_REVISION_WITH_CUSTOM_SERIALIZATION` (54454) - - `UInt8 has_custom`: 0 or 1 indicating if custom serialization is used - - If `has_custom == 1`: Binary-encoded serialization kind stack (for sparse columns, etc.) -4. **Column Data** (columnar, type-specific): - - All values for this column across all rows - - Serialization depends on the data type (see subsequent sections) - - If `NumRows == 0`, no data bytes are written (even for variable-length types) - -**Empty block behavior:** - -When a query returns 0 rows, the entire response is empty (0 bytes). The block structure is only written when there is at least one row or when explicitly required by the protocol. - -```bash -# Query with 0 rows produces empty output -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT number::UInt64 AS n FROM numbers(0)" | wc -c -# Output: 0 -``` - ---- - -## 3. BlockInfo (TCP Protocol Only) - -`BlockInfo` contains additional metadata about the block, used primarily for distributed query execution and aggregation. It is **only present when using the TCP protocol** with `client_revision > 0`. When reading Native format over HTTP, `BlockInfo` is not included. - -### When BlockInfo is Present - -`BlockInfo` appears at the very beginning of each block, before the block dimensions, when: -- The block is transmitted over the TCP (Native) protocol -- `client_revision > 0` (i.e., not reading from an old client) - -In the HTTP Native format context, `BlockInfo` is never serialized, and blocks start directly with NumColumns/NumRows. - -### Field-Based Serialization Format - -`BlockInfo` uses a field-number-based binary format (similar to protobuf but simpler). The structure is: - -``` -[Field1_Num(VarUInt), Field1_Value(binary)] -[Field2_Num(VarUInt), Field2_Value(binary)] -... -[0x00] // Terminator byte -``` - -Each field is serialized only if the protocol revision supports it. The serialization always ends with a terminator byte (`0x00`). - -### Fields - -`BlockInfo` defines three fields (as of protocol revision 54480+): - -#### Field 1: `is_overflows` (bool) - -- **Field Number**: 1 (VarUInt) -- **Type**: `bool` (UInt8: 0 = false, 1 = true) -- **Minimum Revision**: 0 (always present if `client_revision > 0`) -- **Purpose**: Indicates if this block contains overflow rows from `GROUP BY ... WITH TOTALS` when `max_rows_to_group_by` is exceeded with `group_by_overflow_mode = 'any'`. When `true`, this block contains aggregated values that didn't fit within the limit. - -**Encoding:** -```typescript -// If is_overflows = true: -const field1 = new Uint8Array([ - 0x01, // Field number 1 (VarUInt) - 0x01, // is_overflows = true (bool as UInt8) -]); -``` - -#### Field 2: `bucket_num` (Int32) - -- **Field Number**: 2 (VarUInt) -- **Type**: `Int32` (4 bytes, little-endian) -- **Minimum Revision**: 0 (always present if `client_revision > 0`) -- **Purpose**: Used in two-level aggregation to indicate which bucket this block's data belongs to. When using two-level aggregation, data with different key groups are scattered across different buckets. This field helps optimize merging during distributed aggregation. -- **Default Value**: -1 (indicates no bucket assignment) - -**Encoding:** -```typescript -// If bucket_num = 5: -const field2 = new Uint8Array([ - 0x02, // Field number 2 (VarUInt) - 0x05, 0x00, 0x00, 0x00, // bucket_num = 5 (Int32, little-endian) -]); -``` - -#### Field 3: `out_of_order_buckets` (vector) - -- **Field Number**: 3 (VarUInt) -- **Type**: `std::vector` (VarUInt size + array of Int32 values) -- **Minimum Revision**: 54480 (`DBMS_MIN_REVISION_WITH_OUT_OF_ORDER_BUCKETS_IN_AGGREGATION`) -- **Purpose**: List of bucket IDs that were delayed by `ConvertingAggregatedToChunksTransform` on the current node. Used for coordinating out-of-order bucket processing in distributed aggregation. -- **Default Value**: Empty vector - -**Encoding:** -```typescript -// Vector encoding: VarUInt(size) + elements -const writeVector = (values: Int32[]) => { - const size = values.length; // VarUInt - const elements = values.flatMap(v => - [v & 0xFF, (v >> 8) & 0xFF, (v >> 16) & 0xFF, (v >> 24) & 0xFF] - ); - return [size, ...elements]; -}; - -// If out_of_order_buckets = [10, 20]: -const field3 = new Uint8Array([ - 0x03, // Field number 3 (VarUInt) - 0x02, // Vector size = 2 (VarUInt) - 0x0A, 0x00, 0x00, 0x00, // Element 0: 10 (Int32, little-endian) - 0x14, 0x00, 0x00, 0x00, // Element 1: 20 (Int32, little-endian) -]); -``` - -### Terminator Byte - -The `BlockInfo` serialization always ends with a terminator byte of `0x00` (VarUInt encoding of 0), indicating no more fields follow. - -### Complete BlockInfo Example - -Here's a complete example of `BlockInfo` with all fields set (hypothetical, as this would only occur in TCP protocol context): - -```typescript -const blockInfo = new Uint8Array([ - // Field 1: is_overflows = false - 0x01, // Field number 1 - 0x00, // false - - // Field 2: bucket_num = 3 - 0x02, // Field number 2 - 0x03, 0x00, 0x00, 0x00, // bucket_num = 3 - - // Field 3: out_of_order_buckets = [5, 7] (only if revision >= 54480) - 0x03, // Field number 3 - 0x02, // Vector size = 2 - 0x05, 0x00, 0x00, 0x00, // 5 - 0x07, 0x00, 0x00, 0x00, // 7 - - // Terminator - 0x00, // End of BlockInfo - - // After BlockInfo, the block dimensions and columns follow: - // 0x01, 0x01, ... // NumColumns, NumRows, ... -]); -``` - -### Reading BlockInfo - -When reading a Native format stream over the TCP protocol with `server_revision > 0`, the reader must: - -1. Read field numbers (VarUInt) in a loop -2. For each non-zero field number, read the corresponding field value based on its known type -3. Stop when field number `0x00` is encountered -4. Then proceed to read the block dimensions (NumColumns, NumRows) and column data - -**Note**: The HTTP Native format never includes `BlockInfo`, so HTTP readers should skip directly to reading NumColumns/NumRows. - ---- - -## 4. Stream Architecture Concepts - -The Native format uses a **stream-based architecture** for complex data types. This is a conceptual model where certain types decompose their binary representation into multiple logical "streams" that are **interleaved in the actual binary output**. Understanding this concept is essential for correctly decoding complex types over HTTP. - -### What are Streams? - -In ClickHouse's serialization architecture, a "stream" is a logical subdivision of a column's binary data. Simple types like `UInt32` or `String` use a single stream (called `Regular`), where all values are written contiguously. Complex types decompose into multiple named streams, each carrying a specific piece of structural information. - -**Important**: When reading Native format over HTTP, streams are not separate physical byte sequences. They are **conceptual divisions** that appear **sequentially interleaved** in the binary output. The serialization process writes each stream's data in a specific order, and you must read them in that exact order. - -### Why Native Uses Streams - -The stream architecture exists because: - -1. **Columnar Storage Optimization**: In ClickHouse's storage engine (MergeTree), these streams map to separate physical files on disk (e.g., `column.bin` for data, `column.size0.bin` for array sizes). This allows selective reading and efficient compression. - -2. **Flexible Composition**: Complex types can nest arbitrarily (e.g., `Array(Array(LowCardinality(String)))`). Each type layer adds its own streams to the path hierarchy without knowing about parent or child types. - -3. **Shared Structures**: Some streams (like null maps in `Nullable` or dictionaries in `LowCardinality`) can be shared across granules or reused, reducing redundancy. - -### How Streams Relate to HTTP Decoding - -When you read Native format over HTTP: - -1. Each column's data is written by calling `serializeBinaryBulkWithMultipleStreams` -2. The serialization code calls a "stream getter" function for each logical stream path -3. All streams write to the **same physical TCP/HTTP output buffer**, one after another -4. Your decoder must understand the **stream order** to parse the bytes correctly - -**Practical Effect**: You don't read from multiple simultaneous streams. Instead, you read **sequential chunks** of binary data, where each chunk corresponds to a logical stream. The order is determined by the type's `enumerateStreams` method. - -### Stream Path Hierarchy - -Each logical stream is identified by a **SubstreamPath**: a sequence of `Substream` components that describe the nesting structure. The path is built up as types compose. - -**Example SubstreamPath representations**: - -| Type | SubstreamPath Components | Description | -|------|-------------------------|-------------| -| `UInt32` | `{Regular}` | Single stream for values | -| `Array(UInt32)` | `{ArraySizes}` | Array sizes (offsets) | -| | `{ArrayElements, Regular}` | Element values | -| `Nullable(String)` | `{NullMap}` | Null indicator bytes | -| | `{NullableElements, Regular}` | String values (including for NULLs) | -| `Array(Nullable(String))` | `{ArraySizes}` | Array offsets | -| | `{ArrayElements, NullMap}` | Null indicators for all elements | -| | `{ArrayElements, NullableElements, Regular}` | String data | -| `LowCardinality(String)` | `{DictionaryKeys}` | Unique dictionary values | -| | `{DictionaryIndexes}` | Row-to-dictionary index mappings | - -### Detailed Stream Examples - -#### Array(UInt32) - -For `Array(UInt32)` with 3 rows: `[1,2]`, `[3]`, `[]`: - -**Streams**: -1. **ArraySizes** (stream path: `{ArraySizes}`): - - Contains array sizes for each row - - Written as `UInt64` values - - Bytes: `02 00 00 00 00 00 00 00` (size 2), `01 00 00 00 00 00 00 00` (size 1), `00 00 00 00 00 00 00 00` (size 0) - -2. **ArrayElements** (stream path: `{ArrayElements, Regular}`): - - Contains flattened element values: 1, 2, 3 - - Written as contiguous `UInt32` values - - Bytes: `01 00 00 00` (1), `02 00 00 00` (2), `03 00 00 00` (3) - -**Binary layout** (streams written sequentially): -``` -[ArraySizes: 24 bytes for 3 offsets] [ArrayElements: 12 bytes for 3 UInt32 values] -``` - -#### LowCardinality(String) - -For `LowCardinality(String)` with 4 rows: `"hello"`, `"world"`, `"hello"`, `"world"`: - -**Streams**: -1. **DictionaryKeys** (stream path: `{DictionaryKeys}`): - - Prefix: version marker (UInt64 in prefix stream) - - Dictionary values: unique strings `"hello"`, `"world"` - -2. **DictionaryIndexes** (stream path: `{DictionaryIndexes}`): - - Serialization type byte (encodes index type: UInt8/16/32/64, flags) - - Additional keys count (if any overflow keys) - - Row count (UInt64) - - Index values: `0, 1, 0, 1` (indexes into dictionary) - -#### Array(LowCardinality(String)) - -For nested types, streams compose hierarchically: - -**Streams** (in order): -1. `{ArraySizes}` - array offsets -2. `{ArrayElements, DictionaryKeys}` - dictionary for all elements -3. `{ArrayElements, DictionaryIndexes}` - indexes for all elements - -The `ArrayElements` prefix indicates these streams belong to the array's elements, allowing proper reconstruction of the nested structure. - -### Reading Order - -When deserializing, you must: - -1. **Follow the enumeration order**: Types define their stream order via `enumerateStreams`. This order is deterministic. - -2. **Read each stream completely**: Don't interleave reads between different streams of the same column. Read one stream, then the next. - -3. **Respect nesting**: For `Array(T)`, first read array sizes, then recursively read the element stream(s) for type `T`. - -4. **Handle prefixes/suffixes**: Some types write metadata in `serializeBinaryBulkStatePrefix` (before data) and `serializeBinaryBulkStateSuffix` (after data). In Native format over HTTP, prefixes typically appear inline with the first granule's data. - -### Stream Interleaving with Multiple Columns - -In a block with multiple columns, the **entire column** (all its streams) is written before the next column: - -``` -Block: - Column 0: [all streams for column 0] - Column 1: [all streams for column 1] - Column 2: [all streams for column 2] -``` - -Within a column, streams appear in the order determined by `enumerateStreams`. - -### Summary for Client Developers - -When decoding Native format over HTTP: - -- **Streams are conceptual, not physical**: You read a single byte stream, but must mentally track which logical stream you're in -- **Order is critical**: Stream order is type-specific and deterministic. Follow the patterns shown above. -- **Nested types compose paths**: Each nesting level adds components to the stream path (e.g., `ArrayElements` → `NullableElements` → `Regular`) -- **Start with simple types**: Implement `Array` and `Nullable` first to understand the pattern, then tackle `LowCardinality`, `Map`, `Tuple`, etc. - -Understanding streams as "logical subdivisions written sequentially" rather than "parallel data channels" is the key mental model for successfully implementing a Native format decoder. - ---- - -## 5. Simple Data Types - -These types have the same columnar encoding as their RowBinary counterparts (values written consecutively for all rows). - -### 5.1 Integer Types - -Integer types are serialized as fixed-width little-endian values. For a column with N rows, N consecutive integer values are written. - -**Supported types:** -- `Int8` / `UInt8`: 1 byte per value -- `Int16` / `UInt16`: 2 bytes per value (little-endian) -- `Int32` / `UInt32`: 4 bytes per value (little-endian) -- `Int64` / `UInt64`: 8 bytes per value (little-endian) -- `Int128` / `UInt128`: 16 bytes per value (little-endian) -- `Int256` / `UInt256`: 32 bytes per value (little-endian) - -**Example: UInt32 single value** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT 42::UInt32 AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0655 496e 7433 322a 0000 ...col.UInt32*.. -00000010: 00 . -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x06, 0x55, 0x49, 0x6e, // Column type = "UInt32" - 0x74, 0x33, 0x32, - 0x2a, 0x00, 0x00, 0x00, // Data: 42 (little-endian UInt32) -]); -``` - -**Example: Multiple rows (columnar)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT number::UInt8 AS col FROM numbers(3)" | xxd -``` - -``` -00000000: 0103 0363 6f6c 0555 496e 7438 0001 02 ...col.UInt8... -``` - -The column data contains three consecutive UInt8 values: `0x00`, `0x01`, `0x02`. - -**Example: Int8 negative value** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT -1::Int8 AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0449 6e74 38ff ...col.Int8. -``` - -The value `-1` is encoded as `0xff` (two's complement representation). - -**Example: Int128 large value** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT 170141183460469231731687303715884105727::Int128 AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0649 6e74 3132 38ff ffff ...col.Int128... -00000010: ffff ffff ffff ffff ffff ffff 7f ............. -``` - -The 16-byte value is little-endian: `0x7fffffffffffffffffffffffffffffff`. - -**Example: UInt256** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT 1::UInt256 AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0755 496e 7432 3536 0100 ...col.UInt256.. -00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ -00000020: 0000 0000 0000 0000 0000 0000 00 ............. -``` - -The 32-byte value is little-endian with the value `1` in the first byte. - -### 5.2 Floating Point Types - -Floating point types use IEEE 754 standard encoding in little-endian byte order. - -**Supported types:** -- `Float32`: 4 bytes, IEEE 754 single-precision -- `Float64`: 8 bytes, IEEE 754 double-precision -- `BFloat16`: 2 bytes, brain floating point (16-bit truncation of Float32) - -**Example: Float32** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT 1.25::Float32 AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0746 6c6f 6174 3332 0000 ...col.Float32.. -00000010: a03f .? -``` - -The value `1.25` is encoded as `0x3fa00000` (little-endian IEEE 754). - -**Example: Float64** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT 3.14159265358979::Float64 AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0746 6c6f 6174 3634 112d ...col.Float64.- -00000010: 4454 fb21 0940 DT.!.@ -``` - -The value π is encoded as `0x400921fb54442d11` (little-endian IEEE 754 double). - -**Example: BFloat16** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT 1.25::BFloat16 AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0842 466c 6f61 7431 36a0 ...col.BFloat16. -00000010: 3f ? -``` - -`BFloat16` stores the top 16 bits of the Float32 representation: `0x3fa0` (little-endian). This corresponds to the first 2 bytes of the Float32 encoding `0x3fa00000`, reversed to little-endian order. - -### 5.3 Bool - -The `Bool` type is encoded as a single byte per value: -- `0x00` = false -- `0x01` = true - -**Example: true** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT true::Bool AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0442 6f6f 6c01 ...col.Bool. -``` - -**Example: false** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT false::Bool AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0442 6f6f 6c00 ...col.Bool. -``` - -### 5.4 String & FixedString - -**String:** Each string value is encoded as a VarUInt length followed by the UTF-8 bytes. - -- **VarUInt length**: Number of bytes in the string (LEB128 encoding) -- **Bytes**: Raw UTF-8 data - -**Example: String** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT 'hello'::String AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0653 7472 696e 6705 6865 ...col.String.he -00000010: 6c6c 6f llo -``` - -The string `"hello"` is encoded as: -- `0x05`: Length = 5 bytes (VarUInt) -- `0x68 0x65 0x6c 0x6c 0x6f`: "hello" in UTF-8 - -**Example: Empty string** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT ''::String AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0653 7472 696e 6700 ...col.String. -``` - -Empty string is encoded as `0x00` (VarUInt length = 0). - -**Example: Multiple strings (columnar)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT arrayJoin(['hello', 'world']) AS col" | xxd -``` - -``` -00000000: 0102 0363 6f6c 0653 7472 696e 6705 6865 ...col.String.he -00000010: 6c6c 6f05 776f 726c 64 llo.world -``` - -Each string is serialized consecutively: -- Row 0: `0x05` + "hello" -- Row 1: `0x05` + "world" - -**Example: Long string with multi-byte VarUInt** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT repeat('x', 300) AS col" | xxd | head -5 -``` - -``` -00000000: 0101 0363 6f6c 0653 7472 696e 67ac 0278 ...col.String..x -00000010: 7878 7878 7878 7878 7878 7878 7878 7878 xxxxxxxxxxxxxxxx -00000020: 7878 7878 7878 7878 7878 7878 7878 7878 xxxxxxxxxxxxxxxx -00000030: 7878 7878 7878 7878 7878 7878 7878 7878 xxxxxxxxxxxxxxxx -00000040: 7878 7878 7878 7878 7878 7878 7878 7878 xxxxxxxxxxxxxxxx -``` - -The length is encoded as `0xac 0x02`: -- Byte 1: `0xac` = `0b10101100` → 7 data bits = `0b0101100`, continue bit set -- Byte 2: `0x02` = `0b00000010` → 7 data bits = `0b0000010`, continue bit not set -- Value = `(0b0101100) | (0b0000010 << 7)` = `44 | 256` = `300` - -**FixedString(N):** Each value occupies exactly N bytes, zero-padded if the string is shorter. - -**Example: FixedString(5)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT 'abc'::FixedString(5) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0e46 6978 6564 5374 7269 ...col.FixedStri -00000010: 6e67 2835 2961 6263 0000 ng(5)abc.. -``` - -The value `"abc"` is padded to 5 bytes: `0x61 0x62 0x63 0x00 0x00`. - -### 5.5 Date & Time Types - -Date and time types are encoded as integer values representing specific time units since their respective epochs. - -**Date:** UInt16 representing days since `1970-01-01`. - -**Example: Date** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDate('2023-12-25') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0444 6174 6504 4d ...col.Date.M -``` - -The value is `0x4d04` (little-endian UInt16) = 19716 days since 1970-01-01. - -**Example: Date epoch** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDate('1970-01-01') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0444 6174 6500 00 ...col.Date.. -``` - -Epoch value is `0x0000` (0 days). - ---- - -**Date32:** Int32 representing days since `1900-01-01` (supports dates before 1970). - -**Example: Date32** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDate32('2023-12-25') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0644 6174 6533 3204 4d00 ...col.Date32.M. -00000010: 00 . -``` - -The value is `0x00004d04` (little-endian Int32) = 45284 days since 1900-01-01. - -**Example: Date32 epoch (1900-01-01)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDate32('1900-01-01') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0644 6174 6533 3221 9cff ...col.Date32!.. -00000010: ff . -``` - -The value is `0xffff9c21` (little-endian Int32) = -25567 days. Despite the documentation saying "days since 1900-01-01", the actual epoch is 1970-01-01 (value 0). Dates before 1970 are represented as negative values. The date 1900-01-01 is -25567 days before 1970-01-01. - ---- - -**DateTime:** UInt32 representing seconds since Unix epoch (`1970-01-01 00:00:00 UTC`). - -**Example: DateTime** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDateTime('2023-12-25 10:30:45') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0844 6174 6554 696d 65d5 ...col.DateTime. -00000010: 5989 65 Y.e -``` - -The value is `0x658959d5` (little-endian UInt32) = 1703502645 seconds since epoch. - -**Example: DateTime epoch** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDateTime(0) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0844 6174 6554 696d 6500 ...col.DateTime. -00000010: 0000 00 ... -``` - -Epoch value is `0x00000000`. - ---- - -**DateTime64(P):** Int64 representing time units since epoch with precision P (0-9). -- Precision 0: seconds -- Precision 3: milliseconds -- Precision 6: microseconds -- Precision 9: nanoseconds - -The type string includes the precision: `DateTime64(3)`. - -**Example: DateTime64(3) - milliseconds** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDateTime64('2023-12-25 10:30:45.123', 3) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0d44 6174 6554 696d 6536 ...col.DateTime6 -00000010: 3428 3329 83e8 86a0 8c01 0000 4(3)........ -``` - -The value is `0x00000001a086e883` (little-endian Int64) = 1703502645123 milliseconds since epoch. - -**Example: DateTime64 epoch** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDateTime64(0, 3) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0d44 6174 6554 696d 6536 ...col.DateTime6 -00000010: 3428 3329 0000 0000 0000 0000 4(3)........ -``` - -Epoch value is `0x0000000000000000`. - -### 5.6 Decimal Types - -Decimal types store fixed-point numbers as little-endian integers scaled by a power of 10. The scale (number of decimal places) is encoded in the type string. - -**Supported types:** -- `Decimal32(S)` / `Decimal(P, S)` where P ≤ 9: 4 bytes (Int32) -- `Decimal64(S)` / `Decimal(P, S)` where P ≤ 18: 8 bytes (Int64) -- `Decimal128(S)` / `Decimal(P, S)` where P ≤ 38: 16 bytes (Int128) -- `Decimal256(S)` / `Decimal(P, S)` where P ≤ 76: 32 bytes (Int256) - -The stored value equals the decimal number multiplied by 10^S. - -**Example: Decimal32(2)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDecimal32(123.45, 2) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0d44 6563 696d 616c 2839 ...col.Decimal(9 -00000010: 2c20 3229 3090 0000 , 2)0... -``` - -The type string in the wire format is always `Decimal(P, S)` notation (e.g., `Decimal(9, 2)`), even when the SQL uses `Decimal32(S)`. The precision 9 indicates this is a `Decimal32` (P ≤ 9 → 4 bytes). - -The value `123.45` is stored as `12345` (123.45 × 10^2). -Encoded as `0x00003090` (little-endian Int32) = 12345. - -**Example: Decimal64(4)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDecimal64(123.45, 4) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0e44 6563 696d 616c 2831 ...col.Decimal(1 -00000010: 382c 2034 2944 d612 0000 0000 00 8, 4)D....... -``` - -The type string is `Decimal(18, 4)`. -The value `123.45` is stored as `1234500` (123.45 × 10^4). -Encoded as `0x0000000012d644` (little-endian Int64) = 1234500. - -**Example: Decimal128(6)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDecimal128(123.45, 6) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0e44 6563 696d 616c 2833 ...col.Decimal(3 -00000010: 382c 2036 2990 b25b 0700 0000 0000 0000 8, 6)..[........ -00000020: 0000 0000 00 ..... -``` - -The type string is `Decimal(38, 6)`. -The value `123.45` is stored as `123450000` (123.45 × 10^6). -Encoded as `0x00000000000000000000000007 5bb290` (little-endian Int128) = 123450000. - -**Example: Decimal256(8)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toDecimal256(123.45, 8) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0e44 6563 696d 616c 2837 ...col.Decimal(7 -00000010: 362c 2038 2940 c0d1 df02 0000 0000 0000 6, 8)@.......... -00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................ -00000030: 0000 0000 00 ..... -``` - -The type string is `Decimal(76, 8)`. -The value `123.45` is stored as `12345000000` (123.45 × 10^8). -Encoded as 32-byte little-endian Int256 = 12345000000. - -### 5.7 UUID - -`UUID` values are encoded as 16 bytes in a specific byte order that differs from the standard UUID string representation. - -ClickHouse stores UUIDs in a format optimized for indexing, where bytes are reordered from the standard UUID string format `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`. - -**Byte order:** The UUID is stored with the first and second groups swapped compared to RFC 4122 string representation. - -**Example: UUID** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toUUID('61f0c404-5cb3-11e7-907b-a6006ad3dba0') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0455 5549 44e7 11b3 5c04 ...col.UUID...\. -00000010: c4f0 61a0 dbd3 6a00 a67b 90 ..a...j..{. -``` - -The UUID string `61f0c404-5cb3-11e7-907b-a6006ad3dba0` is encoded as: - -``` -e7 11 b3 5c 04 c4 f0 61 a0 db d3 6a 00 a6 7b 90 -``` - -This represents the bytes reordered as: -- Bytes 6-7 of string (`11e7`) → bytes 0-1 of storage -- Bytes 4-5 of string (`5cb3`) → bytes 2-3 of storage -- Bytes 0-3 of string (`61f0c404`) → bytes 4-7 of storage -- Bytes 8-15 of string (`907ba6006ad3dba0`) → bytes 8-15 of storage - -The reordering optimizes sorting and range queries on time-based UUIDs (version 1). - -### 5.8 IPv4 & IPv6 - -**IPv4:** Stored as 4 bytes (UInt32) in reversed byte order from the standard dotted-decimal notation. - -**Example: IPv4** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toIPv4('192.168.1.1') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0449 5076 3401 01a8 c0 ...col.IPv4.... -``` - -The address `192.168.1.1` is encoded as `0x0101a8c0`: -- Byte 0: `0x01` (last octet: 1) -- Byte 1: `0x01` (third octet: 1) -- Byte 2: `0xa8` (second octet: 168) -- Byte 3: `0xc0` (first octet: 192) - -This is little-endian UInt32 interpretation of the reversed IP bytes. - -**Example: IPv4 localhost** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toIPv4('127.0.0.1') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0449 5076 3401 0000 7f ...col.IPv4.... -``` - -The address `127.0.0.1` is encoded as `0x0100007f` (reversed bytes). - -**Example: Multiple IPv4 values** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toIPv4('192.168.' || toString(number) || '.1') AS col FROM numbers(3)" | xxd -``` - -``` -00000000: 0103 0363 6f6c 0449 5076 3401 00a8 c001 ...col.IPv4..... -00000010: 01a8 c001 02a8 c0 ....... -``` - -Three consecutive 4-byte values for `192.168.0.1`, `192.168.1.1`, `192.168.2.1`. - ---- - -**IPv6:** Stored as 16 bytes in network byte order (big-endian). - -**Example: IPv6** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toIPv6('2001:db8::1') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0449 5076 3620 010d b800 ...col.IPv6 .... -00000010: 0000 0000 0000 0000 0000 01 ........... -``` - -The address `2001:db8::1` is encoded as 16 bytes in network byte order: -``` -20 01 0d b8 00 00 00 00 00 00 00 00 00 00 00 01 -``` - -**Example: IPv6 localhost** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toIPv6('::1') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0449 5076 3600 0000 0000 ...col.IPv6..... -00000010: 0000 0000 0000 0000 0000 01 ........... -``` - -The IPv6 loopback address `::1` is 15 zero bytes followed by `0x01`. - -### 5.9 Enum8 & Enum16 - -Enum types are encoded as integer values representing the enum value's assigned number. The enum definition is included in the type string. - -**Enum8:** Stored as 1 byte (Int8) per value. - -**Example: Enum8** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT CAST('a', 'Enum8(''a''=1, ''b''=2)') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 1745 6e75 6d38 2827 6127 ...col.Enum8('a' -00000010: 203d 2031 2c20 2762 2720 3d20 3229 01 = 1, 'b' = 2). -``` - -Breakdown: -- Type string: `Enum8('a' = 1, 'b' = 2)` (23 bytes) -- Data: `0x01` (the assigned value for 'a') - -The enum value `'a'` is encoded as its assigned integer `1`. - ---- - -**Enum16:** Stored as 2 bytes (Int16, little-endian) per value. - -**Example: Enum16** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT CAST('hello', 'Enum16(''hello''=1000, ''world''=2000)') AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 2645 6e75 6d31 3628 2768 ...col&Enum16('h -00000010: 656c 6c6f 2720 3d20 3130 3030 2c20 2777 ello' = 1000, 'w -00000020: 6f72 6c64 2720 3d20 3230 3030 29e8 03 orld' = 2000).. -``` - -Breakdown: -- Type string: `Enum16('hello' = 1000, 'world' = 2000)` (38 bytes) -- Data: `0x03e8` (little-endian Int16) = 1000 - -The enum value `'hello'` is encoded as its assigned integer `1000` in little-endian format. - ---- - -## 6. Complex Data Types (Native-Specific) - -### 6.1 Nullable - -`Nullable(T)` wraps any type `T` to allow `NULL` values. It uses a two-stream architecture: - -1. **NullMap Stream**: Null indicator bytes (1 byte per row) -2. **NullableElements Stream**: The nested type's data (all rows, including NULLs) - -**Stream Layout:** - -The serialization writes two streams sequentially: - -``` -[NullMap: N bytes] [NestedData: serialized as type T] -``` - -**Null Map Encoding:** - -For N rows, N consecutive bytes are written: -- `0x00` = non-NULL (value is valid) -- `0x01` = NULL (value should be ignored) - -**Important**: Even for NULL rows, the nested type still writes placeholder data. When deserializing, if the null map indicates NULL, the corresponding value in the nested data should be ignored and replaced with NULL. - -**Example: Nullable(UInt64) with mixed NULL values** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT if(number % 2 = 0, number, NULL)::Nullable(UInt64) AS col FROM numbers(5)" | xxd -``` - -``` -00000000: 0105 0363 6f6c 104e 756c 6c61 626c 6528 ...col.Nullable( -00000010: 5549 6e74 3634 2900 0100 0100 0000 0000 UInt64)......... -00000020: 0000 0000 0100 0000 0000 0000 0200 0000 ................ -00000030: 0000 0000 0300 0000 0000 0000 0400 0000 ................ -00000040: 0000 0000 .... -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x05, // NumRows = 5 - - // Column metadata - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (String: length=3) - 0x10, 0x4e, 0x75, 0x6c, // Column type = "Nullable(UInt64)" (String: length=16) - 0x6c, 0x61, 0x62, 0x6c, - 0x65, 0x28, 0x55, 0x49, - 0x6e, 0x74, 0x36, 0x34, - 0x29, - - // Stream 1: NullMap (5 bytes for 5 rows) - 0x00, // Row 0: not NULL (value = 0) - 0x01, // Row 1: NULL - 0x00, // Row 2: not NULL (value = 2) - 0x01, // Row 3: NULL - 0x00, // Row 4: not NULL (value = 4) - - // Stream 2: NullableElements (5 x UInt64 values) - // Note: All 5 values are written, even for NULL rows - 0x00, 0x00, 0x00, 0x00, // Row 0: 0 (valid, NullMap[0]=0x00) - 0x00, 0x00, 0x00, 0x00, - 0x01, 0x00, 0x00, 0x00, // Row 1: 1 (IGNORED, NullMap[1]=0x01) - 0x00, 0x00, 0x00, 0x00, - 0x02, 0x00, 0x00, 0x00, // Row 2: 2 (valid, NullMap[2]=0x00) - 0x00, 0x00, 0x00, 0x00, - 0x03, 0x00, 0x00, 0x00, // Row 3: 3 (IGNORED, NullMap[3]=0x01) - 0x00, 0x00, 0x00, 0x00, - 0x04, 0x00, 0x00, 0x00, // Row 4: 4 (valid, NullMap[4]=0x00) - 0x00, 0x00, 0x00, 0x00, -]); - -// Resulting column values: [0, NULL, 2, NULL, 4] -``` - -**Example: All NULL values** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT NULL::Nullable(UInt8) AS col FROM numbers(3)" | xxd -``` - -``` -00000000: 0103 0363 6f6c 0f4e 756c 6c61 626c 6528 ...col.Nullable( -00000010: 5549 6e74 3829 0101 0100 0000 UInt8)...... -``` - -Breakdown: -- NullMap: `0x01 0x01 0x01` (all three rows are NULL) -- Data: `0x00 0x00 0x00` (placeholder UInt8 values, all ignored) - -**Example: Nullable(String)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT if(number = 1, NULL, toString(number))::Nullable(String) AS col FROM numbers(3)" | xxd -``` - -``` -00000000: 0103 0363 6f6c 0f4e 756c 6c61 626c 6528 ...col.Nullable( -00000010: 5374 7269 6e67 2900 0100 0130 0030 0132 String)....0.0.2 -``` - -Breakdown: -- NullMap: `0x00 0x01 0x00` (rows 0 and 2 are valid, row 1 is NULL) -- Data: - - Row 0: `0x01 0x30` (String "0", valid) - - Row 1: `0x00` (empty string placeholder, IGNORED) - - Row 2: `0x01 0x32` (String "2", valid) - -**Nested Nullable Types:** - -For complex types like `Array(Nullable(T))`, the stream paths compose hierarchically: -- `{ArraySizes}` - array offsets -- `{ArrayElements, NullMap}` - null indicators for all array elements -- `{ArrayElements, NullableElements, Regular}` - element values - -### 6.2 LowCardinality - -`LowCardinality(T)` is a dictionary-encoded column type that stores unique values once in a dictionary and uses integer indexes to reference them. This dramatically reduces storage and memory usage for columns with low cardinality (few distinct values). - -**Stream Architecture:** - -`LowCardinality` uses two main streams that are written sequentially: - -1. **DictionaryKeys Stream**: Dictionary metadata (version) -2. **DictionaryIndexes Stream**: Index metadata, dictionary values (as "additional keys"), and row indexes - -**Complete Binary Structure:** - -For Native format over HTTP, the typical structure is: - -``` -// DictionaryKeys Stream: -[KeysVersion (UInt64, little-endian)] - -// DictionaryIndexes Stream: -[IndexesSerializationType (UInt64, little-endian)] -[NumAdditionalKeys (UInt64, little-endian)] (if HasAdditionalKeysBit flag is set) -[AdditionalKeys data] (if HasAdditionalKeysBit flag is set) -[NumRows (UInt64, little-endian)] -[Index values: N integers] -``` - -**DictionaryKeys Stream:** - -- **KeysVersion** (UInt64, little-endian): Serialization version - - Always `0x01` = `SharedDictionariesWithAdditionalKeys` - -In Native format over HTTP, the dictionary is typically sent as "additional keys" in the DictionaryIndexes stream rather than in a global dictionary here, because `low_cardinality_use_single_dictionary_for_part` is enabled by default. - -**DictionaryIndexes Stream:** - -**1. IndexesSerializationType (UInt64, little-endian):** - -Encodes the index integer type and flags: - -**Bits 0-7 (Index Type after flag masking):** -- `0x00` = UInt8 indexes -- `0x01` = UInt16 indexes -- `0x02` = UInt32 indexes -- `0x03` = UInt64 indexes - -**Flag Bits:** -- Bit 8 (`0x0100`): `NeedGlobalDictionaryBit` - Read dictionary from separate global dictionary (0 for Native HTTP) -- Bit 9 (`0x0200`): `HasAdditionalKeysBit` - Additional keys follow -- Bit 10 (`0x0400`): `NeedUpdateDictionary` - Dictionary changed (typically 1 for Native HTTP) - -**Decoding:** - -The `IndexesSerializationType` is a UInt64 where: -- Bits 0-7 contain the index type (0-3) -- Bits 8-10 contain the flags - -```cpp -uint64_t value = read_uint64_le(); -uint8_t index_type = value & 0xFF; // Extract bits 0-7 (type is 0-3) -bool has_additional_keys = (value >> 9) & 1; // Bit 9 -bool need_global_dictionary = (value >> 8) & 1; // Bit 8 -bool need_update_dictionary = (value >> 10) & 1; // Bit 10 -``` - -Note: The index type uses only bits 0-1 (values 0-3), but the full byte is reserved. - -**2. Additional Keys (conditional):** - -If `HasAdditionalKeysBit` is set: - -- **NumAdditionalKeys** (UInt64, little-endian): Number of dictionary entries -- **Keys Data**: Serialized using inner type's bulk serialization - - For `String`: sequence of (VarUInt length + UTF-8 bytes) - - For `UInt32`: sequence of UInt32 values - - etc. - -**3. Row Indexes:** - -- **NumRows** (UInt64, little-endian): Number of rows -- **Index Values**: NumRows consecutive integers (type from IndexesSerializationType) - - Each index references a position in the dictionary - - Index semantics depend on flags: - - If `!need_global_dictionary`: index directly into additional_keys - - If `need_global_dictionary`: index < dict_size → global dict, index >= dict_size → additional_keys[index - dict_size] - -**Example: LowCardinality(String) with repeated values** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toLowCardinality(toString(number % 3)) AS col FROM numbers(6)" | xxd -``` - -``` -00000000: 0106 0363 6f6c 164c 6f77 4361 7264 696e ...col.LowCardin -00000010: 616c 6974 7928 5374 7269 6e67 2901 0000 ality(String)... -00000020: 0000 0000 0000 0600 0000 0000 0004 0000 ................ -00000030: 0000 0000 0000 0130 0131 0132 0600 0000 .......0.1.2.... -00000040: 0000 0000 0102 0301 0203 .......... -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - // Block header - 0x01, // NumColumns = 1 - 0x06, // NumRows = 6 - - // Column metadata - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x16, 0x4c, 0x6f, 0x77, 0x43, 0x61, 0x72, 0x64, 0x69, // Type = "LowCardinality(String)" - 0x6e, 0x61, 0x6c, 0x69, 0x74, 0x79, 0x28, 0x53, - 0x74, 0x72, 0x69, 0x6e, 0x67, 0x29, - - // DictionaryKeys stream: - 0x01, 0x00, 0x00, 0x00, // KeysVersion = 1 (UInt64) - 0x00, 0x00, 0x00, 0x00, - - // DictionaryIndexes stream: - 0x06, 0x00, 0x00, 0x00, // IndexesSerializationType = 0x06 (UInt64) - 0x00, 0x00, 0x00, 0x00, // Exact interpretation requires flag decoding - // Empirically: appears to encode type + flags - - 0x04, 0x00, 0x00, 0x00, // NumAdditionalKeys = 4 (UInt64) - 0x00, 0x00, 0x00, 0x00, // Note: 4 keys for 3 distinct values suggests - // index 0 may be reserved (e.g., for NULL) - - // Additional keys (String values): - 0x00, // Key[0]: empty string (length=0) - 0x01, 0x30, // Key[1]: "0" (length=1, data=0x30) - 0x01, 0x31, // Key[2]: "1" (length=1, data=0x31) - 0x01, 0x32, // Key[3]: "2" (length=1, data=0x32) - - 0x06, 0x00, 0x00, 0x00, // NumRows = 6 (UInt64) - 0x00, 0x00, 0x00, 0x00, - - // Row indexes (6 values): - 0x01, // Row 0: index 1 → "0" (number % 3 = 0) - 0x02, // Row 1: index 2 → "1" (number % 3 = 1) - 0x03, // Row 2: index 3 → "2" (number % 3 = 2) - 0x01, // Row 3: index 1 → "0" (number % 3 = 0) - 0x02, // Row 4: index 2 → "1" (number % 3 = 1) - 0x03, // Row 5: index 3 → "2" (number % 3 = 2) -]); - -// Result: column values = ["0", "1", "2", "0", "1", "2"] -``` - -**Key Observations:** - -1. **Index 0 Convention**: The dictionary often includes a 0-index entry (here, an empty string) that may serve as a default or NULL placeholder. Actual data indexes start at 1. - -2. **Dictionary Compression**: Only 3 distinct values ("0", "1", "2") are stored, even though there are 6 rows. The indexes array (6 bytes) plus dictionary (6 bytes total for keys) is much smaller than 6 separate string serializations would be. - -3. **Index Type Selection**: The index type (UInt8/16/32/64) is chosen based on the dictionary size to minimize space. - -**Example: Simple LowCardinality(String) without repetition** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT toLowCardinality(toString(number)) AS col FROM numbers(3)" | xxd -``` - -``` -00000000: 0103 0363 6f6c 164c 6f77 4361 7264 696e ...col.LowCardin -00000010: 616c 6974 7928 5374 7269 6e67 2901 0000 ality(String)... -00000020: 0000 0000 0000 0600 0000 0000 0004 0000 ................ -00000030: 0000 0000 0000 0130 0131 0132 0300 0000 .......0.1.2.... -00000040: 0000 0000 0102 03 .... -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, 0x03, // 1 column, 3 rows - - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x16, /* "LowCardinality(String)" */, - - // DictionaryKeys: - 0x01, 0x00, 0x00, 0x00, // KeysVersion = 1 - 0x00, 0x00, 0x00, 0x00, - - // DictionaryIndexes: - 0x06, 0x00, 0x00, 0x00, // IndexesSerializationType - 0x00, 0x00, 0x00, 0x00, - - 0x04, 0x00, 0x00, 0x00, // NumAdditionalKeys = 4 - 0x00, 0x00, 0x00, 0x00, - - 0x00, // Key[0]: "" (empty/default) - 0x01, 0x30, // Key[1]: "0" - 0x01, 0x31, // Key[2]: "1" - 0x01, 0x32, // Key[3]: "2" - - 0x03, 0x00, 0x00, 0x00, // NumRows = 3 - 0x00, 0x00, 0x00, 0x00, - - 0x01, 0x02, 0x03, // Indexes: [1, 2, 3] → ["0", "1", "2"] -]); -``` - -**Nested LowCardinality:** - -For complex types like `Array(LowCardinality(String))`, the stream paths compose: -- `{ArraySizes}` - array offsets -- `{ArrayElements, DictionaryKeys}` - dictionary for all array elements -- `{ArrayElements, DictionaryIndexes}` - indexes for all array elements - -**Additional Keys vs. Global Dictionary:** - -- **Global Dictionary Mode** (`need_global_dictionary=true`): Dictionary is shared across granules, written once in DictionaryKeys stream or at end in suffix -- **Additional Keys Mode** (`has_additional_keys=true`): Dictionary sent inline with indexes stream, common in Native HTTP format - -For most queries, Native HTTP format uses additional keys mode, as each HTTP response is self-contained. - -### 6.3 Array - -The `Array` type uses a two-stream architecture to serialize variable-length arrays efficiently in columnar format: - -1. **ArraySizes stream**: Contains the size of each array (serialized as `UInt64` per row) -2. **ArrayElements stream**: Contains all elements from all arrays, flattened consecutively - -This design allows the array structure (sizes) to be separated from the element data, enabling efficient compression and selective reading in storage engines. - -**Stream Enumeration Order:** - -For `Array(T)`, the streams appear in this order: -1. Stream path `{ArraySizes}`: Array sizes (one `UInt64` per row) -2. Stream path `{ArrayElements, ...}`: Element data (recursively serialized according to type `T`) - -**Serialization Details:** - -The array sizes are written as cumulative offsets in the Native format. Each offset is a `UInt64` value in little-endian format that represents the total number of elements from all arrays up to and including the current row. - -The element data follows immediately after all array sizes, with all elements from all arrays written consecutively. For example, if you have 3 arrays `[1,2]`, `[3]`, `[]`, the offsets would be `2, 3, 3` and the element stream would contain just `1, 2, 3` (3 elements total). - -**Example: Simple Array(UInt32)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT [1, 2, 3]::Array(UInt32) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0d41 7272 6179 2855 496e ...col.Array(UIn -00000010: 7433 3229 0300 0000 0000 0000 0100 0000 t32)............ -00000020: 0200 0000 0300 0000 ........ -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 (VarUInt) - 0x01, // NumRows = 1 (VarUInt) - - // Column metadata: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (String: length=3) - 0x0d, 0x41, 0x72, 0x72, // Column type = "Array(UInt32)" (String: length=13) - 0x61, 0x79, 0x28, 0x55, - 0x49, 0x6e, 0x74, 0x33, - 0x32, 0x29, - - // ArraySizes stream (1 array with 3 elements): - 0x03, 0x00, 0x00, 0x00, // Array offset: 3 (UInt64, little-endian) - 0x00, 0x00, 0x00, 0x00, - - // ArrayElements stream (3 UInt32 values): - 0x01, 0x00, 0x00, 0x00, // Element 0: 1 (UInt32) - 0x02, 0x00, 0x00, 0x00, // Element 1: 2 (UInt32) - 0x03, 0x00, 0x00, 0x00, // Element 2: 3 (UInt32) -]); -``` - -**Example: Multiple arrays with varying sizes** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT arrayJoin([[1,2], [3], []]) AS col" | xxd -``` - -``` -00000000: 0103 0363 6f6c 0c41 7272 6179 2855 496e ...col.Array(UIn -00000010: 7438 2902 0000 0000 0000 0003 0000 0000 t8)............. -00000020: 0000 0003 0000 0000 0000 0001 0203 .............. -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x03, // NumRows = 3 - - // Column metadata: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x0c, 0x41, 0x72, 0x72, // Column type = "Array(UInt8)" (length=12) - 0x61, 0x79, 0x28, 0x55, - 0x49, 0x6e, 0x74, 0x38, - 0x29, - - // ArraySizes stream (3 arrays): - 0x02, 0x00, 0x00, 0x00, // Row 0: offset = 2 (cumulative: 2 elements) - 0x00, 0x00, 0x00, 0x00, - 0x03, 0x00, 0x00, 0x00, // Row 1: offset = 3 (cumulative: 3 elements total) - 0x00, 0x00, 0x00, 0x00, - 0x03, 0x00, 0x00, 0x00, // Row 2: offset = 3 (cumulative: 3 elements total) - 0x00, 0x00, 0x00, 0x00, - - // ArrayElements stream (3 total elements: 1, 2, 3): - 0x01, // Element from row 0: 1 (UInt8) - 0x02, // Element from row 0: 2 (UInt8) - 0x03, // Element from row 1: 3 (UInt8) - // Row 2 has no elements (empty array) -]); -``` - -**Understanding Cumulative Offsets:** - -In the Native format, the `ArraySizes` stream contains cumulative offsets, not individual sizes. The offset for row `i` indicates the total number of elements from all arrays in rows 0 through `i`. - -In the example above: -- Row 0: size = 2 - 0 = 2 elements (array `[1, 2]`) -- Row 1: size = 3 - 2 = 1 element (array `[3]`) -- Row 2: size = 3 - 3 = 0 elements (array `[]`) - -**Nested Arrays:** - -For nested arrays like `Array(Array(T))`, the structure repeats hierarchically: -- Outer `ArraySizes` stream: cumulative offsets for the outer arrays -- Inner `ArraySizes` stream: cumulative offsets for all inner arrays (flattened) -- `ArrayElements` stream: all leaf elements (flattened) - -The stream paths would be: -1. `{ArraySizes}` - outer array offsets -2. `{ArrayElements, ArraySizes}` - inner array offsets -3. `{ArrayElements, ArrayElements, Regular}` - element data - -### 6.4 Map - -The `Map(K, V)` type is internally represented as `Array(Tuple(K, V))` where each map entry is a key-value pair stored as a tuple. This means `Map` inherits the same stream structure as `Array`, but with nested streams for the tuple elements. - -**Stream Enumeration Order:** - -For `Map(K, V)`, the streams appear in this order: -1. Stream path `{ArraySizes}`: Map sizes (number of key-value pairs per row) -2. Stream path `{ArrayElements, TupleElement0, ...}`: All keys, serialized according to type `K` -3. Stream path `{ArrayElements, TupleElement1, ...}`: All values, serialized according to type `V` - -**Serialization Process:** - -1. The number of key-value pairs in each map is written to the `ArraySizes` stream (as cumulative offsets) -2. All keys from all maps are flattened and written consecutively -3. All values from all maps are flattened and written consecutively - -**Example: Map(String, UInt32)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT map('a', 1, 'b', 2)::Map(String, UInt32) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 134d 6170 2853 7472 696e ...col.Map(Strin -00000010: 672c 2055 496e 7433 3229 0200 0000 0000 g, UInt32)...... -00000020: 0000 0161 0162 0100 0000 0200 0000 ...a.b........ -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column metadata: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x13, 0x4d, 0x61, 0x70, // Column type = "Map(String, UInt32)" (length=19) - 0x28, 0x53, 0x74, 0x72, - 0x69, 0x6e, 0x67, 0x2c, - 0x20, 0x55, 0x49, 0x6e, - 0x74, 0x33, 0x32, 0x29, - - // ArraySizes stream (1 map with 2 entries): - 0x02, 0x00, 0x00, 0x00, // Map offset: 2 key-value pairs (UInt64) - 0x00, 0x00, 0x00, 0x00, - - // TupleElement0 stream (Keys - 2 String values): - 0x01, 0x61, // Key 0: "a" (String: length=1) - 0x01, 0x62, // Key 1: "b" (String: length=1) - - // TupleElement1 stream (Values - 2 UInt32 values): - 0x01, 0x00, 0x00, 0x00, // Value 0: 1 (UInt32) - 0x02, 0x00, 0x00, 0x00, // Value 1: 2 (UInt32) -]); -``` - -**Stream Path Hierarchy:** - -``` -Map(String, UInt32) -└── ArraySizes [2] (one entry with offset 2) - └── ArrayElements (Tuple) - ├── TupleElement0 (Keys) ["a", "b"] - └── TupleElement1 (Values) [1, 2] -``` - -**Empty Maps:** - -An empty map contributes no keys or values to the element streams. Only the offset in the `ArraySizes` stream reflects its presence (the offset doesn't increase from the previous row). - -### 6.5 Tuple - -The `Tuple` type serializes each element sequentially in separate streams. Unlike `Array` or `Map`, there is no size metadata because tuples have a fixed number of elements known at schema definition time. - -**Stream Enumeration Order:** - -For `Tuple(T1, T2, ..., Tn)`, each element is serialized in its own stream path: -1. Stream path `{TupleElement0, ...}`: First element, serialized according to type `T1` -2. Stream path `{TupleElement1, ...}`: Second element, serialized according to type `T2` -3. ... -n. Stream path `{TupleElement(n-1), ...}`: Nth element, serialized according to type `Tn` - -**Serialization Details:** - -Each tuple element is serialized as if it were an independent column. For a block with N rows, each element stream contains N values of its respective type. The element streams are written consecutively in the order of the tuple elements. - -There is no explicit tuple structure metadata in the wire format - the tuple structure is defined by the type string, and each element is simply serialized in order. - -**Example: Tuple(UInt32, String)** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT (42, 'hello')::Tuple(UInt32, String) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 1554 7570 6c65 2855 496e ...col.Tuple(UIn -00000010: 7433 322c 2053 7472 696e 6729 2a00 0000 t32, String)*... -00000020: 0568 656c 6c6f .hello -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column metadata: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x15, 0x54, 0x75, 0x70, // Column type = "Tuple(UInt32, String)" (length=21) - 0x6c, 0x65, 0x28, 0x55, - 0x49, 0x6e, 0x74, 0x33, - 0x32, 0x2c, 0x20, 0x53, - 0x74, 0x72, 0x69, 0x6e, - 0x67, 0x29, - - // TupleElement0 stream (UInt32): - 0x2a, 0x00, 0x00, 0x00, // Element 0: 42 (UInt32) - - // TupleElement1 stream (String): - 0x05, 0x68, 0x65, 0x6c, // Element 1: "hello" (String: length=5) - 0x6c, 0x6f, -]); -``` - -**Multiple Rows:** - -For multiple rows, each element stream contains all values for that element position across all rows: - -```sql -SELECT arrayJoin([(1, 'a'), (2, 'b'), (3, 'c')])::Tuple(UInt8, String) AS col -``` - -Would serialize as: -- `TupleElement0` stream: `[1, 2, 3]` (3 UInt8 values) -- `TupleElement1` stream: `['a', 'b', 'c']` (3 String values) - -**Nested Tuples:** - -For nested tuples like `Tuple(UInt32, Tuple(String, UInt8))`, the streams are flattened hierarchically: - -1. `{TupleElement0, Regular}` - First element (UInt32) -2. `{TupleElement1, TupleElement0, Regular}` - Nested tuple's first element (String) -3. `{TupleElement1, TupleElement1, Regular}` - Nested tuple's second element (UInt8) - -**Named Tuples:** - -Named tuples (e.g., `Tuple(x UInt32, y String)`) serialize identically to unnamed tuples. The element names are only reflected in the type string and do not affect the binary wire format. The serialization order matches the definition order. - -### 6.6 Nested - -`Nested` is a special column type that represents a structured collection of named fields, where each field is represented as an `Array`. Unlike `Tuple`, which stores data row-by-row, `Nested` stores data columnar-fashion with synchronized array lengths. - -**Conceptual Model:** -- `Nested(a T1, b T2, ...)` is equivalent to multiple `Array(T1)`, `Array(T2)`, ... columns -- All arrays must have the same length for each row -- Fields are accessed using dot notation: `column.field` - -**Wire Format:** -In the Native format, `Nested` columns are **not serialized as a single unit**. Instead, they are represented as separate `Array` columns with a special naming convention: - -1. **Column Naming**: Each nested field becomes a separate column named `parent.field` -2. **Type**: Each column has type `Array(FieldType)` -3. **Serialization**: Each array column is serialized independently using `Array` serialization (see section 6.3) - -**Example: Nested structure** - -```bash -# Note: Using Array(Tuple(...)) as Nested is mostly used internally in ClickHouse -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT [(1, 'a'), (2, 'b')]::Array(Tuple(id UInt32, name String)) AS n" | xxd -``` - -``` -00000000: 0101 016e 2441 7272 6179 2854 7570 6c65 ...n$Array(Tuple -00000010: 2869 6420 5549 6e74 3332 2c20 6e61 6d65 (id UInt32, name -00000020: 2053 7472 696e 6729 2902 0000 0000 0000 String))....... -00000030: 0001 0000 0002 0000 0001 6101 62 ..........a.b -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column 0: - 0x01, 0x6e, // Column name = "n" (length=1) - 0x24, // Type name length = 36 - 0x41, 0x72, 0x72, 0x61, 0x79, 0x28, // "Array(Tuple(id UInt32, name String))" - 0x54, 0x75, 0x70, 0x6c, 0x65, 0x28, - 0x69, 0x64, 0x20, 0x55, 0x49, 0x6e, - 0x74, 0x33, 0x32, 0x2c, 0x20, 0x6e, - 0x61, 0x6d, 0x65, 0x20, 0x53, 0x74, - 0x72, 0x69, 0x6e, 0x67, 0x29, 0x29, - - // Array stream: ArraySizes - 0x02, 0x00, 0x00, 0x00, // Array size = 2 (UInt64 little-endian) - 0x00, 0x00, 0x00, 0x00, - - // Array stream: ArrayElements (Tuple serialization - sequential) - // Tuple element 0 (id UInt32): - 0x01, 0x00, 0x00, 0x00, // Value 1 - 0x02, 0x00, 0x00, 0x00, // Value 2 - - // Tuple element 1 (name String): - 0x01, 0x61, // "a" (length=1, data) - 0x01, 0x62, // "b" (length=1, data) -]); -``` - -**Implementation Note:** -- True `Nested` columns (when created with explicit `Nested` type) would be split into separate columns: `n.id Array(UInt32)` and `n.name Array(String)` -- Each would appear as a separate column in the block with the dotted name -- The synchronization of array lengths is enforced at insertion time, not in the wire format - -### 6.7 Variant - -The `Variant` type represents a value that can be one of several different types, similar to a tagged union. It stores a discriminator indicating which type the value has, along with the actual value in a sparse columnar format. - -**Type Definition:** -```sql -Variant(T1, T2, T3, ...) -``` - -**Stream Architecture:** - -`Variant` uses multiple logical streams: - -1. **VariantDiscriminators** (prefix stream for version byte + data stream for discriminators) - - Serialization mode byte (UInt64): `0` = BASIC, `1` = COMPACT - - Discriminator values identifying which variant type each row contains - -2. **VariantElements** (wrapper stream path component) - - **VariantElement** (one substream per variant type) - - Each variant's data stored sparsely (only values for rows with that discriminator) - -**Discriminator Values:** -- `0xFF` (255): NULL discriminator - the row contains NULL -- `0` to `N-1`: Global discriminator for variant types 0 through N-1 -- Discriminators are always serialized in **global order** (the order types appear in the type definition) - -**Serialization Modes:** - -The discriminators stream has two modes, chosen based on the `use_compact_variant_discriminators_serialization` setting: - -**BASIC Mode (mode = 0):** -- All discriminators written sequentially, one per row -- Each discriminator is a single `UInt8` byte -- Simple but may be inefficient when granules have uniform types - -**COMPACT Mode (mode = 1):** -- Optimizes storage when granules have uniform discriminators -- Format per granule: - ``` - - - - ``` -- Format byte values: - - `0` (PLAIN): Mixed discriminators, all written as in BASIC mode - - `1` (COMPACT): Single discriminator for all rows, written once - -**Example: Variant with String and UInt64** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_variant_type=1" \ - --data-binary "SELECT 'hello'::Variant(String, UInt64) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 1756 6172 6961 6e74 2853 ...col.Variant(S -00000010: 7472 696e 672c 2055 496e 7436 3429 0000 tring, UInt64).. -00000020: 0000 0000 0000 0005 6865 6c6c 6f ........hello -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column 0: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x17, // Type name length = 23 - 0x56, 0x61, 0x72, 0x69, 0x61, 0x6e, // "Variant(String, UInt64)" - 0x74, 0x28, 0x53, 0x74, 0x72, 0x69, - 0x6e, 0x67, 0x2c, 0x20, 0x55, 0x49, - 0x6e, 0x74, 0x36, 0x34, 0x29, - - // VariantDiscriminatorsPrefix stream: - 0x00, 0x00, 0x00, 0x00, // Serialization mode = 0 (BASIC) - 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) - - // VariantDiscriminators stream (BASIC mode): - 0x00, // Discriminator = 0 (String type) - - // VariantElements/VariantElement[0] (String): - 0x05, 0x68, 0x65, 0x6c, 0x6c, 0x6f, // String: length=5, "hello" - - // VariantElements/VariantElement[1] (UInt64): - // (empty - no values with discriminator 1) -]); -``` - -**Example: Variant with multiple rows and NULL** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_variant_type=1" \ - --data-binary "SELECT arrayJoin([42::Variant(String, UInt64), 'hello'::Variant(String, UInt64)]) AS col" | xxd -``` - -``` -00000000: 0102 0363 6f6c 1756 6172 6961 6e74 2853 ...col.Variant(S -00000010: 7472 696e 672c 2055 496e 7436 3429 0000 tring, UInt64).. -00000020: 0000 0000 0000 0100 0568 656c 6c6f 2a00 .........hello*. -00000030: 0000 0000 0000 ...... -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x02, // NumRows = 2 - - // Column metadata... - 0x03, 0x63, 0x6f, 0x6c, // "col" - 0x17, /* "Variant(String, UInt64)" */, - - // VariantDiscriminatorsPrefix: - 0x00, 0x00, 0x00, 0x00, // Mode = 0 (BASIC) - 0x00, 0x00, 0x00, 0x00, - - // VariantDiscriminators (BASIC mode, 2 rows): - 0x01, // Row 0: discriminator = 1 (UInt64) - 0x00, // Row 1: discriminator = 0 (String) - - // VariantElements/VariantElement[0] (String): - 0x05, 0x68, 0x65, 0x6c, 0x6c, 0x6f, // 1 value: "hello" - - // VariantElements/VariantElement[1] (UInt64): - 0x2a, 0x00, 0x00, 0x00, // 1 value: 42 - 0x00, 0x00, 0x00, 0x00, -]); -``` - -Note: The data for each variant is stored **sparsely** - only the actual values are written, in the order they appear. The discriminators determine which variant column each row's value came from. - -**Example: Variant with NULL** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_variant_type=1" \ - --data-binary "SELECT [NULL::Variant(String, UInt64), 'a'::Variant(String, UInt64), 1::Variant(String, UInt64)] AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 1e41 7272 6179 2856 6172 ...col.Array(Var -00000010: 6961 6e74 2853 7472 696e 672c 2055 496e iant(String, UIn -00000020: 7436 3429 2900 0000 0000 0000 0003 0000 t64))........... -00000030: 0000 0000 00ff 0001 0561 0100 0000 0000 .........a...... -00000040: 0000 .. -``` - -Key points in this example: -- Discriminator `0xff` indicates NULL -- String variant (discriminator `0x00`) has 1 value: "a" -- UInt64 variant (discriminator `0x01`) has 1 value: 1 -- Readers reconstruct the row-to-value mapping by counting discriminators per variant type - -**Sparse Variant Storage:** - -Each variant type's data column contains only the values for rows with that discriminator: -- If a `Variant(String, UInt64)` column has 100 rows where 30 are String, 60 are UInt64, and 10 are NULL -- The String variant column has 30 values -- The UInt64 variant column has 60 values -- The discriminators column has 100 discriminator bytes - -Internally, ClickHouse maintains an offsets column to map each row to its position in the appropriate variant column. This offsets column is part of the in-memory representation only and is **not serialized** in the Native wire format. Readers must reconstruct this mapping by iterating through discriminators and counting values per variant type. - -### 6.8 Dynamic - -The `Dynamic` type can store values of different types, dynamically determining the type at insertion time. It's implemented internally using a `Variant` type with an additional "shared variant" for overflow types. - -**Type Definition:** -```sql -Dynamic -Dynamic(max_types=N) -``` - -Where `max_types` (default 32) determines the maximum number of distinct types that can be stored as separate variants. Additional types are stored in a "shared variant" in binary-serialized form. - -**Stream Architecture:** - -`Dynamic` uses two main streams: - -1. **DynamicStructure** stream: - - Serialization version (UInt64) - - Type list (structure metadata) - - Optional statistics - -2. **DynamicData** stream: - - Contains the internal `Variant` column data (in standard mode) - - OR contains indexes + flattened type columns (in FLATTENED mode) - -**Serialization Versions:** - -The `DynamicStructure` stream begins with a version byte (UInt64) determining the format: - -**V1 (version = 1):** -- `` (VarUInt) - historical, now equals num_dynamic_types -- `` (VarUInt) - actual number of variant types -- `` (String for each type, excluding SharedVariant) -- Optional statistics (in MergeTree serialization) - -**V2 (version = 2):** -- Same as V1 but without the `max_dynamic_types` parameter -- Type names as strings - -**V3 (version = 4):** -- Like V2 but type names in binary format (via `encodeDataType`) -- Statistics prefixed with a bool flag indicating presence - -> **Note:** Version numbers are not sequential. V1=1, V2=2, V3=4. There is no version 3. - -**FLATTENED (version = 3):** -- Used only in Native format for client compatibility -- Structure: - ``` - ` (VarUInt) - ` (each type in binary format or as string) - ``` -- Data format: - ``` - ` (UInt8/16/32/64 column, depending on num_types + 1 for NULL) - - - ... - ``` - -**Example: Dynamic with single value** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_dynamic_type=1" \ - --data-binary "SELECT 42::Dynamic AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0744 796e 616d 6963 0100 ...col.Dynamic.. -00000010: 0000 0000 0000 0101 0653 7472 696e 6700 .........String. -00000020: 0000 0000 0000 0001 0234 32 .........42 -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column 0: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x07, 0x44, 0x79, 0x6e, 0x61, // Column type = "Dynamic" - 0x6d, 0x69, 0x63, - - // DynamicStructure stream: - 0x01, 0x00, 0x00, 0x00, // Version = 1 (V1) - 0x00, 0x00, 0x00, 0x00, - - 0x01, // num_dynamic_types = 1 (VarUInt) - 0x01, // Number of types in structure = 1 - 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (length=6) - 0x6e, 0x67, - - // DynamicData stream (internal Variant column): - // VariantDiscriminatorsPrefix: - 0x00, 0x00, 0x00, 0x00, // Mode = 0 (BASIC) - 0x00, 0x00, 0x00, 0x00, - - // VariantDiscriminators: - 0x01, // Discriminator = 1 (SharedVariant) - - // VariantElements/VariantElement[0] (String): - // (empty - no values with discriminator 0) - - // VariantElements/VariantElement[1] (SharedVariant): - 0x01, // Number of values = 1 - 0x02, 0x34, 0x32, // Binary: type + value - // Actually: encodeDataType(String) + "42" -]); -``` - -Wait, let me recheck this with the actual output. The value 42 was inferred as String type "42", and it's stored in the shared variant. - -**Example: Dynamic with NULL** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_dynamic_type=1" \ - --data-binary "SELECT NULL::Dynamic AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0744 796e 616d 6963 0100 ...col.Dynamic.. -00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ -00000020: ff . -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column metadata... - 0x03, 0x63, 0x6f, 0x6c, // "col" - 0x07, 0x44, 0x79, 0x6e, 0x61, // "Dynamic" - 0x6d, 0x69, 0x63, - - // DynamicStructure stream: - 0x01, 0x00, 0x00, 0x00, // Version = 1 - 0x00, 0x00, 0x00, 0x00, - - 0x00, // num_dynamic_types = 0 (no types) - 0x00, // Number of actual types = 0 - - // DynamicData stream (Variant): - 0x00, 0x00, 0x00, 0x00, // Mode = 0 (BASIC) - 0x00, 0x00, 0x00, 0x00, - - 0xff, // Discriminator = 0xff (NULL) -]); -``` - -**Example: Dynamic with FLATTENED serialization** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&output_format_native_use_flattened_dynamic_and_json_serialization=1&allow_experimental_dynamic_type=1" \ - --data-binary "SELECT 42::Dynamic AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0744 796e 616d 6963 0300 ...col.Dynamic.. -00000010: 0000 0000 0000 0106 5374 7269 6e67 0002 ........String.. -00000020: 3432 42 -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column metadata... - 0x03, 0x63, 0x6f, 0x6c, // "col" - 0x07, 0x44, 0x79, 0x6e, 0x61, // "Dynamic" - 0x6d, 0x69, 0x63, - - // DynamicStructure stream: - 0x03, 0x00, 0x00, 0x00, // Version = 3 (FLATTENED) - 0x00, 0x00, 0x00, 0x00, - - 0x01, // Number of types = 1 (VarUInt) - 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (length=6) - 0x6e, 0x67, - - // DynamicData stream (FLATTENED format): - // Type indexes column (UInt8, since num_types + 1 = 2 <= 256): - 0x00, // Row 0: type index = 0 (String) - - // Data for type 0 (String): - 0x02, 0x34, 0x32, // String: length=2, "42" -]); -``` - -**FLATTENED Mode Details:** - -In FLATTENED mode, the data is densely packed: -1. **Type Indexes**: A column of UInt8/16/32/64 (chosen based on number of types) where: - - `0xFF...FF` (all bits set) = NULL - - `0` to `N-1` = index into the types list - -2. **Type Data Columns**: One column per type, containing only the values of that type - - Values appear in row order, but only for rows of that type - - Reader must iterate through indexes to know which type column to read next value from - -**Standard Mode vs FLATTENED Mode:** - -- **Standard Mode** (V1/V2/V3): Uses internal `Variant` column with sparse storage, more efficient for storage -- **FLATTENED Mode**: Easier to parse for clients, all data is densely packed, index array tells which type each row has - -**Shared Variant:** - -When `Dynamic` has more distinct types than `max_types`: -- The first `max_types` types get dedicated variant columns -- Additional types are stored in binary form in a "SharedVariant" type (always the last variant) -- Each value in SharedVariant is: `encodeDataType(T) + serializeBinary(value, T)` -- On deserialization, the type is decoded from the binary prefix and the value is deserialized -### 6.9 JSON - -The `JSON` type stores semi-structured JSON data with automatic schema inference. It uses an object-oriented storage model where JSON paths are stored as separate columns, similar to how `Nested` columns work. - -**Type Definition:** -```sql -JSON -JSON(max_dynamic_paths=N) -JSON(a Int64, b String) -- with typed paths -``` - -**Storage Model:** - -`JSON` columns store data using three categories of paths: - -1. **Typed Paths**: Paths with known, fixed types (declared in type definition) -2. **Dynamic Paths**: Frequently occurring paths discovered at runtime (up to `max_dynamic_paths`) -3. **Shared Data**: Remaining paths stored in a shared `Dynamic` column - -**Stream Architecture:** - -Similar to `Dynamic`, `JSON` uses multiple streams: - -1. **ObjectStructure** stream: - - Serialization version (UInt64) - - Structure metadata (list of dynamic paths, statistics, etc.) - -2. **ObjectData** stream: - - **ObjectTypedPath** substreams: One per typed path - - **ObjectDynamicPath** substreams: One per dynamic path - - **ObjectSharedData** substream: For overflow paths - -**Serialization Versions:** - -**V1 (version = 0):** -- `` (VarUInt) -- `` (VarUInt) -- `` (String for each) -- Optional statistics - -**V2 (version = 2):** -- Same as V1 without `max_dynamic_paths` - -**V3 (version = 4):** -- Like V2 with binary type encoding and optional statistics flag - -> **Note:** JSON Object version numbers are: V1=0, STRING=1, V2=2, FLATTENED=3, V3=4. - -> **Important:** `Dynamic` columns (used for dynamic paths) have **different** version numbers: V1=1, V2=2, FLATTENED=3, V3=4. Don't confuse JSON V1 (=0) with Dynamic V1 (=1). - -**STRING (version = 1):** -- Special mode for Native format only -- JSON stored as plain String (JSON text) -- Controlled by `output_format_native_write_json_as_string=1` - -**FLATTENED (version = 3):** -- Native format only, for easier client parsing -- All paths (typed + dynamic + shared) serialized as separate columns - -**Example: JSON with STRING serialization** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1&output_format_native_write_json_as_string=1" \ - --data-binary "SELECT '{\"a\": 1}'::JSON AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 044a 534f 4e01 0000 0000 ...col.JSON..... -00000010: 0000 0007 7b22 6122 3a31 7d ....{"a":1} -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column 0: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x04, 0x4a, 0x53, 0x4f, 0x4e, // Column type = "JSON" - - // ObjectStructure stream: - 0x01, 0x00, 0x00, 0x00, // Version = 1 (STRING) - 0x00, 0x00, 0x00, 0x00, - - // ObjectData stream (String column): - 0x07, // String length = 7 - 0x7b, 0x22, 0x61, 0x22, 0x3a, // '{"a":1}' - 0x31, 0x7d, -]); -``` - -**Example: JSON with standard serialization** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ - --data-binary "SELECT '{\"a\": 1, \"b\": \"hello\"}'::JSON AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 044a 534f 4e00 0000 0000 ...col.JSON..... -00000010: 0000 0002 0201 6101 6201 0000 0000 0000 ......a.b....... -00000020: 0001 0105 496e 7436 3400 0000 0000 0000 ....Int64....... -00000030: 0001 0000 0000 0000 0001 0106 5374 7269 ............Stri -00000040: 6e67 0000 0000 0000 0000 0001 0000 0000 ng.............. -00000050: 0000 0001 0568 656c 6c6f 0000 0000 0000 .....hello...... -00000060: 0000 .. -``` - -Breakdown (simplified structure): -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column metadata... - 0x03, 0x63, 0x6f, 0x6c, // "col" - 0x04, 0x4a, 0x53, 0x4f, 0x4e, // "JSON" - - // ObjectStructure stream: - 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1) - 0x00, 0x00, 0x00, 0x00, - - 0x02, // num_dynamic_paths = 2 (VarUInt) - 0x02, // Actual count = 2 - - // Sorted dynamic paths: - 0x01, 0x61, // Path 0: "a" (length=1) - 0x01, 0x62, // Path 1: "b" (length=1) - - // Shared data serialization version: - 0x01, 0x00, 0x00, 0x00, // Version info - 0x00, 0x00, 0x00, 0x00, - - // ObjectData stream: - // Path "a" stored as Dynamic with inferred type Int64: - 0x01, // num_types = 1 - 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // "Int64" - // ... Variant discriminators ... - 0x00, 0x00, 0x00, 0x00, // Mode - 0x00, 0x00, 0x00, 0x00, - 0x01, // Discriminator - // ... Value ... - 0x01, 0x00, 0x00, 0x00, // 1 (as Int64) - 0x00, 0x00, 0x00, 0x00, - - // Path "b" stored as Dynamic with type String: - 0x01, // num_types = 1 - 0x06, 0x53, 0x74, 0x72, 0x69, // "String" - 0x6e, 0x67, - // ... Variant discriminators ... - 0x00, 0x00, 0x00, 0x00, // Mode - 0x00, 0x00, 0x00, 0x00, - 0x01, // Discriminator - // ... Value ... - 0x05, 0x68, 0x65, 0x6c, 0x6c, 0x6f, // "hello" -]); -``` - -Note: Each dynamic path is stored as a `Dynamic` column, so it includes the full `Dynamic` serialization (structure + variant data). - -**Example: JSON with typed and dynamic paths** - -This example demonstrates a `JSON` column with both a **typed path** (declared type) and a **dynamic path** (inferred type). Typed paths are serialized directly using their native format, while dynamic paths are wrapped in `Dynamic` serialization. - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ - --data-binary "SELECT '{\"a\": 42, \"b\": \"hi\"}'::JSON(a UInt32) AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0e4a 534f 4e28 6120 5549 ...col.JSON(a UI -00000010: 6e74 3332 2900 0000 0000 0000 0001 0101 nt32)........... -00000020: 6201 0000 0000 0000 0001 0106 5374 7269 b...........Stri -00000030: 6e67 0000 0000 0000 0000 2a00 0000 0102 ng........*..... -00000040: 6869 0000 0000 0000 0000 hi........ -``` - -Full breakdown: -```typescript -const block = new Uint8Array([ - // ═══════════════════════════════════════════════════════════════════ - // BLOCK HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x01, // NumColumns = 1 (VarUInt) - 0x01, // NumRows = 1 (VarUInt) - - // ═══════════════════════════════════════════════════════════════════ - // COLUMN HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (len=3) - 0x0e, // Type name length = 14 - 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x61, // "JSON(a UInt32)" - 0x20, 0x55, 0x49, 0x6e, 0x74, 0x33, // - typed path "a" with type UInt32 - 0x32, 0x29, // - typed paths are part of the type name - - // ═══════════════════════════════════════════════════════════════════ - // OBJECT STRUCTURE STREAM (prefix for JSON column) - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1 serialization) - 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) - - // V1 structure metadata: - 0x01, // max_dynamic_paths = 1 (VarUInt) - 0x01, // num_dynamic_paths = 1 (VarUInt) - - // Sorted list of dynamic path names: - 0x01, 0x62, // Path 0: "b" (len=1, then "b") - // - typed path "a" is NOT listed here - // - only runtime-discovered paths appear - - // NOTE: In Native format, statistics mode defaults to NONE, so no - // statistics are written here. The next bytes are the Dynamic prefix. - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC STRUCTURE STREAM (prefix for dynamic path "b") - // ═══════════════════════════════════════════════════════════════════ - 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) - 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) - // NOTE: Dynamic V1=1, V2=2, FLATTENED=3, V3=4 - - // V1 Dynamic structure: - 0x01, // max_dynamic_types = 1 (VarUInt) - 0x01, // num_dynamic_types = 1 (VarUInt) - - // Type names (not SharedVariant, which is implicit): - 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (len=6) - 0x6e, 0x67, - - // NOTE: Dynamic statistics also use NONE mode in Native format, - // so no statistics written here either. - - // ═══════════════════════════════════════════════════════════════════ - // VARIANT STRUCTURE STREAM (prefix for Dynamic's Variant) - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) - 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) - - // ═══════════════════════════════════════════════════════════════════ - // OBJECT DATA STREAM - // ═══════════════════════════════════════════════════════════════════ - - // --- TYPED PATH "a" (UInt32) --- - // Serialized directly using UInt32's native format (no Dynamic wrapper!) - 0x2a, 0x00, 0x00, 0x00, // value = 42 (UInt32 little-endian) - - // --- DYNAMIC PATH "b" (as Dynamic -> Variant) --- - // Variant discriminator column (1 row): - 0x01, // Row 0: discriminator = 1 (String variant) - // Variants sorted alphabetically: SharedVariant=0, String=1 - // (255 = NULL) - - // String variant data: - 0x02, 0x68, 0x69, // value = "hi" (len=2, then "hi") - - // ═══════════════════════════════════════════════════════════════════ - // SHARED DATA STREAM (overflow paths, empty in this case) - // ═══════════════════════════════════════════════════════════════════ - // Shared data is Array(Tuple(path: String, value: String)) - // For 1 row with 0 shared paths: - 0x00, 0x00, 0x00, 0x00, // Array offsets: [0] (UInt64) - 0x00, 0x00, 0x00, 0x00, // - row 0 has 0 elements -]); -``` - -**Key observations:** - -1. **Typed path "a"** (UInt32): - - Declared in type definition: `JSON(a UInt32)` - - Type name includes the typed path specification - - NOT listed in the dynamic paths list - - Serialized directly as UInt32 (4 bytes, little-endian): `2a 00 00 00` = 42 - - No `Dynamic` wrapper overhead - -2. **Dynamic path "b"** (inferred as String): - - Discovered at runtime from the JSON data - - Listed in the dynamic paths list in ObjectStructure - - Serialized as a full `Dynamic` column: - - DynamicStructure: version, types list, statistics - - VariantStructure: mode - - VariantData: discriminators + type-specific data - -3. **Serialization order in ObjectData**: - - Typed paths first (sorted alphabetically by path name) - - Dynamic paths second (sorted alphabetically by path name) - - Shared data last - -4. **Space efficiency**: Typed paths save significant space by avoiding the `Dynamic` overhead. For path "a", we use 4 bytes (UInt32) instead of ~20+ bytes (Dynamic structure + Variant + value). - -**Example: JSON with typed and dynamic paths (multiple rows)** - -This example shows the same structure with 2 rows, demonstrating how column data is laid out contiguously for each path. - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ - --data-binary "SELECT ('{\"a\": ' || toString(number * 10) || ', \"b\": \"row' || toString(number) || '\"}')::JSON(a UInt32) AS col FROM system.numbers LIMIT 2" | xxd -``` - -This produces rows: `{"a": 0, "b": "row0"}` and `{"a": 10, "b": "row1"}`. - -``` -00000000: 0102 0363 6f6c 0e4a 534f 4e28 6120 5549 ...col.JSON(a UI -00000010: 6e74 3332 2900 0000 0000 0000 0001 0101 nt32)........... -00000020: 6201 0000 0000 0000 0001 0106 5374 7269 b...........Stri -00000030: 6e67 0000 0000 0000 0000 0000 0000 0a00 ng.............. -00000040: 0000 0101 0472 6f77 3004 726f 7731 0000 .....row0.row1.. -00000050: 0000 0000 0000 0000 0000 0000 0000 .............. -``` - -Full breakdown: -```typescript -const block = new Uint8Array([ - // ═══════════════════════════════════════════════════════════════════ - // BLOCK HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x01, // NumColumns = 1 (VarUInt) - 0x02, // NumRows = 2 (VarUInt) <-- 2 rows! - - // ═══════════════════════════════════════════════════════════════════ - // COLUMN HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (len=3) - 0x0e, // Type name length = 14 - 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x61, // "JSON(a UInt32)" - 0x20, 0x55, 0x49, 0x6e, 0x74, 0x33, - 0x32, 0x29, - - // ═══════════════════════════════════════════════════════════════════ - // OBJECT STRUCTURE STREAM - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1 serialization) - 0x00, 0x00, 0x00, 0x00, - - 0x01, // max_dynamic_paths = 1 (VarUInt) - 0x01, // num_dynamic_paths = 1 (VarUInt) - 0x01, 0x62, // Path 0: "b" (len=1) - - // NOTE: No statistics in Native format (NONE mode is default) - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC STRUCTURE STREAM (for path "b") - // ═══════════════════════════════════════════════════════════════════ - 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) - 0x00, 0x00, 0x00, 0x00, // (Dynamic V1=1, V2=2, V3=4) - - 0x01, // max_dynamic_types = 1 (VarUInt) - 0x01, // num_dynamic_types = 1 (VarUInt) - 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (len=6) - 0x6e, 0x67, - - // NOTE: No Dynamic statistics in Native format (NONE mode) - - // ═══════════════════════════════════════════════════════════════════ - // VARIANT STRUCTURE STREAM - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) - 0x00, 0x00, 0x00, 0x00, - - // ═══════════════════════════════════════════════════════════════════ - // OBJECT DATA STREAM - // ═══════════════════════════════════════════════════════════════════ - - // --- TYPED PATH "a" (UInt32) - ALL ROWS CONTIGUOUS --- - 0x00, 0x00, 0x00, 0x00, // Row 0: a = 0 - 0x0a, 0x00, 0x00, 0x00, // Row 1: a = 10 - - // --- DYNAMIC PATH "b" - Variant discriminators (ALL ROWS) --- - // Variants sorted alphabetically: SharedVariant=0, String=1 (255=NULL) - 0x01, // Row 0: discriminator = 1 (String) - 0x01, // Row 1: discriminator = 1 (String) - - // --- DYNAMIC PATH "b" - String variant data (ALL ROWS) --- - 0x04, 0x72, 0x6f, 0x77, 0x30, // Row 0: "row0" (len=4) - 0x04, 0x72, 0x6f, 0x77, 0x31, // Row 1: "row1" (len=4) - - // ═══════════════════════════════════════════════════════════════════ - // SHARED DATA STREAM (Array offsets for each row) - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Row 0 offset: 0 elements - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, // Row 1 offset: 0 elements - 0x00, 0x00, 0x00, 0x00, -]); -``` - -**Key observations for multi-row data:** - -1. **Columnar layout**: All values for each path are stored contiguously: - - Typed path "a": `[0, 10]` as two consecutive UInt32 values - - Dynamic path "b" discriminators: `[1, 1]` (both rows are String) - - Dynamic path "b" String data: `["row0", "row1"]` stored sequentially - -2. **No per-row overhead**: The structure metadata (version, paths, types) is written once in the prefix, not repeated for each row. Only the actual data values scale with row count. - -3. **Variant discriminators**: For the Dynamic path, discriminators for ALL rows come first, then the actual variant data for all rows. This enables efficient columnar processing. - -4. **Shared data offsets**: One UInt64 offset per row indicating how many shared path entries that row has. With 0 shared paths, all offsets are 0. - -**Example: JSON with exceeded max_dynamic_paths (shared data)** - -This example shows what happens when the number of paths exceeds `max_dynamic_paths`. Overflow paths are stored in the **shared data** section. - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ - --data-binary "SELECT '{\"a\": 1, \"b\": 2, \"c\": 3}'::JSON(max_dynamic_paths=2) AS col" | xxd -``` - -With `max_dynamic_paths=2` and 3 paths in the JSON: -- Paths "a" and "b" become **dynamic paths** (stored as `Dynamic` columns) -- Path "c" overflows to **shared data** - -``` -00000000: 0101 0363 6f6c 194a 534f 4e28 6d61 785f ...col.JSON(max_ -00000010: 6479 6e61 6d69 635f 7061 7468 733d 3229 dynamic_paths=2) -00000020: 0000 0000 0000 0000 0202 0161 0162 0100 ...........a.b.. -00000030: 0000 0000 0000 0101 0549 6e74 3634 0000 .........Int64.. -00000040: 0000 0000 0000 0100 0000 0000 0000 0101 ................ -00000050: 0549 6e74 3634 0000 0000 0000 0000 0001 .Int64.......... -00000060: 0000 0000 0000 0000 0200 0000 0000 0000 ................ -00000070: 0100 0000 0000 0000 0163 090a 0300 0000 .........c...... -00000080: 0000 0000 .... -``` - -Full breakdown: -```typescript -const block = new Uint8Array([ - // ═══════════════════════════════════════════════════════════════════ - // BLOCK HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // ═══════════════════════════════════════════════════════════════════ - // COLUMN HEADER - // ═══════════════════════════════════════════════════════════════════ - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x19, // Type name length = 25 - // "JSON(max_dynamic_paths=2)" - 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x6d, 0x61, 0x78, 0x5f, - 0x64, 0x79, 0x6e, 0x61, 0x6d, 0x69, 0x63, 0x5f, - 0x70, 0x61, 0x74, 0x68, 0x73, 0x3d, 0x32, 0x29, - - // ═══════════════════════════════════════════════════════════════════ - // OBJECT STRUCTURE STREAM - // ═══════════════════════════════════════════════════════════════════ - 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1) - 0x00, 0x00, 0x00, 0x00, - - 0x02, // max_dynamic_paths = 2 - 0x02, // num_dynamic_paths = 2 - 0x01, 0x61, // Path 0: "a" - 0x01, 0x62, // Path 1: "b" - // Note: "c" is NOT here - it's in shared data - - // NOTE: No statistics in Native format (NONE mode is default) - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC PATH "a" STRUCTURE - // ═══════════════════════════════════════════════════════════════════ - 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) - 0x00, 0x00, 0x00, 0x00, // (Dynamic V1=1, V2=2, V3=4) - 0x01, // max_dynamic_types = 1 - 0x01, // num_dynamic_types = 1 - 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // Type: "Int64" - // NOTE: No Dynamic stats (NONE mode) - 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) - 0x00, 0x00, 0x00, 0x00, - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC PATH "b" STRUCTURE (similar to "a") - // ═══════════════════════════════════════════════════════════════════ - 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) - 0x00, 0x00, 0x00, 0x00, - 0x01, // max_dynamic_types = 1 - 0x01, // num_dynamic_types = 1 - 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // Type: "Int64" - 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) - 0x00, 0x00, 0x00, 0x00, - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC PATH "a" DATA (Int64 value = 1) - // ═══════════════════════════════════════════════════════════════════ - // Variants sorted alphabetically: Int64=0, SharedVariant=1 (255=NULL) - 0x00, // Discriminator = 0 (Int64 variant) - 0x01, 0x00, 0x00, 0x00, // value = 1 (Int64 little-endian) - 0x00, 0x00, 0x00, 0x00, - - // ═══════════════════════════════════════════════════════════════════ - // DYNAMIC PATH "b" DATA (Int64 value = 2) - // ═══════════════════════════════════════════════════════════════════ - 0x00, // Discriminator = 0 (Int64 variant) - 0x02, 0x00, 0x00, 0x00, // value = 2 (Int64 little-endian) - 0x00, 0x00, 0x00, 0x00, - - // ═══════════════════════════════════════════════════════════════════ - // SHARED DATA STREAM - // This is where path "c" lives (overflow from max_dynamic_paths) - // Format: Array(Tuple(path: String, value: String)) - // ═══════════════════════════════════════════════════════════════════ - - // Array offsets (one UInt64 per row): - 0x01, 0x00, 0x00, 0x00, // Row 0: offset = 1 (has 1 element) - 0x00, 0x00, 0x00, 0x00, - - // Shared data element 0 (path "c"): - 0x01, 0x63, // Path name: "c" (len=1) - - // Binary-encoded Dynamic value for "c": - 0x09, // Value string length = 9 bytes - 0x0a, // BinaryTypeIndex = 0x0a (Int64) - 0x03, 0x00, 0x00, 0x00, // value = 3 (Int64 little-endian) - 0x00, 0x00, 0x00, 0x00, -]); -``` - -**Key observations for shared data:** - -1. **Overflow mechanism**: When paths exceed `max_dynamic_paths`, extra paths go to shared data. Here "a" and "b" are dynamic paths, "c" overflows. - -2. **Shared data format**: `Array(Tuple(path: String, value: String))` - - Array offsets indicate how many shared paths each row has - - Each element is a (path_name, binary_encoded_value) tuple - -3. **Binary-encoded values in shared data**: The value is stored as a length-prefixed string containing: - - `BinaryTypeIndex` (1 byte): Type identifier (0x0a = Int64) - - Native value: The value in its type's binary format - -4. **BinaryTypeIndex values** (common types): - | Type | Index | - |------|-------| - | Nothing | 0x00 | - | UInt8 | 0x01 | - | UInt32 | 0x03 | - | UInt64 | 0x04 | - | Int64 | 0x0a | - | String | 0x15 | - | Array | 0x1e | - | JSON | 0x30 | - -5. **Space tradeoff**: Shared data is less efficient than dynamic paths because each value includes its path name and type encoding. Use `max_dynamic_paths` wisely for your data. - -**Example: Variant discriminator ordering (multiple types)** - -Variant types are sorted **alphabetically by type name** to determine discriminator values. SharedVariant is included in this sorting. Here's an example showing how discriminators are assigned: - -```sql --- Insert multiple rows with different JSON value types -INSERT INTO test_json VALUES -('{"x": true}'), -- Bool -('{"x": 3.14}'), -- Float64 -('{"x": "hello"}'), -- String -('{"x": [1,2,3]}'), -- Array(Nullable(Int64)) -('{"x": null}') -- NULL -``` - -The Dynamic column for path "x" will have variant types sorted alphabetically: - -| Index | Type | Notes | -|-------|------|-------| -| 0 | Array(Nullable(Int64)) | "A" < "B" < ... | -| 1 | Bool | | -| 2 | Float64 | | -| 3 | SharedVariant | Implicit, always present | -| 4 | String | "Sh..." < "St..." | -| 255 | (NULL) | Special NULL_DISCRIMINATOR | - -The discriminator bytes in the serialized data will be: -``` -0x01, // Bool (index 1) -0x02, // Float64 (index 2) -0x04, // String (index 4, after SharedVariant=3) -0x00, // Array (index 0) -0xff, // NULL (255) -``` - -**Key rule**: For any type T in a Dynamic column, its discriminator = index in the alphabetically sorted list of [all_types + SharedVariant]. - -**Path Naming:** - -JSON paths use dot notation: -- `a` - top-level key "a" -- `a.b` - nested key "b" inside "a" -- `a.b.c` - deeply nested - -ClickHouse can escape dots in JSON keys when `json_type_escape_dots_in_keys=1` is set. - -**Typed Paths:** - -When a `JSON` type has typed paths declared: -```sql -JSON(user_id UInt64, name String) -``` - -These paths: -1. Always have the specified type (no inference needed) -2. Are serialized directly in their native type format (not as `Dynamic`) -3. Appear in `ObjectData/ObjectTypedPath` substreams - -**Dynamic Paths:** - -Dynamic paths are: -1. Discovered at insertion time -2. Limited by `max_dynamic_paths` (default 1024) -3. Stored as `Dynamic` columns with inferred types -4. Appear in `ObjectData/ObjectDynamicPath` substreams - -**Shared Data:** - -When more than `max_dynamic_paths` unique paths exist: -- Overflow paths are stored in a shared `Dynamic` column -- Each value in shared data includes its full path + value -- Uses binary serialization similar to `Dynamic`'s shared variant - -**Format Setting: `output_format_native_write_json_as_string`:** - -When set to `1`: -- Uses STRING serialization (version = 1) -- Entire JSON object serialized as text String -- Easier for clients that don't support complex `JSON` deserialization -- Default: `0` (use structured serialization) - -**Format Setting: `output_format_native_use_flattened_dynamic_and_json_serialization`:** - -When set to `1`: -- Uses FLATTENED serialization (version = 3) -- All paths serialized as individual columns -- No `Dynamic` wrapper, just direct column data -- Easier for some clients to parse -- Default: `0` - -**Reading JSON Columns:** - -To deserialize a `JSON` column in standard mode: - -1. Read `ObjectStructure` to get version and path list -2. For each typed path: deserialize using its known type -3. For each dynamic path: deserialize as `Dynamic` column -4. If shared data exists: deserialize the shared `Dynamic` column -5. Reconstruct JSON objects by merging values from all paths - -In STRING mode: just read the String value containing JSON text. - -In FLATTENED mode: read each path as a separate column, with a NULL indicator for missing paths. - ---- - -## 7. Geo Types - -Geographic data types in ClickHouse are type aliases that compose existing types (`Tuple`, `Array`) to represent geometric shapes. While their underlying representation uses standard composite types, these geo types have distinct names in the Native format's type string. - -**Important:** The type NAME in the Native format wire protocol is the geo type alias (`Point`, `Ring`, etc.), not the underlying composite type (`Tuple(Float64, Float64)`, `Array(Point)`, etc.). This allows parsers to recognize geographic types explicitly. - -### 7.1 Point - -`Point` represents a single 2D coordinate as a tuple of two `Float64` values (X, Y). - -**Type Alias:** -```sql -Point = Tuple(Float64, Float64) -``` - -**Wire Format:** - -The `Point` type serializes as a `Tuple` with two `Float64` elements: -1. Type string: `"Point"` (not `"Tuple(Float64, Float64)"`) -2. Stream `{TupleElement0}`: X coordinate (`Float64`, 8 bytes little-endian) -3. Stream `{TupleElement1}`: Y coordinate (`Float64`, 8 bytes little-endian) - -**Example: Single Point** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT (1.5, 2.5)::Point AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0550 6f69 6e74 0000 0000 ...col.Point.... -00000010: 0000 f83f 0000 0000 0000 0440 ...?.......@ -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column metadata: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (String: length=3) - 0x05, 0x50, 0x6f, 0x69, // Column type = "Point" (String: length=5) - 0x6e, 0x74, // (continuation) - - // TupleElement0 stream (X coordinate): - 0x00, 0x00, 0x00, 0x00, // X = 1.5 (Float64, little-endian) - 0x00, 0x00, 0xf8, 0x3f, // IEEE 754: 0x3ff8000000000000 - - // TupleElement1 stream (Y coordinate): - 0x00, 0x00, 0x00, 0x00, // Y = 2.5 (Float64, little-endian) - 0x00, 0x00, 0x04, 0x40, // IEEE 754: 0x4004000000000000 -]); -``` - -### 7.2 Ring - -`Ring` represents a closed loop of points, typically used to define polygon boundaries. It is an array of `Point` values. - -**Type Alias:** -```sql -Ring = Array(Point) -``` - -**Wire Format:** - -The `Ring` type serializes as an `Array(Point)`: -1. Type string: `"Ring"` (not `"Array(Point)"`) -2. Stream `{ArraySizes}`: Cumulative offsets (`UInt64` per row) -3. Stream `{ArrayElements, TupleElement0}`: X coordinates of all points (flattened `Float64` values) -4. Stream `{ArrayElements, TupleElement1}`: Y coordinates of all points (flattened `Float64` values) - -**Example: Ring with 4 points** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT [(0.0, 0.0), (1.0, 0.0), (1.0, 1.0), (0.0, 0.0)]::Ring AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0452 696e 6704 0000 0000 ...col.Ring..... -00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ -00000020: 00f0 3f00 0000 0000 00f0 3f00 0000 0000 ..?.......?..... -00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ -00000040: 0000 0000 0000 0000 00f0 3f00 0000 0000 ..........?..... -00000050: 0000 00 ... -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column metadata: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x04, 0x52, 0x69, 0x6e, // Column type = "Ring" (length=4) - 0x67, - - // ArraySizes stream: - 0x04, 0x00, 0x00, 0x00, // Array size = 4 points (UInt64) - 0x00, 0x00, 0x00, 0x00, - - // ArrayElements/TupleElement0 stream (X coordinates, 4 Float64 values): - 0x00, 0x00, 0x00, 0x00, // Point[0].X = 0.0 - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, // Point[1].X = 1.0 - 0x00, 0x00, 0xf0, 0x3f, - 0x00, 0x00, 0x00, 0x00, // Point[2].X = 1.0 - 0x00, 0x00, 0xf0, 0x3f, - 0x00, 0x00, 0x00, 0x00, // Point[3].X = 0.0 - 0x00, 0x00, 0x00, 0x00, - - // ArrayElements/TupleElement1 stream (Y coordinates, 4 Float64 values): - 0x00, 0x00, 0x00, 0x00, // Point[0].Y = 0.0 - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, // Point[1].Y = 0.0 - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, // Point[2].Y = 1.0 - 0x00, 0x00, 0xf0, 0x3f, - 0x00, 0x00, 0x00, 0x00, // Point[3].Y = 0.0 - 0x00, 0x00, 0x00, 0x00, -]); -``` - -**Note:** A valid ring should have its first and last points identical (closed loop), though this is not enforced at the wire format level. - -### 7.3 Polygon - -`Polygon` represents a polygon consisting of an outer boundary ring and optional holes (inner rings). It is an array of `Ring` values. - -**Type Alias:** -```sql -Polygon = Array(Ring) -``` - -**Wire Format:** - -The `Polygon` type serializes as `Array(Ring)`, which nests as `Array(Array(Point))`: -1. Type string: `"Polygon"` (not `"Array(Ring)"`) -2. Stream `{ArraySizes}`: Offsets for outer array (number of rings) -3. Stream `{ArrayElements, ArraySizes}`: Offsets for inner arrays (points per ring) -4. Stream `{ArrayElements, ArrayElements, TupleElement0}`: All X coordinates -5. Stream `{ArrayElements, ArrayElements, TupleElement1}`: All Y coordinates - -**Example: Simple Polygon with one ring** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT [[(0.0, 0.0), (10.0, 0.0), (10.0, 10.0), (0.0, 10.0), (0.0, 0.0)]]::Polygon AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0750 6f6c 7967 6f6e 0100 ...col.Polygon.. -00000010: 0000 0000 0000 0500 0000 0000 0000 0000 ................ -00000020: 0000 0000 0000 0000 0000 0000 2440 0000 ............$@.. -00000030: 0000 0000 2440 0000 0000 0000 0000 0000 ....$@.......... -00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ -00000050: 0000 0000 0000 0000 0000 0000 2440 0000 ............$@.. -00000060: 0000 0000 2440 0000 0000 0000 0000 ....$@........ -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column metadata: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x07, 0x50, 0x6f, 0x6c, // Column type = "Polygon" (length=7) - 0x79, 0x67, 0x6f, 0x6e, - - // Outer ArraySizes (number of rings): - 0x01, 0x00, 0x00, 0x00, // 1 ring (UInt64) - 0x00, 0x00, 0x00, 0x00, - - // Inner ArraySizes (points per ring): - 0x05, 0x00, 0x00, 0x00, // Ring[0] has 5 points (UInt64) - 0x00, 0x00, 0x00, 0x00, - - // All X coordinates (5 Float64 values): - 0x00, 0x00, 0x00, 0x00, // Point[0].X = 0.0 - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, // Point[1].X = 10.0 - 0x00, 0x00, 0x24, 0x40, // (0x4024000000000000) - 0x00, 0x00, 0x00, 0x00, // Point[2].X = 10.0 - 0x00, 0x00, 0x24, 0x40, - 0x00, 0x00, 0x00, 0x00, // Point[3].X = 0.0 - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, // Point[4].X = 0.0 - 0x00, 0x00, 0x00, 0x00, - - // All Y coordinates (5 Float64 values): - 0x00, 0x00, 0x00, 0x00, // Point[0].Y = 0.0 - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, // Point[1].Y = 0.0 - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, // Point[2].Y = 10.0 - 0x00, 0x00, 0x24, 0x40, - 0x00, 0x00, 0x00, 0x00, // Point[3].Y = 10.0 - 0x00, 0x00, 0x24, 0x40, - 0x00, 0x00, 0x00, 0x00, // Point[4].Y = 0.0 - 0x00, 0x00, 0x00, 0x00, -]); -``` - -**Convention:** The first ring in a polygon is the outer boundary, and subsequent rings (if any) represent holes. - -### 7.4 MultiPolygon - -`MultiPolygon` represents a collection of polygons. It is an array of `Polygon` values. - -**Type Alias:** -```sql -MultiPolygon = Array(Polygon) -``` - -**Wire Format:** - -The `MultiPolygon` type serializes as `Array(Polygon)`, nesting three levels deep as `Array(Array(Array(Point)))`: -1. Type string: `"MultiPolygon"` (not `"Array(Polygon)"`) -2. Stream `{ArraySizes}`: Offsets for polygons -3. Stream `{ArrayElements, ArraySizes}`: Offsets for rings within each polygon -4. Stream `{ArrayElements, ArrayElements, ArraySizes}`: Offsets for points within each ring -5. Stream `{ArrayElements, ArrayElements, ArrayElements, TupleElement0}`: All X coordinates -6. Stream `{ArrayElements, ArrayElements, ArrayElements, TupleElement1}`: All Y coordinates - -**Example: MultiPolygon with two simple triangular polygons** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT [[[(0.0, 0.0), (10.0, 0.0), (10.0, 10.0), (0.0, 0.0)]], [[(20.0, 20.0), (30.0, 20.0), (30.0, 30.0), (20.0, 20.0)]]]::MultiPolygon AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0c4d 756c 7469 506f 6c79 ...col.MultiPoly -00000010: 676f 6e02 0000 0000 0000 0001 0000 0000 gon............. -00000020: 0000 0002 0000 0000 0000 0004 0000 0000 ................ -00000030: 0000 0008 0000 0000 0000 0000 0000 0000 ................ -00000040: 0000 0000 0000 0000 0024 4000 0000 0000 .........$@..... -00000050: 0024 4000 0000 0000 0000 0000 0000 0000 .$@............. -00000060: 0034 4000 0000 0000 003e 4000 0000 0000 .4@......>@..... -00000070: 003e 4000 0000 0000 0034 4000 0000 0000 .>@......4@..... -00000080: 0000 0000 0000 0000 0000 0000 0000 0000 ................ -00000090: 0024 4000 0000 0000 0000 0000 0000 0000 .$@............. -000000a0: 0034 4000 0000 0000 0034 4000 0000 0000 .4@......4@..... -000000b0: 003e 4000 0000 0000 0034 40 .>@......4@ -``` - -Breakdown omitted for brevity (follows nested array pattern with 3 levels of offsets). - -### 7.5 LineString - -`LineString` represents a sequence of connected line segments as an array of points. - -**Type Alias:** -```sql -LineString = Array(Point) -``` - -**Wire Format:** - -The `LineString` type serializes identically to `Ring` (both are `Array(Point)`): -1. Type string: `"LineString"` (not `"Array(Point)"`) -2. Stream `{ArraySizes}`: Cumulative offsets -3. Stream `{ArrayElements, TupleElement0}`: X coordinates -4. Stream `{ArrayElements, TupleElement1}`: Y coordinates - -**Example: LineString with 3 points** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT [(1.0, 2.0), (3.0, 4.0), (5.0, 6.0)]::LineString AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0a4c 696e 6553 7472 696e ...col.LineStrin -00000010: 6703 0000 0000 0000 0000 0000 0000 00f0 g............... -00000020: 3f00 0000 0000 0008 4000 0000 0000 0014 ?.......@....... -00000030: 4000 0000 0000 0000 4000 0000 0000 0010 @.......@....... -00000040: 4000 0000 0000 0018 40 @.......@ -``` - -Breakdown omitted (follows same pattern as `Ring`). - -**Note:** Unlike `Ring`, a `LineString` is not required to form a closed loop (first and last points need not match). - -### 7.6 MultiLineString - -`MultiLineString` represents a collection of line strings. It is an array of `LineString` values. - -**Type Alias:** -```sql -MultiLineString = Array(LineString) -``` - -**Wire Format:** - -The `MultiLineString` type serializes as `Array(LineString)`, nesting as `Array(Array(Point))`: -1. Type string: `"MultiLineString"` (not `"Array(LineString)"`) -2. Stream `{ArraySizes}`: Offsets for line strings -3. Stream `{ArrayElements, ArraySizes}`: Offsets for points within each line string -4. Stream `{ArrayElements, ArrayElements, TupleElement0}`: All X coordinates -5. Stream `{ArrayElements, ArrayElements, TupleElement1}`: All Y coordinates - -**Example: MultiLineString with two line strings** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT [[(1.0, 2.0), (3.0, 4.0)], [(5.0, 6.0), (7.0, 8.0)]]::MultiLineString AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0f4d 756c 7469 4c69 6e65 ...col.MultiLine -00000010: 5374 7269 6e67 0200 0000 0000 0000 0200 String.......... -00000020: 0000 0000 0000 0400 0000 0000 0000 0000 ................ -00000030: 0000 0000 f03f 0000 0000 0000 0840 0000 .....?.......@.. -00000040: 0000 0000 1440 0000 0000 0000 1c40 0000 .....@.......@.. -00000050: 0000 0000 0040 0000 0000 0000 1040 0000 .....@.......@.. -00000060: 0000 0000 1840 0000 0000 0000 2040 .....@...... @ -``` - -Breakdown omitted (follows nested array pattern). - -### 7.7 Geometry - -`Geometry` is a variant type that can hold any of the geographic types. It uses the `Variant` type structure with specific discriminant ordering for geo types. - -**Type Definition:** -```sql -Geometry = Variant(LineString, MultiLineString, MultiPolygon, Point, Polygon, Ring) -``` - -**Discriminant Mapping:** - -| Discriminant Index | Geo Type | -|--------------------|----------| -| 0 | LineString | -| 1 | MultiLineString | -| 2 | MultiPolygon | -| 3 | Point | -| 4 | Polygon | -| 5 | Ring | - -**Wire Format:** - -The `Geometry` type follows standard `Variant` serialization (see section 6.7): -1. Type string: `"Geometry"` (not `"Variant(LineString, ...)"`) -2. Stream `{VariantDiscriminatorsPrefix}`: Serialization mode (`UInt64`) -3. Stream `{VariantDiscriminators}`: Discriminator indices (mode-dependent) -4. Streams `{VariantElements, VariantElement[i], ...}`: Sparse data for each variant type - -**Important Configuration:** - -The `Geometry` type contains structurally similar variants (`LineString` and `Ring` are both `Array(Point)`), which ClickHouse considers "suspicious" for potential ambiguity. To use `Geometry`, you must enable the setting: - -```sql -SET allow_suspicious_variant_types = 1; -``` - -**Example: Geometry containing a Point** - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native&allow_suspicious_variant_types=1" \ - --data-binary "SELECT (1.0, 2.0)::Point::Geometry AS col" | xxd -``` - -``` -00000000: 0101 0363 6f6c 0847 656f 6d65 7472 7900 ...col.Geometry. -00000010: 0000 0000 0000 0003 0000 0000 0000 f03f ...............? -00000020: 0000 0000 0000 0040 .......@ -``` - -Breakdown: -```typescript -const block = new Uint8Array([ - 0x01, // NumColumns = 1 - 0x01, // NumRows = 1 - - // Column metadata: - 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" - 0x08, 0x47, 0x65, 0x6f, // Column type = "Geometry" (length=8) - 0x6d, 0x65, 0x74, 0x72, - 0x79, - - // VariantDiscriminatorsPrefix stream: - 0x00, 0x00, 0x00, 0x00, // Serialization mode = 0 (BASIC) - 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) - - // VariantDiscriminators stream (BASIC mode): - 0x03, // Discriminator = 3 (Point type) - - // VariantElements/VariantElement[3] (Point data): - // TupleElement0 (X coordinate): - 0x00, 0x00, 0x00, 0x00, // X = 1.0 (Float64) - 0x00, 0x00, 0xf0, 0x3f, - // TupleElement1 (Y coordinate): - 0x00, 0x00, 0x00, 0x00, // Y = 2.0 (Float64) - 0x00, 0x00, 0x00, 0x40, -]); -``` - -**Key Observations:** - -1. **Type Name vs. Underlying Type**: The wire format uses the geo type alias name (e.g., `"Point"`, `"Geometry"`) in the type string, not the underlying composite type name. This allows parsers to distinguish geographic types from generic tuples and arrays. - -2. **Structural Composition**: Geo types compose existing Native format features: - - `Point` uses `Tuple` streams - - `Ring`, `LineString` use `Array` streams - - `Polygon`, `MultiPolygon`, `MultiLineString` use nested `Array` streams - - `Geometry` uses `Variant` streams - -3. **Sparse Variant Storage**: In `Geometry`, only the variant element corresponding to the discriminator contains data. Other variant elements are empty. - -4. **Ambiguity Warning**: `LineString` and `Ring` have identical underlying types (`Array(Point)`), which creates potential ambiguity in `Geometry`. This is why `allow_suspicious_variant_types=1` is required. - ---- - -## 8. Vector Types (QBit) - -`QBit` is a specialized vector data type designed for efficient similarity search operations. It stores vectors of floating-point numbers in a bit-transposed format for optimized SIMD operations during distance calculations. - -#### Type Definition - -```sql -QBit(element_type, dimension) -``` - -Where: -- `element_type`: The numeric type of vector elements (`Float32`, `Float64`, or `BFloat16`) -- `dimension`: The number of elements in the vector (fixed size) - -#### Wire Format - -**Important:** The wire format differs significantly between Native and RowBinary formats. - -##### RowBinary Format - -In RowBinary format, `QBit` is serialized exactly like an `Array` of the element type: - -1. **Size** (VarUInt): Number of elements in the vector (must match the declared dimension) -2. **Elements**: Sequential float values in little-endian format - -**Example: QBit(Float32, 3) with values [1.0, 2.0, 3.0] in RowBinaryWithNamesAndTypes** - -```bash -curl -s "http://localhost:8123/?allow_experimental_qbit_type=1" \ - --data-binary "SELECT [1.0, 2.0, 3.0]::QBit(Float32, 3) AS vec FORMAT RowBinaryWithNamesAndTypes" | xxd -``` - -``` -00000000: 0103 7665 6310 5142 6974 2846 6c6f 6174 ..vec.QBit(Float -00000010: 3332 2c20 3329 0300 0080 3f00 0000 4000 32, 3)....?...@. -00000020: 0040 40 .@@ -``` - -The data portion (after type metadata) is: -``` -0x03 # Size = 3 (VarUInt) -0x00 0x00 0x80 0x3F # 1.0 as little-endian Float32 -0x00 0x00 0x00 0x40 # 2.0 as little-endian Float32 -0x00 0x00 0x40 0x40 # 3.0 as little-endian Float32 -``` - -##### Native Format (Bit-Transposed) - -In Native format, `QBit` uses a **bit-transposed encoding** that is NOT the same as Array. This format organizes data across multiple `FixedString` columns as a `Tuple` for SIMD-optimized distance calculations. - -**Example: QBit(Float32, 3) in Native format** - -```bash -curl -s "http://localhost:8123/?allow_experimental_qbit_type=1" \ - --data-binary "SELECT [1.0, 2.0, 3.0]::QBit(Float32, 3) AS vec FORMAT Native" | xxd -``` - -``` -00000000: 0101 0376 6563 1051 4269 7428 466c 6f61 ...vec.QBit(Floa -00000010: 7433 322c 2033 2900 0601 0101 0101 0101 t32, 3)......... -00000020: 0400 0000 0000 0000 0000 0000 0000 0000 ................ -00000030: 0000 0000 0000 00 ....... -``` - -The bit transposition maps: -- Bit j of element i → Bit i of bit-plane j - -This internal format is significantly more complex than simple array serialization and is designed for optimal SIMD performance during vector similarity computations. - -#### Type Encoding - -When using binary type encoding (with `output_format_native_encode_types_in_binary_format=1`): - -``` -0x36 # BinaryTypeIndex::QBit - # Encoded element type (Float32=0x0D, Float64=0x0E, BFloat16=0x31) - # Vector dimension -``` - ---- - -## 9. AggregateFunction Types - -The `AggregateFunction` type stores the intermediate state of an aggregate function. This enables incremental aggregation, state merging across distributed systems, and materialized aggregate views. - -### Type Definition - -```sql -AggregateFunction(function_name[(param1, param2, ...)], arg_type1, arg_type2, ...) -``` - -Examples: -- `AggregateFunction(sum, UInt64)` - State for `sum` over `UInt64` values -- `AggregateFunction(quantile(0.5), Float64)` - State for median calculation -- `AggregateFunction(uniqExact, String)` - State for exact unique count - -### Version Management - -Starting from revision **54452** (`DBMS_MIN_REVISION_WITH_AGGREGATE_FUNCTIONS_VERSIONING`), aggregate function states include version information. The version number corresponds to the `client_revision` and allows backward-compatible changes to serialization formats. - -**Before revision 54452:** -- States were serialized without version metadata -- Format changes required careful coordination - -**From revision 54452 onwards:** -- Each state includes a version number -- Version is used during both serialization and deserialization -- Allows gradual rollout of format improvements - -### Wire Format - -When serializing `AggregateFunction` values in Native format: - -**Type Name Format:** -``` -"AggregateFunction(function_name(param1, param2, ...), Type1, Type2, ...)" -``` - -**Binary Type Encoding (when `output_format_native_encode_types_in_binary_format=1`):** -``` -0x25 # BinaryTypeIndex::AggregateFunction - # Version number (e.g., client revision) - # Length of function name - # Function name as UTF-8 bytes - # Number of parameters - # Parameter 1 (Field binary encoding) -... - # Parameter N - # Number of argument types - # Argument type 1 (recursive type encoding) -... - # Argument type N -``` - -**State Data Format:** -Each aggregate function state is serialized as an opaque binary blob. The exact format is function-specific and controlled by the `IAggregateFunction::serialize()` and `IAggregateFunction::deserialize()` methods. - -When writing a single value: -``` - # Function-specific binary data -``` - -The state data includes: -1. Internal counters, accumulators, or intermediate values -2. May include embedded data structures (e.g., hash tables for `uniq`, sketches for `quantile`) -3. Format is optimized for the specific aggregate function - -### Example: sumState - -```bash -curl -s -XPOST "http://localhost:8123?default_format=Native" \ - --data-binary "SELECT sumState(number) AS col FROM numbers(5)" | xxd -``` - -The result contains: -``` -01 # NumColumns = 1 -01 # NumRows = 1 -03 63 6f 6c # Column name = "col" (length=3) - # Type name = "AggregateFunction(sum, UInt64)" - - # For sum(0,1,2,3,4) = 10, stored as UInt64 -``` - -For `sum` over integers, the state is simply the accumulated value. For complex functions like `quantile` or `uniq`, the state contains sophisticated data structures. - -### Bulk Serialization - -When serializing multiple aggregate function states (e.g., in a column), the format uses `serializeBatch`: - -``` - - -... - -``` - -Each state is serialized sequentially, with the aggregate function controlling the exact format through its versioned serialization methods. - -### Function-Specific State Examples - -**sum/avg**: Simple accumulator values -**uniq**: Hash table or HyperLogLog sketch -**quantile**: T-Digest or other quantile estimation structure -**groupArray**: Array of accumulated values -**topK**: Heap of top K elements with counts - ---- - -## 10. Format Settings - -The Native format behavior can be customized using several format-specific settings. These settings affect both input parsing and output serialization. - -### input_format_native_allow_types_conversion - -**Type:** Bool -**Default:** `true` -**Applies to:** Input only - -Allows automatic type conversion when reading Native format data. When enabled, ClickHouse will attempt to convert column types from the input data to match the target table schema. - -**Example Use Cases:** -- Reading older data with slightly different type definitions -- Importing data where numeric types don't exactly match (e.g., `UInt32` to `UInt64`) -- Upgrading tables with changed column types - -**When disabled (`false`):** -- Type mismatches cause errors -- Strict type checking enforced -- Better for ensuring data consistency - -```sql -SET input_format_native_allow_types_conversion = 1; -INSERT INTO table SELECT * FROM input('Native', 'col UInt64'); -``` - -### input_format_native_decode_types_in_binary_format - -**Type:** Bool -**Default:** `false` -**Applies to:** Input only - -When enabled, expects type names to be encoded in binary format (as described in `DataTypesBinaryEncoding.h`) rather than as human-readable strings. - -**Binary format benefits:** -- More compact representation (single byte for simple types) -- Faster parsing (no string parsing required) -- Unambiguous encoding - -**String format (default):** -- Human-readable type names (e.g., `"UInt32"`, `"String"`) -- More compatible with external tools -- Easier debugging - -**Example:** -```sql --- Reading data with binary-encoded types -SET input_format_native_decode_types_in_binary_format = 1; -SELECT * FROM file('data.native', 'Native'); -``` - -### output_format_native_encode_types_in_binary_format - -**Type:** Bool -**Default:** `false` -**Applies to:** Output only - -When enabled, writes type names in binary format rather than as strings. This is the output counterpart to `input_format_native_decode_types_in_binary_format`. - -**Benefits of binary encoding:** -- Reduced output size (especially for complex nested types) -- Faster serialization -- Precise encoding of all type parameters - -**Type encoding table:** See `DataTypesBinaryEncoding.h` for the complete mapping (excerpted in section 8.1). - -**Example:** -```sql -SET output_format_native_encode_types_in_binary_format = 1; -SELECT * FROM table FORMAT Native; -``` - -**Binary vs String comparison:** -- `UInt32`: Binary = 1 byte (`0x03`), String = 11 bytes (`"UInt32"` with length prefix) -- `Array(UInt8)`: Binary = 2 bytes (`0x1E 0x01`), String = 16 bytes -- Complex nested types: Binary can be 10x smaller - -### output_format_native_write_json_as_string - -**Type:** Bool -**Default:** `false` -**Applies to:** Output only, `JSON` data type - -Controls how the new `JSON` data type is serialized in Native format. - -**When disabled (default `false`):** -- `JSON` columns use their native binary serialization format -- Preserves full type structure with typed paths -- Supports efficient querying of JSON subcolumns -- Uses internal representation with dynamic types - -**When enabled (`true`):** -- `JSON` columns are serialized as `String` columns -- Each value is a JSON string -- Simpler format, easier interoperability -- Loses type information for paths - -**Example:** -```sql -CREATE TABLE json_test (data JSON) ENGINE = Memory; -INSERT INTO json_test VALUES ('{"id": 1, "name": "test"}'); - --- Native JSON serialization (structured) -SET output_format_native_write_json_as_string = 0; -SELECT * FROM json_test FORMAT Native; - --- As string (JSON text) -SET output_format_native_write_json_as_string = 1; -SELECT * FROM json_test FORMAT Native; -``` - -**Use when:** -- Exporting to systems that don't understand ClickHouse's `JSON` type -- Preserving exact JSON text representation -- Simplifying data pipeline integration - -### output_format_native_use_flattened_dynamic_and_json_serialization - -**Type:** Bool -**Default:** `false` -**Applies to:** Output only, `JSON` and `Dynamic` data types - -Controls whether `JSON` and `Dynamic` columns are serialized in a flattened format where all types/paths are written as separate subcolumns. - -**When disabled (default `false`):** -- Uses compact serialization optimized for the Native format -- More efficient for storage and transmission -- Uses version 2 serialization (revision >= 54473) - -**When enabled (`true`):** -- Each typed path in `JSON` becomes a separate subcolumn -- Each possible type in `Dynamic` is stored separately -- More verbose but easier to introspect -- Better compatibility with older versions or external tools - -**Flattened format structure:** -For `JSON`: -- Each discovered path becomes a subcolumn (e.g., `data.id`, `data.name`) -- Shared offsets for nested structures -- Null maps for optional paths - -For `Dynamic`: -- Discriminator column for type selection -- Separate column for each possible type variant -- Null maps for unused variants - -**Example:** -```sql -CREATE TABLE dynamic_test (value Dynamic) ENGINE = Memory; -INSERT INTO dynamic_test VALUES (42), ('text'), ([1,2,3]); - --- Compact serialization -SET output_format_native_use_flattened_dynamic_and_json_serialization = 0; -SELECT * FROM dynamic_test FORMAT Native; - --- Flattened serialization -SET output_format_native_use_flattened_dynamic_and_json_serialization = 1; -SELECT * FROM dynamic_test FORMAT Native; -``` - -**Trade-offs:** -- Flattened: Easier to understand, more compatible, larger size -- Compact: More efficient, requires newer client support - ---- - -## 11. Protocol Revision Reference - -The Native format behavior varies based on the protocol revision (also called `client_revision` in the code). This enables backward compatibility while allowing the format to evolve with new features. - -### Revision Thresholds - -| Revision | Constant Name | Feature | -|----------|---------------|---------| -| 54405 | DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE | LowCardinality support | -| 54452 | DBMS_MIN_REVISION_WITH_AGGREGATE_FUNCTIONS_VERSIONING | AggFunc versioning | -| 54454 | DBMS_MIN_REVISION_WITH_CUSTOM_SERIALIZATION | Custom serialization info | -| 54465 | DBMS_MIN_REVISION_WITH_SPARSE_SERIALIZATION | Sparse columns | -| 54473 | DBMS_MIN_REVISION_WITH_V2_DYNAMIC_AND_JSON_SERIALIZATION | Dynamic/JSON v2 | -| 54480 | DBMS_MIN_REVISION_WITH_OUT_OF_ORDER_BUCKETS_IN_AGGREGATION | Out of order bucket IDs | -| 54482 | DBMS_MIN_REVISION_WITH_REPLICATED_SERIALIZATION | Replicated columns | -| 54483 | DBMS_MIN_REVISION_WITH_NULLABLE_SPARSE_SERIALIZATION | Nullable sparse | - -### Detailed Revision Explanations - -#### 54405: LowCardinality Support - -**Feature:** Introduces the `LowCardinality` type wrapper for efficient storage of low-cardinality string columns. - -**Wire Format Impact:** -- Clients with `revision >= 54405`: Receive `LowCardinality(T)` types with dictionary-encoded values -- Clients with `revision < 54405`: Server automatically converts to the underlying type `T` - -**Backward Compatibility:** -The server checks `client_revision` and calls `recursiveRemoveLowCardinality()` before sending data to old clients, ensuring they receive materialized values instead of dictionary references. - -**Example:** -```sql --- Server with LowCardinality column -CREATE TABLE test (country LowCardinality(String)) ENGINE = Memory; - --- Old client (revision < 54405): Receives as String --- New client (revision >= 54405): Receives with dictionary encoding -``` - -#### 54452: Aggregate Functions Versioning - -**Feature:** Adds version metadata to `AggregateFunction` state serialization. - -**Wire Format Impact:** -- Enables versioned serialization of aggregate function states -- Version number passed to `IAggregateFunction::serialize()` and `deserialize()` -- Allows incremental improvements to state formats - -**Before this revision:** -- States serialized without version information -- Format changes required careful coordination across clusters -- Upgrades had to be all-or-nothing - -**After this revision:** -- Each state includes version metadata -- Functions can support multiple format versions -- Gradual rollout of format improvements possible - -**Version Parameter:** -The version is typically set to `client_revision` when serializing, allowing functions to choose appropriate format based on client capabilities. - -See Section 9 for detailed `AggregateFunction` serialization format. - -#### 54454: Custom Serialization Info - -**Feature:** Enables custom serialization strategies based on column statistics. - -**Wire Format Impact:** -- Server analyzes column data and derives `SerializationInfo` -- Serialization chosen based on actual data characteristics -- Examples: sparse serialization for columns with many default values - -**Process:** -1. Server calls `DataType::getSerializationInfo(column)` to analyze data -2. Server calls `DataType::getSerialization(info)` to get optimized serialization -3. Server writes serialization metadata if needed -4. Client reads metadata and uses appropriate deserialization - -**Before this revision:** -- All columns used default serialization -- No optimization based on data characteristics - -**After this revision:** -- Sparse serialization for columns with mostly defaults -- Custom strategies for specific data patterns -- Significant storage and network savings for sparse data - -**Client Impact:** -Old clients (`revision < 54454`) receive data with `getDefaultSerialization()` and all sparse columns are materialized. - -#### 54465: Sparse Serialization - -**Feature:** Introduces sparse column format for efficient storage of columns with many default/zero values. - -**Wire Format Impact:** -Sparse columns are encoded as: -1. **Offsets array**: Positions of non-default values -2. **Values array**: Only non-default values - -This dramatically reduces size for sparse data. - -**Example:** -``` -Regular column: [0, 0, 0, 0, 42, 0, 0, 0, 99, 0] -Sparse encoding: - offsets = [4, 8] - values = [42, 99] -``` - -**Backward Compatibility:** -- Clients with `revision < 54465`: Server calls `recursiveRemoveSparse()` to materialize all values -- Clients with `revision >= 54465`: Receive sparse format when beneficial - -**Applies to:** -- Numeric columns with many zeros -- Nullable columns with many NULLs -- Any column where most values equal the default - -#### 54473: Dynamic/JSON Serialization v2 - -**Feature:** Improved serialization format for `Dynamic` and `JSON` types. - -**Wire Format Impact:** -- **V1 (before 54473):** Less efficient, simpler format -- **V2 (from 54473):** Optimized binary encoding, better compression - -**Settings Affected:** -When `client_revision < 54473`, the server sets: -```cpp -settings.dynamic_serialization_version = MergeTreeDynamicSerializationVersion::V1; -settings.object_serialization_version = MergeTreeObjectSerializationVersion::V1; -``` - -**V2 Improvements:** -- More compact type discriminators -- Better handling of mixed-type data -- Optimized path encoding for JSON -- Reduced metadata overhead - -**Backward Compatibility:** -Automatic downgrade to V1 for old clients, ensuring seamless upgrades. - -#### 54480: Out of Order Buckets in Aggregation - -**Feature:** Allows bucket IDs to arrive in any order during distributed aggregation. - -**Wire Format Impact:** -This primarily affects distributed aggregation protocol rather than basic Native format. When aggregating across multiple nodes: - -**Before 54480:** -- Bucket IDs expected in sorted order -- Required sorting/buffering - -**After 54480:** -- Buckets can arrive in any order -- Reduced latency in distributed queries -- Better parallelism - -**Native Format Impact:** -Minimal direct impact on basic Native format structure, but affects how aggregate states are transmitted during distributed two-level aggregation. - -#### 54482: Replicated Serialization - -**Feature:** Support for replicated column optimization in distributed queries. - -**Wire Format Impact:** -`Replicated` columns share data across multiple identical copies to save memory. - -**Before this revision:** -- All replicated columns converted to full columns: `convertToFullColumnIfReplicated()` -- Increased memory usage - -**From this revision:** -- Clients with `revision >= 54482`: Receive replicated columns as-is -- Clients with `revision < 54482`: Server materializes replicated columns - -**Use Case:** -Common in `JOIN` operations where the same small dimension table is used repeatedly. - -#### 54483: Nullable Sparse Serialization - -**Feature:** Extends sparse serialization to `Nullable` types. - -**Wire Format Impact:** -Enables sparse encoding for nullable columns, combining NULL map sparsity with value sparsity. - -**Efficiency Gains:** -``` -Example: Nullable(UInt64) with mostly NULLs and zeros -Regular: [NULL, NULL, 0, NULL, 42, NULL, 0, 0, NULL] -Sparse: - null_offsets = [2, 4, 6, 7] # non-NULL positions - null_values = [0, 42, 0, 0] - value_offsets = [1] # non-default positions within non-NULLs - value_values = [42] -``` - -**Backward Compatibility:** -- `revision < 54483`: Server materializes sparse nullable columns -- `revision >= 54483`: Can receive optimized sparse format - -### Revision Check Logic - -The server determines serialization format based on `client_revision`: - -```cpp -if (client_revision >= DBMS_MIN_REVISION_WITH_CUSTOM_SERIALIZATION) { - // Use optimized serialization - if (client_revision < DBMS_MIN_REVISION_WITH_REPLICATED_SERIALIZATION) - column = column->convertToFullColumnIfReplicated(); - if (client_revision < DBMS_MIN_REVISION_WITH_SPARSE_SERIALIZATION) - column = recursiveRemoveSparse(column); - // etc. -} -``` - -This cascading check ensures maximum compatibility while enabling new optimizations for capable clients. - -### Current Protocol Version - -As of the source code, the current TCP protocol version is **54483** (`DBMS_TCP_PROTOCOL_VERSION`), supporting all features listed above. +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +!!!!! NOTE: DOCS GENERATED BY CLAUDE VIA EXAMINING CORE CODEBASE AND GATHERING DATA FROM QUERIES. HAS NOT BEEN VERIFIED BY A HUMAN. +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + + +# Native Format Specification + +## 1. Introduction + +### Overview + +The Native format is ClickHouse's columnar binary format designed for efficient data exchange. Unlike row-based formats (like RowBinary) where each row is serialized sequentially, Native organizes data by column - all values for the first column are written together, then all values for the second column, and so on. + +**Columnar Layout (Native):** +``` +Column1: [row1_val, row2_val, row3_val, ...] +Column2: [row1_val, row2_val, row3_val, ...] +... +``` + +**Row-Based Layout (RowBinary):** +``` +Row1: [col1_val, col2_val, col3_val, ...] +Row2: [col1_val, col2_val, col3_val, ...] +... +``` + +The columnar layout enables efficient compression (similar values are adjacent), better CPU cache utilization, and aligns with ClickHouse's internal storage format. + +### Native Format vs TCP Native Protocol + +The term "Native" is used in two related but distinct contexts: + +1. **Native Format**: A binary wire format for serializing blocks of columnar data. This document specifies the Native format. + +2. **TCP Native Protocol**: ClickHouse's proprietary client-server protocol that uses the Native format for data transfer, plus additional protocol messages for queries, progress, exceptions, etc. + +**Key Differences:** + +| Aspect | Native Format (HTTP) | TCP Native Protocol | +|--------|---------------------|---------------------| +| Transport | HTTP POST body | TCP socket | +| BlockInfo | Not included | Included (when `client_revision > 0`) | +| Streaming | Single response | Multi-block streaming with progress | +| Authentication | HTTP headers | Protocol handshake | +| Use Case | Simple integrations | Native clients, high-performance | + +When using HTTP with `FORMAT Native`, you receive pure Native format blocks without the TCP protocol overhead. The TCP protocol wraps Native format blocks with additional metadata like `BlockInfo` (see Section 3). + +### Comparison with RowBinary + +| Feature | Native | RowBinary | +|---------|--------|-----------| +| **Layout** | Columnar (all values per column together) | Row-based (all columns per row together) | +| **Block Header** | NumColumns, NumRows, column names & types | None (schema must be known) | +| **Type Info** | Embedded in stream | External (defined by INSERT or FORMAT clause) | +| **Streaming** | Block-at-a-time | Row-at-a-time | +| **Memory** | Higher (full column buffers) | Lower (single row) | +| **Compression** | Better (similar values adjacent) | Moderate | +| **LowCardinality** | Dictionary-encoded | Materialized values | +| **Nullable** | Null bitmap + values | Per-row null flags + values | +| **Complex Types** | Stream-based (Array sizes + elements) | Inline per-row | +| **Best For** | Bulk transfer, server-to-server | Simple clients, streaming | + +**Size Comparison Example:** + +For a column of 1000 `LowCardinality(String)` values with 10 unique strings: + +- **Native**: Dictionary (10 strings) + 1000 index bytes ≈ small +- **RowBinary**: 1000 full strings ≈ much larger + +For simple `UInt64` data without repetition, both formats produce similar sizes. + +--- + +## 2. Block Structure + +The Native format organizes data into blocks. Each block is a self-contained unit with metadata followed by columnar data. When reading over HTTP (not using the TCP protocol with `client_revision > 0`), blocks do not include `BlockInfo`. + +### Block Dimensions + +A block starts with two VarUInt values that specify its dimensions: + +- **NumColumns** (VarUInt): Number of columns in the block +- **NumRows** (VarUInt): Number of rows in the block + +VarUInt is an unsigned LEB128 (Little Endian Base 128) encoding where each byte uses 7 bits for data and 1 bit as a continuation flag. Values 0-127 encode as a single byte. + +**Example: Simple single-column, single-row block** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT 42::UInt32 AS num" | xxd +``` + +``` +00000000: 0101 036e 756d 0655 496e 7433 322a 0000 ...num.UInt32*.. +00000010: 00 . +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 (VarUInt) + 0x01, // NumRows = 1 (VarUInt) + + // Column 0: + 0x03, 0x6e, 0x75, 0x6d, // Column name = "num" (String: length=3, bytes="num") + 0x06, 0x55, 0x49, 0x6e, // Column type = "UInt32" (String: length=6, bytes="UInt32") + 0x74, 0x33, 0x32, // (continuation of "UInt32") + 0x2a, 0x00, 0x00, 0x00, // Column data: 42 as little-endian UInt32 +]); +``` + +**Example: Two columns, one row** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT 'hello'::String AS msg, 100::UInt8 AS id" | xxd +``` + +``` +00000000: 0201 036d 7367 0653 7472 696e 6705 6865 ...msg.String.he +00000010: 6c6c 6f02 6964 0555 496e 7438 64 llo.id.UInt8d +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x02, // NumColumns = 2 (VarUInt) + 0x01, // NumRows = 1 (VarUInt) + + // Column 0: + 0x03, 0x6d, 0x73, 0x67, // Column name = "msg" (String) + 0x06, 0x53, 0x74, 0x72, // Column type = "String" (String) + 0x69, 0x6e, 0x67, // (continuation) + 0x05, 0x68, 0x65, 0x6c, // Column data: "hello" (String: length=5) + 0x6c, 0x6f, // (continuation) + + // Column 1: + 0x02, 0x69, 0x64, // Column name = "id" (String: length=2) + 0x05, 0x55, 0x49, 0x6e, // Column type = "UInt8" (String: length=5) + 0x74, 0x38, // (continuation) + 0x64, // Column data: 100 as UInt8 +]); +``` + +**Example: One column, three rows (columnar layout)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT number::UInt64 AS n FROM numbers(3)" | xxd +``` + +``` +00000000: 0103 016e 0655 496e 7436 3400 0000 0000 ...n.UInt64..... +00000010: 0000 0001 0000 0000 0000 0002 0000 0000 ................ +00000020: 0000 00 ... +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 (VarUInt) + 0x03, // NumRows = 3 (VarUInt) + + // Column 0: + 0x01, 0x6e, // Column name = "n" (String: length=1) + 0x06, 0x55, 0x49, 0x6e, // Column type = "UInt64" (String: length=6) + 0x74, 0x36, 0x34, // (continuation) + + // Column data (all 3 rows, columnar): + 0x00, 0x00, 0x00, 0x00, // Row 0: 0 as UInt64 (little-endian) + 0x00, 0x00, 0x00, 0x00, + 0x01, 0x00, 0x00, 0x00, // Row 1: 1 as UInt64 (little-endian) + 0x00, 0x00, 0x00, 0x00, + 0x02, 0x00, 0x00, 0x00, // Row 2: 2 as UInt64 (little-endian) + 0x00, 0x00, 0x00, 0x00, +]); +``` + +### Per-Column Structure + +Each column in the block is serialized sequentially with the following structure: + +1. **Column Name** (String): VarUInt length + UTF-8 bytes +2. **Column Type** (String or binary-encoded): + - By default: String representation (e.g., `"UInt32"`, `"Nullable(String)"`) + - If `output_format_native_encode_types_in_binary_format=1`: Binary-encoded type structure +3. **Serialization Info** (conditional, revision >= 54454): + - Present only when `client_revision >= DBMS_MIN_REVISION_WITH_CUSTOM_SERIALIZATION` (54454) + - `UInt8 has_custom`: 0 or 1 indicating if custom serialization is used + - If `has_custom == 1`: Binary-encoded serialization kind stack (for sparse columns, etc.) +4. **Column Data** (columnar, type-specific): + - All values for this column across all rows + - Serialization depends on the data type (see subsequent sections) + - If `NumRows == 0`, no data bytes are written (even for variable-length types) + +**Empty block behavior:** + +When a query returns 0 rows, the entire response is empty (0 bytes). The block structure is only written when there is at least one row or when explicitly required by the protocol. + +```bash +# Query with 0 rows produces empty output +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT number::UInt64 AS n FROM numbers(0)" | wc -c +# Output: 0 +``` + +--- + +## 3. BlockInfo (TCP Protocol Only) + +`BlockInfo` contains additional metadata about the block, used primarily for distributed query execution and aggregation. It is **only present when using the TCP protocol** with `client_revision > 0`. When reading Native format over HTTP, `BlockInfo` is not included. + +### When BlockInfo is Present + +`BlockInfo` appears at the very beginning of each block, before the block dimensions, when: +- The block is transmitted over the TCP (Native) protocol +- `client_revision > 0` (i.e., not reading from an old client) + +In the HTTP Native format context, `BlockInfo` is never serialized, and blocks start directly with NumColumns/NumRows. + +### Field-Based Serialization Format + +`BlockInfo` uses a field-number-based binary format (similar to protobuf but simpler). The structure is: + +``` +[Field1_Num(VarUInt), Field1_Value(binary)] +[Field2_Num(VarUInt), Field2_Value(binary)] +... +[0x00] // Terminator byte +``` + +Each field is serialized only if the protocol revision supports it. The serialization always ends with a terminator byte (`0x00`). + +### Fields + +`BlockInfo` defines three fields (as of protocol revision 54480+): + +#### Field 1: `is_overflows` (bool) + +- **Field Number**: 1 (VarUInt) +- **Type**: `bool` (UInt8: 0 = false, 1 = true) +- **Minimum Revision**: 0 (always present if `client_revision > 0`) +- **Purpose**: Indicates if this block contains overflow rows from `GROUP BY ... WITH TOTALS` when `max_rows_to_group_by` is exceeded with `group_by_overflow_mode = 'any'`. When `true`, this block contains aggregated values that didn't fit within the limit. + +**Encoding:** +```typescript +// If is_overflows = true: +const field1 = new Uint8Array([ + 0x01, // Field number 1 (VarUInt) + 0x01, // is_overflows = true (bool as UInt8) +]); +``` + +#### Field 2: `bucket_num` (Int32) + +- **Field Number**: 2 (VarUInt) +- **Type**: `Int32` (4 bytes, little-endian) +- **Minimum Revision**: 0 (always present if `client_revision > 0`) +- **Purpose**: Used in two-level aggregation to indicate which bucket this block's data belongs to. When using two-level aggregation, data with different key groups are scattered across different buckets. This field helps optimize merging during distributed aggregation. +- **Default Value**: -1 (indicates no bucket assignment) + +**Encoding:** +```typescript +// If bucket_num = 5: +const field2 = new Uint8Array([ + 0x02, // Field number 2 (VarUInt) + 0x05, 0x00, 0x00, 0x00, // bucket_num = 5 (Int32, little-endian) +]); +``` + +#### Field 3: `out_of_order_buckets` (vector) + +- **Field Number**: 3 (VarUInt) +- **Type**: `std::vector` (VarUInt size + array of Int32 values) +- **Minimum Revision**: 54480 (`DBMS_MIN_REVISION_WITH_OUT_OF_ORDER_BUCKETS_IN_AGGREGATION`) +- **Purpose**: List of bucket IDs that were delayed by `ConvertingAggregatedToChunksTransform` on the current node. Used for coordinating out-of-order bucket processing in distributed aggregation. +- **Default Value**: Empty vector + +**Encoding:** +```typescript +// Vector encoding: VarUInt(size) + elements +const writeVector = (values: Int32[]) => { + const size = values.length; // VarUInt + const elements = values.flatMap(v => + [v & 0xFF, (v >> 8) & 0xFF, (v >> 16) & 0xFF, (v >> 24) & 0xFF] + ); + return [size, ...elements]; +}; + +// If out_of_order_buckets = [10, 20]: +const field3 = new Uint8Array([ + 0x03, // Field number 3 (VarUInt) + 0x02, // Vector size = 2 (VarUInt) + 0x0A, 0x00, 0x00, 0x00, // Element 0: 10 (Int32, little-endian) + 0x14, 0x00, 0x00, 0x00, // Element 1: 20 (Int32, little-endian) +]); +``` + +### Terminator Byte + +The `BlockInfo` serialization always ends with a terminator byte of `0x00` (VarUInt encoding of 0), indicating no more fields follow. + +### Complete BlockInfo Example + +Here's a complete example of `BlockInfo` with all fields set (hypothetical, as this would only occur in TCP protocol context): + +```typescript +const blockInfo = new Uint8Array([ + // Field 1: is_overflows = false + 0x01, // Field number 1 + 0x00, // false + + // Field 2: bucket_num = 3 + 0x02, // Field number 2 + 0x03, 0x00, 0x00, 0x00, // bucket_num = 3 + + // Field 3: out_of_order_buckets = [5, 7] (only if revision >= 54480) + 0x03, // Field number 3 + 0x02, // Vector size = 2 + 0x05, 0x00, 0x00, 0x00, // 5 + 0x07, 0x00, 0x00, 0x00, // 7 + + // Terminator + 0x00, // End of BlockInfo + + // After BlockInfo, the block dimensions and columns follow: + // 0x01, 0x01, ... // NumColumns, NumRows, ... +]); +``` + +### Reading BlockInfo + +When reading a Native format stream over the TCP protocol with `server_revision > 0`, the reader must: + +1. Read field numbers (VarUInt) in a loop +2. For each non-zero field number, read the corresponding field value based on its known type +3. Stop when field number `0x00` is encountered +4. Then proceed to read the block dimensions (NumColumns, NumRows) and column data + +**Note**: The HTTP Native format never includes `BlockInfo`, so HTTP readers should skip directly to reading NumColumns/NumRows. + +--- + +## 4. Stream Architecture Concepts + +The Native format uses a **stream-based architecture** for complex data types. This is a conceptual model where certain types decompose their binary representation into multiple logical "streams" that are **interleaved in the actual binary output**. Understanding this concept is essential for correctly decoding complex types over HTTP. + +### What are Streams? + +In ClickHouse's serialization architecture, a "stream" is a logical subdivision of a column's binary data. Simple types like `UInt32` or `String` use a single stream (called `Regular`), where all values are written contiguously. Complex types decompose into multiple named streams, each carrying a specific piece of structural information. + +**Important**: When reading Native format over HTTP, streams are not separate physical byte sequences. They are **conceptual divisions** that appear **sequentially interleaved** in the binary output. The serialization process writes each stream's data in a specific order, and you must read them in that exact order. + +### Why Native Uses Streams + +The stream architecture exists because: + +1. **Columnar Storage Optimization**: In ClickHouse's storage engine (MergeTree), these streams map to separate physical files on disk (e.g., `column.bin` for data, `column.size0.bin` for array sizes). This allows selective reading and efficient compression. + +2. **Flexible Composition**: Complex types can nest arbitrarily (e.g., `Array(Array(LowCardinality(String)))`). Each type layer adds its own streams to the path hierarchy without knowing about parent or child types. + +3. **Shared Structures**: Some streams (like null maps in `Nullable` or dictionaries in `LowCardinality`) can be shared across granules or reused, reducing redundancy. + +### How Streams Relate to HTTP Decoding + +When you read Native format over HTTP: + +1. Each column's data is written by calling `serializeBinaryBulkWithMultipleStreams` +2. The serialization code calls a "stream getter" function for each logical stream path +3. All streams write to the **same physical TCP/HTTP output buffer**, one after another +4. Your decoder must understand the **stream order** to parse the bytes correctly + +**Practical Effect**: You don't read from multiple simultaneous streams. Instead, you read **sequential chunks** of binary data, where each chunk corresponds to a logical stream. The order is determined by the type's `enumerateStreams` method. + +### Stream Path Hierarchy + +Each logical stream is identified by a **SubstreamPath**: a sequence of `Substream` components that describe the nesting structure. The path is built up as types compose. + +**Example SubstreamPath representations**: + +| Type | SubstreamPath Components | Description | +|------|-------------------------|-------------| +| `UInt32` | `{Regular}` | Single stream for values | +| `Array(UInt32)` | `{ArraySizes}` | Array sizes (offsets) | +| | `{ArrayElements, Regular}` | Element values | +| `Nullable(String)` | `{NullMap}` | Null indicator bytes | +| | `{NullableElements, Regular}` | String values (including for NULLs) | +| `Array(Nullable(String))` | `{ArraySizes}` | Array offsets | +| | `{ArrayElements, NullMap}` | Null indicators for all elements | +| | `{ArrayElements, NullableElements, Regular}` | String data | +| `LowCardinality(String)` | `{DictionaryKeys}` | Unique dictionary values | +| | `{DictionaryIndexes}` | Row-to-dictionary index mappings | + +### Detailed Stream Examples + +#### Array(UInt32) + +For `Array(UInt32)` with 3 rows: `[1,2]`, `[3]`, `[]`: + +**Streams**: +1. **ArraySizes** (stream path: `{ArraySizes}`): + - Contains array sizes for each row + - Written as `UInt64` values + - Bytes: `02 00 00 00 00 00 00 00` (size 2), `01 00 00 00 00 00 00 00` (size 1), `00 00 00 00 00 00 00 00` (size 0) + +2. **ArrayElements** (stream path: `{ArrayElements, Regular}`): + - Contains flattened element values: 1, 2, 3 + - Written as contiguous `UInt32` values + - Bytes: `01 00 00 00` (1), `02 00 00 00` (2), `03 00 00 00` (3) + +**Binary layout** (streams written sequentially): +``` +[ArraySizes: 24 bytes for 3 offsets] [ArrayElements: 12 bytes for 3 UInt32 values] +``` + +#### LowCardinality(String) + +For `LowCardinality(String)` with 4 rows: `"hello"`, `"world"`, `"hello"`, `"world"`: + +**Streams**: +1. **DictionaryKeys** (stream path: `{DictionaryKeys}`): + - Prefix: version marker (UInt64 in prefix stream) + - Dictionary values: unique strings `"hello"`, `"world"` + +2. **DictionaryIndexes** (stream path: `{DictionaryIndexes}`): + - Serialization type byte (encodes index type: UInt8/16/32/64, flags) + - Additional keys count (if any overflow keys) + - Row count (UInt64) + - Index values: `0, 1, 0, 1` (indexes into dictionary) + +#### Array(LowCardinality(String)) + +For nested types, streams compose hierarchically: + +**Streams** (in order): +1. `{ArraySizes}` - array offsets +2. `{ArrayElements, DictionaryKeys}` - dictionary for all elements +3. `{ArrayElements, DictionaryIndexes}` - indexes for all elements + +The `ArrayElements` prefix indicates these streams belong to the array's elements, allowing proper reconstruction of the nested structure. + +### Reading Order + +When deserializing, you must: + +1. **Follow the enumeration order**: Types define their stream order via `enumerateStreams`. This order is deterministic. + +2. **Read each stream completely**: Don't interleave reads between different streams of the same column. Read one stream, then the next. + +3. **Respect nesting**: For `Array(T)`, first read array sizes, then recursively read the element stream(s) for type `T`. + +4. **Handle prefixes/suffixes**: Some types write metadata in `serializeBinaryBulkStatePrefix` (before data) and `serializeBinaryBulkStateSuffix` (after data). In Native format over HTTP, prefixes typically appear inline with the first granule's data. + +### Stream Interleaving with Multiple Columns + +In a block with multiple columns, the **entire column** (all its streams) is written before the next column: + +``` +Block: + Column 0: [all streams for column 0] + Column 1: [all streams for column 1] + Column 2: [all streams for column 2] +``` + +Within a column, streams appear in the order determined by `enumerateStreams`. + +### Summary for Client Developers + +When decoding Native format over HTTP: + +- **Streams are conceptual, not physical**: You read a single byte stream, but must mentally track which logical stream you're in +- **Order is critical**: Stream order is type-specific and deterministic. Follow the patterns shown above. +- **Nested types compose paths**: Each nesting level adds components to the stream path (e.g., `ArrayElements` → `NullableElements` → `Regular`) +- **Start with simple types**: Implement `Array` and `Nullable` first to understand the pattern, then tackle `LowCardinality`, `Map`, `Tuple`, etc. + +Understanding streams as "logical subdivisions written sequentially" rather than "parallel data channels" is the key mental model for successfully implementing a Native format decoder. + +--- + +## 5. Simple Data Types + +These types have the same columnar encoding as their RowBinary counterparts (values written consecutively for all rows). + +### 5.1 Integer Types + +Integer types are serialized as fixed-width little-endian values. For a column with N rows, N consecutive integer values are written. + +**Supported types:** +- `Int8` / `UInt8`: 1 byte per value +- `Int16` / `UInt16`: 2 bytes per value (little-endian) +- `Int32` / `UInt32`: 4 bytes per value (little-endian) +- `Int64` / `UInt64`: 8 bytes per value (little-endian) +- `Int128` / `UInt128`: 16 bytes per value (little-endian) +- `Int256` / `UInt256`: 32 bytes per value (little-endian) + +**Example: UInt32 single value** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT 42::UInt32 AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0655 496e 7433 322a 0000 ...col.UInt32*.. +00000010: 00 . +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x06, 0x55, 0x49, 0x6e, // Column type = "UInt32" + 0x74, 0x33, 0x32, + 0x2a, 0x00, 0x00, 0x00, // Data: 42 (little-endian UInt32) +]); +``` + +**Example: Multiple rows (columnar)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT number::UInt8 AS col FROM numbers(3)" | xxd +``` + +``` +00000000: 0103 0363 6f6c 0555 496e 7438 0001 02 ...col.UInt8... +``` + +The column data contains three consecutive UInt8 values: `0x00`, `0x01`, `0x02`. + +**Example: Int8 negative value** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT -1::Int8 AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0449 6e74 38ff ...col.Int8. +``` + +The value `-1` is encoded as `0xff` (two's complement representation). + +**Example: Int128 large value** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT 170141183460469231731687303715884105727::Int128 AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0649 6e74 3132 38ff ffff ...col.Int128... +00000010: ffff ffff ffff ffff ffff ffff 7f ............. +``` + +The 16-byte value is little-endian: `0x7fffffffffffffffffffffffffffffff`. + +**Example: UInt256** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT 1::UInt256 AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0755 496e 7432 3536 0100 ...col.UInt256.. +00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ +00000020: 0000 0000 0000 0000 0000 0000 00 ............. +``` + +The 32-byte value is little-endian with the value `1` in the first byte. + +### 5.2 Floating Point Types + +Floating point types use IEEE 754 standard encoding in little-endian byte order. + +**Supported types:** +- `Float32`: 4 bytes, IEEE 754 single-precision +- `Float64`: 8 bytes, IEEE 754 double-precision +- `BFloat16`: 2 bytes, brain floating point (16-bit truncation of Float32) + +**Example: Float32** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT 1.25::Float32 AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0746 6c6f 6174 3332 0000 ...col.Float32.. +00000010: a03f .? +``` + +The value `1.25` is encoded as `0x3fa00000` (little-endian IEEE 754). + +**Example: Float64** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT 3.14159265358979::Float64 AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0746 6c6f 6174 3634 112d ...col.Float64.- +00000010: 4454 fb21 0940 DT.!.@ +``` + +The value π is encoded as `0x400921fb54442d11` (little-endian IEEE 754 double). + +**Example: BFloat16** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT 1.25::BFloat16 AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0842 466c 6f61 7431 36a0 ...col.BFloat16. +00000010: 3f ? +``` + +`BFloat16` stores the top 16 bits of the Float32 representation: `0x3fa0` (little-endian). This corresponds to the first 2 bytes of the Float32 encoding `0x3fa00000`, reversed to little-endian order. + +### 5.3 Bool + +The `Bool` type is encoded as a single byte per value: +- `0x00` = false +- `0x01` = true + +**Example: true** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT true::Bool AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0442 6f6f 6c01 ...col.Bool. +``` + +**Example: false** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT false::Bool AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0442 6f6f 6c00 ...col.Bool. +``` + +### 5.4 String & FixedString + +**String:** Each string value is encoded as a VarUInt length followed by the UTF-8 bytes. + +- **VarUInt length**: Number of bytes in the string (LEB128 encoding) +- **Bytes**: Raw UTF-8 data + +**Example: String** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT 'hello'::String AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0653 7472 696e 6705 6865 ...col.String.he +00000010: 6c6c 6f llo +``` + +The string `"hello"` is encoded as: +- `0x05`: Length = 5 bytes (VarUInt) +- `0x68 0x65 0x6c 0x6c 0x6f`: "hello" in UTF-8 + +**Example: Empty string** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT ''::String AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0653 7472 696e 6700 ...col.String. +``` + +Empty string is encoded as `0x00` (VarUInt length = 0). + +**Example: Multiple strings (columnar)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT arrayJoin(['hello', 'world']) AS col" | xxd +``` + +``` +00000000: 0102 0363 6f6c 0653 7472 696e 6705 6865 ...col.String.he +00000010: 6c6c 6f05 776f 726c 64 llo.world +``` + +Each string is serialized consecutively: +- Row 0: `0x05` + "hello" +- Row 1: `0x05` + "world" + +**Example: Long string with multi-byte VarUInt** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT repeat('x', 300) AS col" | xxd | head -5 +``` + +``` +00000000: 0101 0363 6f6c 0653 7472 696e 67ac 0278 ...col.String..x +00000010: 7878 7878 7878 7878 7878 7878 7878 7878 xxxxxxxxxxxxxxxx +00000020: 7878 7878 7878 7878 7878 7878 7878 7878 xxxxxxxxxxxxxxxx +00000030: 7878 7878 7878 7878 7878 7878 7878 7878 xxxxxxxxxxxxxxxx +00000040: 7878 7878 7878 7878 7878 7878 7878 7878 xxxxxxxxxxxxxxxx +``` + +The length is encoded as `0xac 0x02`: +- Byte 1: `0xac` = `0b10101100` → 7 data bits = `0b0101100`, continue bit set +- Byte 2: `0x02` = `0b00000010` → 7 data bits = `0b0000010`, continue bit not set +- Value = `(0b0101100) | (0b0000010 << 7)` = `44 | 256` = `300` + +**FixedString(N):** Each value occupies exactly N bytes, zero-padded if the string is shorter. + +**Example: FixedString(5)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT 'abc'::FixedString(5) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0e46 6978 6564 5374 7269 ...col.FixedStri +00000010: 6e67 2835 2961 6263 0000 ng(5)abc.. +``` + +The value `"abc"` is padded to 5 bytes: `0x61 0x62 0x63 0x00 0x00`. + +### 5.5 Date & Time Types + +Date and time types are encoded as integer values representing specific time units since their respective epochs. + +**Date:** UInt16 representing days since `1970-01-01`. + +**Example: Date** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDate('2023-12-25') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0444 6174 6504 4d ...col.Date.M +``` + +The value is `0x4d04` (little-endian UInt16) = 19716 days since 1970-01-01. + +**Example: Date epoch** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDate('1970-01-01') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0444 6174 6500 00 ...col.Date.. +``` + +Epoch value is `0x0000` (0 days). + +--- + +**Date32:** Int32 representing days since `1900-01-01` (supports dates before 1970). + +**Example: Date32** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDate32('2023-12-25') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0644 6174 6533 3204 4d00 ...col.Date32.M. +00000010: 00 . +``` + +The value is `0x00004d04` (little-endian Int32) = 45284 days since 1900-01-01. + +**Example: Date32 epoch (1900-01-01)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDate32('1900-01-01') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0644 6174 6533 3221 9cff ...col.Date32!.. +00000010: ff . +``` + +The value is `0xffff9c21` (little-endian Int32) = -25567 days. Despite the documentation saying "days since 1900-01-01", the actual epoch is 1970-01-01 (value 0). Dates before 1970 are represented as negative values. The date 1900-01-01 is -25567 days before 1970-01-01. + +--- + +**DateTime:** UInt32 representing seconds since Unix epoch (`1970-01-01 00:00:00 UTC`). + +**Example: DateTime** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDateTime('2023-12-25 10:30:45') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0844 6174 6554 696d 65d5 ...col.DateTime. +00000010: 5989 65 Y.e +``` + +The value is `0x658959d5` (little-endian UInt32) = 1703502645 seconds since epoch. + +**Example: DateTime epoch** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDateTime(0) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0844 6174 6554 696d 6500 ...col.DateTime. +00000010: 0000 00 ... +``` + +Epoch value is `0x00000000`. + +--- + +**DateTime64(P):** Int64 representing time units since epoch with precision P (0-9). +- Precision 0: seconds +- Precision 3: milliseconds +- Precision 6: microseconds +- Precision 9: nanoseconds + +The type string includes the precision: `DateTime64(3)`. + +**Example: DateTime64(3) - milliseconds** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDateTime64('2023-12-25 10:30:45.123', 3) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0d44 6174 6554 696d 6536 ...col.DateTime6 +00000010: 3428 3329 83e8 86a0 8c01 0000 4(3)........ +``` + +The value is `0x00000001a086e883` (little-endian Int64) = 1703502645123 milliseconds since epoch. + +**Example: DateTime64 epoch** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDateTime64(0, 3) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0d44 6174 6554 696d 6536 ...col.DateTime6 +00000010: 3428 3329 0000 0000 0000 0000 4(3)........ +``` + +Epoch value is `0x0000000000000000`. + +### 5.6 Decimal Types + +Decimal types store fixed-point numbers as little-endian integers scaled by a power of 10. The scale (number of decimal places) is encoded in the type string. + +**Supported types:** +- `Decimal32(S)` / `Decimal(P, S)` where P ≤ 9: 4 bytes (Int32) +- `Decimal64(S)` / `Decimal(P, S)` where P ≤ 18: 8 bytes (Int64) +- `Decimal128(S)` / `Decimal(P, S)` where P ≤ 38: 16 bytes (Int128) +- `Decimal256(S)` / `Decimal(P, S)` where P ≤ 76: 32 bytes (Int256) + +The stored value equals the decimal number multiplied by 10^S. + +**Example: Decimal32(2)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDecimal32(123.45, 2) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0d44 6563 696d 616c 2839 ...col.Decimal(9 +00000010: 2c20 3229 3090 0000 , 2)0... +``` + +The type string in the wire format is always `Decimal(P, S)` notation (e.g., `Decimal(9, 2)`), even when the SQL uses `Decimal32(S)`. The precision 9 indicates this is a `Decimal32` (P ≤ 9 → 4 bytes). + +The value `123.45` is stored as `12345` (123.45 × 10^2). +Encoded as `0x00003090` (little-endian Int32) = 12345. + +**Example: Decimal64(4)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDecimal64(123.45, 4) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0e44 6563 696d 616c 2831 ...col.Decimal(1 +00000010: 382c 2034 2944 d612 0000 0000 00 8, 4)D....... +``` + +The type string is `Decimal(18, 4)`. +The value `123.45` is stored as `1234500` (123.45 × 10^4). +Encoded as `0x0000000012d644` (little-endian Int64) = 1234500. + +**Example: Decimal128(6)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDecimal128(123.45, 6) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0e44 6563 696d 616c 2833 ...col.Decimal(3 +00000010: 382c 2036 2990 b25b 0700 0000 0000 0000 8, 6)..[........ +00000020: 0000 0000 00 ..... +``` + +The type string is `Decimal(38, 6)`. +The value `123.45` is stored as `123450000` (123.45 × 10^6). +Encoded as `0x00000000000000000000000007 5bb290` (little-endian Int128) = 123450000. + +**Example: Decimal256(8)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toDecimal256(123.45, 8) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0e44 6563 696d 616c 2837 ...col.Decimal(7 +00000010: 362c 2038 2940 c0d1 df02 0000 0000 0000 6, 8)@.......... +00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................ +00000030: 0000 0000 00 ..... +``` + +The type string is `Decimal(76, 8)`. +The value `123.45` is stored as `12345000000` (123.45 × 10^8). +Encoded as 32-byte little-endian Int256 = 12345000000. + +### 5.7 UUID + +`UUID` values are encoded as 16 bytes in a specific byte order that differs from the standard UUID string representation. + +ClickHouse stores UUIDs in a format optimized for indexing, where bytes are reordered from the standard UUID string format `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`. + +**Byte order:** The UUID is stored with the first and second groups swapped compared to RFC 4122 string representation. + +**Example: UUID** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toUUID('61f0c404-5cb3-11e7-907b-a6006ad3dba0') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0455 5549 44e7 11b3 5c04 ...col.UUID...\. +00000010: c4f0 61a0 dbd3 6a00 a67b 90 ..a...j..{. +``` + +The UUID string `61f0c404-5cb3-11e7-907b-a6006ad3dba0` is encoded as: + +``` +e7 11 b3 5c 04 c4 f0 61 a0 db d3 6a 00 a6 7b 90 +``` + +This represents the bytes reordered as: +- Bytes 6-7 of string (`11e7`) → bytes 0-1 of storage +- Bytes 4-5 of string (`5cb3`) → bytes 2-3 of storage +- Bytes 0-3 of string (`61f0c404`) → bytes 4-7 of storage +- Bytes 8-15 of string (`907ba6006ad3dba0`) → bytes 8-15 of storage + +The reordering optimizes sorting and range queries on time-based UUIDs (version 1). + +### 5.8 IPv4 & IPv6 + +**IPv4:** Stored as 4 bytes (UInt32) in reversed byte order from the standard dotted-decimal notation. + +**Example: IPv4** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toIPv4('192.168.1.1') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0449 5076 3401 01a8 c0 ...col.IPv4.... +``` + +The address `192.168.1.1` is encoded as `0x0101a8c0`: +- Byte 0: `0x01` (last octet: 1) +- Byte 1: `0x01` (third octet: 1) +- Byte 2: `0xa8` (second octet: 168) +- Byte 3: `0xc0` (first octet: 192) + +This is little-endian UInt32 interpretation of the reversed IP bytes. + +**Example: IPv4 localhost** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toIPv4('127.0.0.1') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0449 5076 3401 0000 7f ...col.IPv4.... +``` + +The address `127.0.0.1` is encoded as `0x0100007f` (reversed bytes). + +**Example: Multiple IPv4 values** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toIPv4('192.168.' || toString(number) || '.1') AS col FROM numbers(3)" | xxd +``` + +``` +00000000: 0103 0363 6f6c 0449 5076 3401 00a8 c001 ...col.IPv4..... +00000010: 01a8 c001 02a8 c0 ....... +``` + +Three consecutive 4-byte values for `192.168.0.1`, `192.168.1.1`, `192.168.2.1`. + +--- + +**IPv6:** Stored as 16 bytes in network byte order (big-endian). + +**Example: IPv6** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toIPv6('2001:db8::1') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0449 5076 3620 010d b800 ...col.IPv6 .... +00000010: 0000 0000 0000 0000 0000 01 ........... +``` + +The address `2001:db8::1` is encoded as 16 bytes in network byte order: +``` +20 01 0d b8 00 00 00 00 00 00 00 00 00 00 00 01 +``` + +**Example: IPv6 localhost** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toIPv6('::1') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0449 5076 3600 0000 0000 ...col.IPv6..... +00000010: 0000 0000 0000 0000 0000 01 ........... +``` + +The IPv6 loopback address `::1` is 15 zero bytes followed by `0x01`. + +### 5.9 Enum8 & Enum16 + +Enum types are encoded as integer values representing the enum value's assigned number. The enum definition is included in the type string. + +**Enum8:** Stored as 1 byte (Int8) per value. + +**Example: Enum8** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT CAST('a', 'Enum8(''a''=1, ''b''=2)') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 1745 6e75 6d38 2827 6127 ...col.Enum8('a' +00000010: 203d 2031 2c20 2762 2720 3d20 3229 01 = 1, 'b' = 2). +``` + +Breakdown: +- Type string: `Enum8('a' = 1, 'b' = 2)` (23 bytes) +- Data: `0x01` (the assigned value for 'a') + +The enum value `'a'` is encoded as its assigned integer `1`. + +--- + +**Enum16:** Stored as 2 bytes (Int16, little-endian) per value. + +**Example: Enum16** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT CAST('hello', 'Enum16(''hello''=1000, ''world''=2000)') AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 2645 6e75 6d31 3628 2768 ...col&Enum16('h +00000010: 656c 6c6f 2720 3d20 3130 3030 2c20 2777 ello' = 1000, 'w +00000020: 6f72 6c64 2720 3d20 3230 3030 29e8 03 orld' = 2000).. +``` + +Breakdown: +- Type string: `Enum16('hello' = 1000, 'world' = 2000)` (38 bytes) +- Data: `0x03e8` (little-endian Int16) = 1000 + +The enum value `'hello'` is encoded as its assigned integer `1000` in little-endian format. + +--- + +## 6. Complex Data Types (Native-Specific) + +### 6.1 Nullable + +`Nullable(T)` wraps any type `T` to allow `NULL` values. It uses a two-stream architecture: + +1. **NullMap Stream**: Null indicator bytes (1 byte per row) +2. **NullableElements Stream**: The nested type's data (all rows, including NULLs) + +**Stream Layout:** + +The serialization writes two streams sequentially: + +``` +[NullMap: N bytes] [NestedData: serialized as type T] +``` + +**Null Map Encoding:** + +For N rows, N consecutive bytes are written: +- `0x00` = non-NULL (value is valid) +- `0x01` = NULL (value should be ignored) + +**Important**: Even for NULL rows, the nested type still writes placeholder data. When deserializing, if the null map indicates NULL, the corresponding value in the nested data should be ignored and replaced with NULL. + +**Example: Nullable(UInt64) with mixed NULL values** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT if(number % 2 = 0, number, NULL)::Nullable(UInt64) AS col FROM numbers(5)" | xxd +``` + +``` +00000000: 0105 0363 6f6c 104e 756c 6c61 626c 6528 ...col.Nullable( +00000010: 5549 6e74 3634 2900 0100 0100 0000 0000 UInt64)......... +00000020: 0000 0000 0100 0000 0000 0000 0200 0000 ................ +00000030: 0000 0000 0300 0000 0000 0000 0400 0000 ................ +00000040: 0000 0000 .... +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x05, // NumRows = 5 + + // Column metadata + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (String: length=3) + 0x10, 0x4e, 0x75, 0x6c, // Column type = "Nullable(UInt64)" (String: length=16) + 0x6c, 0x61, 0x62, 0x6c, + 0x65, 0x28, 0x55, 0x49, + 0x6e, 0x74, 0x36, 0x34, + 0x29, + + // Stream 1: NullMap (5 bytes for 5 rows) + 0x00, // Row 0: not NULL (value = 0) + 0x01, // Row 1: NULL + 0x00, // Row 2: not NULL (value = 2) + 0x01, // Row 3: NULL + 0x00, // Row 4: not NULL (value = 4) + + // Stream 2: NullableElements (5 x UInt64 values) + // Note: All 5 values are written, even for NULL rows + 0x00, 0x00, 0x00, 0x00, // Row 0: 0 (valid, NullMap[0]=0x00) + 0x00, 0x00, 0x00, 0x00, + 0x01, 0x00, 0x00, 0x00, // Row 1: 1 (IGNORED, NullMap[1]=0x01) + 0x00, 0x00, 0x00, 0x00, + 0x02, 0x00, 0x00, 0x00, // Row 2: 2 (valid, NullMap[2]=0x00) + 0x00, 0x00, 0x00, 0x00, + 0x03, 0x00, 0x00, 0x00, // Row 3: 3 (IGNORED, NullMap[3]=0x01) + 0x00, 0x00, 0x00, 0x00, + 0x04, 0x00, 0x00, 0x00, // Row 4: 4 (valid, NullMap[4]=0x00) + 0x00, 0x00, 0x00, 0x00, +]); + +// Resulting column values: [0, NULL, 2, NULL, 4] +``` + +**Example: All NULL values** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT NULL::Nullable(UInt8) AS col FROM numbers(3)" | xxd +``` + +``` +00000000: 0103 0363 6f6c 0f4e 756c 6c61 626c 6528 ...col.Nullable( +00000010: 5549 6e74 3829 0101 0100 0000 UInt8)...... +``` + +Breakdown: +- NullMap: `0x01 0x01 0x01` (all three rows are NULL) +- Data: `0x00 0x00 0x00` (placeholder UInt8 values, all ignored) + +**Example: Nullable(String)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT if(number = 1, NULL, toString(number))::Nullable(String) AS col FROM numbers(3)" | xxd +``` + +``` +00000000: 0103 0363 6f6c 0f4e 756c 6c61 626c 6528 ...col.Nullable( +00000010: 5374 7269 6e67 2900 0100 0130 0030 0132 String)....0.0.2 +``` + +Breakdown: +- NullMap: `0x00 0x01 0x00` (rows 0 and 2 are valid, row 1 is NULL) +- Data: + - Row 0: `0x01 0x30` (String "0", valid) + - Row 1: `0x00` (empty string placeholder, IGNORED) + - Row 2: `0x01 0x32` (String "2", valid) + +**Nested Nullable Types:** + +For complex types like `Array(Nullable(T))`, the stream paths compose hierarchically: +- `{ArraySizes}` - array offsets +- `{ArrayElements, NullMap}` - null indicators for all array elements +- `{ArrayElements, NullableElements, Regular}` - element values + +### 6.2 LowCardinality + +`LowCardinality(T)` is a dictionary-encoded column type that stores unique values once in a dictionary and uses integer indexes to reference them. This dramatically reduces storage and memory usage for columns with low cardinality (few distinct values). + +**Stream Architecture:** + +`LowCardinality` uses two main streams that are written sequentially: + +1. **DictionaryKeys Stream**: Dictionary metadata (version) +2. **DictionaryIndexes Stream**: Index metadata, dictionary values (as "additional keys"), and row indexes + +**Complete Binary Structure:** + +For Native format over HTTP, the typical structure is: + +``` +// DictionaryKeys Stream: +[KeysVersion (UInt64, little-endian)] + +// DictionaryIndexes Stream: +[IndexesSerializationType (UInt64, little-endian)] +[NumAdditionalKeys (UInt64, little-endian)] (if HasAdditionalKeysBit flag is set) +[AdditionalKeys data] (if HasAdditionalKeysBit flag is set) +[NumRows (UInt64, little-endian)] +[Index values: N integers] +``` + +**DictionaryKeys Stream:** + +- **KeysVersion** (UInt64, little-endian): Serialization version + - Always `0x01` = `SharedDictionariesWithAdditionalKeys` + +In Native format over HTTP, the dictionary is typically sent as "additional keys" in the DictionaryIndexes stream rather than in a global dictionary here, because `low_cardinality_use_single_dictionary_for_part` is enabled by default. + +**DictionaryIndexes Stream:** + +**1. IndexesSerializationType (UInt64, little-endian):** + +Encodes the index integer type and flags: + +**Bits 0-7 (Index Type after flag masking):** +- `0x00` = UInt8 indexes +- `0x01` = UInt16 indexes +- `0x02` = UInt32 indexes +- `0x03` = UInt64 indexes + +**Flag Bits:** +- Bit 8 (`0x0100`): `NeedGlobalDictionaryBit` - Read dictionary from separate global dictionary (0 for Native HTTP) +- Bit 9 (`0x0200`): `HasAdditionalKeysBit` - Additional keys follow +- Bit 10 (`0x0400`): `NeedUpdateDictionary` - Dictionary changed (typically 1 for Native HTTP) + +**Decoding:** + +The `IndexesSerializationType` is a UInt64 where: +- Bits 0-7 contain the index type (0-3) +- Bits 8-10 contain the flags + +```cpp +uint64_t value = read_uint64_le(); +uint8_t index_type = value & 0xFF; // Extract bits 0-7 (type is 0-3) +bool has_additional_keys = (value >> 9) & 1; // Bit 9 +bool need_global_dictionary = (value >> 8) & 1; // Bit 8 +bool need_update_dictionary = (value >> 10) & 1; // Bit 10 +``` + +Note: The index type uses only bits 0-1 (values 0-3), but the full byte is reserved. + +**2. Additional Keys (conditional):** + +If `HasAdditionalKeysBit` is set: + +- **NumAdditionalKeys** (UInt64, little-endian): Number of dictionary entries +- **Keys Data**: Serialized using inner type's bulk serialization + - For `String`: sequence of (VarUInt length + UTF-8 bytes) + - For `UInt32`: sequence of UInt32 values + - etc. + +**3. Row Indexes:** + +- **NumRows** (UInt64, little-endian): Number of rows +- **Index Values**: NumRows consecutive integers (type from IndexesSerializationType) + - Each index references a position in the dictionary + - Index semantics depend on flags: + - If `!need_global_dictionary`: index directly into additional_keys + - If `need_global_dictionary`: index < dict_size → global dict, index >= dict_size → additional_keys[index - dict_size] + +**Example: LowCardinality(String) with repeated values** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toLowCardinality(toString(number % 3)) AS col FROM numbers(6)" | xxd +``` + +``` +00000000: 0106 0363 6f6c 164c 6f77 4361 7264 696e ...col.LowCardin +00000010: 616c 6974 7928 5374 7269 6e67 2901 0000 ality(String)... +00000020: 0000 0000 0000 0600 0000 0000 0004 0000 ................ +00000030: 0000 0000 0000 0130 0131 0132 0600 0000 .......0.1.2.... +00000040: 0000 0000 0102 0301 0203 .......... +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + // Block header + 0x01, // NumColumns = 1 + 0x06, // NumRows = 6 + + // Column metadata + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x16, 0x4c, 0x6f, 0x77, 0x43, 0x61, 0x72, 0x64, 0x69, // Type = "LowCardinality(String)" + 0x6e, 0x61, 0x6c, 0x69, 0x74, 0x79, 0x28, 0x53, + 0x74, 0x72, 0x69, 0x6e, 0x67, 0x29, + + // DictionaryKeys stream: + 0x01, 0x00, 0x00, 0x00, // KeysVersion = 1 (UInt64) + 0x00, 0x00, 0x00, 0x00, + + // DictionaryIndexes stream: + 0x06, 0x00, 0x00, 0x00, // IndexesSerializationType = 0x06 (UInt64) + 0x00, 0x00, 0x00, 0x00, // Exact interpretation requires flag decoding + // Empirically: appears to encode type + flags + + 0x04, 0x00, 0x00, 0x00, // NumAdditionalKeys = 4 (UInt64) + 0x00, 0x00, 0x00, 0x00, // Note: 4 keys for 3 distinct values suggests + // index 0 may be reserved (e.g., for NULL) + + // Additional keys (String values): + 0x00, // Key[0]: empty string (length=0) + 0x01, 0x30, // Key[1]: "0" (length=1, data=0x30) + 0x01, 0x31, // Key[2]: "1" (length=1, data=0x31) + 0x01, 0x32, // Key[3]: "2" (length=1, data=0x32) + + 0x06, 0x00, 0x00, 0x00, // NumRows = 6 (UInt64) + 0x00, 0x00, 0x00, 0x00, + + // Row indexes (6 values): + 0x01, // Row 0: index 1 → "0" (number % 3 = 0) + 0x02, // Row 1: index 2 → "1" (number % 3 = 1) + 0x03, // Row 2: index 3 → "2" (number % 3 = 2) + 0x01, // Row 3: index 1 → "0" (number % 3 = 0) + 0x02, // Row 4: index 2 → "1" (number % 3 = 1) + 0x03, // Row 5: index 3 → "2" (number % 3 = 2) +]); + +// Result: column values = ["0", "1", "2", "0", "1", "2"] +``` + +**Key Observations:** + +1. **Index 0 Convention**: The dictionary often includes a 0-index entry (here, an empty string) that may serve as a default or NULL placeholder. Actual data indexes start at 1. + +2. **Dictionary Compression**: Only 3 distinct values ("0", "1", "2") are stored, even though there are 6 rows. The indexes array (6 bytes) plus dictionary (6 bytes total for keys) is much smaller than 6 separate string serializations would be. + +3. **Index Type Selection**: The index type (UInt8/16/32/64) is chosen based on the dictionary size to minimize space. + +**Example: Simple LowCardinality(String) without repetition** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT toLowCardinality(toString(number)) AS col FROM numbers(3)" | xxd +``` + +``` +00000000: 0103 0363 6f6c 164c 6f77 4361 7264 696e ...col.LowCardin +00000010: 616c 6974 7928 5374 7269 6e67 2901 0000 ality(String)... +00000020: 0000 0000 0000 0600 0000 0000 0004 0000 ................ +00000030: 0000 0000 0000 0130 0131 0132 0300 0000 .......0.1.2.... +00000040: 0000 0000 0102 03 .... +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, 0x03, // 1 column, 3 rows + + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x16, /* "LowCardinality(String)" */, + + // DictionaryKeys: + 0x01, 0x00, 0x00, 0x00, // KeysVersion = 1 + 0x00, 0x00, 0x00, 0x00, + + // DictionaryIndexes: + 0x06, 0x00, 0x00, 0x00, // IndexesSerializationType + 0x00, 0x00, 0x00, 0x00, + + 0x04, 0x00, 0x00, 0x00, // NumAdditionalKeys = 4 + 0x00, 0x00, 0x00, 0x00, + + 0x00, // Key[0]: "" (empty/default) + 0x01, 0x30, // Key[1]: "0" + 0x01, 0x31, // Key[2]: "1" + 0x01, 0x32, // Key[3]: "2" + + 0x03, 0x00, 0x00, 0x00, // NumRows = 3 + 0x00, 0x00, 0x00, 0x00, + + 0x01, 0x02, 0x03, // Indexes: [1, 2, 3] → ["0", "1", "2"] +]); +``` + +**Nested LowCardinality:** + +For complex types like `Array(LowCardinality(String))`, the stream paths compose: +- `{ArraySizes}` - array offsets +- `{ArrayElements, DictionaryKeys}` - dictionary for all array elements +- `{ArrayElements, DictionaryIndexes}` - indexes for all array elements + +**Additional Keys vs. Global Dictionary:** + +- **Global Dictionary Mode** (`need_global_dictionary=true`): Dictionary is shared across granules, written once in DictionaryKeys stream or at end in suffix +- **Additional Keys Mode** (`has_additional_keys=true`): Dictionary sent inline with indexes stream, common in Native HTTP format + +For most queries, Native HTTP format uses additional keys mode, as each HTTP response is self-contained. + +### 6.3 Array + +The `Array` type uses a two-stream architecture to serialize variable-length arrays efficiently in columnar format: + +1. **ArraySizes stream**: Contains the size of each array (serialized as `UInt64` per row) +2. **ArrayElements stream**: Contains all elements from all arrays, flattened consecutively + +This design allows the array structure (sizes) to be separated from the element data, enabling efficient compression and selective reading in storage engines. + +**Stream Enumeration Order:** + +For `Array(T)`, the streams appear in this order: +1. Stream path `{ArraySizes}`: Array sizes (one `UInt64` per row) +2. Stream path `{ArrayElements, ...}`: Element data (recursively serialized according to type `T`) + +**Serialization Details:** + +The array sizes are written as cumulative offsets in the Native format. Each offset is a `UInt64` value in little-endian format that represents the total number of elements from all arrays up to and including the current row. + +The element data follows immediately after all array sizes, with all elements from all arrays written consecutively. For example, if you have 3 arrays `[1,2]`, `[3]`, `[]`, the offsets would be `2, 3, 3` and the element stream would contain just `1, 2, 3` (3 elements total). + +**Example: Simple Array(UInt32)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT [1, 2, 3]::Array(UInt32) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0d41 7272 6179 2855 496e ...col.Array(UIn +00000010: 7433 3229 0300 0000 0000 0000 0100 0000 t32)............ +00000020: 0200 0000 0300 0000 ........ +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 (VarUInt) + 0x01, // NumRows = 1 (VarUInt) + + // Column metadata: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (String: length=3) + 0x0d, 0x41, 0x72, 0x72, // Column type = "Array(UInt32)" (String: length=13) + 0x61, 0x79, 0x28, 0x55, + 0x49, 0x6e, 0x74, 0x33, + 0x32, 0x29, + + // ArraySizes stream (1 array with 3 elements): + 0x03, 0x00, 0x00, 0x00, // Array offset: 3 (UInt64, little-endian) + 0x00, 0x00, 0x00, 0x00, + + // ArrayElements stream (3 UInt32 values): + 0x01, 0x00, 0x00, 0x00, // Element 0: 1 (UInt32) + 0x02, 0x00, 0x00, 0x00, // Element 1: 2 (UInt32) + 0x03, 0x00, 0x00, 0x00, // Element 2: 3 (UInt32) +]); +``` + +**Example: Multiple arrays with varying sizes** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT arrayJoin([[1,2], [3], []]) AS col" | xxd +``` + +``` +00000000: 0103 0363 6f6c 0c41 7272 6179 2855 496e ...col.Array(UIn +00000010: 7438 2902 0000 0000 0000 0003 0000 0000 t8)............. +00000020: 0000 0003 0000 0000 0000 0001 0203 .............. +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x03, // NumRows = 3 + + // Column metadata: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x0c, 0x41, 0x72, 0x72, // Column type = "Array(UInt8)" (length=12) + 0x61, 0x79, 0x28, 0x55, + 0x49, 0x6e, 0x74, 0x38, + 0x29, + + // ArraySizes stream (3 arrays): + 0x02, 0x00, 0x00, 0x00, // Row 0: offset = 2 (cumulative: 2 elements) + 0x00, 0x00, 0x00, 0x00, + 0x03, 0x00, 0x00, 0x00, // Row 1: offset = 3 (cumulative: 3 elements total) + 0x00, 0x00, 0x00, 0x00, + 0x03, 0x00, 0x00, 0x00, // Row 2: offset = 3 (cumulative: 3 elements total) + 0x00, 0x00, 0x00, 0x00, + + // ArrayElements stream (3 total elements: 1, 2, 3): + 0x01, // Element from row 0: 1 (UInt8) + 0x02, // Element from row 0: 2 (UInt8) + 0x03, // Element from row 1: 3 (UInt8) + // Row 2 has no elements (empty array) +]); +``` + +**Understanding Cumulative Offsets:** + +In the Native format, the `ArraySizes` stream contains cumulative offsets, not individual sizes. The offset for row `i` indicates the total number of elements from all arrays in rows 0 through `i`. + +In the example above: +- Row 0: size = 2 - 0 = 2 elements (array `[1, 2]`) +- Row 1: size = 3 - 2 = 1 element (array `[3]`) +- Row 2: size = 3 - 3 = 0 elements (array `[]`) + +**Nested Arrays:** + +For nested arrays like `Array(Array(T))`, the structure repeats hierarchically: +- Outer `ArraySizes` stream: cumulative offsets for the outer arrays +- Inner `ArraySizes` stream: cumulative offsets for all inner arrays (flattened) +- `ArrayElements` stream: all leaf elements (flattened) + +The stream paths would be: +1. `{ArraySizes}` - outer array offsets +2. `{ArrayElements, ArraySizes}` - inner array offsets +3. `{ArrayElements, ArrayElements, Regular}` - element data + +### 6.4 Map + +The `Map(K, V)` type is internally represented as `Array(Tuple(K, V))` where each map entry is a key-value pair stored as a tuple. This means `Map` inherits the same stream structure as `Array`, but with nested streams for the tuple elements. + +**Stream Enumeration Order:** + +For `Map(K, V)`, the streams appear in this order: +1. Stream path `{ArraySizes}`: Map sizes (number of key-value pairs per row) +2. Stream path `{ArrayElements, TupleElement0, ...}`: All keys, serialized according to type `K` +3. Stream path `{ArrayElements, TupleElement1, ...}`: All values, serialized according to type `V` + +**Serialization Process:** + +1. The number of key-value pairs in each map is written to the `ArraySizes` stream (as cumulative offsets) +2. All keys from all maps are flattened and written consecutively +3. All values from all maps are flattened and written consecutively + +**Example: Map(String, UInt32)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT map('a', 1, 'b', 2)::Map(String, UInt32) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 134d 6170 2853 7472 696e ...col.Map(Strin +00000010: 672c 2055 496e 7433 3229 0200 0000 0000 g, UInt32)...... +00000020: 0000 0161 0162 0100 0000 0200 0000 ...a.b........ +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column metadata: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x13, 0x4d, 0x61, 0x70, // Column type = "Map(String, UInt32)" (length=19) + 0x28, 0x53, 0x74, 0x72, + 0x69, 0x6e, 0x67, 0x2c, + 0x20, 0x55, 0x49, 0x6e, + 0x74, 0x33, 0x32, 0x29, + + // ArraySizes stream (1 map with 2 entries): + 0x02, 0x00, 0x00, 0x00, // Map offset: 2 key-value pairs (UInt64) + 0x00, 0x00, 0x00, 0x00, + + // TupleElement0 stream (Keys - 2 String values): + 0x01, 0x61, // Key 0: "a" (String: length=1) + 0x01, 0x62, // Key 1: "b" (String: length=1) + + // TupleElement1 stream (Values - 2 UInt32 values): + 0x01, 0x00, 0x00, 0x00, // Value 0: 1 (UInt32) + 0x02, 0x00, 0x00, 0x00, // Value 1: 2 (UInt32) +]); +``` + +**Stream Path Hierarchy:** + +``` +Map(String, UInt32) +└── ArraySizes [2] (one entry with offset 2) + └── ArrayElements (Tuple) + ├── TupleElement0 (Keys) ["a", "b"] + └── TupleElement1 (Values) [1, 2] +``` + +**Empty Maps:** + +An empty map contributes no keys or values to the element streams. Only the offset in the `ArraySizes` stream reflects its presence (the offset doesn't increase from the previous row). + +### 6.5 Tuple + +The `Tuple` type serializes each element sequentially in separate streams. Unlike `Array` or `Map`, there is no size metadata because tuples have a fixed number of elements known at schema definition time. + +**Stream Enumeration Order:** + +For `Tuple(T1, T2, ..., Tn)`, each element is serialized in its own stream path: +1. Stream path `{TupleElement0, ...}`: First element, serialized according to type `T1` +2. Stream path `{TupleElement1, ...}`: Second element, serialized according to type `T2` +3. ... +n. Stream path `{TupleElement(n-1), ...}`: Nth element, serialized according to type `Tn` + +**Serialization Details:** + +Each tuple element is serialized as if it were an independent column. For a block with N rows, each element stream contains N values of its respective type. The element streams are written consecutively in the order of the tuple elements. + +There is no explicit tuple structure metadata in the wire format - the tuple structure is defined by the type string, and each element is simply serialized in order. + +**Example: Tuple(UInt32, String)** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT (42, 'hello')::Tuple(UInt32, String) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 1554 7570 6c65 2855 496e ...col.Tuple(UIn +00000010: 7433 322c 2053 7472 696e 6729 2a00 0000 t32, String)*... +00000020: 0568 656c 6c6f .hello +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column metadata: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x15, 0x54, 0x75, 0x70, // Column type = "Tuple(UInt32, String)" (length=21) + 0x6c, 0x65, 0x28, 0x55, + 0x49, 0x6e, 0x74, 0x33, + 0x32, 0x2c, 0x20, 0x53, + 0x74, 0x72, 0x69, 0x6e, + 0x67, 0x29, + + // TupleElement0 stream (UInt32): + 0x2a, 0x00, 0x00, 0x00, // Element 0: 42 (UInt32) + + // TupleElement1 stream (String): + 0x05, 0x68, 0x65, 0x6c, // Element 1: "hello" (String: length=5) + 0x6c, 0x6f, +]); +``` + +**Multiple Rows:** + +For multiple rows, each element stream contains all values for that element position across all rows: + +```sql +SELECT arrayJoin([(1, 'a'), (2, 'b'), (3, 'c')])::Tuple(UInt8, String) AS col +``` + +Would serialize as: +- `TupleElement0` stream: `[1, 2, 3]` (3 UInt8 values) +- `TupleElement1` stream: `['a', 'b', 'c']` (3 String values) + +**Nested Tuples:** + +For nested tuples like `Tuple(UInt32, Tuple(String, UInt8))`, the streams are flattened hierarchically: + +1. `{TupleElement0, Regular}` - First element (UInt32) +2. `{TupleElement1, TupleElement0, Regular}` - Nested tuple's first element (String) +3. `{TupleElement1, TupleElement1, Regular}` - Nested tuple's second element (UInt8) + +**Named Tuples:** + +Named tuples (e.g., `Tuple(x UInt32, y String)`) serialize identically to unnamed tuples. The element names are only reflected in the type string and do not affect the binary wire format. The serialization order matches the definition order. + +### 6.6 Nested + +`Nested` is a special column type that represents a structured collection of named fields, where each field is represented as an `Array`. Unlike `Tuple`, which stores data row-by-row, `Nested` stores data columnar-fashion with synchronized array lengths. + +**Conceptual Model:** +- `Nested(a T1, b T2, ...)` is equivalent to multiple `Array(T1)`, `Array(T2)`, ... columns +- All arrays must have the same length for each row +- Fields are accessed using dot notation: `column.field` + +**Wire Format:** +In the Native format, `Nested` columns are **not serialized as a single unit**. Instead, they are represented as separate `Array` columns with a special naming convention: + +1. **Column Naming**: Each nested field becomes a separate column named `parent.field` +2. **Type**: Each column has type `Array(FieldType)` +3. **Serialization**: Each array column is serialized independently using `Array` serialization (see section 6.3) + +**Example: Nested structure** + +```bash +# Note: Using Array(Tuple(...)) as Nested is mostly used internally in ClickHouse +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT [(1, 'a'), (2, 'b')]::Array(Tuple(id UInt32, name String)) AS n" | xxd +``` + +``` +00000000: 0101 016e 2441 7272 6179 2854 7570 6c65 ...n$Array(Tuple +00000010: 2869 6420 5549 6e74 3332 2c20 6e61 6d65 (id UInt32, name +00000020: 2053 7472 696e 6729 2902 0000 0000 0000 String))....... +00000030: 0001 0000 0002 0000 0001 6101 62 ..........a.b +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column 0: + 0x01, 0x6e, // Column name = "n" (length=1) + 0x24, // Type name length = 36 + 0x41, 0x72, 0x72, 0x61, 0x79, 0x28, // "Array(Tuple(id UInt32, name String))" + 0x54, 0x75, 0x70, 0x6c, 0x65, 0x28, + 0x69, 0x64, 0x20, 0x55, 0x49, 0x6e, + 0x74, 0x33, 0x32, 0x2c, 0x20, 0x6e, + 0x61, 0x6d, 0x65, 0x20, 0x53, 0x74, + 0x72, 0x69, 0x6e, 0x67, 0x29, 0x29, + + // Array stream: ArraySizes + 0x02, 0x00, 0x00, 0x00, // Array size = 2 (UInt64 little-endian) + 0x00, 0x00, 0x00, 0x00, + + // Array stream: ArrayElements (Tuple serialization - sequential) + // Tuple element 0 (id UInt32): + 0x01, 0x00, 0x00, 0x00, // Value 1 + 0x02, 0x00, 0x00, 0x00, // Value 2 + + // Tuple element 1 (name String): + 0x01, 0x61, // "a" (length=1, data) + 0x01, 0x62, // "b" (length=1, data) +]); +``` + +**Implementation Note:** +- True `Nested` columns (when created with explicit `Nested` type) would be split into separate columns: `n.id Array(UInt32)` and `n.name Array(String)` +- Each would appear as a separate column in the block with the dotted name +- The synchronization of array lengths is enforced at insertion time, not in the wire format + +### 6.7 Variant + +The `Variant` type represents a value that can be one of several different types, similar to a tagged union. It stores a discriminator indicating which type the value has, along with the actual value in a sparse columnar format. + +**Type Definition:** +```sql +Variant(T1, T2, T3, ...) +``` + +**Stream Architecture:** + +`Variant` uses multiple logical streams: + +1. **VariantDiscriminators** (prefix stream for version byte + data stream for discriminators) + - Serialization mode byte (UInt64): `0` = BASIC, `1` = COMPACT + - Discriminator values identifying which variant type each row contains + +2. **VariantElements** (wrapper stream path component) + - **VariantElement** (one substream per variant type) + - Each variant's data stored sparsely (only values for rows with that discriminator) + +**Discriminator Values:** +- `0xFF` (255): NULL discriminator - the row contains NULL +- `0` to `N-1`: Global discriminator for variant types 0 through N-1 +- Discriminators are always serialized in **global order** (the order types appear in the type definition) + +**Serialization Modes:** + +The discriminators stream has two modes, chosen based on the `use_compact_variant_discriminators_serialization` setting: + +**BASIC Mode (mode = 0):** +- All discriminators written sequentially, one per row +- Each discriminator is a single `UInt8` byte +- Simple but may be inefficient when granules have uniform types + +**COMPACT Mode (mode = 1):** +- Optimizes storage when granules have uniform discriminators +- Format per granule: + ``` + + + + ``` +- Format byte values: + - `0` (PLAIN): Mixed discriminators, all written as in BASIC mode + - `1` (COMPACT): Single discriminator for all rows, written once + +**Example: Variant with String and UInt64** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_variant_type=1" \ + --data-binary "SELECT 'hello'::Variant(String, UInt64) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 1756 6172 6961 6e74 2853 ...col.Variant(S +00000010: 7472 696e 672c 2055 496e 7436 3429 0000 tring, UInt64).. +00000020: 0000 0000 0000 0005 6865 6c6c 6f ........hello +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column 0: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x17, // Type name length = 23 + 0x56, 0x61, 0x72, 0x69, 0x61, 0x6e, // "Variant(String, UInt64)" + 0x74, 0x28, 0x53, 0x74, 0x72, 0x69, + 0x6e, 0x67, 0x2c, 0x20, 0x55, 0x49, + 0x6e, 0x74, 0x36, 0x34, 0x29, + + // VariantDiscriminatorsPrefix stream: + 0x00, 0x00, 0x00, 0x00, // Serialization mode = 0 (BASIC) + 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) + + // VariantDiscriminators stream (BASIC mode): + 0x00, // Discriminator = 0 (String type) + + // VariantElements/VariantElement[0] (String): + 0x05, 0x68, 0x65, 0x6c, 0x6c, 0x6f, // String: length=5, "hello" + + // VariantElements/VariantElement[1] (UInt64): + // (empty - no values with discriminator 1) +]); +``` + +**Example: Variant with multiple rows and NULL** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_variant_type=1" \ + --data-binary "SELECT arrayJoin([42::Variant(String, UInt64), 'hello'::Variant(String, UInt64)]) AS col" | xxd +``` + +``` +00000000: 0102 0363 6f6c 1756 6172 6961 6e74 2853 ...col.Variant(S +00000010: 7472 696e 672c 2055 496e 7436 3429 0000 tring, UInt64).. +00000020: 0000 0000 0000 0100 0568 656c 6c6f 2a00 .........hello*. +00000030: 0000 0000 0000 ...... +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x02, // NumRows = 2 + + // Column metadata... + 0x03, 0x63, 0x6f, 0x6c, // "col" + 0x17, /* "Variant(String, UInt64)" */, + + // VariantDiscriminatorsPrefix: + 0x00, 0x00, 0x00, 0x00, // Mode = 0 (BASIC) + 0x00, 0x00, 0x00, 0x00, + + // VariantDiscriminators (BASIC mode, 2 rows): + 0x01, // Row 0: discriminator = 1 (UInt64) + 0x00, // Row 1: discriminator = 0 (String) + + // VariantElements/VariantElement[0] (String): + 0x05, 0x68, 0x65, 0x6c, 0x6c, 0x6f, // 1 value: "hello" + + // VariantElements/VariantElement[1] (UInt64): + 0x2a, 0x00, 0x00, 0x00, // 1 value: 42 + 0x00, 0x00, 0x00, 0x00, +]); +``` + +Note: The data for each variant is stored **sparsely** - only the actual values are written, in the order they appear. The discriminators determine which variant column each row's value came from. + +**Example: Variant with NULL** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_variant_type=1" \ + --data-binary "SELECT [NULL::Variant(String, UInt64), 'a'::Variant(String, UInt64), 1::Variant(String, UInt64)] AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 1e41 7272 6179 2856 6172 ...col.Array(Var +00000010: 6961 6e74 2853 7472 696e 672c 2055 496e iant(String, UIn +00000020: 7436 3429 2900 0000 0000 0000 0003 0000 t64))........... +00000030: 0000 0000 00ff 0001 0561 0100 0000 0000 .........a...... +00000040: 0000 .. +``` + +Key points in this example: +- Discriminator `0xff` indicates NULL +- String variant (discriminator `0x00`) has 1 value: "a" +- UInt64 variant (discriminator `0x01`) has 1 value: 1 +- Readers reconstruct the row-to-value mapping by counting discriminators per variant type + +**Sparse Variant Storage:** + +Each variant type's data column contains only the values for rows with that discriminator: +- If a `Variant(String, UInt64)` column has 100 rows where 30 are String, 60 are UInt64, and 10 are NULL +- The String variant column has 30 values +- The UInt64 variant column has 60 values +- The discriminators column has 100 discriminator bytes + +Internally, ClickHouse maintains an offsets column to map each row to its position in the appropriate variant column. This offsets column is part of the in-memory representation only and is **not serialized** in the Native wire format. Readers must reconstruct this mapping by iterating through discriminators and counting values per variant type. + +### 6.8 Dynamic + +The `Dynamic` type can store values of different types, dynamically determining the type at insertion time. It's implemented internally using a `Variant` type with an additional "shared variant" for overflow types. + +**Type Definition:** +```sql +Dynamic +Dynamic(max_types=N) +``` + +Where `max_types` (default 32) determines the maximum number of distinct types that can be stored as separate variants. Additional types are stored in a "shared variant" in binary-serialized form. + +**Stream Architecture:** + +`Dynamic` uses two main streams: + +1. **DynamicStructure** stream: + - Serialization version (UInt64) + - Type list (structure metadata) + - Optional statistics + +2. **DynamicData** stream: + - Contains the internal `Variant` column data (in standard mode) + - OR contains indexes + flattened type columns (in FLATTENED mode) + +**Serialization Versions:** + +The `DynamicStructure` stream begins with a version byte (UInt64) determining the format: + +**V1 (version = 1):** +- `` (VarUInt) - historical, now equals num_dynamic_types +- `` (VarUInt) - actual number of variant types +- `` (String for each type, excluding SharedVariant) +- Optional statistics (in MergeTree serialization) + +**V2 (version = 2):** +- Same as V1 but without the `max_dynamic_types` parameter +- Type names as strings + +**V3 (version = 4):** +- Like V2 but type names in binary format (via `encodeDataType`) +- Statistics prefixed with a bool flag indicating presence + +> **Note:** Version numbers are not sequential. V1=1, V2=2, V3=4. There is no version 3. + +**FLATTENED (version = 3):** +- Used only in Native format for client compatibility +- Structure: + ``` + ` (VarUInt) + ` (each type in binary format or as string) + ``` +- Data format: + ``` + ` (UInt8/16/32/64 column, depending on num_types + 1 for NULL) + + + ... + ``` + +**Example: Dynamic with single value** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_dynamic_type=1" \ + --data-binary "SELECT 42::Dynamic AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0744 796e 616d 6963 0100 ...col.Dynamic.. +00000010: 0000 0000 0000 0101 0653 7472 696e 6700 .........String. +00000020: 0000 0000 0000 0001 0234 32 .........42 +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column 0: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x07, 0x44, 0x79, 0x6e, 0x61, // Column type = "Dynamic" + 0x6d, 0x69, 0x63, + + // DynamicStructure stream: + 0x01, 0x00, 0x00, 0x00, // Version = 1 (V1) + 0x00, 0x00, 0x00, 0x00, + + 0x01, // num_dynamic_types = 1 (VarUInt) + 0x01, // Number of types in structure = 1 + 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (length=6) + 0x6e, 0x67, + + // DynamicData stream (internal Variant column): + // VariantDiscriminatorsPrefix: + 0x00, 0x00, 0x00, 0x00, // Mode = 0 (BASIC) + 0x00, 0x00, 0x00, 0x00, + + // VariantDiscriminators: + 0x01, // Discriminator = 1 (SharedVariant) + + // VariantElements/VariantElement[0] (String): + // (empty - no values with discriminator 0) + + // VariantElements/VariantElement[1] (SharedVariant): + 0x01, // Number of values = 1 + 0x02, 0x34, 0x32, // Binary: type + value + // Actually: encodeDataType(String) + "42" +]); +``` + +Wait, let me recheck this with the actual output. The value 42 was inferred as String type "42", and it's stored in the shared variant. + +**Example: Dynamic with NULL** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_dynamic_type=1" \ + --data-binary "SELECT NULL::Dynamic AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0744 796e 616d 6963 0100 ...col.Dynamic.. +00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ +00000020: ff . +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column metadata... + 0x03, 0x63, 0x6f, 0x6c, // "col" + 0x07, 0x44, 0x79, 0x6e, 0x61, // "Dynamic" + 0x6d, 0x69, 0x63, + + // DynamicStructure stream: + 0x01, 0x00, 0x00, 0x00, // Version = 1 + 0x00, 0x00, 0x00, 0x00, + + 0x00, // num_dynamic_types = 0 (no types) + 0x00, // Number of actual types = 0 + + // DynamicData stream (Variant): + 0x00, 0x00, 0x00, 0x00, // Mode = 0 (BASIC) + 0x00, 0x00, 0x00, 0x00, + + 0xff, // Discriminator = 0xff (NULL) +]); +``` + +**Example: Dynamic with FLATTENED serialization** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&output_format_native_use_flattened_dynamic_and_json_serialization=1&allow_experimental_dynamic_type=1" \ + --data-binary "SELECT 42::Dynamic AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0744 796e 616d 6963 0300 ...col.Dynamic.. +00000010: 0000 0000 0000 0106 5374 7269 6e67 0002 ........String.. +00000020: 3432 42 +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column metadata... + 0x03, 0x63, 0x6f, 0x6c, // "col" + 0x07, 0x44, 0x79, 0x6e, 0x61, // "Dynamic" + 0x6d, 0x69, 0x63, + + // DynamicStructure stream: + 0x03, 0x00, 0x00, 0x00, // Version = 3 (FLATTENED) + 0x00, 0x00, 0x00, 0x00, + + 0x01, // Number of types = 1 (VarUInt) + 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (length=6) + 0x6e, 0x67, + + // DynamicData stream (FLATTENED format): + // Type indexes column (UInt8, since num_types + 1 = 2 <= 256): + 0x00, // Row 0: type index = 0 (String) + + // Data for type 0 (String): + 0x02, 0x34, 0x32, // String: length=2, "42" +]); +``` + +**FLATTENED Mode Details:** + +In FLATTENED mode, the data is densely packed: +1. **Type Indexes**: A column of UInt8/16/32/64 (chosen based on number of types) where: + - `0xFF...FF` (all bits set) = NULL + - `0` to `N-1` = index into the types list + +2. **Type Data Columns**: One column per type, containing only the values of that type + - Values appear in row order, but only for rows of that type + - Reader must iterate through indexes to know which type column to read next value from + +**Standard Mode vs FLATTENED Mode:** + +- **Standard Mode** (V1/V2/V3): Uses internal `Variant` column with sparse storage, more efficient for storage +- **FLATTENED Mode**: Easier to parse for clients, all data is densely packed, index array tells which type each row has + +**Shared Variant:** + +When `Dynamic` has more distinct types than `max_types`: +- The first `max_types` types get dedicated variant columns +- Additional types are stored in binary form in a "SharedVariant" type (always the last variant) +- Each value in SharedVariant is: `encodeDataType(T) + serializeBinary(value, T)` +- On deserialization, the type is decoded from the binary prefix and the value is deserialized +### 6.9 JSON + +The `JSON` type stores semi-structured JSON data with automatic schema inference. It uses an object-oriented storage model where JSON paths are stored as separate columns, similar to how `Nested` columns work. + +**Type Definition:** +```sql +JSON +JSON(max_dynamic_paths=N) +JSON(a Int64, b String) -- with typed paths +``` + +**Storage Model:** + +`JSON` columns store data using three categories of paths: + +1. **Typed Paths**: Paths with known, fixed types (declared in type definition) +2. **Dynamic Paths**: Frequently occurring paths discovered at runtime (up to `max_dynamic_paths`) +3. **Shared Data**: Remaining paths stored in a shared `Dynamic` column + +**Stream Architecture:** + +Similar to `Dynamic`, `JSON` uses multiple streams: + +1. **ObjectStructure** stream: + - Serialization version (UInt64) + - Structure metadata (list of dynamic paths, statistics, etc.) + +2. **ObjectData** stream: + - **ObjectTypedPath** substreams: One per typed path + - **ObjectDynamicPath** substreams: One per dynamic path + - **ObjectSharedData** substream: For overflow paths + +**Serialization Versions:** + +**V1 (version = 0):** +- `` (VarUInt) +- `` (VarUInt) +- `` (String for each) +- Optional statistics + +**V2 (version = 2):** +- Same as V1 without `max_dynamic_paths` + +**V3 (version = 4):** +- Like V2 with binary type encoding and optional statistics flag + +> **Note:** JSON Object version numbers are: V1=0, STRING=1, V2=2, FLATTENED=3, V3=4. + +> **Important:** `Dynamic` columns (used for dynamic paths) have **different** version numbers: V1=1, V2=2, FLATTENED=3, V3=4. Don't confuse JSON V1 (=0) with Dynamic V1 (=1). + +**STRING (version = 1):** +- Special mode for Native format only +- JSON stored as plain String (JSON text) +- Controlled by `output_format_native_write_json_as_string=1` + +**FLATTENED (version = 3):** +- Native format only, for easier client parsing +- All paths (typed + dynamic + shared) serialized as separate columns + +**Example: JSON with STRING serialization** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1&output_format_native_write_json_as_string=1" \ + --data-binary "SELECT '{\"a\": 1}'::JSON AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 044a 534f 4e01 0000 0000 ...col.JSON..... +00000010: 0000 0007 7b22 6122 3a31 7d ....{"a":1} +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column 0: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x04, 0x4a, 0x53, 0x4f, 0x4e, // Column type = "JSON" + + // ObjectStructure stream: + 0x01, 0x00, 0x00, 0x00, // Version = 1 (STRING) + 0x00, 0x00, 0x00, 0x00, + + // ObjectData stream (String column): + 0x07, // String length = 7 + 0x7b, 0x22, 0x61, 0x22, 0x3a, // '{"a":1}' + 0x31, 0x7d, +]); +``` + +**Example: JSON with standard serialization** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ + --data-binary "SELECT '{\"a\": 1, \"b\": \"hello\"}'::JSON AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 044a 534f 4e00 0000 0000 ...col.JSON..... +00000010: 0000 0002 0201 6101 6201 0000 0000 0000 ......a.b....... +00000020: 0001 0105 496e 7436 3400 0000 0000 0000 ....Int64....... +00000030: 0001 0000 0000 0000 0001 0106 5374 7269 ............Stri +00000040: 6e67 0000 0000 0000 0000 0001 0000 0000 ng.............. +00000050: 0000 0001 0568 656c 6c6f 0000 0000 0000 .....hello...... +00000060: 0000 .. +``` + +Breakdown (simplified structure): +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column metadata... + 0x03, 0x63, 0x6f, 0x6c, // "col" + 0x04, 0x4a, 0x53, 0x4f, 0x4e, // "JSON" + + // ObjectStructure stream: + 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1) + 0x00, 0x00, 0x00, 0x00, + + 0x02, // num_dynamic_paths = 2 (VarUInt) + 0x02, // Actual count = 2 + + // Sorted dynamic paths: + 0x01, 0x61, // Path 0: "a" (length=1) + 0x01, 0x62, // Path 1: "b" (length=1) + + // Shared data serialization version: + 0x01, 0x00, 0x00, 0x00, // Version info + 0x00, 0x00, 0x00, 0x00, + + // ObjectData stream: + // Path "a" stored as Dynamic with inferred type Int64: + 0x01, // num_types = 1 + 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // "Int64" + // ... Variant discriminators ... + 0x00, 0x00, 0x00, 0x00, // Mode + 0x00, 0x00, 0x00, 0x00, + 0x01, // Discriminator + // ... Value ... + 0x01, 0x00, 0x00, 0x00, // 1 (as Int64) + 0x00, 0x00, 0x00, 0x00, + + // Path "b" stored as Dynamic with type String: + 0x01, // num_types = 1 + 0x06, 0x53, 0x74, 0x72, 0x69, // "String" + 0x6e, 0x67, + // ... Variant discriminators ... + 0x00, 0x00, 0x00, 0x00, // Mode + 0x00, 0x00, 0x00, 0x00, + 0x01, // Discriminator + // ... Value ... + 0x05, 0x68, 0x65, 0x6c, 0x6c, 0x6f, // "hello" +]); +``` + +Note: Each dynamic path is stored as a `Dynamic` column, so it includes the full `Dynamic` serialization (structure + variant data). + +**Example: JSON with typed and dynamic paths** + +This example demonstrates a `JSON` column with both a **typed path** (declared type) and a **dynamic path** (inferred type). Typed paths are serialized directly using their native format, while dynamic paths are wrapped in `Dynamic` serialization. + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ + --data-binary "SELECT '{\"a\": 42, \"b\": \"hi\"}'::JSON(a UInt32) AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0e4a 534f 4e28 6120 5549 ...col.JSON(a UI +00000010: 6e74 3332 2900 0000 0000 0000 0001 0101 nt32)........... +00000020: 6201 0000 0000 0000 0001 0106 5374 7269 b...........Stri +00000030: 6e67 0000 0000 0000 0000 2a00 0000 0102 ng........*..... +00000040: 6869 0000 0000 0000 0000 hi........ +``` + +Full breakdown: +```typescript +const block = new Uint8Array([ + // ═══════════════════════════════════════════════════════════════════ + // BLOCK HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x01, // NumColumns = 1 (VarUInt) + 0x01, // NumRows = 1 (VarUInt) + + // ═══════════════════════════════════════════════════════════════════ + // COLUMN HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (len=3) + 0x0e, // Type name length = 14 + 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x61, // "JSON(a UInt32)" + 0x20, 0x55, 0x49, 0x6e, 0x74, 0x33, // - typed path "a" with type UInt32 + 0x32, 0x29, // - typed paths are part of the type name + + // ═══════════════════════════════════════════════════════════════════ + // OBJECT STRUCTURE STREAM (prefix for JSON column) + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1 serialization) + 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) + + // V1 structure metadata: + 0x01, // max_dynamic_paths = 1 (VarUInt) + 0x01, // num_dynamic_paths = 1 (VarUInt) + + // Sorted list of dynamic path names: + 0x01, 0x62, // Path 0: "b" (len=1, then "b") + // - typed path "a" is NOT listed here + // - only runtime-discovered paths appear + + // NOTE: In Native format, statistics mode defaults to NONE, so no + // statistics are written here. The next bytes are the Dynamic prefix. + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC STRUCTURE STREAM (prefix for dynamic path "b") + // ═══════════════════════════════════════════════════════════════════ + 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) + 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) + // NOTE: Dynamic V1=1, V2=2, FLATTENED=3, V3=4 + + // V1 Dynamic structure: + 0x01, // max_dynamic_types = 1 (VarUInt) + 0x01, // num_dynamic_types = 1 (VarUInt) + + // Type names (not SharedVariant, which is implicit): + 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (len=6) + 0x6e, 0x67, + + // NOTE: Dynamic statistics also use NONE mode in Native format, + // so no statistics written here either. + + // ═══════════════════════════════════════════════════════════════════ + // VARIANT STRUCTURE STREAM (prefix for Dynamic's Variant) + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) + 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) + + // ═══════════════════════════════════════════════════════════════════ + // OBJECT DATA STREAM + // ═══════════════════════════════════════════════════════════════════ + + // --- TYPED PATH "a" (UInt32) --- + // Serialized directly using UInt32's native format (no Dynamic wrapper!) + 0x2a, 0x00, 0x00, 0x00, // value = 42 (UInt32 little-endian) + + // --- DYNAMIC PATH "b" (as Dynamic -> Variant) --- + // Variant discriminator column (1 row): + 0x01, // Row 0: discriminator = 1 (String variant) + // Variants sorted alphabetically: SharedVariant=0, String=1 + // (255 = NULL) + + // String variant data: + 0x02, 0x68, 0x69, // value = "hi" (len=2, then "hi") + + // ═══════════════════════════════════════════════════════════════════ + // SHARED DATA STREAM (overflow paths, empty in this case) + // ═══════════════════════════════════════════════════════════════════ + // Shared data is Array(Tuple(path: String, value: String)) + // For 1 row with 0 shared paths: + 0x00, 0x00, 0x00, 0x00, // Array offsets: [0] (UInt64) + 0x00, 0x00, 0x00, 0x00, // - row 0 has 0 elements +]); +``` + +**Key observations:** + +1. **Typed path "a"** (UInt32): + - Declared in type definition: `JSON(a UInt32)` + - Type name includes the typed path specification + - NOT listed in the dynamic paths list + - Serialized directly as UInt32 (4 bytes, little-endian): `2a 00 00 00` = 42 + - No `Dynamic` wrapper overhead + +2. **Dynamic path "b"** (inferred as String): + - Discovered at runtime from the JSON data + - Listed in the dynamic paths list in ObjectStructure + - Serialized as a full `Dynamic` column: + - DynamicStructure: version, types list, statistics + - VariantStructure: mode + - VariantData: discriminators + type-specific data + +3. **Serialization order in ObjectData**: + - Typed paths first (sorted alphabetically by path name) + - Dynamic paths second (sorted alphabetically by path name) + - Shared data last + +4. **Space efficiency**: Typed paths save significant space by avoiding the `Dynamic` overhead. For path "a", we use 4 bytes (UInt32) instead of ~20+ bytes (Dynamic structure + Variant + value). + +**Example: JSON with typed and dynamic paths (multiple rows)** + +This example shows the same structure with 2 rows, demonstrating how column data is laid out contiguously for each path. + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ + --data-binary "SELECT ('{\"a\": ' || toString(number * 10) || ', \"b\": \"row' || toString(number) || '\"}')::JSON(a UInt32) AS col FROM system.numbers LIMIT 2" | xxd +``` + +This produces rows: `{"a": 0, "b": "row0"}` and `{"a": 10, "b": "row1"}`. + +``` +00000000: 0102 0363 6f6c 0e4a 534f 4e28 6120 5549 ...col.JSON(a UI +00000010: 6e74 3332 2900 0000 0000 0000 0001 0101 nt32)........... +00000020: 6201 0000 0000 0000 0001 0106 5374 7269 b...........Stri +00000030: 6e67 0000 0000 0000 0000 0000 0000 0a00 ng.............. +00000040: 0000 0101 0472 6f77 3004 726f 7731 0000 .....row0.row1.. +00000050: 0000 0000 0000 0000 0000 0000 0000 .............. +``` + +Full breakdown: +```typescript +const block = new Uint8Array([ + // ═══════════════════════════════════════════════════════════════════ + // BLOCK HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x01, // NumColumns = 1 (VarUInt) + 0x02, // NumRows = 2 (VarUInt) <-- 2 rows! + + // ═══════════════════════════════════════════════════════════════════ + // COLUMN HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (len=3) + 0x0e, // Type name length = 14 + 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x61, // "JSON(a UInt32)" + 0x20, 0x55, 0x49, 0x6e, 0x74, 0x33, + 0x32, 0x29, + + // ═══════════════════════════════════════════════════════════════════ + // OBJECT STRUCTURE STREAM + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1 serialization) + 0x00, 0x00, 0x00, 0x00, + + 0x01, // max_dynamic_paths = 1 (VarUInt) + 0x01, // num_dynamic_paths = 1 (VarUInt) + 0x01, 0x62, // Path 0: "b" (len=1) + + // NOTE: No statistics in Native format (NONE mode is default) + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC STRUCTURE STREAM (for path "b") + // ═══════════════════════════════════════════════════════════════════ + 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) + 0x00, 0x00, 0x00, 0x00, // (Dynamic V1=1, V2=2, V3=4) + + 0x01, // max_dynamic_types = 1 (VarUInt) + 0x01, // num_dynamic_types = 1 (VarUInt) + 0x06, 0x53, 0x74, 0x72, 0x69, // Type 0: "String" (len=6) + 0x6e, 0x67, + + // NOTE: No Dynamic statistics in Native format (NONE mode) + + // ═══════════════════════════════════════════════════════════════════ + // VARIANT STRUCTURE STREAM + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) + 0x00, 0x00, 0x00, 0x00, + + // ═══════════════════════════════════════════════════════════════════ + // OBJECT DATA STREAM + // ═══════════════════════════════════════════════════════════════════ + + // --- TYPED PATH "a" (UInt32) - ALL ROWS CONTIGUOUS --- + 0x00, 0x00, 0x00, 0x00, // Row 0: a = 0 + 0x0a, 0x00, 0x00, 0x00, // Row 1: a = 10 + + // --- DYNAMIC PATH "b" - Variant discriminators (ALL ROWS) --- + // Variants sorted alphabetically: SharedVariant=0, String=1 (255=NULL) + 0x01, // Row 0: discriminator = 1 (String) + 0x01, // Row 1: discriminator = 1 (String) + + // --- DYNAMIC PATH "b" - String variant data (ALL ROWS) --- + 0x04, 0x72, 0x6f, 0x77, 0x30, // Row 0: "row0" (len=4) + 0x04, 0x72, 0x6f, 0x77, 0x31, // Row 1: "row1" (len=4) + + // ═══════════════════════════════════════════════════════════════════ + // SHARED DATA STREAM (Array offsets for each row) + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Row 0 offset: 0 elements + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, // Row 1 offset: 0 elements + 0x00, 0x00, 0x00, 0x00, +]); +``` + +**Key observations for multi-row data:** + +1. **Columnar layout**: All values for each path are stored contiguously: + - Typed path "a": `[0, 10]` as two consecutive UInt32 values + - Dynamic path "b" discriminators: `[1, 1]` (both rows are String) + - Dynamic path "b" String data: `["row0", "row1"]` stored sequentially + +2. **No per-row overhead**: The structure metadata (version, paths, types) is written once in the prefix, not repeated for each row. Only the actual data values scale with row count. + +3. **Variant discriminators**: For the Dynamic path, discriminators for ALL rows come first, then the actual variant data for all rows. This enables efficient columnar processing. + +4. **Shared data offsets**: One UInt64 offset per row indicating how many shared path entries that row has. With 0 shared paths, all offsets are 0. + +**Example: JSON with exceeded max_dynamic_paths (shared data)** + +This example shows what happens when the number of paths exceeds `max_dynamic_paths`. Overflow paths are stored in the **shared data** section. + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_experimental_json_type=1" \ + --data-binary "SELECT '{\"a\": 1, \"b\": 2, \"c\": 3}'::JSON(max_dynamic_paths=2) AS col" | xxd +``` + +With `max_dynamic_paths=2` and 3 paths in the JSON: +- Paths "a" and "b" become **dynamic paths** (stored as `Dynamic` columns) +- Path "c" overflows to **shared data** + +``` +00000000: 0101 0363 6f6c 194a 534f 4e28 6d61 785f ...col.JSON(max_ +00000010: 6479 6e61 6d69 635f 7061 7468 733d 3229 dynamic_paths=2) +00000020: 0000 0000 0000 0000 0202 0161 0162 0100 ...........a.b.. +00000030: 0000 0000 0000 0101 0549 6e74 3634 0000 .........Int64.. +00000040: 0000 0000 0000 0100 0000 0000 0000 0101 ................ +00000050: 0549 6e74 3634 0000 0000 0000 0000 0001 .Int64.......... +00000060: 0000 0000 0000 0000 0200 0000 0000 0000 ................ +00000070: 0100 0000 0000 0000 0163 090a 0300 0000 .........c...... +00000080: 0000 0000 .... +``` + +Full breakdown: +```typescript +const block = new Uint8Array([ + // ═══════════════════════════════════════════════════════════════════ + // BLOCK HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // ═══════════════════════════════════════════════════════════════════ + // COLUMN HEADER + // ═══════════════════════════════════════════════════════════════════ + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x19, // Type name length = 25 + // "JSON(max_dynamic_paths=2)" + 0x4a, 0x53, 0x4f, 0x4e, 0x28, 0x6d, 0x61, 0x78, 0x5f, + 0x64, 0x79, 0x6e, 0x61, 0x6d, 0x69, 0x63, 0x5f, + 0x70, 0x61, 0x74, 0x68, 0x73, 0x3d, 0x32, 0x29, + + // ═══════════════════════════════════════════════════════════════════ + // OBJECT STRUCTURE STREAM + // ═══════════════════════════════════════════════════════════════════ + 0x00, 0x00, 0x00, 0x00, // Version = 0 (V1) + 0x00, 0x00, 0x00, 0x00, + + 0x02, // max_dynamic_paths = 2 + 0x02, // num_dynamic_paths = 2 + 0x01, 0x61, // Path 0: "a" + 0x01, 0x62, // Path 1: "b" + // Note: "c" is NOT here - it's in shared data + + // NOTE: No statistics in Native format (NONE mode is default) + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC PATH "a" STRUCTURE + // ═══════════════════════════════════════════════════════════════════ + 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) + 0x00, 0x00, 0x00, 0x00, // (Dynamic V1=1, V2=2, V3=4) + 0x01, // max_dynamic_types = 1 + 0x01, // num_dynamic_types = 1 + 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // Type: "Int64" + // NOTE: No Dynamic stats (NONE mode) + 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) + 0x00, 0x00, 0x00, 0x00, + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC PATH "b" STRUCTURE (similar to "a") + // ═══════════════════════════════════════════════════════════════════ + 0x01, 0x00, 0x00, 0x00, // Dynamic version = 1 (V1) + 0x00, 0x00, 0x00, 0x00, + 0x01, // max_dynamic_types = 1 + 0x01, // num_dynamic_types = 1 + 0x05, 0x49, 0x6e, 0x74, 0x36, 0x34, // Type: "Int64" + 0x00, 0x00, 0x00, 0x00, // Variant mode = 0 (COMPACT) + 0x00, 0x00, 0x00, 0x00, + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC PATH "a" DATA (Int64 value = 1) + // ═══════════════════════════════════════════════════════════════════ + // Variants sorted alphabetically: Int64=0, SharedVariant=1 (255=NULL) + 0x00, // Discriminator = 0 (Int64 variant) + 0x01, 0x00, 0x00, 0x00, // value = 1 (Int64 little-endian) + 0x00, 0x00, 0x00, 0x00, + + // ═══════════════════════════════════════════════════════════════════ + // DYNAMIC PATH "b" DATA (Int64 value = 2) + // ═══════════════════════════════════════════════════════════════════ + 0x00, // Discriminator = 0 (Int64 variant) + 0x02, 0x00, 0x00, 0x00, // value = 2 (Int64 little-endian) + 0x00, 0x00, 0x00, 0x00, + + // ═══════════════════════════════════════════════════════════════════ + // SHARED DATA STREAM + // This is where path "c" lives (overflow from max_dynamic_paths) + // Format: Array(Tuple(path: String, value: String)) + // ═══════════════════════════════════════════════════════════════════ + + // Array offsets (one UInt64 per row): + 0x01, 0x00, 0x00, 0x00, // Row 0: offset = 1 (has 1 element) + 0x00, 0x00, 0x00, 0x00, + + // Shared data element 0 (path "c"): + 0x01, 0x63, // Path name: "c" (len=1) + + // Binary-encoded Dynamic value for "c": + 0x09, // Value string length = 9 bytes + 0x0a, // BinaryTypeIndex = 0x0a (Int64) + 0x03, 0x00, 0x00, 0x00, // value = 3 (Int64 little-endian) + 0x00, 0x00, 0x00, 0x00, +]); +``` + +**Key observations for shared data:** + +1. **Overflow mechanism**: When paths exceed `max_dynamic_paths`, extra paths go to shared data. Here "a" and "b" are dynamic paths, "c" overflows. + +2. **Shared data format**: `Array(Tuple(path: String, value: String))` + - Array offsets indicate how many shared paths each row has + - Each element is a (path_name, binary_encoded_value) tuple + +3. **Binary-encoded values in shared data**: The value is stored as a length-prefixed string containing: + - `BinaryTypeIndex` (1 byte): Type identifier (0x0a = Int64) + - Native value: The value in its type's binary format + +4. **BinaryTypeIndex values** (common types): + | Type | Index | + |------|-------| + | Nothing | 0x00 | + | UInt8 | 0x01 | + | UInt32 | 0x03 | + | UInt64 | 0x04 | + | Int64 | 0x0a | + | String | 0x15 | + | Array | 0x1e | + | JSON | 0x30 | + +5. **Space tradeoff**: Shared data is less efficient than dynamic paths because each value includes its path name and type encoding. Use `max_dynamic_paths` wisely for your data. + +**Example: Variant discriminator ordering (multiple types)** + +Variant types are sorted **alphabetically by type name** to determine discriminator values. SharedVariant is included in this sorting. Here's an example showing how discriminators are assigned: + +```sql +-- Insert multiple rows with different JSON value types +INSERT INTO test_json VALUES +('{"x": true}'), -- Bool +('{"x": 3.14}'), -- Float64 +('{"x": "hello"}'), -- String +('{"x": [1,2,3]}'), -- Array(Nullable(Int64)) +('{"x": null}') -- NULL +``` + +The Dynamic column for path "x" will have variant types sorted alphabetically: + +| Index | Type | Notes | +|-------|------|-------| +| 0 | Array(Nullable(Int64)) | "A" < "B" < ... | +| 1 | Bool | | +| 2 | Float64 | | +| 3 | SharedVariant | Implicit, always present | +| 4 | String | "Sh..." < "St..." | +| 255 | (NULL) | Special NULL_DISCRIMINATOR | + +The discriminator bytes in the serialized data will be: +``` +0x01, // Bool (index 1) +0x02, // Float64 (index 2) +0x04, // String (index 4, after SharedVariant=3) +0x00, // Array (index 0) +0xff, // NULL (255) +``` + +**Key rule**: For any type T in a Dynamic column, its discriminator = index in the alphabetically sorted list of [all_types + SharedVariant]. + +**Path Naming:** + +JSON paths use dot notation: +- `a` - top-level key "a" +- `a.b` - nested key "b" inside "a" +- `a.b.c` - deeply nested + +ClickHouse can escape dots in JSON keys when `json_type_escape_dots_in_keys=1` is set. + +**Typed Paths:** + +When a `JSON` type has typed paths declared: +```sql +JSON(user_id UInt64, name String) +``` + +These paths: +1. Always have the specified type (no inference needed) +2. Are serialized directly in their native type format (not as `Dynamic`) +3. Appear in `ObjectData/ObjectTypedPath` substreams + +**Dynamic Paths:** + +Dynamic paths are: +1. Discovered at insertion time +2. Limited by `max_dynamic_paths` (default 1024) +3. Stored as `Dynamic` columns with inferred types +4. Appear in `ObjectData/ObjectDynamicPath` substreams + +**Shared Data:** + +When more than `max_dynamic_paths` unique paths exist: +- Overflow paths are stored in a shared `Dynamic` column +- Each value in shared data includes its full path + value +- Uses binary serialization similar to `Dynamic`'s shared variant + +**Format Setting: `output_format_native_write_json_as_string`:** + +When set to `1`: +- Uses STRING serialization (version = 1) +- Entire JSON object serialized as text String +- Easier for clients that don't support complex `JSON` deserialization +- Default: `0` (use structured serialization) + +**Format Setting: `output_format_native_use_flattened_dynamic_and_json_serialization`:** + +When set to `1`: +- Uses FLATTENED serialization (version = 3) +- All paths serialized as individual columns +- No `Dynamic` wrapper, just direct column data +- Easier for some clients to parse +- Default: `0` + +**Reading JSON Columns:** + +To deserialize a `JSON` column in standard mode: + +1. Read `ObjectStructure` to get version and path list +2. For each typed path: deserialize using its known type +3. For each dynamic path: deserialize as `Dynamic` column +4. If shared data exists: deserialize the shared `Dynamic` column +5. Reconstruct JSON objects by merging values from all paths + +In STRING mode: just read the String value containing JSON text. + +In FLATTENED mode: read each path as a separate column, with a NULL indicator for missing paths. + +--- + +## 7. Geo Types + +Geographic data types in ClickHouse are type aliases that compose existing types (`Tuple`, `Array`) to represent geometric shapes. While their underlying representation uses standard composite types, these geo types have distinct names in the Native format's type string. + +**Important:** The type NAME in the Native format wire protocol is the geo type alias (`Point`, `Ring`, etc.), not the underlying composite type (`Tuple(Float64, Float64)`, `Array(Point)`, etc.). This allows parsers to recognize geographic types explicitly. + +### 7.1 Point + +`Point` represents a single 2D coordinate as a tuple of two `Float64` values (X, Y). + +**Type Alias:** +```sql +Point = Tuple(Float64, Float64) +``` + +**Wire Format:** + +The `Point` type serializes as a `Tuple` with two `Float64` elements: +1. Type string: `"Point"` (not `"Tuple(Float64, Float64)"`) +2. Stream `{TupleElement0}`: X coordinate (`Float64`, 8 bytes little-endian) +3. Stream `{TupleElement1}`: Y coordinate (`Float64`, 8 bytes little-endian) + +**Example: Single Point** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT (1.5, 2.5)::Point AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0550 6f69 6e74 0000 0000 ...col.Point.... +00000010: 0000 f83f 0000 0000 0000 0440 ...?.......@ +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column metadata: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" (String: length=3) + 0x05, 0x50, 0x6f, 0x69, // Column type = "Point" (String: length=5) + 0x6e, 0x74, // (continuation) + + // TupleElement0 stream (X coordinate): + 0x00, 0x00, 0x00, 0x00, // X = 1.5 (Float64, little-endian) + 0x00, 0x00, 0xf8, 0x3f, // IEEE 754: 0x3ff8000000000000 + + // TupleElement1 stream (Y coordinate): + 0x00, 0x00, 0x00, 0x00, // Y = 2.5 (Float64, little-endian) + 0x00, 0x00, 0x04, 0x40, // IEEE 754: 0x4004000000000000 +]); +``` + +### 7.2 Ring + +`Ring` represents a closed loop of points, typically used to define polygon boundaries. It is an array of `Point` values. + +**Type Alias:** +```sql +Ring = Array(Point) +``` + +**Wire Format:** + +The `Ring` type serializes as an `Array(Point)`: +1. Type string: `"Ring"` (not `"Array(Point)"`) +2. Stream `{ArraySizes}`: Cumulative offsets (`UInt64` per row) +3. Stream `{ArrayElements, TupleElement0}`: X coordinates of all points (flattened `Float64` values) +4. Stream `{ArrayElements, TupleElement1}`: Y coordinates of all points (flattened `Float64` values) + +**Example: Ring with 4 points** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT [(0.0, 0.0), (1.0, 0.0), (1.0, 1.0), (0.0, 0.0)]::Ring AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0452 696e 6704 0000 0000 ...col.Ring..... +00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ +00000020: 00f0 3f00 0000 0000 00f0 3f00 0000 0000 ..?.......?..... +00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ +00000040: 0000 0000 0000 0000 00f0 3f00 0000 0000 ..........?..... +00000050: 0000 00 ... +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column metadata: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x04, 0x52, 0x69, 0x6e, // Column type = "Ring" (length=4) + 0x67, + + // ArraySizes stream: + 0x04, 0x00, 0x00, 0x00, // Array size = 4 points (UInt64) + 0x00, 0x00, 0x00, 0x00, + + // ArrayElements/TupleElement0 stream (X coordinates, 4 Float64 values): + 0x00, 0x00, 0x00, 0x00, // Point[0].X = 0.0 + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, // Point[1].X = 1.0 + 0x00, 0x00, 0xf0, 0x3f, + 0x00, 0x00, 0x00, 0x00, // Point[2].X = 1.0 + 0x00, 0x00, 0xf0, 0x3f, + 0x00, 0x00, 0x00, 0x00, // Point[3].X = 0.0 + 0x00, 0x00, 0x00, 0x00, + + // ArrayElements/TupleElement1 stream (Y coordinates, 4 Float64 values): + 0x00, 0x00, 0x00, 0x00, // Point[0].Y = 0.0 + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, // Point[1].Y = 0.0 + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, // Point[2].Y = 1.0 + 0x00, 0x00, 0xf0, 0x3f, + 0x00, 0x00, 0x00, 0x00, // Point[3].Y = 0.0 + 0x00, 0x00, 0x00, 0x00, +]); +``` + +**Note:** A valid ring should have its first and last points identical (closed loop), though this is not enforced at the wire format level. + +### 7.3 Polygon + +`Polygon` represents a polygon consisting of an outer boundary ring and optional holes (inner rings). It is an array of `Ring` values. + +**Type Alias:** +```sql +Polygon = Array(Ring) +``` + +**Wire Format:** + +The `Polygon` type serializes as `Array(Ring)`, which nests as `Array(Array(Point))`: +1. Type string: `"Polygon"` (not `"Array(Ring)"`) +2. Stream `{ArraySizes}`: Offsets for outer array (number of rings) +3. Stream `{ArrayElements, ArraySizes}`: Offsets for inner arrays (points per ring) +4. Stream `{ArrayElements, ArrayElements, TupleElement0}`: All X coordinates +5. Stream `{ArrayElements, ArrayElements, TupleElement1}`: All Y coordinates + +**Example: Simple Polygon with one ring** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT [[(0.0, 0.0), (10.0, 0.0), (10.0, 10.0), (0.0, 10.0), (0.0, 0.0)]]::Polygon AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0750 6f6c 7967 6f6e 0100 ...col.Polygon.. +00000010: 0000 0000 0000 0500 0000 0000 0000 0000 ................ +00000020: 0000 0000 0000 0000 0000 0000 2440 0000 ............$@.. +00000030: 0000 0000 2440 0000 0000 0000 0000 0000 ....$@.......... +00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ +00000050: 0000 0000 0000 0000 0000 0000 2440 0000 ............$@.. +00000060: 0000 0000 2440 0000 0000 0000 0000 ....$@........ +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column metadata: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x07, 0x50, 0x6f, 0x6c, // Column type = "Polygon" (length=7) + 0x79, 0x67, 0x6f, 0x6e, + + // Outer ArraySizes (number of rings): + 0x01, 0x00, 0x00, 0x00, // 1 ring (UInt64) + 0x00, 0x00, 0x00, 0x00, + + // Inner ArraySizes (points per ring): + 0x05, 0x00, 0x00, 0x00, // Ring[0] has 5 points (UInt64) + 0x00, 0x00, 0x00, 0x00, + + // All X coordinates (5 Float64 values): + 0x00, 0x00, 0x00, 0x00, // Point[0].X = 0.0 + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, // Point[1].X = 10.0 + 0x00, 0x00, 0x24, 0x40, // (0x4024000000000000) + 0x00, 0x00, 0x00, 0x00, // Point[2].X = 10.0 + 0x00, 0x00, 0x24, 0x40, + 0x00, 0x00, 0x00, 0x00, // Point[3].X = 0.0 + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, // Point[4].X = 0.0 + 0x00, 0x00, 0x00, 0x00, + + // All Y coordinates (5 Float64 values): + 0x00, 0x00, 0x00, 0x00, // Point[0].Y = 0.0 + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, // Point[1].Y = 0.0 + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, // Point[2].Y = 10.0 + 0x00, 0x00, 0x24, 0x40, + 0x00, 0x00, 0x00, 0x00, // Point[3].Y = 10.0 + 0x00, 0x00, 0x24, 0x40, + 0x00, 0x00, 0x00, 0x00, // Point[4].Y = 0.0 + 0x00, 0x00, 0x00, 0x00, +]); +``` + +**Convention:** The first ring in a polygon is the outer boundary, and subsequent rings (if any) represent holes. + +### 7.4 MultiPolygon + +`MultiPolygon` represents a collection of polygons. It is an array of `Polygon` values. + +**Type Alias:** +```sql +MultiPolygon = Array(Polygon) +``` + +**Wire Format:** + +The `MultiPolygon` type serializes as `Array(Polygon)`, nesting three levels deep as `Array(Array(Array(Point)))`: +1. Type string: `"MultiPolygon"` (not `"Array(Polygon)"`) +2. Stream `{ArraySizes}`: Offsets for polygons +3. Stream `{ArrayElements, ArraySizes}`: Offsets for rings within each polygon +4. Stream `{ArrayElements, ArrayElements, ArraySizes}`: Offsets for points within each ring +5. Stream `{ArrayElements, ArrayElements, ArrayElements, TupleElement0}`: All X coordinates +6. Stream `{ArrayElements, ArrayElements, ArrayElements, TupleElement1}`: All Y coordinates + +**Example: MultiPolygon with two simple triangular polygons** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT [[[(0.0, 0.0), (10.0, 0.0), (10.0, 10.0), (0.0, 0.0)]], [[(20.0, 20.0), (30.0, 20.0), (30.0, 30.0), (20.0, 20.0)]]]::MultiPolygon AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0c4d 756c 7469 506f 6c79 ...col.MultiPoly +00000010: 676f 6e02 0000 0000 0000 0001 0000 0000 gon............. +00000020: 0000 0002 0000 0000 0000 0004 0000 0000 ................ +00000030: 0000 0008 0000 0000 0000 0000 0000 0000 ................ +00000040: 0000 0000 0000 0000 0024 4000 0000 0000 .........$@..... +00000050: 0024 4000 0000 0000 0000 0000 0000 0000 .$@............. +00000060: 0034 4000 0000 0000 003e 4000 0000 0000 .4@......>@..... +00000070: 003e 4000 0000 0000 0034 4000 0000 0000 .>@......4@..... +00000080: 0000 0000 0000 0000 0000 0000 0000 0000 ................ +00000090: 0024 4000 0000 0000 0000 0000 0000 0000 .$@............. +000000a0: 0034 4000 0000 0000 0034 4000 0000 0000 .4@......4@..... +000000b0: 003e 4000 0000 0000 0034 40 .>@......4@ +``` + +Breakdown omitted for brevity (follows nested array pattern with 3 levels of offsets). + +### 7.5 LineString + +`LineString` represents a sequence of connected line segments as an array of points. + +**Type Alias:** +```sql +LineString = Array(Point) +``` + +**Wire Format:** + +The `LineString` type serializes identically to `Ring` (both are `Array(Point)`): +1. Type string: `"LineString"` (not `"Array(Point)"`) +2. Stream `{ArraySizes}`: Cumulative offsets +3. Stream `{ArrayElements, TupleElement0}`: X coordinates +4. Stream `{ArrayElements, TupleElement1}`: Y coordinates + +**Example: LineString with 3 points** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT [(1.0, 2.0), (3.0, 4.0), (5.0, 6.0)]::LineString AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0a4c 696e 6553 7472 696e ...col.LineStrin +00000010: 6703 0000 0000 0000 0000 0000 0000 00f0 g............... +00000020: 3f00 0000 0000 0008 4000 0000 0000 0014 ?.......@....... +00000030: 4000 0000 0000 0000 4000 0000 0000 0010 @.......@....... +00000040: 4000 0000 0000 0018 40 @.......@ +``` + +Breakdown omitted (follows same pattern as `Ring`). + +**Note:** Unlike `Ring`, a `LineString` is not required to form a closed loop (first and last points need not match). + +### 7.6 MultiLineString + +`MultiLineString` represents a collection of line strings. It is an array of `LineString` values. + +**Type Alias:** +```sql +MultiLineString = Array(LineString) +``` + +**Wire Format:** + +The `MultiLineString` type serializes as `Array(LineString)`, nesting as `Array(Array(Point))`: +1. Type string: `"MultiLineString"` (not `"Array(LineString)"`) +2. Stream `{ArraySizes}`: Offsets for line strings +3. Stream `{ArrayElements, ArraySizes}`: Offsets for points within each line string +4. Stream `{ArrayElements, ArrayElements, TupleElement0}`: All X coordinates +5. Stream `{ArrayElements, ArrayElements, TupleElement1}`: All Y coordinates + +**Example: MultiLineString with two line strings** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT [[(1.0, 2.0), (3.0, 4.0)], [(5.0, 6.0), (7.0, 8.0)]]::MultiLineString AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0f4d 756c 7469 4c69 6e65 ...col.MultiLine +00000010: 5374 7269 6e67 0200 0000 0000 0000 0200 String.......... +00000020: 0000 0000 0000 0400 0000 0000 0000 0000 ................ +00000030: 0000 0000 f03f 0000 0000 0000 0840 0000 .....?.......@.. +00000040: 0000 0000 1440 0000 0000 0000 1c40 0000 .....@.......@.. +00000050: 0000 0000 0040 0000 0000 0000 1040 0000 .....@.......@.. +00000060: 0000 0000 1840 0000 0000 0000 2040 .....@...... @ +``` + +Breakdown omitted (follows nested array pattern). + +### 7.7 Geometry + +`Geometry` is a variant type that can hold any of the geographic types. It uses the `Variant` type structure with specific discriminant ordering for geo types. + +**Type Definition:** +```sql +Geometry = Variant(LineString, MultiLineString, MultiPolygon, Point, Polygon, Ring) +``` + +**Discriminant Mapping:** + +| Discriminant Index | Geo Type | +|--------------------|----------| +| 0 | LineString | +| 1 | MultiLineString | +| 2 | MultiPolygon | +| 3 | Point | +| 4 | Polygon | +| 5 | Ring | + +**Wire Format:** + +The `Geometry` type follows standard `Variant` serialization (see section 6.7): +1. Type string: `"Geometry"` (not `"Variant(LineString, ...)"`) +2. Stream `{VariantDiscriminatorsPrefix}`: Serialization mode (`UInt64`) +3. Stream `{VariantDiscriminators}`: Discriminator indices (mode-dependent) +4. Streams `{VariantElements, VariantElement[i], ...}`: Sparse data for each variant type + +**Important Configuration:** + +The `Geometry` type contains structurally similar variants (`LineString` and `Ring` are both `Array(Point)`), which ClickHouse considers "suspicious" for potential ambiguity. To use `Geometry`, you must enable the setting: + +```sql +SET allow_suspicious_variant_types = 1; +``` + +**Example: Geometry containing a Point** + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native&allow_suspicious_variant_types=1" \ + --data-binary "SELECT (1.0, 2.0)::Point::Geometry AS col" | xxd +``` + +``` +00000000: 0101 0363 6f6c 0847 656f 6d65 7472 7900 ...col.Geometry. +00000010: 0000 0000 0000 0003 0000 0000 0000 f03f ...............? +00000020: 0000 0000 0000 0040 .......@ +``` + +Breakdown: +```typescript +const block = new Uint8Array([ + 0x01, // NumColumns = 1 + 0x01, // NumRows = 1 + + // Column metadata: + 0x03, 0x63, 0x6f, 0x6c, // Column name = "col" + 0x08, 0x47, 0x65, 0x6f, // Column type = "Geometry" (length=8) + 0x6d, 0x65, 0x74, 0x72, + 0x79, + + // VariantDiscriminatorsPrefix stream: + 0x00, 0x00, 0x00, 0x00, // Serialization mode = 0 (BASIC) + 0x00, 0x00, 0x00, 0x00, // (UInt64 little-endian) + + // VariantDiscriminators stream (BASIC mode): + 0x03, // Discriminator = 3 (Point type) + + // VariantElements/VariantElement[3] (Point data): + // TupleElement0 (X coordinate): + 0x00, 0x00, 0x00, 0x00, // X = 1.0 (Float64) + 0x00, 0x00, 0xf0, 0x3f, + // TupleElement1 (Y coordinate): + 0x00, 0x00, 0x00, 0x00, // Y = 2.0 (Float64) + 0x00, 0x00, 0x00, 0x40, +]); +``` + +**Key Observations:** + +1. **Type Name vs. Underlying Type**: The wire format uses the geo type alias name (e.g., `"Point"`, `"Geometry"`) in the type string, not the underlying composite type name. This allows parsers to distinguish geographic types from generic tuples and arrays. + +2. **Structural Composition**: Geo types compose existing Native format features: + - `Point` uses `Tuple` streams + - `Ring`, `LineString` use `Array` streams + - `Polygon`, `MultiPolygon`, `MultiLineString` use nested `Array` streams + - `Geometry` uses `Variant` streams + +3. **Sparse Variant Storage**: In `Geometry`, only the variant element corresponding to the discriminator contains data. Other variant elements are empty. + +4. **Ambiguity Warning**: `LineString` and `Ring` have identical underlying types (`Array(Point)`), which creates potential ambiguity in `Geometry`. This is why `allow_suspicious_variant_types=1` is required. + +--- + +## 8. Vector Types (QBit) + +`QBit` is a specialized vector data type designed for efficient similarity search operations. It stores vectors of floating-point numbers in a bit-transposed format for optimized SIMD operations during distance calculations. + +#### Type Definition + +```sql +QBit(element_type, dimension) +``` + +Where: +- `element_type`: The numeric type of vector elements (`Float32`, `Float64`, or `BFloat16`) +- `dimension`: The number of elements in the vector (fixed size) + +#### Wire Format + +**Important:** The wire format differs significantly between Native and RowBinary formats. + +##### RowBinary Format + +In RowBinary format, `QBit` is serialized exactly like an `Array` of the element type: + +1. **Size** (VarUInt): Number of elements in the vector (must match the declared dimension) +2. **Elements**: Sequential float values in little-endian format + +**Example: QBit(Float32, 3) with values [1.0, 2.0, 3.0] in RowBinaryWithNamesAndTypes** + +```bash +curl -s "http://localhost:8123/?allow_experimental_qbit_type=1" \ + --data-binary "SELECT [1.0, 2.0, 3.0]::QBit(Float32, 3) AS vec FORMAT RowBinaryWithNamesAndTypes" | xxd +``` + +``` +00000000: 0103 7665 6310 5142 6974 2846 6c6f 6174 ..vec.QBit(Float +00000010: 3332 2c20 3329 0300 0080 3f00 0000 4000 32, 3)....?...@. +00000020: 0040 40 .@@ +``` + +The data portion (after type metadata) is: +``` +0x03 # Size = 3 (VarUInt) +0x00 0x00 0x80 0x3F # 1.0 as little-endian Float32 +0x00 0x00 0x00 0x40 # 2.0 as little-endian Float32 +0x00 0x00 0x40 0x40 # 3.0 as little-endian Float32 +``` + +##### Native Format (Bit-Transposed) + +In Native format, `QBit` uses a **bit-transposed encoding** that is NOT the same as Array. This format organizes data across multiple `FixedString` columns as a `Tuple` for SIMD-optimized distance calculations. + +**Example: QBit(Float32, 3) in Native format** + +```bash +curl -s "http://localhost:8123/?allow_experimental_qbit_type=1" \ + --data-binary "SELECT [1.0, 2.0, 3.0]::QBit(Float32, 3) AS vec FORMAT Native" | xxd +``` + +``` +00000000: 0101 0376 6563 1051 4269 7428 466c 6f61 ...vec.QBit(Floa +00000010: 7433 322c 2033 2900 0601 0101 0101 0101 t32, 3)......... +00000020: 0400 0000 0000 0000 0000 0000 0000 0000 ................ +00000030: 0000 0000 0000 00 ....... +``` + +The bit transposition maps: +- Bit j of element i → Bit i of bit-plane j + +This internal format is significantly more complex than simple array serialization and is designed for optimal SIMD performance during vector similarity computations. + +#### Type Encoding + +When using binary type encoding (with `output_format_native_encode_types_in_binary_format=1`): + +``` +0x36 # BinaryTypeIndex::QBit + # Encoded element type (Float32=0x0D, Float64=0x0E, BFloat16=0x31) + # Vector dimension +``` + +--- + +## 9. AggregateFunction Types + +The `AggregateFunction` type stores the intermediate state of an aggregate function. This enables incremental aggregation, state merging across distributed systems, and materialized aggregate views. + +### Type Definition + +```sql +AggregateFunction(function_name[(param1, param2, ...)], arg_type1, arg_type2, ...) +``` + +Examples: +- `AggregateFunction(sum, UInt64)` - State for `sum` over `UInt64` values +- `AggregateFunction(quantile(0.5), Float64)` - State for median calculation +- `AggregateFunction(uniqExact, String)` - State for exact unique count + +### Version Management + +Starting from revision **54452** (`DBMS_MIN_REVISION_WITH_AGGREGATE_FUNCTIONS_VERSIONING`), aggregate function states include version information. The version number corresponds to the `client_revision` and allows backward-compatible changes to serialization formats. + +**Before revision 54452:** +- States were serialized without version metadata +- Format changes required careful coordination + +**From revision 54452 onwards:** +- Each state includes a version number +- Version is used during both serialization and deserialization +- Allows gradual rollout of format improvements + +### Wire Format + +When serializing `AggregateFunction` values in Native format: + +**Type Name Format:** +``` +"AggregateFunction(function_name(param1, param2, ...), Type1, Type2, ...)" +``` + +**Binary Type Encoding (when `output_format_native_encode_types_in_binary_format=1`):** +``` +0x25 # BinaryTypeIndex::AggregateFunction + # Version number (e.g., client revision) + # Length of function name + # Function name as UTF-8 bytes + # Number of parameters + # Parameter 1 (Field binary encoding) +... + # Parameter N + # Number of argument types + # Argument type 1 (recursive type encoding) +... + # Argument type N +``` + +**State Data Format:** +Each aggregate function state is serialized as an opaque binary blob. The exact format is function-specific and controlled by the `IAggregateFunction::serialize()` and `IAggregateFunction::deserialize()` methods. + +When writing a single value: +``` + # Function-specific binary data +``` + +The state data includes: +1. Internal counters, accumulators, or intermediate values +2. May include embedded data structures (e.g., hash tables for `uniq`, sketches for `quantile`) +3. Format is optimized for the specific aggregate function + +### Example: sumState + +```bash +curl -s -XPOST "http://localhost:8123?default_format=Native" \ + --data-binary "SELECT sumState(number) AS col FROM numbers(5)" | xxd +``` + +The result contains: +``` +01 # NumColumns = 1 +01 # NumRows = 1 +03 63 6f 6c # Column name = "col" (length=3) + # Type name = "AggregateFunction(sum, UInt64)" + + # For sum(0,1,2,3,4) = 10, stored as UInt64 +``` + +For `sum` over integers, the state is simply the accumulated value. For complex functions like `quantile` or `uniq`, the state contains sophisticated data structures. + +### Bulk Serialization + +When serializing multiple aggregate function states (e.g., in a column), the format uses `serializeBatch`: + +``` + + +... + +``` + +Each state is serialized sequentially, with the aggregate function controlling the exact format through its versioned serialization methods. + +### Function-Specific State Examples + +**sum/avg**: Simple accumulator values +**uniq**: Hash table or HyperLogLog sketch +**quantile**: T-Digest or other quantile estimation structure +**groupArray**: Array of accumulated values +**topK**: Heap of top K elements with counts + +--- + +## 10. Format Settings + +The Native format behavior can be customized using several format-specific settings. These settings affect both input parsing and output serialization. + +### input_format_native_allow_types_conversion + +**Type:** Bool +**Default:** `true` +**Applies to:** Input only + +Allows automatic type conversion when reading Native format data. When enabled, ClickHouse will attempt to convert column types from the input data to match the target table schema. + +**Example Use Cases:** +- Reading older data with slightly different type definitions +- Importing data where numeric types don't exactly match (e.g., `UInt32` to `UInt64`) +- Upgrading tables with changed column types + +**When disabled (`false`):** +- Type mismatches cause errors +- Strict type checking enforced +- Better for ensuring data consistency + +```sql +SET input_format_native_allow_types_conversion = 1; +INSERT INTO table SELECT * FROM input('Native', 'col UInt64'); +``` + +### input_format_native_decode_types_in_binary_format + +**Type:** Bool +**Default:** `false` +**Applies to:** Input only + +When enabled, expects type names to be encoded in binary format (as described in `DataTypesBinaryEncoding.h`) rather than as human-readable strings. + +**Binary format benefits:** +- More compact representation (single byte for simple types) +- Faster parsing (no string parsing required) +- Unambiguous encoding + +**String format (default):** +- Human-readable type names (e.g., `"UInt32"`, `"String"`) +- More compatible with external tools +- Easier debugging + +**Example:** +```sql +-- Reading data with binary-encoded types +SET input_format_native_decode_types_in_binary_format = 1; +SELECT * FROM file('data.native', 'Native'); +``` + +### output_format_native_encode_types_in_binary_format + +**Type:** Bool +**Default:** `false` +**Applies to:** Output only + +When enabled, writes type names in binary format rather than as strings. This is the output counterpart to `input_format_native_decode_types_in_binary_format`. + +**Benefits of binary encoding:** +- Reduced output size (especially for complex nested types) +- Faster serialization +- Precise encoding of all type parameters + +**Type encoding table:** See `DataTypesBinaryEncoding.h` for the complete mapping (excerpted in section 8.1). + +**Example:** +```sql +SET output_format_native_encode_types_in_binary_format = 1; +SELECT * FROM table FORMAT Native; +``` + +**Binary vs String comparison:** +- `UInt32`: Binary = 1 byte (`0x03`), String = 11 bytes (`"UInt32"` with length prefix) +- `Array(UInt8)`: Binary = 2 bytes (`0x1E 0x01`), String = 16 bytes +- Complex nested types: Binary can be 10x smaller + +### output_format_native_write_json_as_string + +**Type:** Bool +**Default:** `false` +**Applies to:** Output only, `JSON` data type + +Controls how the new `JSON` data type is serialized in Native format. + +**When disabled (default `false`):** +- `JSON` columns use their native binary serialization format +- Preserves full type structure with typed paths +- Supports efficient querying of JSON subcolumns +- Uses internal representation with dynamic types + +**When enabled (`true`):** +- `JSON` columns are serialized as `String` columns +- Each value is a JSON string +- Simpler format, easier interoperability +- Loses type information for paths + +**Example:** +```sql +CREATE TABLE json_test (data JSON) ENGINE = Memory; +INSERT INTO json_test VALUES ('{"id": 1, "name": "test"}'); + +-- Native JSON serialization (structured) +SET output_format_native_write_json_as_string = 0; +SELECT * FROM json_test FORMAT Native; + +-- As string (JSON text) +SET output_format_native_write_json_as_string = 1; +SELECT * FROM json_test FORMAT Native; +``` + +**Use when:** +- Exporting to systems that don't understand ClickHouse's `JSON` type +- Preserving exact JSON text representation +- Simplifying data pipeline integration + +### output_format_native_use_flattened_dynamic_and_json_serialization + +**Type:** Bool +**Default:** `false` +**Applies to:** Output only, `JSON` and `Dynamic` data types + +Controls whether `JSON` and `Dynamic` columns are serialized in a flattened format where all types/paths are written as separate subcolumns. + +**When disabled (default `false`):** +- Uses compact serialization optimized for the Native format +- More efficient for storage and transmission +- Uses version 2 serialization (revision >= 54473) + +**When enabled (`true`):** +- Each typed path in `JSON` becomes a separate subcolumn +- Each possible type in `Dynamic` is stored separately +- More verbose but easier to introspect +- Better compatibility with older versions or external tools + +**Flattened format structure:** +For `JSON`: +- Each discovered path becomes a subcolumn (e.g., `data.id`, `data.name`) +- Shared offsets for nested structures +- Null maps for optional paths + +For `Dynamic`: +- Discriminator column for type selection +- Separate column for each possible type variant +- Null maps for unused variants + +**Example:** +```sql +CREATE TABLE dynamic_test (value Dynamic) ENGINE = Memory; +INSERT INTO dynamic_test VALUES (42), ('text'), ([1,2,3]); + +-- Compact serialization +SET output_format_native_use_flattened_dynamic_and_json_serialization = 0; +SELECT * FROM dynamic_test FORMAT Native; + +-- Flattened serialization +SET output_format_native_use_flattened_dynamic_and_json_serialization = 1; +SELECT * FROM dynamic_test FORMAT Native; +``` + +**Trade-offs:** +- Flattened: Easier to understand, more compatible, larger size +- Compact: More efficient, requires newer client support + +--- + +## 11. Protocol Revision Reference + +The Native format behavior varies based on the protocol revision (also called `client_revision` in the code). This enables backward compatibility while allowing the format to evolve with new features. + +### Revision Thresholds + +| Revision | Constant Name | Feature | +|----------|---------------|---------| +| 54405 | DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE | LowCardinality support | +| 54452 | DBMS_MIN_REVISION_WITH_AGGREGATE_FUNCTIONS_VERSIONING | AggFunc versioning | +| 54454 | DBMS_MIN_REVISION_WITH_CUSTOM_SERIALIZATION | Custom serialization info | +| 54465 | DBMS_MIN_REVISION_WITH_SPARSE_SERIALIZATION | Sparse columns | +| 54473 | DBMS_MIN_REVISION_WITH_V2_DYNAMIC_AND_JSON_SERIALIZATION | Dynamic/JSON v2 | +| 54480 | DBMS_MIN_REVISION_WITH_OUT_OF_ORDER_BUCKETS_IN_AGGREGATION | Out of order bucket IDs | +| 54482 | DBMS_MIN_REVISION_WITH_REPLICATED_SERIALIZATION | Replicated columns | +| 54483 | DBMS_MIN_REVISION_WITH_NULLABLE_SPARSE_SERIALIZATION | Nullable sparse | + +### Detailed Revision Explanations + +#### 54405: LowCardinality Support + +**Feature:** Introduces the `LowCardinality` type wrapper for efficient storage of low-cardinality string columns. + +**Wire Format Impact:** +- Clients with `revision >= 54405`: Receive `LowCardinality(T)` types with dictionary-encoded values +- Clients with `revision < 54405`: Server automatically converts to the underlying type `T` + +**Backward Compatibility:** +The server checks `client_revision` and calls `recursiveRemoveLowCardinality()` before sending data to old clients, ensuring they receive materialized values instead of dictionary references. + +**Example:** +```sql +-- Server with LowCardinality column +CREATE TABLE test (country LowCardinality(String)) ENGINE = Memory; + +-- Old client (revision < 54405): Receives as String +-- New client (revision >= 54405): Receives with dictionary encoding +``` + +#### 54452: Aggregate Functions Versioning + +**Feature:** Adds version metadata to `AggregateFunction` state serialization. + +**Wire Format Impact:** +- Enables versioned serialization of aggregate function states +- Version number passed to `IAggregateFunction::serialize()` and `deserialize()` +- Allows incremental improvements to state formats + +**Before this revision:** +- States serialized without version information +- Format changes required careful coordination across clusters +- Upgrades had to be all-or-nothing + +**After this revision:** +- Each state includes version metadata +- Functions can support multiple format versions +- Gradual rollout of format improvements possible + +**Version Parameter:** +The version is typically set to `client_revision` when serializing, allowing functions to choose appropriate format based on client capabilities. + +See Section 9 for detailed `AggregateFunction` serialization format. + +#### 54454: Custom Serialization Info + +**Feature:** Enables custom serialization strategies based on column statistics. + +**Wire Format Impact:** +- Server analyzes column data and derives `SerializationInfo` +- Serialization chosen based on actual data characteristics +- Examples: sparse serialization for columns with many default values + +**Process:** +1. Server calls `DataType::getSerializationInfo(column)` to analyze data +2. Server calls `DataType::getSerialization(info)` to get optimized serialization +3. Server writes serialization metadata if needed +4. Client reads metadata and uses appropriate deserialization + +**Before this revision:** +- All columns used default serialization +- No optimization based on data characteristics + +**After this revision:** +- Sparse serialization for columns with mostly defaults +- Custom strategies for specific data patterns +- Significant storage and network savings for sparse data + +**Client Impact:** +Old clients (`revision < 54454`) receive data with `getDefaultSerialization()` and all sparse columns are materialized. + +#### 54465: Sparse Serialization + +**Feature:** Introduces sparse column format for efficient storage of columns with many default/zero values. + +**Wire Format Impact:** +Sparse columns are encoded as: +1. **Offsets array**: Positions of non-default values +2. **Values array**: Only non-default values + +This dramatically reduces size for sparse data. + +**Example:** +``` +Regular column: [0, 0, 0, 0, 42, 0, 0, 0, 99, 0] +Sparse encoding: + offsets = [4, 8] + values = [42, 99] +``` + +**Backward Compatibility:** +- Clients with `revision < 54465`: Server calls `recursiveRemoveSparse()` to materialize all values +- Clients with `revision >= 54465`: Receive sparse format when beneficial + +**Applies to:** +- Numeric columns with many zeros +- Nullable columns with many NULLs +- Any column where most values equal the default + +#### 54473: Dynamic/JSON Serialization v2 + +**Feature:** Improved serialization format for `Dynamic` and `JSON` types. + +**Wire Format Impact:** +- **V1 (before 54473):** Less efficient, simpler format +- **V2 (from 54473):** Optimized binary encoding, better compression + +**Settings Affected:** +When `client_revision < 54473`, the server sets: +```cpp +settings.dynamic_serialization_version = MergeTreeDynamicSerializationVersion::V1; +settings.object_serialization_version = MergeTreeObjectSerializationVersion::V1; +``` + +**V2 Improvements:** +- More compact type discriminators +- Better handling of mixed-type data +- Optimized path encoding for JSON +- Reduced metadata overhead + +**Backward Compatibility:** +Automatic downgrade to V1 for old clients, ensuring seamless upgrades. + +#### 54480: Out of Order Buckets in Aggregation + +**Feature:** Allows bucket IDs to arrive in any order during distributed aggregation. + +**Wire Format Impact:** +This primarily affects distributed aggregation protocol rather than basic Native format. When aggregating across multiple nodes: + +**Before 54480:** +- Bucket IDs expected in sorted order +- Required sorting/buffering + +**After 54480:** +- Buckets can arrive in any order +- Reduced latency in distributed queries +- Better parallelism + +**Native Format Impact:** +Minimal direct impact on basic Native format structure, but affects how aggregate states are transmitted during distributed two-level aggregation. + +#### 54482: Replicated Serialization + +**Feature:** Support for replicated column optimization in distributed queries. + +**Wire Format Impact:** +`Replicated` columns share data across multiple identical copies to save memory. + +**Before this revision:** +- All replicated columns converted to full columns: `convertToFullColumnIfReplicated()` +- Increased memory usage + +**From this revision:** +- Clients with `revision >= 54482`: Receive replicated columns as-is +- Clients with `revision < 54482`: Server materializes replicated columns + +**Use Case:** +Common in `JOIN` operations where the same small dimension table is used repeatedly. + +#### 54483: Nullable Sparse Serialization + +**Feature:** Extends sparse serialization to `Nullable` types. + +**Wire Format Impact:** +Enables sparse encoding for nullable columns, combining NULL map sparsity with value sparsity. + +**Efficiency Gains:** +``` +Example: Nullable(UInt64) with mostly NULLs and zeros +Regular: [NULL, NULL, 0, NULL, 42, NULL, 0, 0, NULL] +Sparse: + null_offsets = [2, 4, 6, 7] # non-NULL positions + null_values = [0, 42, 0, 0] + value_offsets = [1] # non-default positions within non-NULLs + value_values = [42] +``` + +**Backward Compatibility:** +- `revision < 54483`: Server materializes sparse nullable columns +- `revision >= 54483`: Can receive optimized sparse format + +### Revision Check Logic + +The server determines serialization format based on `client_revision`: + +```cpp +if (client_revision >= DBMS_MIN_REVISION_WITH_CUSTOM_SERIALIZATION) { + // Use optimized serialization + if (client_revision < DBMS_MIN_REVISION_WITH_REPLICATED_SERIALIZATION) + column = column->convertToFullColumnIfReplicated(); + if (client_revision < DBMS_MIN_REVISION_WITH_SPARSE_SERIALIZATION) + column = recursiveRemoveSparse(column); + // etc. +} +``` + +This cascading check ensures maximum compatibility while enabling new optimizations for capable clients. + +### Current Protocol Version + +As of the source code, the current TCP protocol version is **54483** (`DBMS_TCP_PROTOCOL_VERSION`), supporting all features listed above. diff --git a/rowbinaryspec.md b/docs/rowbinaryspec.md similarity index 97% rename from rowbinaryspec.md rename to docs/rowbinaryspec.md index d8cd782..aa413cb 100644 --- a/rowbinaryspec.md +++ b/docs/rowbinaryspec.md @@ -1,1109 +1,1109 @@ -# Binary formats overview - -## Introduction - -This document covers basic concepts of three main binary formats used in ClickHouse: - -- [RowBinary](about:blank#rowbinary): row-based format. Rows and their values are listed consecutively, without separators. -- [RowBinaryWithNamesAndTypes](about:blank#rowbinarywithnamesandtypes): similar to `RowBinary`, but with [additional metadata](about:blank#header-structure) about column names and types. -- [Native](about:blank#native): columnar format streamed as [blocks](about:blank#block-structure). Each block contains a header with metadata about the columns, followed by the column data itself. - -> Do not confuse Native format with the TCP (Native) interface. -> - -## Recordings - -- [The first recording of the document review](https://drive.google.com/file/d/1TY35nih3ft_ebOdWDybmlSdKV-52mSaw/view). -- *The second and the following recordings (if any) will be added here* - -## RowBinary - -> See also: RowBinary ClickHouse docs. -> - -### Data types wire format - -> Most of the queries provided in the examples can be executed with curl with a file output. -> -> -> ```bash -> curl -XPOST "http://localhost:8123?default_format=RowBinary" \ -> --data-binary "SELECT 42 :: UInt32" \ > out.bin -> ``` -> -> Then, the data can be examined with a hex editor like [ImHex](https://github.com/WerWolv/ImHex). -> - -### Unsigned LEB128 (Little Endian Base 128) - -An **unsigned little-endian** variable-width integer is used to hint of a length for some data types, such as `String`, `Array`, `Map`, and `Tuple`. A sample implementation can be found on the [LEB128 wiki page](https://en.wikipedia.org/wiki/LEB128#Decode_unsigned_integer). - -### (U)Int8, (U)Int16, (U)Int32, (U)Int64, (U)Int128, (U)Int256 - -All integer types are encoded with an appropriate number of bytes as **little-endian**. Most languages support extracting such integers from byte arrays, using either built-in tools, or well-known libraries. - -### Bool - -Boolean values are encoded as a single byte, and can be deserialized similarly to `UInt8`. - -- `0` is `false` -- `1` is `true` - -### Float32, Float64 - -**Little-endian** floating-point numbers encoded as 4 bytes for `Float32` and 8 bytes for `Float64`. Similarly to integers, most languages provide proper tools to deserialize these values. - -### BFloat16 - -[BFloat16](https://clickhouse.com/docs/sql-reference/data-types/float#bfloat16) (Brain Floating Point) is a 16-bit floating point format with the range of Float32 and reduced precision, making it useful for machine learning workloads. The wire format is essentially the top 16 bits of a Float32 value. If your language doesn't support it natively, the easiest way to handle it is to read and write as UInt16, converting to and from Float32: - -To convert BFloat16 to Float32: - -```tsx -// Read 2 bytes as little-endian UInt16 -// Left-shift by 16 bits to get Float32 bits -const bfloat16Bits = readUInt16() -const float32Bits = bfloat16Bits << 16 -const floatValue = reinterpretAsFloat32(float32Bits) -``` - -To convert Float32 to BFloat16: - -```tsx -// Right-shift Float32 bits by 16 to truncate to BFloat16 -const float32Bits = reinterpretAsUInt32(floatValue) -const bfloat16Bits = float32Bits >> 16 -writeUInt16(bfloat16Bits) -``` - -Sample underlying values for `BFloat16`: - -```sql -SELECT CAST(1.25, 'BFloat16') -``` - -```tsx -const data = new Uint8Array([ - 0xA0, 0x3F, // 1.25 as BFloat16 -]) -``` - -### Decimal32, Decimal64, Decimal128, Decimal256 - -Decimal types are represented as **little-endian** integers with respective bit width. - -- `Decimal32` - 4 bytes, or `Int32`. -- `Decimal64` - 8 bytes, or `Int64`. -- `Decimal128` - 16 bytes, or `Int128`. -- `Decimal256` - 32 bytes, or `Int256`. - -When deserializing a Decimal value, the whole and fractional parts can be derived using the following pseudocode: - -```tsx -let scale_multipler = 10 ** scale -let whole_part = ~~(value / scale_multiplier) -let fractional_part = value % scale_multiplier -let result = Decimal(whole_part, fractional_part) -``` - -Where `~~` is the “fast” floor division operator, and `scale` is the number of digits after the decimal point. For example, for `Decimal(10, 2)` (an equivalent to `Decimal32(2)`), the scale is `2`, and the value `12345` will be represented as `(123, 45)`. - -Serialization requires the reverse operation: - -```tsx -let scale_multiplier = 10 ** scale -let result = whole_part * scale_multiplier + fractional_part -``` - -See more details in the [Decimal types ClickHouse docs](https://clickhouse.com/docs/sql-reference/data-types/decimal). - -### String - -Encoded in two parts: - -1. A variable-length integer (LEB128) that indicates the length of the string in bytes. -2. The string itself, encoded as a sequence of bytes. - -For example, a string `foobar` will be encoded using *seven* bytes as follows: - -```tsx -const data = new Uint8Array([ - 0x06, // LEB128 length of the string (6) - 0x66, // 'f' - 0x6f, // 'o' - 0x6f, // 'o' - 0x62, // 'b' - 0x61, // 'a' - 0x72, // 'r' -]) -``` - -### FixedString - -Unlike `String`, `FixedString` has a fixed length, which is defined in the schema. It is encoded as a sequence of bytes, padded with zeroes if necessary. - -An empty `FixedString(3)` contains only padding zeroes: - -```tsx -const data = new Uint8Array([0x00, 0x00, 0x00]) -``` - -Non-empty `FixedString(3)` containing the string `hi`: - -```tsx -const data = new Uint8Array([ - 0x68, // 'h' - 0x69, // 'i' - 0x00, // padding zero -]) -``` - -Non-empty `FixedString(3)` containing the string `bar`: - -```tsx -const data = new Uint8Array([ - 0x62, // 'b' - 0x61, // 'a' - 0x72, // 'r' -]) -``` - -No padding is required in the last example, since all *three* bytes are used. - -### Date - -Stored as `UInt16` (two bytes) representing the number of days ***since*** `1970-01-01`. - -Supported range of values: `[1970-01-01, 2149-06-06]`. - -Sample underlying values for `Date`: - -- TODO - -### Date32 - -Stored as `Int32` (four bytes) representing the number of days ***before or after*** `1970-01-01`. - -Supported range of values: `[1900-01-01, 2299-12-31]` - -Sample underlying values for `Date32`: - -- TODO - -### DateTime - -Stored as `UInt32` (four bytes) representing the number of seconds ***since*** `1970-01-01 00:00:00 UTC`. - -Syntax: - -``` -DateTime([timezone]) -``` - -For example, `DateTime` or `DateTime('UTC')`. - -Supported range of values: `[1970-01-01 00:00:00, 2106-02-07 06:28:15]`. - -Sample underlying values for `DateTime`: - -- TODO - -### DateTime64 - -Stored as `Int64` (eight bytes) representing the number of **ticks** ***before or after*** `1970-01-01 00:00:00 UTC`. Tick resolution is defined by the `precision` parameter, see the syntax below: - -``` -DateTime64(precision, [timezone]) -``` - -Where `precision` is an integer from `0` to `9`. Typically, only the following are used: `3` (milliseconds), `6` (microseconds), -`9` (nanoseconds). - -Examples of valid DateTime64 definitions: `DateTime64(0)`, `DateTime64(3)`, `DateTime64(6, 'UTC')`, or `DateTime64(9, 'Europe/Amsterdam')`. - -> [!NOTE] -Variants like DateTime64(2) or DateTime64(5, 'UTC') are also valid, while not commonly used. -> - -The underlying `Int64` value of the `DateTime64` type could be used as number of the following units before or after the UNIX epoch: - -- `DateTime64(0)` - seconds. -- `DateTime64(3)` - milliseconds. -- `DateTime64(6)` - microseconds. -- `DateTime64(9)` - nanoseconds. - -Supported range of values: `[1900-01-01 00:00:00, 2299-12-31 23:59:59.99999999]`. - -Sample underlying values for `DateTime64`: - -- `DateTime64(3)`: value `1546300800000` represents `2019-01-01 00:00:00 UTC`. -- TODO - -> [!NOTE] -The precision of the maximum value is 8. If the maximum precision of 9 digits (nanoseconds) is used, the maximum supported value is 2262-04-11 23:47:16 in UTC. -> - -### Time - -Stored as `Int32` representing a time value in seconds. Negative values are valid. - -Supported range of values: `[-999:59:59, 999:59:59]` (i.e., `[-3599999, 3599999]` seconds). - -> [!NOTE] -At the moment, the setting enable_time_time64_type must be set to 1 to use Time or Time64. -> - -Sample underlying values for `Time`: - -```sql -SET enable_time_time64_type = 1; -SELECT CAST('15:32:16', 'Time') AS t -``` - -```tsx -const data = new Uint8Array([ - 0x80, 0xDA, 0x00, 0x00, // 55936 seconds = 15:32:16 -]) -``` - -### Time64 - -Internally stored as a `Decimal64` (which is stored as `Int64`) representing a time value with fractional seconds, with configurable precision. Negative values are valid. - -Syntax: - -``` -Time64(precision) -``` - -Where `precision` is an integer from `0` to `9`. Common values: `3` (milliseconds), `6` (microseconds), `9` (nanoseconds). - -Supported range of values: `[-999:59:59.xxxxxxxxx, 999:59:59.xxxxxxxxx]`. - -> [!NOTE] -At the moment, the setting enable_time_time64_type must be set to 1 to use Time or Time64. -> - -The underlying `Int64` value represents fractional seconds scaled by `10^precision`. - -Sample underlying values for `Time64`: - -```sql -SET enable_time_time64_type = 1; -SELECT CAST('15:32:16.123456', 'Time64(6)') AS t -``` - -```tsx -const data = new Uint8Array([ - 0x40, 0x82, 0x0D, 0x06, - 0x0D, 0x00, 0x00, 0x00, // 55936123456 as Int64 -]) -// 55936123456 / 10^6 = 55936.123456 seconds = 15:32:16.123456 -``` - -### Enum8, Enum16 - -Stored as a single byte (`Enum8` == `UInt8`) or two bytes (`Enum16` == `UInt16`) representing the index of the enum value in the enum definition. - -Note that an Enum can be defined in a fairly simple way, like this: - -```sql -SELECT 1 :: Enum8('hello' = 1, 'world' = 2) AS e; -``` - -``` - ┌─e─────┐ -1. │ hello │ - └───────┘ -``` - -The Enum8 defined above will have the following values map on the client: - -``` -Map { - 1: 'hello', - 2: 'world' -} -``` - -Or in a more complex way, like this: - -```sql -SELECT 42 :: Enum16('f\'' = 1, 'x =' = 2, 'b\'\'' = 3, '\'c=4=' = 42, '4' = 1234) AS e; -``` - -``` - ┌─e─────┐ -1. │ 'c=4= │ - └───────┘ -``` - -The Enum16 defined above will have the following values map on the client: - -``` -Map { - 1: 'f\'', - 2: 'x =', - 3: 'b\'', - 42: '\'c=4=', - 1234: '4' -} -``` - -As you can see, for the data type parser, the main challenge is to track the escaped symbols in the enum definition, such as `\'`, and special symbols like `=` that may appear in the quotes. - -### UUID - -Represented as a sequence of 16 bytes. - -Sample underlying values for `UUID`: - -- `61f0c404-5cb3-11e7-907b-a6006ad3dba0` is represented as: - -```tsx -const data = new Uint8Array([ - 0xE7, 0x11, 0xB3, 0x5C, 0x04, 0xC4, 0xF0, 0x61, - 0xA0, 0xDB, 0xD3, 0x6A, 0x00, 0xA6, 0x7B, 0x90, -]) -``` - -- The default UUID `00000000-0000-0000-0000-000000000000` is represented as 16 zero bytes: - -```tsx -const data = new Uint8Array([ - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, - ]) -``` - -It can be used when a new record was inserted, but the UUID value was not specified. - -### IPv4 - -Stored in four bytes as `UInt32`. Sample underlying values for `IPv4`: - -```sql -SELECT - CAST('0.0.0.0', 'IPv4') AS a, - CAST('127.0.0.1', 'IPv4') AS b, - CAST('192.168.0.1', 'IPv4') AS c, - CAST('255.255.255.255', 'IPv4') AS d, - CAST('168.212.226.204', 'IPv4') AS e -``` - -```tsx -const data = new Uint8Array([ - 0x00, 0x00, 0x00, 0x00, // 0.0.0.0 - 0x01, 0x00, 0x00, 0x7f, // 127.0.0.1 - 0x01, 0x00, 0xa8, 0xc0, // 192.168.0.1 - 0xff, 0xff, 0xff, 0xff, // 255.255.255.255 - 0xcc, 0xe2, 0xd4, 0xa8, // 168.212.226.204 -]) -``` - -### IPv6 - -Stored in 16 bytes, *but it does not correspond to `(U)Int128`*. Sample underlying values for `IPv6`: - -```sql -SELECT - CAST('2a02:aa08:e000:3100::2', 'IPv6') AS a, - CAST('2001:44c8:129:2632:33:0:252:2', 'IPv6') AS b, - CAST('2a02:e980:1e::1', 'IPv6') AS c -``` - -```tsx -const data = new Uint8Array([ - // 2a02:aa08:e000:3100::2 - 0x2A, 0x02, 0xAA, 0x08, 0xE0, 0x00, 0x31, 0x00, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, - // 2001:44c8:129:2632:33:0:252:2 - 0x20, 0x01, 0x44, 0xC8, 0x01, 0x29, 0x26, 0x32, - 0x00, 0x33, 0x00, 0x00, 0x02, 0x52, 0x00, 0x02, - // 2a02:e980:1e::1 - 0x2A, 0x02, 0xE9, 0x80, 0x00, 0x1E, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, -]) -``` - -### Nullable - -A nullable data type is encoded as follows: - -1. A single byte that indicates whether the value is `NULL` or not: - - `0x00` means the value is not `NULL`. - - `0x01` means the value is `NULL`. -2. If the value is not `NULL`, the underlying data type is encoded as usual. - -For example, a `Nullable(UInt32)` value: - -```sql -SELECT CAST(42, 'Nullable(UInt32)') AS a, - CAST(NULL, 'Nullable(UInt32)') AS b -``` - -```tsx -const data = new Uint8Array([ - 0x00, // Not NULL - the value follows - 0x2A, 0x00, 0x00, 0x00, // UInt32(42) - 0x01, // NULL - nothing follows -]) -``` - -### LowCardinality - -Low-cardinality marker does not affect the wire format. For example, a `LowCardinality(String)` is encoded the same way as a regular `String`. - -> [!NOTE] -A column can be defined as LowCardinality(Nullable(T)), but it is not possible to define it as Nullable(LowCardinality(T)) - it will always result in an error from the server. -> - -While testing, [allow_suspicious_low_cardinality_types](https://clickhouse.com/docs/operations/settings/settings#allow_suspicious_low_cardinality_types) can be set to `1` to allow most of the data types inside `LowCardinality` for better coverage. - -### Array - -An array is encoded as follows: - -1. A [variable-length integer (LEB128)](about:blank#unsigned-leb128-little-endian-base-128) that indicates the number of elements in the array. -2. The elements of the array, encoded in the same way as the underlying data type. - -For example, an array with `UInt32` values: - -```sql -SELECT CAST(array(1, 2, 3), 'Array(UInt32)') AS arr -``` - -```tsx -const data = new Uint8Array([ - 0x03, // LEB128 - the array has 3 elements - 0x01, 0x00, 0x00, 0x00, // UInt32(1) - 0x02, 0x00, 0x00, 0x00, // UInt32(2) - 0x03, 0x00, 0x00, 0x00, // UInt32(3) -]) -``` - -A slightly more complex example: - -```sql -SELECT array('foobar', 'qaz') AS arr -``` - -```tsx -const data = new Uint8Array([ - 0x02, // LEB128 - the array has 2 elements - 0x06, // LEB128 - the first string has 6 bytes - 0x66, 0x6f, 0x6f, - 0x62, 0x61, 0x72, // 'foobar' - 0x03, // LEB128 - the second string has 3 bytes - 0x71, 0x61, 0x7a, // 'qaz' -]) -``` - -> [!NOTE] -An array can contain nullable values, but the array itself cannot be nullable. -> - -The following is valid: - -```sql -SELECT CAST([NULL, 'foo'], 'Array(Nullable(String))') AS arr; -``` - -``` - ┌─arr──────────┐ -1. │ [NULL,'foo'] │ - └──────────────┘ -``` - -And it will be encoded as follows: - -```tsx -const data = new Uint8Array([ - 0x02, // LEB128 - the array has 2 elements - 0x01, // Is NULL - nothing follows for this element - 0x00, // Is NOT NULL - the data follows - 0x03, // LEB128 - the string has 3 bytes - 0x66, 0x6f, 0x6f, // 'foo' -]) -``` - -An example of dealing with multidimensional arrays can be found in the [Geo section](about:blank#geo). - -### Tuple - -A tuple is encoded as all elements of the tuple following each other in their corresponding wire format without any additional meta-information or delimiters. - -```sql -CREATE OR REPLACE TABLE foo -( - `t` Tuple( - UInt32, - String, - Array(UInt8) - ) -) -ENGINE = Memory; -INSERT INTO foo VALUES ((42, 'foo', array(99, 144))); -``` - -```tsx -const data = new Uint8Array([ - 0x2a, 0x00, 0x00, 0x00, // 42 as UInt32 - 0x03, // LEB128 - the string has 3 bytes - 0x66, 0x6f, 0x6f, // 'foo' - 0x02, // LEB128 - the array has 2 elements - 0x63, // 99 as UInt8 - 0x90, // 144 as UInt8 -]) -``` - -The string encoding of the tuple data type presents the similar challenges as with the [Enum type](about:blank#enum8-enum16), such as tracking the escaped symbols and special characters; now, with Tuple it is also required to track open and closing parenthesis. Additionally, note that the most complex Tuples can contain other nested Tuples, Arrays, Maps, and even enums. - -For example, in the following table, the tuple contains an enum with a tick and parenthesis in the name, which can cause parsing issues if not handled properly: - -```sql -CREATE OR REPLACE TABLE foo -( - `t` Tuple( - Enum8('f\'()' = 0), Array(Nullable(Tuple(UInt32, String))) ))ENGINE = Memory; -``` - -### Map - -A map can be viewed as an `Array(Tuple(K, V))`, where `K` is the key type and `V` is the value type. The map is encoded as follows: - -1. A [variable-length integer (LEB128)](about:blank#unsigned-leb128-little-endian-base-128) that indicates the number of elements in the map. -2. The elements of the map as key-value pairs, encoded as their corresponding types. - -For example, a map with `String` keys and `UInt32` values: - -```sql -SELECT CAST(map('foo', 1, 'bar', 2), 'Map(String, UInt32)') AS m -``` - -```tsx -const data = new Uint8Array([ - 0x02, // LEB128 - the map has 2 elements - 0x03, // LEB128 - the first key has 3 bytes - 0x66, 0x6f, 0x6f, // 'foo' - 0x01, 0x00, 0x00, 0x00, // UInt32(1) - 0x03, // LEB128 - the second key has 3 bytes - 0x62, 0x61, 0x72, // 'bar' - 0x02, 0x00, 0x00, 0x00, // UInt32(2) -]) -``` - -> It is possible to have maps with deeply nested structures, such as Map(String, Map(Int32, Array(Nullable(String)))), which will be encoded similarly to what is described above. -> - -### Variant - -This type represents a union of other data types. Type `Variant(T1, T2, ..., TN)` means that each row of this type has a value of either type `T1` or `T2` or … or `TN` or none of them (`NULL` value). - -> [!WARNING] -While for the end user Variant(T1, T2) means exactly the same as Variant(T2, T1), the order of types in the definition matters for the wire format: the types in the definition are always sorted alphabetically, and this is important, since the exact variant is encoded by a “discriminant” - the data type index in the definition. -> - -Consider the following example: - -```sql -SET allow_experimental_variant_type = 1, - allow_suspicious_variant_types = 1; -CREATE OR REPLACE TABLE foo -( - -- it does not matter what is the order of types in the user input -- the types are always sorted alphabetically in the wire format `var` Variant( - Array(Int16), - Bool, - Date, - FixedString(6), - Float32, Float64, - Int128, Int16, Int32, Int64, Int8, - String, - UInt128, UInt16, UInt32, UInt64, UInt8 - ) -) -ENGINE = MergeTree -ORDER BY (); -INSERT INTO foo VALUES(true), ('foo'), (CAST(100, 'Int128')), (array(1,2,3)); -SELECT * FROM foo; -``` - -```tsx -const data = new Uint8Array([ - 0x01, // type index -> Bool - 0x01, // true - 0x03, // type index -> FixedString(6) - 0x66, 0x6F, 0x6F, 0x62, 0x61, 0x72, // 'foobar' - 0x05, // type index -> Float64 - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x20, 0x59, 0x40, // 100.5 as Float64 - 0x06, // type index -> Int128 - 0x64, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, - 0x00, 0x00, 0x00, 0x00, // 100 as Int128 - 0x00, // type index -> Array(Int16) - 0x03, // LEB128 - the array has 3 elements - 0x01, 0x00, // 1 as Int16 - 0x02, 0x00, // 2 as Int16 - 0x03, 0x00, // 3 as Int16 -]) -``` - -[allow_suspicious_variant_types](https://clickhouse.com/docs/operations/settings/settings#allow_suspicious_variant_types) setting can be used to allow more exhaustive testing of the `Variant` type. - -### Dynamic - -The `Dynamic` type can hold values of any type, determined at runtime. In RowBinary format, each value is self-describing: the first part is the type specification in [**this format](https://clickhouse.com/docs/sql-reference/data-types/data-types-binary-encoding).** The contents then follow, with the value encoding as described in this document. So to parse you you just need to use the type index to determine the right parser and then re-use the RowBinary parsing you already have elsewhere. - -``` -[BinaryTypeIndex][type-specific parameters...][value] -``` - -Where `BinaryTypeIndex` is a single byte identifying the type. See the reference [here](https://clickhouse.com/docs/sql-reference/data-types/data-types-binary-encoding) for the type indices and parameters. - -**Examples:** - -```sql -SELECT 42::Dynamic -``` - -``` -0a # BinaryTypeIndex: Int64 (0x0A) -2a 00 00 00 00 00 00 00 # Int64 value: 42 -``` - -```sql -SELECT toDateTime64('2024-01-15 10:30:00', 3, 'America/New_York')::Dynamic -``` - -``` -14 # BinaryTypeIndex: DateTime64WithTimezone (0x14) -03 # UInt8: precision -10 # VarUInt: timezone name length -41 6d 65 72 69 63 61 2f # "America/" -4e 65 77 5f 59 6f 72 6b # "New_York" -c0 6c be 0d 8d 01 00 00 # Int64: timestamps -``` - -### JSON - -The JSON type encodes data in two distinct categories: - -1. **Typed Paths** - Paths declared with explicit types in the schema (e.g., `JSON(user_id UInt32, name String)`) -2. **Dynamic Paths/Overflow paths when dynamic path limit is exceeded** - Runtime-discovered paths stored as `Dynamic` type. The value encoding is preceded by the type definition. - -The wire format and rules are different for these two categories. - -| Path Category | Included in Serialization | Value Encoding | Variant/Nullable allowed | -| --- | --- | --- | --- | -| **Typed paths** | Always (even if NULL) | Type-specific binary format | Yes | -| **Dynamic paths** | Only if non-null | Dynamic | No | - -Each JSON row in RowBinary format is serialized as: - -``` -[VarUInt: number_of_paths] -[String: path_1][value_1] -[String: path_2][value_2] -... -``` - -**Examples:** - -1. Simple JSON with typed paths only: - -Schema: `JSON(user_id UInt32, active Bool)` - -Row: `{"user_id": 42, "active": true}` - -Binary encoding (hex with annotations): - -``` -02 # VarUInt: 2 paths total - -# Typed path "active" -06 61 63 74 69 76 65 # String: "active" (length 6 + bytes) -01 # Bool/UInt8 value: true (1) - -# Typed path "user_id" -07 75 73 65 72 5F 69 64 # String: "user_id" (length 7 + bytes) -2A 00 00 00 # UInt32 value: 42 (little-endian) -``` - -1. Simple JSON with typed and dynamic paths: - -Schema: `JSON(user_id UInt32, active Bool)` - -Row: `{"user_id": 42, "active": true, "name": "Alice"}` - -Binary encoding (hex with annotations): - -``` -03 # VarUInt: 3 paths total - -# Typed path "active" -06 61 63 74 69 76 65 # String: "active" (length 6 + bytes) -01 # Bool/UInt8 value: true (1) - -# Dynamic path "name" -04 6E 61 6D 65 # String: "name" (length 4 + bytes) -15 # BinaryTypeIndex: String (0x15) -05 41 6C 69 63 65 # String value: "Alice" (length 5 + bytes) - -# Typed path "user_id" -07 75 73 65 72 5F 69 64 # String: "user_id" (length 7 + bytes) -2A 00 00 00 # UInt32 value: 42 (little-endian) - -``` - -1. Null handling: - -With typed nullable column you get null: - -Schema: `JSON(score Nullable(Int32))` - -Row: `{"score": null }` - -Binary encoding (hex with annotations): - -``` -02 # VarUInt: 2 paths total - -# Typed path "score" (Nullable) -05 73 63 6f 72 65 # String: "score" (length 5 + bytes) -01 # Nullable flag: 1 (is NULL, no value follows) -``` - -With typed non-nullable column, you get the default value: - -JSON(name String) - -Schema: `JSON(name String)` - -Row: `{"name": null}` - -Binary encoding: - -``` -01 # VarUInt: 1 path (dynamic NULL paths are skipped!) - -04 6e 61 6d 65 # "name" -00 # String length 0 (empty string) -``` - -With dynamic path, it is ignored: - -Schema: `JSON(id UInt64)` - -Row: `{"id": 100, "metadata": null}` - -Binary encoding: - -``` -01 # VarUInt: 1 path (dynamic NULL paths are skipped!) - -# Typed path "id" -02 69 64 # String: "id" (length 2 + bytes) -64 00 00 00 00 00 00 00 # UInt64 value: 100 (little-endian) - -``` - -Note: The `metadata` path with NULL value is **not included** because dynamic paths are only serialized when non-null. This is a key difference from typed paths. - -1. Nested JSON objects: - -Schema: `JSON()` - -Row: `{"user": {"name": "Bob", "age": 30}}` - -Binary encoding (hex with annotations): - -``` -02 # VarUInt: 2 paths (nested objects are flattened) - -# Dynamic path "user.age" -08 75 73 65 72 2E 61 67 65 # String: "user.age" (length 8 + bytes) -0A # BinaryTypeIndex: Int64 (0x0A) -1E 00 00 00 00 00 00 00 # Int64 value: 30 (little-endian) - -# Dynamic path "user.name" -09 75 73 65 72 2E 6E 61 6D 65 # String: "user.name" (length 9 + bytes) -15 # BinaryTypeIndex: String (0x15) -03 42 6F 62 # String value: "Bob" (length 3 + bytes) - -``` - -Note: Nested objects are flattened into dot-separated paths (e.g., `user.name` instead of a nested structure). - -**Alternative: JSON as String Mode** - -With the setting `output_format_binary_write_json_as_string=1`, JSON columns are serialized as a single JSON text string instead of the structured binary format. There is a corresponding setting for writing to JSON columns, `input_format_binary_read_json_as_string`. The choice of setting here comes down to whether you want to parse the json in the client or the server. - -### Geo types - -Geo is a category of data types that represent geographical data. It includes: - -- `Point` - as `Tuple(Float64, Float64)`. -- `Ring` - as `Array(Point)`, or `Array(Tuple(Float64, Float64))`. -- `Polygon` - as `Array(Ring)`, or `Array(Array(Tuple(Float64, Float64)))`. -- `MultiPolygon` - as `Array(Polygon)`, or `Array(Array(Array(Tuple(Float64, Float64))))`. -- `LineString` - as `Array(Point)`, or `Array(Tuple(Float64, Float64))`. -- `MultiLineString` - as `Array(LineString)`, or `Array(Array(Tuple(Float64, Float64)))`. - -The wire format of the Geo values is exactly the same as with Tuple and Array. [RowBinaryWithNamesAndTypes](about:blank#rowbinarywithnamesandtypes) -and [Native](about:blank#native) formats headers will contain the aliases for these types, e.g., `Point`, `Ring`, `Polygon`, `MultiPolygon`, `LineString`, and `MultiLineString`. - -```sql -SELECT (1.0, 2.0) :: Point AS point, - [(3.0, 4.0), (5.0, 6.0)] :: Ring AS ring, - [[(7.0, 8.0), (9.0, 10.0)], [(11.0, 12.0)]] :: Polygon AS polygon, - [[[(13.0, 14.0), (15.0, 16.0)], [(17.0, 18.0)]]] :: MultiPolygon AS multi_polygon, - [(19.0, 20.0), (21.0, 22.0)] :: LineString AS line_string, - [[(23.0, 24.0), (25.0, 26.0)], [(27.0, 28.0)]] :: MultiLineString AS multi_line_string -``` - -```tsx -const data = new Uint8Array([ - // Point - or Tuple(Float64, Float64) - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xF0, 0x3F, // Point.X - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, // Point.Y - // Ring - or Array(Tuple(Float64, Float64)) - 0x02, // LEB128 - the "ring" array has 2 points - // Ring - Point #1 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x40, - // Ring - Point #2 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x14, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x18, 0x40, - // Polygon - or Array(Array(Tuple(Float64, Float64))) - 0x02, // LEB128 - the "polygon" array has 2 rings - 0x02, // LEB128 - the first ring has 2 points - // Polygon - Ring #1 - Point #1 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x1C, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x40, - // Polygon - Ring #1 - Point #2 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x22, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x24, 0x40, - 0x01, // LEB128 - the second ring has 1 point - // Polygon - Ring #2 - Point #1 (the only one) - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x26, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x28, 0x40, - // MultiPolygon - or Array(Array(Array(Tuple(Float64, Float64)))) - 0x01, // LEB128 - the "multi_polygon" array has 1 polygon - 0x02, // LEB128 - the first polygon has 2 rings - 0x02, // LEB128 - the first ring has 2 points - // MultiPolygon - Polygon #1 - Ring #1 - Point #1 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2A, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2C, 0x40, - // MultiPolygon - Polygon #1 - Ring #1 - Point #2 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2E, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x30, 0x40, - 0x01, // LEB128 - the second ring has 1 point - // MultiPolygon - Polygon #1 - Ring #2 - Point #1 (the only one) - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x31, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x32, 0x40, - // LineString - or Array(Tuple(Float64, Float64)) - 0x02, // LEB128 - the line string has 2 points - // LineString - Point #1 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x33, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x34, 0x40, - // LineString - Point #2 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x35, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x36, 0x40, - // MultiLineString - or Array(Array(Tuple(Float64, Float64))) - 0x02, // LEB128 - the multi line string has 2 line strings - 0x02, // LEB128 - the first line string has 2 points - // MultiLineString - LineString #1 - Point #1 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x37, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x38, 0x40, - // MultiLineString - LineString #1 - Point #2 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x39, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x3A, 0x40, - 0x01, // LEB128 - the second line string has 1 point - // MultiLineString - LineString #2 - Point #1 (the only one) - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x3B, 0x40, - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x3C, 0x40, -]) -``` - -### Geometry - -`Geometry` is a `Variant` type that can hold any of the Geo types listed above. On the wire, it is encoded exactly like a `Variant`, with a discriminant byte indicating which geo type follows. - -The discriminant indices for Geometry are: - -| Index | Type | -| --- | --- | -| 0 | LineString | -| 1 | MultiLineString | -| 2 | MultiPolygon | -| 3 | Point | -| 4 | Polygon | -| 5 | Ring | - -Wire format structure: - -```tsx -// 1 byte discriminant (0-5) -// followed by the corresponding geo type data -``` - -Sample encoding of a `Point` as `Geometry`: - -```sql -SELECT ((1.0, 2.0)::Point)::Geometry -``` - -```tsx -const data = new Uint8Array([ - 0x03, // discriminant = 3 (Point) - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xF0, 0x3F, // Point.X = 1.0 as Float64 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, // Point.Y = 2.0 as Float64 -]) -``` - -Sample encoding of a `Ring` as `Geometry`: - -```tsx -const data = new Uint8Array([ - 0x05, // discriminant = 5 (Ring) - 0x02, // LEB128 - array has 2 points - // Point #1 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x40, // X = 3.0 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x40, // Y = 4.0 - // Point #2 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x14, 0x40, // X = 5.0 - 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x18, 0x40, // Y = 6.0 -]) -``` - -### Nested - -In the wire format, `Nested` is represented as a sequence of arrays, where each array corresponds to a column in the nested structure. Each array is encoded as described in the [Array](about:blank#array) section. - -```sql -CREATE OR REPLACE TABLE foo -( - n Nested ( - a String, - b Int32 - ) -) ENGINE = MergeTree ORDER BY (); -INSERT INTO foo VALUES (array('foo', 'bar'), array(42, 144)) -``` - -Note that even the `DESCRIBE TABLE foo` will show the `Nested` structure as a sequence of arrays: - -``` -DESCRIBE TABLE foo - - ┌─name─┬─type──────────┐ -1. │ n.a │ Array(String) │ -2. │ n.b │ Array(Int32) │ - └──────┴───────────────┘ -``` - -Sample wire format for the `Nested` structure above: - -```tsx -const data = new Uint8Array([ - 0x02, // LEB128 - 2 String elements in the first array - 0x03, // LEB128 - the first string has 3 bytes - 0x66, 0x6F, 0x6F, // 'foo' - 0x03, // LEB128 - the second string has 3 bytes - 0x62, 0x61, 0x72, // 'bar' - 0x02, // LEB128 - 2 Int32 elements in the second array - 0x2A, 0x00, 0x00, 0x00, // 42 as Int32 - 0x90, 0x00, 0x00, 0x00, // 144 as Int32 -]) -``` - -### SimpleAggregateFunction - -TODO - -### QBit - -`QBit` vector type for efficient lookup with different levels of precision. Internally it’s stored in a transposed format. On the wire, QBit is simply an `Array` of the underlying element type (`Float32`, `Float64`, or `BFloat16`). The bit-transpose optimization for storage happens server-side, not in the RowBinary protocol. - -Syntax: - -``` -QBit(element_type, dimension) -``` - -Where `element_type` is `Float32`, `Float64`, or `BFloat16`, and `dimension` is the fixed vector dimension. - -Wire format: identical to `Array(element_type)`: - -```tsx -// LEB128 length -// followed by `length` elements of `element_type` -``` - -Sample encoding of `QBit(Float32, 4)` containing `[1.0, 2.0, 3.0, 4.0]`: - -```sql -SELECT [1.0, 2.0, 3.0, 4.0]::QBit(Float32, 4) -``` - -```tsx -const data = new Uint8Array([ - 0x04, // LEB128 - array has 4 elements - 0x00, 0x00, 0x80, 0x3F, // 1.0 as Float32 - 0x00, 0x00, 0x00, 0x40, // 2.0 as Float32 - 0x00, 0x00, 0x40, 0x40, // 3.0 as Float32 - 0x00, 0x00, 0x80, 0x40, // 4.0 as Float32 -]) -``` - ---- - -## RowBinaryWithNamesAndTypes - -> [!TIP] -See also: RowBinaryWithNamesAndTypes ClickHouse docs. -> - -Often abbreviated as **RBWNAT** for convenience. Used as the main format for [clickhouse-java](https://github.com/ClickHouse/clickhouse-java) and [clickhouse-rs](https://github.com/ClickHouse/clickhouse-rs) (WIP) clients. - -The wire format is exactly the same as `RowBinary`, but it adds a header with the following information that enables the client to ensure proper (de)serialization of the data. - -### Header structure - -- [LEB128](about:blank#unsigned-leb128-little-endian-base-128) as `N`: number of columns in the resulting rows. -- `N` times: [String](about:blank#string) as the name of the column. -- `N` times: [String](about:blank#string) as the type of the column. - -An example of the **RBWNAT** header for a table with three columns, `id UInt32`, `name String` and `sku Array(UInt64)`: - -```bash -curl -XPOST "http://localhost:8123?default_format=RowBinaryWithNamesAndTypes" \ --data-binary "SELECT 42 :: UInt32 AS id, - 'foobar' :: String AS name, - array(23) :: Array(UInt64) AS sku - LIMIT 0" \ > out.bin -``` - -```tsx -const data = new Uint8Array([ - 0x03, // LEB128 - rows have 3 columns - 0x02, // LEB128 - str has 2 bytes - 0x69, 0x64, // id - 0x04, // LEB128 - str has 4 bytes - 0x6E, 0x61, 0x6D, 0x65, // name - 0x03, // LEB128 - str has 3 bytes - 0x73, 0x6B, 0x75, // sku - 0x06, // LEB128 - str has 6 bytes - 0x55, 0x49, 0x6E, 0x74, 0x33, 0x32, // UInt32 - 0x06, // LEB128 - str has 6 bytes - 0x53, 0x74, 0x72, 0x69, 0x6E, 0x67, // String - 0x0D, // LEB128 - str has 13 bytes - 0x41, 0x72, 0x72, 0x61, 0x79, 0x28, 0x55, - 0x49, 0x6E, 0x74, 0x36, 0x34, 0x29, // Array(UInt64) -]) -``` - -Currently, the client has to parse the data types AST from the strings provided in the header. - +# Binary formats overview + +## Introduction + +This document covers basic concepts of three main binary formats used in ClickHouse: + +- [RowBinary](about:blank#rowbinary): row-based format. Rows and their values are listed consecutively, without separators. +- [RowBinaryWithNamesAndTypes](about:blank#rowbinarywithnamesandtypes): similar to `RowBinary`, but with [additional metadata](about:blank#header-structure) about column names and types. +- [Native](about:blank#native): columnar format streamed as [blocks](about:blank#block-structure). Each block contains a header with metadata about the columns, followed by the column data itself. + +> Do not confuse Native format with the TCP (Native) interface. +> + +## Recordings + +- [The first recording of the document review](https://drive.google.com/file/d/1TY35nih3ft_ebOdWDybmlSdKV-52mSaw/view). +- *The second and the following recordings (if any) will be added here* + +## RowBinary + +> See also: RowBinary ClickHouse docs. +> + +### Data types wire format + +> Most of the queries provided in the examples can be executed with curl with a file output. +> +> +> ```bash +> curl -XPOST "http://localhost:8123?default_format=RowBinary" \ +> --data-binary "SELECT 42 :: UInt32" \ > out.bin +> ``` +> +> Then, the data can be examined with a hex editor like [ImHex](https://github.com/WerWolv/ImHex). +> + +### Unsigned LEB128 (Little Endian Base 128) + +An **unsigned little-endian** variable-width integer is used to hint of a length for some data types, such as `String`, `Array`, `Map`, and `Tuple`. A sample implementation can be found on the [LEB128 wiki page](https://en.wikipedia.org/wiki/LEB128#Decode_unsigned_integer). + +### (U)Int8, (U)Int16, (U)Int32, (U)Int64, (U)Int128, (U)Int256 + +All integer types are encoded with an appropriate number of bytes as **little-endian**. Most languages support extracting such integers from byte arrays, using either built-in tools, or well-known libraries. + +### Bool + +Boolean values are encoded as a single byte, and can be deserialized similarly to `UInt8`. + +- `0` is `false` +- `1` is `true` + +### Float32, Float64 + +**Little-endian** floating-point numbers encoded as 4 bytes for `Float32` and 8 bytes for `Float64`. Similarly to integers, most languages provide proper tools to deserialize these values. + +### BFloat16 + +[BFloat16](https://clickhouse.com/docs/sql-reference/data-types/float#bfloat16) (Brain Floating Point) is a 16-bit floating point format with the range of Float32 and reduced precision, making it useful for machine learning workloads. The wire format is essentially the top 16 bits of a Float32 value. If your language doesn't support it natively, the easiest way to handle it is to read and write as UInt16, converting to and from Float32: + +To convert BFloat16 to Float32: + +```tsx +// Read 2 bytes as little-endian UInt16 +// Left-shift by 16 bits to get Float32 bits +const bfloat16Bits = readUInt16() +const float32Bits = bfloat16Bits << 16 +const floatValue = reinterpretAsFloat32(float32Bits) +``` + +To convert Float32 to BFloat16: + +```tsx +// Right-shift Float32 bits by 16 to truncate to BFloat16 +const float32Bits = reinterpretAsUInt32(floatValue) +const bfloat16Bits = float32Bits >> 16 +writeUInt16(bfloat16Bits) +``` + +Sample underlying values for `BFloat16`: + +```sql +SELECT CAST(1.25, 'BFloat16') +``` + +```tsx +const data = new Uint8Array([ + 0xA0, 0x3F, // 1.25 as BFloat16 +]) +``` + +### Decimal32, Decimal64, Decimal128, Decimal256 + +Decimal types are represented as **little-endian** integers with respective bit width. + +- `Decimal32` - 4 bytes, or `Int32`. +- `Decimal64` - 8 bytes, or `Int64`. +- `Decimal128` - 16 bytes, or `Int128`. +- `Decimal256` - 32 bytes, or `Int256`. + +When deserializing a Decimal value, the whole and fractional parts can be derived using the following pseudocode: + +```tsx +let scale_multipler = 10 ** scale +let whole_part = ~~(value / scale_multiplier) +let fractional_part = value % scale_multiplier +let result = Decimal(whole_part, fractional_part) +``` + +Where `~~` is the “fast” floor division operator, and `scale` is the number of digits after the decimal point. For example, for `Decimal(10, 2)` (an equivalent to `Decimal32(2)`), the scale is `2`, and the value `12345` will be represented as `(123, 45)`. + +Serialization requires the reverse operation: + +```tsx +let scale_multiplier = 10 ** scale +let result = whole_part * scale_multiplier + fractional_part +``` + +See more details in the [Decimal types ClickHouse docs](https://clickhouse.com/docs/sql-reference/data-types/decimal). + +### String + +Encoded in two parts: + +1. A variable-length integer (LEB128) that indicates the length of the string in bytes. +2. The string itself, encoded as a sequence of bytes. + +For example, a string `foobar` will be encoded using *seven* bytes as follows: + +```tsx +const data = new Uint8Array([ + 0x06, // LEB128 length of the string (6) + 0x66, // 'f' + 0x6f, // 'o' + 0x6f, // 'o' + 0x62, // 'b' + 0x61, // 'a' + 0x72, // 'r' +]) +``` + +### FixedString + +Unlike `String`, `FixedString` has a fixed length, which is defined in the schema. It is encoded as a sequence of bytes, padded with zeroes if necessary. + +An empty `FixedString(3)` contains only padding zeroes: + +```tsx +const data = new Uint8Array([0x00, 0x00, 0x00]) +``` + +Non-empty `FixedString(3)` containing the string `hi`: + +```tsx +const data = new Uint8Array([ + 0x68, // 'h' + 0x69, // 'i' + 0x00, // padding zero +]) +``` + +Non-empty `FixedString(3)` containing the string `bar`: + +```tsx +const data = new Uint8Array([ + 0x62, // 'b' + 0x61, // 'a' + 0x72, // 'r' +]) +``` + +No padding is required in the last example, since all *three* bytes are used. + +### Date + +Stored as `UInt16` (two bytes) representing the number of days ***since*** `1970-01-01`. + +Supported range of values: `[1970-01-01, 2149-06-06]`. + +Sample underlying values for `Date`: + +- TODO + +### Date32 + +Stored as `Int32` (four bytes) representing the number of days ***before or after*** `1970-01-01`. + +Supported range of values: `[1900-01-01, 2299-12-31]` + +Sample underlying values for `Date32`: + +- TODO + +### DateTime + +Stored as `UInt32` (four bytes) representing the number of seconds ***since*** `1970-01-01 00:00:00 UTC`. + +Syntax: + +``` +DateTime([timezone]) +``` + +For example, `DateTime` or `DateTime('UTC')`. + +Supported range of values: `[1970-01-01 00:00:00, 2106-02-07 06:28:15]`. + +Sample underlying values for `DateTime`: + +- TODO + +### DateTime64 + +Stored as `Int64` (eight bytes) representing the number of **ticks** ***before or after*** `1970-01-01 00:00:00 UTC`. Tick resolution is defined by the `precision` parameter, see the syntax below: + +``` +DateTime64(precision, [timezone]) +``` + +Where `precision` is an integer from `0` to `9`. Typically, only the following are used: `3` (milliseconds), `6` (microseconds), +`9` (nanoseconds). + +Examples of valid DateTime64 definitions: `DateTime64(0)`, `DateTime64(3)`, `DateTime64(6, 'UTC')`, or `DateTime64(9, 'Europe/Amsterdam')`. + +> [!NOTE] +Variants like DateTime64(2) or DateTime64(5, 'UTC') are also valid, while not commonly used. +> + +The underlying `Int64` value of the `DateTime64` type could be used as number of the following units before or after the UNIX epoch: + +- `DateTime64(0)` - seconds. +- `DateTime64(3)` - milliseconds. +- `DateTime64(6)` - microseconds. +- `DateTime64(9)` - nanoseconds. + +Supported range of values: `[1900-01-01 00:00:00, 2299-12-31 23:59:59.99999999]`. + +Sample underlying values for `DateTime64`: + +- `DateTime64(3)`: value `1546300800000` represents `2019-01-01 00:00:00 UTC`. +- TODO + +> [!NOTE] +The precision of the maximum value is 8. If the maximum precision of 9 digits (nanoseconds) is used, the maximum supported value is 2262-04-11 23:47:16 in UTC. +> + +### Time + +Stored as `Int32` representing a time value in seconds. Negative values are valid. + +Supported range of values: `[-999:59:59, 999:59:59]` (i.e., `[-3599999, 3599999]` seconds). + +> [!NOTE] +At the moment, the setting enable_time_time64_type must be set to 1 to use Time or Time64. +> + +Sample underlying values for `Time`: + +```sql +SET enable_time_time64_type = 1; +SELECT CAST('15:32:16', 'Time') AS t +``` + +```tsx +const data = new Uint8Array([ + 0x80, 0xDA, 0x00, 0x00, // 55936 seconds = 15:32:16 +]) +``` + +### Time64 + +Internally stored as a `Decimal64` (which is stored as `Int64`) representing a time value with fractional seconds, with configurable precision. Negative values are valid. + +Syntax: + +``` +Time64(precision) +``` + +Where `precision` is an integer from `0` to `9`. Common values: `3` (milliseconds), `6` (microseconds), `9` (nanoseconds). + +Supported range of values: `[-999:59:59.xxxxxxxxx, 999:59:59.xxxxxxxxx]`. + +> [!NOTE] +At the moment, the setting enable_time_time64_type must be set to 1 to use Time or Time64. +> + +The underlying `Int64` value represents fractional seconds scaled by `10^precision`. + +Sample underlying values for `Time64`: + +```sql +SET enable_time_time64_type = 1; +SELECT CAST('15:32:16.123456', 'Time64(6)') AS t +``` + +```tsx +const data = new Uint8Array([ + 0x40, 0x82, 0x0D, 0x06, + 0x0D, 0x00, 0x00, 0x00, // 55936123456 as Int64 +]) +// 55936123456 / 10^6 = 55936.123456 seconds = 15:32:16.123456 +``` + +### Enum8, Enum16 + +Stored as a single byte (`Enum8` == `UInt8`) or two bytes (`Enum16` == `UInt16`) representing the index of the enum value in the enum definition. + +Note that an Enum can be defined in a fairly simple way, like this: + +```sql +SELECT 1 :: Enum8('hello' = 1, 'world' = 2) AS e; +``` + +``` + ┌─e─────┐ +1. │ hello │ + └───────┘ +``` + +The Enum8 defined above will have the following values map on the client: + +``` +Map { + 1: 'hello', + 2: 'world' +} +``` + +Or in a more complex way, like this: + +```sql +SELECT 42 :: Enum16('f\'' = 1, 'x =' = 2, 'b\'\'' = 3, '\'c=4=' = 42, '4' = 1234) AS e; +``` + +``` + ┌─e─────┐ +1. │ 'c=4= │ + └───────┘ +``` + +The Enum16 defined above will have the following values map on the client: + +``` +Map { + 1: 'f\'', + 2: 'x =', + 3: 'b\'', + 42: '\'c=4=', + 1234: '4' +} +``` + +As you can see, for the data type parser, the main challenge is to track the escaped symbols in the enum definition, such as `\'`, and special symbols like `=` that may appear in the quotes. + +### UUID + +Represented as a sequence of 16 bytes. + +Sample underlying values for `UUID`: + +- `61f0c404-5cb3-11e7-907b-a6006ad3dba0` is represented as: + +```tsx +const data = new Uint8Array([ + 0xE7, 0x11, 0xB3, 0x5C, 0x04, 0xC4, 0xF0, 0x61, + 0xA0, 0xDB, 0xD3, 0x6A, 0x00, 0xA6, 0x7B, 0x90, +]) +``` + +- The default UUID `00000000-0000-0000-0000-000000000000` is represented as 16 zero bytes: + +```tsx +const data = new Uint8Array([ + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + ]) +``` + +It can be used when a new record was inserted, but the UUID value was not specified. + +### IPv4 + +Stored in four bytes as `UInt32`. Sample underlying values for `IPv4`: + +```sql +SELECT + CAST('0.0.0.0', 'IPv4') AS a, + CAST('127.0.0.1', 'IPv4') AS b, + CAST('192.168.0.1', 'IPv4') AS c, + CAST('255.255.255.255', 'IPv4') AS d, + CAST('168.212.226.204', 'IPv4') AS e +``` + +```tsx +const data = new Uint8Array([ + 0x00, 0x00, 0x00, 0x00, // 0.0.0.0 + 0x01, 0x00, 0x00, 0x7f, // 127.0.0.1 + 0x01, 0x00, 0xa8, 0xc0, // 192.168.0.1 + 0xff, 0xff, 0xff, 0xff, // 255.255.255.255 + 0xcc, 0xe2, 0xd4, 0xa8, // 168.212.226.204 +]) +``` + +### IPv6 + +Stored in 16 bytes, *but it does not correspond to `(U)Int128`*. Sample underlying values for `IPv6`: + +```sql +SELECT + CAST('2a02:aa08:e000:3100::2', 'IPv6') AS a, + CAST('2001:44c8:129:2632:33:0:252:2', 'IPv6') AS b, + CAST('2a02:e980:1e::1', 'IPv6') AS c +``` + +```tsx +const data = new Uint8Array([ + // 2a02:aa08:e000:3100::2 + 0x2A, 0x02, 0xAA, 0x08, 0xE0, 0x00, 0x31, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, + // 2001:44c8:129:2632:33:0:252:2 + 0x20, 0x01, 0x44, 0xC8, 0x01, 0x29, 0x26, 0x32, + 0x00, 0x33, 0x00, 0x00, 0x02, 0x52, 0x00, 0x02, + // 2a02:e980:1e::1 + 0x2A, 0x02, 0xE9, 0x80, 0x00, 0x1E, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, +]) +``` + +### Nullable + +A nullable data type is encoded as follows: + +1. A single byte that indicates whether the value is `NULL` or not: + - `0x00` means the value is not `NULL`. + - `0x01` means the value is `NULL`. +2. If the value is not `NULL`, the underlying data type is encoded as usual. + +For example, a `Nullable(UInt32)` value: + +```sql +SELECT CAST(42, 'Nullable(UInt32)') AS a, + CAST(NULL, 'Nullable(UInt32)') AS b +``` + +```tsx +const data = new Uint8Array([ + 0x00, // Not NULL - the value follows + 0x2A, 0x00, 0x00, 0x00, // UInt32(42) + 0x01, // NULL - nothing follows +]) +``` + +### LowCardinality + +Low-cardinality marker does not affect the wire format. For example, a `LowCardinality(String)` is encoded the same way as a regular `String`. + +> [!NOTE] +A column can be defined as LowCardinality(Nullable(T)), but it is not possible to define it as Nullable(LowCardinality(T)) - it will always result in an error from the server. +> + +While testing, [allow_suspicious_low_cardinality_types](https://clickhouse.com/docs/operations/settings/settings#allow_suspicious_low_cardinality_types) can be set to `1` to allow most of the data types inside `LowCardinality` for better coverage. + +### Array + +An array is encoded as follows: + +1. A [variable-length integer (LEB128)](about:blank#unsigned-leb128-little-endian-base-128) that indicates the number of elements in the array. +2. The elements of the array, encoded in the same way as the underlying data type. + +For example, an array with `UInt32` values: + +```sql +SELECT CAST(array(1, 2, 3), 'Array(UInt32)') AS arr +``` + +```tsx +const data = new Uint8Array([ + 0x03, // LEB128 - the array has 3 elements + 0x01, 0x00, 0x00, 0x00, // UInt32(1) + 0x02, 0x00, 0x00, 0x00, // UInt32(2) + 0x03, 0x00, 0x00, 0x00, // UInt32(3) +]) +``` + +A slightly more complex example: + +```sql +SELECT array('foobar', 'qaz') AS arr +``` + +```tsx +const data = new Uint8Array([ + 0x02, // LEB128 - the array has 2 elements + 0x06, // LEB128 - the first string has 6 bytes + 0x66, 0x6f, 0x6f, + 0x62, 0x61, 0x72, // 'foobar' + 0x03, // LEB128 - the second string has 3 bytes + 0x71, 0x61, 0x7a, // 'qaz' +]) +``` + +> [!NOTE] +An array can contain nullable values, but the array itself cannot be nullable. +> + +The following is valid: + +```sql +SELECT CAST([NULL, 'foo'], 'Array(Nullable(String))') AS arr; +``` + +``` + ┌─arr──────────┐ +1. │ [NULL,'foo'] │ + └──────────────┘ +``` + +And it will be encoded as follows: + +```tsx +const data = new Uint8Array([ + 0x02, // LEB128 - the array has 2 elements + 0x01, // Is NULL - nothing follows for this element + 0x00, // Is NOT NULL - the data follows + 0x03, // LEB128 - the string has 3 bytes + 0x66, 0x6f, 0x6f, // 'foo' +]) +``` + +An example of dealing with multidimensional arrays can be found in the [Geo section](about:blank#geo). + +### Tuple + +A tuple is encoded as all elements of the tuple following each other in their corresponding wire format without any additional meta-information or delimiters. + +```sql +CREATE OR REPLACE TABLE foo +( + `t` Tuple( + UInt32, + String, + Array(UInt8) + ) +) +ENGINE = Memory; +INSERT INTO foo VALUES ((42, 'foo', array(99, 144))); +``` + +```tsx +const data = new Uint8Array([ + 0x2a, 0x00, 0x00, 0x00, // 42 as UInt32 + 0x03, // LEB128 - the string has 3 bytes + 0x66, 0x6f, 0x6f, // 'foo' + 0x02, // LEB128 - the array has 2 elements + 0x63, // 99 as UInt8 + 0x90, // 144 as UInt8 +]) +``` + +The string encoding of the tuple data type presents the similar challenges as with the [Enum type](about:blank#enum8-enum16), such as tracking the escaped symbols and special characters; now, with Tuple it is also required to track open and closing parenthesis. Additionally, note that the most complex Tuples can contain other nested Tuples, Arrays, Maps, and even enums. + +For example, in the following table, the tuple contains an enum with a tick and parenthesis in the name, which can cause parsing issues if not handled properly: + +```sql +CREATE OR REPLACE TABLE foo +( + `t` Tuple( + Enum8('f\'()' = 0), Array(Nullable(Tuple(UInt32, String))) ))ENGINE = Memory; +``` + +### Map + +A map can be viewed as an `Array(Tuple(K, V))`, where `K` is the key type and `V` is the value type. The map is encoded as follows: + +1. A [variable-length integer (LEB128)](about:blank#unsigned-leb128-little-endian-base-128) that indicates the number of elements in the map. +2. The elements of the map as key-value pairs, encoded as their corresponding types. + +For example, a map with `String` keys and `UInt32` values: + +```sql +SELECT CAST(map('foo', 1, 'bar', 2), 'Map(String, UInt32)') AS m +``` + +```tsx +const data = new Uint8Array([ + 0x02, // LEB128 - the map has 2 elements + 0x03, // LEB128 - the first key has 3 bytes + 0x66, 0x6f, 0x6f, // 'foo' + 0x01, 0x00, 0x00, 0x00, // UInt32(1) + 0x03, // LEB128 - the second key has 3 bytes + 0x62, 0x61, 0x72, // 'bar' + 0x02, 0x00, 0x00, 0x00, // UInt32(2) +]) +``` + +> It is possible to have maps with deeply nested structures, such as Map(String, Map(Int32, Array(Nullable(String)))), which will be encoded similarly to what is described above. +> + +### Variant + +This type represents a union of other data types. Type `Variant(T1, T2, ..., TN)` means that each row of this type has a value of either type `T1` or `T2` or … or `TN` or none of them (`NULL` value). + +> [!WARNING] +While for the end user Variant(T1, T2) means exactly the same as Variant(T2, T1), the order of types in the definition matters for the wire format: the types in the definition are always sorted alphabetically, and this is important, since the exact variant is encoded by a “discriminant” - the data type index in the definition. +> + +Consider the following example: + +```sql +SET allow_experimental_variant_type = 1, + allow_suspicious_variant_types = 1; +CREATE OR REPLACE TABLE foo +( + -- it does not matter what is the order of types in the user input -- the types are always sorted alphabetically in the wire format `var` Variant( + Array(Int16), + Bool, + Date, + FixedString(6), + Float32, Float64, + Int128, Int16, Int32, Int64, Int8, + String, + UInt128, UInt16, UInt32, UInt64, UInt8 + ) +) +ENGINE = MergeTree +ORDER BY (); +INSERT INTO foo VALUES(true), ('foo'), (CAST(100, 'Int128')), (array(1,2,3)); +SELECT * FROM foo; +``` + +```tsx +const data = new Uint8Array([ + 0x01, // type index -> Bool + 0x01, // true + 0x03, // type index -> FixedString(6) + 0x66, 0x6F, 0x6F, 0x62, 0x61, 0x72, // 'foobar' + 0x05, // type index -> Float64 + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x20, 0x59, 0x40, // 100.5 as Float64 + 0x06, // type index -> Int128 + 0x64, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, // 100 as Int128 + 0x00, // type index -> Array(Int16) + 0x03, // LEB128 - the array has 3 elements + 0x01, 0x00, // 1 as Int16 + 0x02, 0x00, // 2 as Int16 + 0x03, 0x00, // 3 as Int16 +]) +``` + +[allow_suspicious_variant_types](https://clickhouse.com/docs/operations/settings/settings#allow_suspicious_variant_types) setting can be used to allow more exhaustive testing of the `Variant` type. + +### Dynamic + +The `Dynamic` type can hold values of any type, determined at runtime. In RowBinary format, each value is self-describing: the first part is the type specification in [**this format](https://clickhouse.com/docs/sql-reference/data-types/data-types-binary-encoding).** The contents then follow, with the value encoding as described in this document. So to parse you you just need to use the type index to determine the right parser and then re-use the RowBinary parsing you already have elsewhere. + +``` +[BinaryTypeIndex][type-specific parameters...][value] +``` + +Where `BinaryTypeIndex` is a single byte identifying the type. See the reference [here](https://clickhouse.com/docs/sql-reference/data-types/data-types-binary-encoding) for the type indices and parameters. + +**Examples:** + +```sql +SELECT 42::Dynamic +``` + +``` +0a # BinaryTypeIndex: Int64 (0x0A) +2a 00 00 00 00 00 00 00 # Int64 value: 42 +``` + +```sql +SELECT toDateTime64('2024-01-15 10:30:00', 3, 'America/New_York')::Dynamic +``` + +``` +14 # BinaryTypeIndex: DateTime64WithTimezone (0x14) +03 # UInt8: precision +10 # VarUInt: timezone name length +41 6d 65 72 69 63 61 2f # "America/" +4e 65 77 5f 59 6f 72 6b # "New_York" +c0 6c be 0d 8d 01 00 00 # Int64: timestamps +``` + +### JSON + +The JSON type encodes data in two distinct categories: + +1. **Typed Paths** - Paths declared with explicit types in the schema (e.g., `JSON(user_id UInt32, name String)`) +2. **Dynamic Paths/Overflow paths when dynamic path limit is exceeded** - Runtime-discovered paths stored as `Dynamic` type. The value encoding is preceded by the type definition. + +The wire format and rules are different for these two categories. + +| Path Category | Included in Serialization | Value Encoding | Variant/Nullable allowed | +| --- | --- | --- | --- | +| **Typed paths** | Always (even if NULL) | Type-specific binary format | Yes | +| **Dynamic paths** | Only if non-null | Dynamic | No | + +Each JSON row in RowBinary format is serialized as: + +``` +[VarUInt: number_of_paths] +[String: path_1][value_1] +[String: path_2][value_2] +... +``` + +**Examples:** + +1. Simple JSON with typed paths only: + +Schema: `JSON(user_id UInt32, active Bool)` + +Row: `{"user_id": 42, "active": true}` + +Binary encoding (hex with annotations): + +``` +02 # VarUInt: 2 paths total + +# Typed path "active" +06 61 63 74 69 76 65 # String: "active" (length 6 + bytes) +01 # Bool/UInt8 value: true (1) + +# Typed path "user_id" +07 75 73 65 72 5F 69 64 # String: "user_id" (length 7 + bytes) +2A 00 00 00 # UInt32 value: 42 (little-endian) +``` + +1. Simple JSON with typed and dynamic paths: + +Schema: `JSON(user_id UInt32, active Bool)` + +Row: `{"user_id": 42, "active": true, "name": "Alice"}` + +Binary encoding (hex with annotations): + +``` +03 # VarUInt: 3 paths total + +# Typed path "active" +06 61 63 74 69 76 65 # String: "active" (length 6 + bytes) +01 # Bool/UInt8 value: true (1) + +# Dynamic path "name" +04 6E 61 6D 65 # String: "name" (length 4 + bytes) +15 # BinaryTypeIndex: String (0x15) +05 41 6C 69 63 65 # String value: "Alice" (length 5 + bytes) + +# Typed path "user_id" +07 75 73 65 72 5F 69 64 # String: "user_id" (length 7 + bytes) +2A 00 00 00 # UInt32 value: 42 (little-endian) + +``` + +1. Null handling: + +With typed nullable column you get null: + +Schema: `JSON(score Nullable(Int32))` + +Row: `{"score": null }` + +Binary encoding (hex with annotations): + +``` +02 # VarUInt: 2 paths total + +# Typed path "score" (Nullable) +05 73 63 6f 72 65 # String: "score" (length 5 + bytes) +01 # Nullable flag: 1 (is NULL, no value follows) +``` + +With typed non-nullable column, you get the default value: + +JSON(name String) + +Schema: `JSON(name String)` + +Row: `{"name": null}` + +Binary encoding: + +``` +01 # VarUInt: 1 path (dynamic NULL paths are skipped!) + +04 6e 61 6d 65 # "name" +00 # String length 0 (empty string) +``` + +With dynamic path, it is ignored: + +Schema: `JSON(id UInt64)` + +Row: `{"id": 100, "metadata": null}` + +Binary encoding: + +``` +01 # VarUInt: 1 path (dynamic NULL paths are skipped!) + +# Typed path "id" +02 69 64 # String: "id" (length 2 + bytes) +64 00 00 00 00 00 00 00 # UInt64 value: 100 (little-endian) + +``` + +Note: The `metadata` path with NULL value is **not included** because dynamic paths are only serialized when non-null. This is a key difference from typed paths. + +1. Nested JSON objects: + +Schema: `JSON()` + +Row: `{"user": {"name": "Bob", "age": 30}}` + +Binary encoding (hex with annotations): + +``` +02 # VarUInt: 2 paths (nested objects are flattened) + +# Dynamic path "user.age" +08 75 73 65 72 2E 61 67 65 # String: "user.age" (length 8 + bytes) +0A # BinaryTypeIndex: Int64 (0x0A) +1E 00 00 00 00 00 00 00 # Int64 value: 30 (little-endian) + +# Dynamic path "user.name" +09 75 73 65 72 2E 6E 61 6D 65 # String: "user.name" (length 9 + bytes) +15 # BinaryTypeIndex: String (0x15) +03 42 6F 62 # String value: "Bob" (length 3 + bytes) + +``` + +Note: Nested objects are flattened into dot-separated paths (e.g., `user.name` instead of a nested structure). + +**Alternative: JSON as String Mode** + +With the setting `output_format_binary_write_json_as_string=1`, JSON columns are serialized as a single JSON text string instead of the structured binary format. There is a corresponding setting for writing to JSON columns, `input_format_binary_read_json_as_string`. The choice of setting here comes down to whether you want to parse the json in the client or the server. + +### Geo types + +Geo is a category of data types that represent geographical data. It includes: + +- `Point` - as `Tuple(Float64, Float64)`. +- `Ring` - as `Array(Point)`, or `Array(Tuple(Float64, Float64))`. +- `Polygon` - as `Array(Ring)`, or `Array(Array(Tuple(Float64, Float64)))`. +- `MultiPolygon` - as `Array(Polygon)`, or `Array(Array(Array(Tuple(Float64, Float64))))`. +- `LineString` - as `Array(Point)`, or `Array(Tuple(Float64, Float64))`. +- `MultiLineString` - as `Array(LineString)`, or `Array(Array(Tuple(Float64, Float64)))`. + +The wire format of the Geo values is exactly the same as with Tuple and Array. [RowBinaryWithNamesAndTypes](about:blank#rowbinarywithnamesandtypes) +and [Native](about:blank#native) formats headers will contain the aliases for these types, e.g., `Point`, `Ring`, `Polygon`, `MultiPolygon`, `LineString`, and `MultiLineString`. + +```sql +SELECT (1.0, 2.0) :: Point AS point, + [(3.0, 4.0), (5.0, 6.0)] :: Ring AS ring, + [[(7.0, 8.0), (9.0, 10.0)], [(11.0, 12.0)]] :: Polygon AS polygon, + [[[(13.0, 14.0), (15.0, 16.0)], [(17.0, 18.0)]]] :: MultiPolygon AS multi_polygon, + [(19.0, 20.0), (21.0, 22.0)] :: LineString AS line_string, + [[(23.0, 24.0), (25.0, 26.0)], [(27.0, 28.0)]] :: MultiLineString AS multi_line_string +``` + +```tsx +const data = new Uint8Array([ + // Point - or Tuple(Float64, Float64) + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xF0, 0x3F, // Point.X + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, // Point.Y + // Ring - or Array(Tuple(Float64, Float64)) + 0x02, // LEB128 - the "ring" array has 2 points + // Ring - Point #1 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x40, + // Ring - Point #2 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x14, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x18, 0x40, + // Polygon - or Array(Array(Tuple(Float64, Float64))) + 0x02, // LEB128 - the "polygon" array has 2 rings + 0x02, // LEB128 - the first ring has 2 points + // Polygon - Ring #1 - Point #1 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x1C, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x40, + // Polygon - Ring #1 - Point #2 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x22, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x24, 0x40, + 0x01, // LEB128 - the second ring has 1 point + // Polygon - Ring #2 - Point #1 (the only one) + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x26, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x28, 0x40, + // MultiPolygon - or Array(Array(Array(Tuple(Float64, Float64)))) + 0x01, // LEB128 - the "multi_polygon" array has 1 polygon + 0x02, // LEB128 - the first polygon has 2 rings + 0x02, // LEB128 - the first ring has 2 points + // MultiPolygon - Polygon #1 - Ring #1 - Point #1 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2A, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2C, 0x40, + // MultiPolygon - Polygon #1 - Ring #1 - Point #2 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2E, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x30, 0x40, + 0x01, // LEB128 - the second ring has 1 point + // MultiPolygon - Polygon #1 - Ring #2 - Point #1 (the only one) + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x31, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x32, 0x40, + // LineString - or Array(Tuple(Float64, Float64)) + 0x02, // LEB128 - the line string has 2 points + // LineString - Point #1 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x33, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x34, 0x40, + // LineString - Point #2 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x35, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x36, 0x40, + // MultiLineString - or Array(Array(Tuple(Float64, Float64))) + 0x02, // LEB128 - the multi line string has 2 line strings + 0x02, // LEB128 - the first line string has 2 points + // MultiLineString - LineString #1 - Point #1 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x37, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x38, 0x40, + // MultiLineString - LineString #1 - Point #2 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x39, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x3A, 0x40, + 0x01, // LEB128 - the second line string has 1 point + // MultiLineString - LineString #2 - Point #1 (the only one) + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x3B, 0x40, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x3C, 0x40, +]) +``` + +### Geometry + +`Geometry` is a `Variant` type that can hold any of the Geo types listed above. On the wire, it is encoded exactly like a `Variant`, with a discriminant byte indicating which geo type follows. + +The discriminant indices for Geometry are: + +| Index | Type | +| --- | --- | +| 0 | LineString | +| 1 | MultiLineString | +| 2 | MultiPolygon | +| 3 | Point | +| 4 | Polygon | +| 5 | Ring | + +Wire format structure: + +```tsx +// 1 byte discriminant (0-5) +// followed by the corresponding geo type data +``` + +Sample encoding of a `Point` as `Geometry`: + +```sql +SELECT ((1.0, 2.0)::Point)::Geometry +``` + +```tsx +const data = new Uint8Array([ + 0x03, // discriminant = 3 (Point) + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xF0, 0x3F, // Point.X = 1.0 as Float64 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, // Point.Y = 2.0 as Float64 +]) +``` + +Sample encoding of a `Ring` as `Geometry`: + +```tsx +const data = new Uint8Array([ + 0x05, // discriminant = 5 (Ring) + 0x02, // LEB128 - array has 2 points + // Point #1 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x40, // X = 3.0 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x40, // Y = 4.0 + // Point #2 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x14, 0x40, // X = 5.0 + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x18, 0x40, // Y = 6.0 +]) +``` + +### Nested + +In the wire format, `Nested` is represented as a sequence of arrays, where each array corresponds to a column in the nested structure. Each array is encoded as described in the [Array](about:blank#array) section. + +```sql +CREATE OR REPLACE TABLE foo +( + n Nested ( + a String, + b Int32 + ) +) ENGINE = MergeTree ORDER BY (); +INSERT INTO foo VALUES (array('foo', 'bar'), array(42, 144)) +``` + +Note that even the `DESCRIBE TABLE foo` will show the `Nested` structure as a sequence of arrays: + +``` +DESCRIBE TABLE foo + + ┌─name─┬─type──────────┐ +1. │ n.a │ Array(String) │ +2. │ n.b │ Array(Int32) │ + └──────┴───────────────┘ +``` + +Sample wire format for the `Nested` structure above: + +```tsx +const data = new Uint8Array([ + 0x02, // LEB128 - 2 String elements in the first array + 0x03, // LEB128 - the first string has 3 bytes + 0x66, 0x6F, 0x6F, // 'foo' + 0x03, // LEB128 - the second string has 3 bytes + 0x62, 0x61, 0x72, // 'bar' + 0x02, // LEB128 - 2 Int32 elements in the second array + 0x2A, 0x00, 0x00, 0x00, // 42 as Int32 + 0x90, 0x00, 0x00, 0x00, // 144 as Int32 +]) +``` + +### SimpleAggregateFunction + +TODO + +### QBit + +`QBit` vector type for efficient lookup with different levels of precision. Internally it’s stored in a transposed format. On the wire, QBit is simply an `Array` of the underlying element type (`Float32`, `Float64`, or `BFloat16`). The bit-transpose optimization for storage happens server-side, not in the RowBinary protocol. + +Syntax: + +``` +QBit(element_type, dimension) +``` + +Where `element_type` is `Float32`, `Float64`, or `BFloat16`, and `dimension` is the fixed vector dimension. + +Wire format: identical to `Array(element_type)`: + +```tsx +// LEB128 length +// followed by `length` elements of `element_type` +``` + +Sample encoding of `QBit(Float32, 4)` containing `[1.0, 2.0, 3.0, 4.0]`: + +```sql +SELECT [1.0, 2.0, 3.0, 4.0]::QBit(Float32, 4) +``` + +```tsx +const data = new Uint8Array([ + 0x04, // LEB128 - array has 4 elements + 0x00, 0x00, 0x80, 0x3F, // 1.0 as Float32 + 0x00, 0x00, 0x00, 0x40, // 2.0 as Float32 + 0x00, 0x00, 0x40, 0x40, // 3.0 as Float32 + 0x00, 0x00, 0x80, 0x40, // 4.0 as Float32 +]) +``` + +--- + +## RowBinaryWithNamesAndTypes + +> [!TIP] +See also: RowBinaryWithNamesAndTypes ClickHouse docs. +> + +Often abbreviated as **RBWNAT** for convenience. Used as the main format for [clickhouse-java](https://github.com/ClickHouse/clickhouse-java) and [clickhouse-rs](https://github.com/ClickHouse/clickhouse-rs) (WIP) clients. + +The wire format is exactly the same as `RowBinary`, but it adds a header with the following information that enables the client to ensure proper (de)serialization of the data. + +### Header structure + +- [LEB128](about:blank#unsigned-leb128-little-endian-base-128) as `N`: number of columns in the resulting rows. +- `N` times: [String](about:blank#string) as the name of the column. +- `N` times: [String](about:blank#string) as the type of the column. + +An example of the **RBWNAT** header for a table with three columns, `id UInt32`, `name String` and `sku Array(UInt64)`: + +```bash +curl -XPOST "http://localhost:8123?default_format=RowBinaryWithNamesAndTypes" \ --data-binary "SELECT 42 :: UInt32 AS id, + 'foobar' :: String AS name, + array(23) :: Array(UInt64) AS sku + LIMIT 0" \ > out.bin +``` + +```tsx +const data = new Uint8Array([ + 0x03, // LEB128 - rows have 3 columns + 0x02, // LEB128 - str has 2 bytes + 0x69, 0x64, // id + 0x04, // LEB128 - str has 4 bytes + 0x6E, 0x61, 0x6D, 0x65, // name + 0x03, // LEB128 - str has 3 bytes + 0x73, 0x6B, 0x75, // sku + 0x06, // LEB128 - str has 6 bytes + 0x55, 0x49, 0x6E, 0x74, 0x33, 0x32, // UInt32 + 0x06, // LEB128 - str has 6 bytes + 0x53, 0x74, 0x72, 0x69, 0x6E, 0x67, // String + 0x0D, // LEB128 - str has 13 bytes + 0x41, 0x72, 0x72, 0x61, 0x79, 0x28, 0x55, + 0x49, 0x6E, 0x74, 0x36, 0x34, 0x29, // Array(UInt64) +]) +``` + +Currently, the client has to parse the data types AST from the strings provided in the header. + --- \ No newline at end of file