From c785bc93f24e3b984fd759ebc74cbfa4bbe22a6a Mon Sep 17 00:00:00 2001
From: Prateek Gaur <prateek.gaur@snowflake.com>
Date: Mon, 9 Mar 2026 21:47:39 +0000
Subject: [PATCH 1/7] Add ALP (Adaptive Lossless floating-Point) encoding
 specification

Add the encoding specification for ALP (encoding value 10) to Encodings.md.
ALP compresses FLOAT and DOUBLE columns by converting values to integers via
decimal scaling, then applying Frame of Reference encoding and bit-packing.
Values that cannot be losslessly round-tripped are stored as exceptions.

The spec covers:
- Page layout: 7-byte header, offset array, compressed vectors
- Vector format: AlpInfo, ForInfo, packed values, exception data
- Encoding math: two-step multiplication for cross-language consistency
- Parameter selection, exception detection, and decoding steps

Based on the paper "ALP: Adaptive Lossless floating-Point Compression"
(Afroozeh and Boncz, SIGMOD 2024). Wire format matches the C++ Arrow
and Java parquet-java implementations.
---
 Encodings.md | 519 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 519 insertions(+)
diff --git a/Encodings.md b/Encodings.md
index 1c766fb5..6f0758e6 100644
--- a/Encodings.md
+++ b/Encodings.md
@@ -391,3 +391,522 @@ After applying the transformation, the data has the following representation:
 ```
 Bytes  AA 00 A3 BB 11 B4 CC 22 C5 DD 33 D6
 ```
+
+<a name="ALP"></a>
+### Adaptive Lossless floating-Point: (ALP = 10)
+
+Supported Types: FLOAT, DOUBLE
+
+This encoding is adapted from the paper
+["ALP: Adaptive Lossless floating-Point Compression"](https://dl.acm.org/doi/10.1145/3626717)
+by Afroozeh and Boncz (SIGMOD 2024).
+
+ALP works by converting floating-point values to integers using decimal scaling,
+then applying Frame of Reference (FOR) encoding and bit-packing. Values that
+cannot be losslessly converted are stored as exceptions. The encoding achieves
+high compression for decimal-like floating-point data (e.g., monetary values,
+sensor readings) while remaining fully lossless.
+
+#### Overview
+
+ALP encoding consists of a page-level header followed by an offset array and one
+or more encoded vectors (batches of values). Each vector contains up to
+`vector_size` elements (default 1024).
+
+```
++-------------+-----------------------------+--------------------------------------+
+|   Header    |        Offset Array         |            Vector Data               |
+|  (7 bytes)  |   (num_vectors * 4 bytes)   |            (variable)                |
++-------------+------+------+-----+---------+----------+----------+-----+----------+
+| Page Header | off0 | off1 | ... | off N-1 | Vector 0 | Vector 1 | ... | Vec N-1  |
+|  (7 bytes)  | (4B) | (4B) |     |  (4B)   |(variable)|(variable)|     |(variable)|
++-------------+------+------+-----+---------+----------+----------+-----+----------+
+```
+
+The compression pipeline for each vector is:
+
+```
+                    Input: float/double array
+                              |
+                              v
+    +----------------------------------------------------------+
+    |  1. SAMPLING & PRESET GENERATION                         |
+    |     Sample vectors from column chunk                     |
+    |     Try all (exponent, factor) combinations              |
+    |     Select best k combinations for preset                |
+    +----------------------------------------------------------+
+                              |
+                              v
+    +----------------------------------------------------------+
+    |  2. DECIMAL ENCODING                                     |
+    |     encoded[i] = round(value[i] * 10^e * 10^(-f))       |
+    |     Detect exceptions where decode(encode(v)) != v       |
+    +----------------------------------------------------------+
+                              |
+                              v
+    +----------------------------------------------------------+
+    |  3. FRAME OF REFERENCE (FOR)                             |
+    |     min_val = min(encoded[])                             |
+    |     delta[i] = encoded[i] - min_val                      |
+    +----------------------------------------------------------+
+                              |
+                              v
+    +----------------------------------------------------------+
+    |  4. BIT PACKING                                          |
+    |     bit_width = ceil(log2(max_delta + 1))                |
+    |     Pack each delta into bit_width bits                  |
+    +----------------------------------------------------------+
+                              |
+                              v
+                   Output: Serialized vector bytes
+```
+
+#### Page Layout
+
+##### Header (7 bytes)
+
+All multi-byte values are little-endian.
+
+```
+ Byte:    0              1               2              3    4    5    6
+       +----------------+---------------+--------------+----+----+----+----+
+       | compression    | integer       | log_vector   |     num_elements  |
+       | _mode          | _encoding     | _size        |     (int32 LE)    |
+       +----------------+---------------+--------------+----+----+----+----+
+```
+
+| Offset | Field | Size | Type | Description |
+|--------|-------|------|------|-------------|
+| 0 | compression_mode | 1 byte | uint8 | Compression mode (must be 0 = ALP) |
+| 1 | integer_encoding | 1 byte | uint8 | Integer encoding (must be 0 = FOR + bit-packing) |
+| 2 | log_vector_size | 1 byte | uint8 | log2(vector\_size). Must be in \[3, 15\]. Default: 10 (vector size 1024) |
+| 3 | num_elements | 4 bytes | int32 | Total number of floating-point values in the page |
+
+The number of vectors is `ceil(num_elements / vector_size)`. The last vector may
+contain fewer than `vector_size` elements.
+
+**Note:** The number of elements per vector and the packed data size are NOT stored
+in the header. They are derived:
+* Elements per vector: `vector_size` for all vectors except the last, which may be smaller.
+* Packed data size: `ceil(num_elements_in_vector * bit_width / 8)`.
+
+##### Offset Array
+
+Immediately following the header is an array of `num_vectors` little-endian uint32
+values. Each offset gives the byte position of the corresponding vector's data,
+measured from the start of the offset array itself.
+
+The first offset equals `num_vectors * 4` (pointing just past the offset array).
+Each subsequent offset equals the previous offset plus the stored size of the
+previous vector.
+
+##### Vector Format
+
+Each vector is self-describing and contains the encoding parameters, FOR metadata,
+bit-packed encoded values, and exception data.
+
+```
++-------------------+-----------------+-------------------+---------------------+-------------------+
+|      AlpInfo      |     ForInfo     |   PackedValues    | ExceptionPositions  | ExceptionValues   |
+|     (4 bytes)     | (5B or 9B)      |    (variable)     |     (variable)      |    (variable)     |
++-------------------+-----------------+-------------------+---------------------+-------------------+
+```
+
+Vector header sizes:
+| Type   | AlpInfo | ForInfo | Total Header |
+|--------|---------|---------|--------------|
+| FLOAT  | 4 bytes | 5 bytes | 9 bytes      |
+| DOUBLE | 4 bytes | 9 bytes | 13 bytes     |
+
+Data section sizes:
+| Section             | Size Formula                | Description                  |
+|---------------------|-----------------------------|------------------------------|
+| PackedValues        | ceil(N * bit\_width / 8)    | Bit-packed delta values      |
+| ExceptionPositions  | num\_exceptions * 2 bytes   | uint16 indices of exceptions |
+| ExceptionValues     | num\_exceptions * sizeof(T) | Original float/double values |
+
+###### AlpInfo (4 bytes, both types)
+
+```
+ Byte:    0           1          2       3
+       +----------+----------+---------+---------+
+       | exponent |  factor  |  num_exceptions   |
+       |  (uint8) | (uint8)  |   (uint16 LE)     |
+       +----------+----------+---------+---------+
+```
+
+| Offset | Field | Size | Type | Description |
+|--------|-------|------|------|-------------|
+| 0 | exponent | 1 byte | uint8 | Power-of-10 exponent *e*. Range: \[0, 10\] for FLOAT, \[0, 18\] for DOUBLE. |
+| 1 | factor | 1 byte | uint8 | Power-of-10 factor *f*. Range: \[0, *e*\]. |
+| 2 | num_exceptions | 2 bytes | uint16 | Number of exception values in this vector. |
+
+###### ForInfo for FLOAT (5 bytes)
+
+```
+ Byte:    0    1    2    3       4
+       +----+----+----+----+-----------+
+       | frame_of_reference | bit_width |
+       |    (int32 LE)      |  (uint8)  |
+       +----+----+----+----+-----------+
+```
+
+| Offset | Field | Size | Type | Description |
+|--------|-------|------|------|-------------|
+| 0 | frame_of_reference | 4 bytes | int32 | Minimum encoded integer in the vector |
+| 4 | bit_width | 1 byte | uint8 | Bits per packed value. Range: \[0, 32\]. |
+
+###### ForInfo for DOUBLE (9 bytes)
+
+```
+ Byte:    0    1    2    3    4    5    6    7       8
+       +----+----+----+----+----+----+----+----+-----------+
+       |          frame_of_reference           | bit_width |
+       |              (int64 LE)               |  (uint8)  |
+       +----+----+----+----+----+----+----+----+-----------+
+```
+
+| Offset | Field | Size | Type | Description |
+|--------|-------|------|------|-------------|
+| 0 | frame_of_reference | 8 bytes | int64 | Minimum encoded long in the vector |
+| 8 | bit_width | 1 byte | uint8 | Bits per packed value. Range: \[0, 64\]. |
+
+###### PackedValues
+
+The FOR-encoded deltas, bit-packed into `ceil(num_elements_in_vector * bit_width / 8)` bytes.
+Values are packed from the least significant bit of each byte to the most significant bit,
+in groups of 8 values, using the same bit-packing order as the
+[RLE/Bit-Packing Hybrid](#RLE) encoding.
+
+If `bit_width` is 0, no bytes are stored (all deltas are zero, meaning all encoded
+integers are equal to `frame_of_reference`).
+
+###### ExceptionPositions
+
+An array of `num_exceptions` little-endian uint16 values, each giving
+the 0-based index within the vector of an exception value.
+
+###### ExceptionValues
+
+An array of `num_exceptions` values in the original floating-point type
+(4 bytes little-endian IEEE 754 for FLOAT, 8 bytes for DOUBLE), stored in
+the same order as the corresponding positions.
+
+#### Encoding
+
+##### Encoding Formula
+
+```
++-------------------------------------------------------------------+
+|                                                                   |
+|   encoded = round( value  *  10^e  *  10^(-f) )                  |
+|                                                                   |
+|   decoded = encoded  *  10^f  *  10^(-e)                          |
+|                                                                   |
++-------------------------------------------------------------------+
+```
+
+The encoding uses two separate multiplications (not a single multiplication by
+`10^(e-f)`, and not division) to ensure that implementations produce identical
+floating-point rounding across languages. The powers of 10 MUST be stored as
+precomputed floating-point constants (i.e., literal values like `1e-3f`), not
+computed at runtime.
+
+##### Fast Rounding
+
+The rounding function uses a "magic number" technique for branchless rounding:
+
+| Type   | Magic Number                      | Formula                          |
+|--------|-----------------------------------|----------------------------------|
+| FLOAT  | 2^22 + 2^23 = 12,582,912         | `(int)((value + magic) - magic)` |
+| DOUBLE | 2^51 + 2^52 = 6,755,399,441,055,744 | `(long)((value + magic) - magic)` |
+
+For negative values, the signs are reversed: `(int)((value - magic) + magic)`.
+
+##### Parameter Selection
+
+The encoder selects the (exponent, factor) pair that minimizes exceptions.
+Valid combinations satisfy 0 &le; factor &le; exponent:
+
+| Type   | Max Exponent | Total Combinations |
+|--------|--------------|--------------------|
+| FLOAT  | 10           | 66                 |
+| DOUBLE | 18           | 190                |
+
+To avoid the cost of exhaustive search on every vector, implementations
+SHOULD use sampling to select up to 5 candidate (exponent, factor)
+combinations (the "encoding preset") at the start of each column chunk.
+Each vector then searches only those 5 candidates.
+
+Sampling parameters:
+
+| Parameter            | Value | Description                         |
+|----------------------|-------|-------------------------------------|
+| Vector Size          | 1024  | Elements compressed as a unit       |
+| Sample Size          | 256   | Values sampled per vector           |
+| Max Combinations     | 5     | Best (e,f) pairs kept in preset     |
+| Sample Vectors       | 8     | Vectors sampled per row group       |
+
+##### Exception Detection
+
+A value becomes an exception if any of the following is true:
+
+| Condition          | Example                    | Reason                           |
+|--------------------|----------------------------|----------------------------------|
+| NaN                | `NaN`                      | Cannot convert to integer        |
+| Infinity           | `+Inf`, `-Inf`             | Cannot convert to integer        |
+| Negative zero      | `-0.0`                     | Would become `+0.0` after encoding |
+| Out of range       | value * 10^e > INT32\_MAX  | Exceeds target integer limits    |
+| Round-trip failure  | `0.333...` with e=1, f=0  | `decode(encode(v)) != v`         |
+
+Exception values at positions in the vector are replaced with a placeholder
+(the encoded integer of the first non-exception value, or 0 if all values
+are exceptions) before FOR encoding. This keeps the FOR range tight.
+
+##### Frame of Reference and Bit-Packing
+
+After decimal encoding and exception substitution:
+
+```
++---------------------------------------------------------------------+
+|  Encoded:   [ 123,  456,  789,   12 ]                               |
+|                                                                     |
+|  min_val = 12  (stored as frame_of_reference)                       |
+|                                                                     |
+|  Deltas:    [ 111,  444,  777,    0 ]   <-- all non-negative        |
++---------------------------------------------------------------------+
+```
+
+| Step                   | Formula                               | Example                     |
+|------------------------|---------------------------------------|-----------------------------|
+| 1. Find min            | min\_val = min(encoded\[\])           | 12                          |
+| 2. Compute deltas      | delta\[i\] = encoded\[i\] - min\_val | \[111, 444, 777, 0\]       |
+| 3. Calculate bit width | bit\_width = ceil(log2(max\_delta+1)) | ceil(log2(778)) = 10       |
+| 4. Pack values         | Each value uses bit\_width bits       | 4 * 10 = 40 bits = 5 bytes |
+
+Special case: If all values are identical, bit\_width = 0 and no packed data is stored.
+
+#### Decoding
+
+```
+                    Input: Serialized vector bytes
+                              |
+                              v
+    +----------------------------------------------------------+
+    |  1. BIT UNPACKING                                        |
+    |     Unpack num_elements values at bit_width bits each    |
+    +----------------------------------------------------------+
+                              |
+                              v
+    +----------------------------------------------------------+
+    |  2. REVERSE FOR                                          |
+    |     encoded[i] = delta[i] + frame_of_reference           |
+    +----------------------------------------------------------+
+                              |
+                              v
+    +----------------------------------------------------------+
+    |  3. DECIMAL DECODING                                     |
+    |     value[i] = encoded[i] * 10^factor * 10^(-exponent)   |
+    +----------------------------------------------------------+
+                              |
+                              v
+    +----------------------------------------------------------+
+    |  4. PATCH EXCEPTIONS                                     |
+    |     value[pos[j]] = exception_values[j]                  |
+    +----------------------------------------------------------+
+                              |
+                              v
+                  Output: Original float/double array
+```
+
+For each vector:
+
+1. Read AlpInfo and ForInfo from the vector header.
+2. Unpack `bit_width`-bit integers from PackedValues.
+3. Add `frame_of_reference` to each unpacked integer.
+4. Decode: multiply each integer by `10^factor` then by `10^(-exponent)`.
+5. Patch exceptions: for each (position, value) in the exception arrays,
+   overwrite the decoded output at that position with the stored value.
+
+#### Example 1: Simple Decimal Values
+
+**Input:** `float values[4] = { 1.23, 4.56, 7.89, 0.12 }`
+
+**Step 1: Find Best Exponent/Factor**
+
+Testing (exponent=2, factor=0) means multiply by 10^2 = 100:
+
+| Value | value * 100 | Rounded | Verify: rounded * 1.0 * 0.01 | Match? |
+|-------|-------------|---------|-------------------------------|--------|
+| 1.23  | 123.0       | 123     | 1.23                          | Yes    |
+| 4.56  | 456.0       | 456     | 4.56                          | Yes    |
+| 7.89  | 789.0       | 789     | 7.89                          | Yes    |
+| 0.12  | 12.0        | 12      | 0.12                          | Yes    |
+
+All values round-trip correctly -- no exceptions.
+
+**Step 2: Frame of Reference**
+
+| Encoded | min = 12 | Delta (encoded - min) |
+|---------|----------|-----------------------|
+| 123     | -        | 111                   |
+| 456     | -        | 444                   |
+| 789     | -        | 777                   |
+| 12      | -        | 0                     |
+
+**Step 3: Bit Packing**
+
+max\_delta = 777, bit\_width = ceil(log2(778)) = 10 bits,
+packed\_size = ceil(4 * 10 / 8) = 5 bytes
+
+**Serialized Vector:**
+
+| Section             | Content                                | Size     |
+|---------------------|----------------------------------------|----------|
+| AlpInfo             | e=2, f=0, num\_exceptions=0            | 4 bytes  |
+| ForInfo             | frame\_of\_reference=12, bit\_width=10 | 5 bytes  |
+| PackedValues        | \[111, 444, 777, 0\] at 10 bits each  | 5 bytes  |
+| ExceptionPositions  | (none)                                 | 0 bytes  |
+| ExceptionValues     | (none)                                 | 0 bytes  |
+| **Total**           |                                        | **14 bytes** |
+
+Compared to PLAIN encoding (4 * 4 = 16 bytes). With 1024 values, the 9-byte
+vector header becomes negligible and compression ratios of 2-8x are typical.
+
+#### Example 2: Values with Exceptions
+
+**Input:** `float values[4] = { 1.5, NaN, 2.5, 0.333... }`
+
+**Step 1: Decimal Encoding with (e=1, f=0)**
+
+Multiply by 10^1 = 10:
+
+| Index | Value    | value * 10 | Rounded | Verify         | Exception? |
+|-------|----------|------------|---------|----------------|------------|
+| 0     | 1.5      | 15.0       | 15      | 1.5 = 1.5      | No         |
+| 1     | NaN      | -          | -       | -              | Yes (NaN)  |
+| 2     | 2.5      | 25.0       | 25      | 2.5 = 2.5      | No         |
+| 3     | 0.333... | 3.333...   | 3       | 0.3 != 0.333...| Yes (round-trip) |
+
+**Step 2: Handle Exceptions**
+
+Exception positions: \[1, 3\]
+Exception values: \[NaN, 0.333...\]
+Placeholder: 15 (first non-exception encoded value)
+Encoded with placeholders: \[15, 15, 25, 15\]
+
+**Step 3: Frame of Reference**
+
+| Encoded          | min = 15 | Delta |
+|------------------|----------|-------|
+| 15               | -        | 0     |
+| 15 (placeholder) | -        | 0     |
+| 25               | -        | 10    |
+| 15 (placeholder) | -        | 0     |
+
+**Step 4: Bit Packing**
+
+max\_delta = 10, bit\_width = ceil(log2(11)) = 4 bits,
+packed\_size = ceil(4 * 4 / 8) = 2 bytes
+
+**Serialized Vector:**
+
+| Section             | Content                                | Size     |
+|---------------------|----------------------------------------|----------|
+| AlpInfo             | e=1, f=0, num\_exceptions=2            | 4 bytes  |
+| ForInfo             | frame\_of\_reference=15, bit\_width=4  | 5 bytes  |
+| PackedValues        | \[0, 0, 10, 0\] at 4 bits each        | 2 bytes  |
+| ExceptionPositions  | \[1, 3\]                               | 4 bytes  |
+| ExceptionValues     | \[NaN, 0.333...\]                      | 8 bytes  |
+| **Total**           |                                        | **23 bytes** |
+
+#### Example 3: Monetary Data (1024 values)
+
+1024 price values ranging from $0.01 to $999.99 (e.g., product prices).
+
+Optimal encoding: (exponent=2, factor=0)
+
+| Metric        | Value       | Calculation                          |
+|---------------|-------------|--------------------------------------|
+| Exponent      | 2           | Multiply by 100 for 2 decimal places |
+| Factor        | 0           | No additional scaling needed         |
+| Encoded range | 1 to 99,999 | $0.01 -> 1, $999.99 -> 99999         |
+| FOR min       | 1           | Assuming $0.01 is present            |
+| Delta range   | 0 to 99,998 | After FOR subtraction                |
+| Bit width     | 17          | ceil(log2(99999)) = 17 bits          |
+| Packed size   | 2,176 bytes | ceil(1024 * 17 / 8)                  |
+
+**Size Comparison:**
+
+| Encoding      | Size         | Ratio               |
+|---------------|--------------|----------------------|
+| PLAIN (float) | 4,096 bytes  | 1.0x                 |
+| ALP           | ~2,185 bytes | 0.53x (47% smaller)  |
+
+#### Characteristics
+
+| Property       | Description                                                                            |
+|----------------|----------------------------------------------------------------------------------------|
+| Lossless       | All original floating-point values are perfectly recoverable, including NaN, Inf, -0.0 |
+| Adaptive       | Exponent/factor selection adapts per vector based on data characteristics               |
+| Vectorized     | Fixed-size vectors enable SIMD-optimized bit packing/unpacking                         |
+| Exception-safe | Values that don't fit decimal model are stored separately                              |
+
+**Best use cases:**
+
+* Monetary/financial data (prices, transactions)
+* Sensor readings with fixed precision
+* Scientific measurements with limited decimal places
+* GPS coordinates and geographic data
+* Normalized scores and percentages
+
+**Worst case scenarios:**
+
+* Random floating-point values (high exception rate)
+* High-precision scientific data (many decimal places)
+* Data with many special values (NaN, Inf)
+* Very small datasets (header overhead dominates)
+
+**Comparison with other encodings:**
+
+| Encoding            | Type Support | Compression | Best For            |
+|---------------------|--------------|-------------|---------------------|
+| PLAIN               | All          | None        | General purpose     |
+| BYTE\_STREAM\_SPLIT | Float/Double | Moderate    | Random floats       |
+| ALP                 | Float/Double | High        | Decimal-like floats |
+| DELTA\_BINARY\_PACKED | Int32/Int64 | High       | Sequential integers |
+
+Unlike [Byte Stream Split](#BYTE_STREAM_SPLIT), ALP does not require a subsequent
+compression step to achieve size reduction -- the bit-packing directly reduces the
+encoded size. However, ALP and Byte Stream Split can be complementary: ALP
+exploits decimal structure while Byte Stream Split exploits byte-level correlation.
+
+#### Size Calculations
+
+##### Vector Size Formula
+
+```
+vector_bytes = vector_header_size                  // FLOAT: 9, DOUBLE: 13
+             + ceil(num_elements * bit_width / 8)  // packed values
+             + num_exceptions * 2                  // exception positions (uint16)
+             + num_exceptions * sizeof(T)          // exception values (4 or 8)
+```
+
+##### Page Size Formula
+
+```
+page_bytes = 7                                   // page header
+           + num_vectors * 4                     // offset array
+           + sum(vector_bytes for each vector)   // all vectors
+```
+
+#### Constants Reference
+
+| Constant          | Value   | Description                             |
+|-------------------|---------|-----------------------------------------|
+| Vector size       | 1024    | Default elements per compressed vector  |
+| Max combinations  | 5       | Max (e,f) pairs in preset               |
+| Samples per vector| 256     | Values sampled per vector               |
+| Sample vectors    | 8       | Vectors sampled per row group           |
+| FLOAT max exponent| 10      | 10^10 ~ 10 billion                      |
+| DOUBLE max exponent| 18     | 10^18 ~ 1 quintillion                   |

From 69aaf624a9edfe56134cbd86269aa5b9d98a226d Mon Sep 17 00:00:00 2001
From: Prateek Gaur <prateek.gaur@snowflake.com>
Date: Wed, 29 Apr 2026 20:01:08 +0000
Subject: [PATCH 2/7] Address review feedback on ALP encoding specification

Incorporate review comments from emkornfield and alamb on PR #557:
---
 Encodings.md | 56 ++++++++++++++++++++++++----------------------------
 1 file changed, 26 insertions(+), 30 deletions(-)

diff --git a/Encodings.md b/Encodings.md
index 6f0758e6..5db97777 100644
--- a/Encodings.md
+++ b/Encodings.md
@@ -38,6 +38,7 @@ For details on current implementation status, see the [Implementation Status](ht
 | [Delta-length byte array](#DELTALENGTH)          | DELTA_LENGTH_BYTE_ARRAY = 6                               | BYTE_ARRAY                                        |
 | [Delta Strings](#DELTASTRING)                    | DELTA_BYTE_ARRAY = 7                                      | BYTE_ARRAY, FIXED_LEN_BYTE_ARRAY                  |
 | [Byte Stream Split](#BYTESTREAMSPLIT)            | BYTE_STREAM_SPLIT = 9                                     | INT32, INT64, FLOAT, DOUBLE, FIXED_LEN_BYTE_ARRAY |
+| [ALP](#ALP)                                      | ALP = 10                                                  | FLOAT, DOUBLE                                     |
 
 ### Deprecated Encodings
 
@@ -45,7 +46,6 @@ For details on current implementation status, see the [Implementation Status](ht
 | ------------------------------------- | -------------- |
 | [Bit-packed (Deprecated)](#BITPACKED) | BIT_PACKED = 4 |
 
-
 <a name="PLAIN"></a>
 ### Plain: (PLAIN = 0)
 
@@ -401,11 +401,13 @@ This encoding is adapted from the paper
 ["ALP: Adaptive Lossless floating-Point Compression"](https://dl.acm.org/doi/10.1145/3626717)
 by Afroozeh and Boncz (SIGMOD 2024).
 
-ALP works by converting floating-point values to integers using decimal scaling,
-then applying Frame of Reference (FOR) encoding and bit-packing. Values that
-cannot be losslessly converted are stored as exceptions. The encoding achieves
-high compression for decimal-like floating-point data (e.g., monetary values,
-sensor readings) while remaining fully lossless.
+ALP works by converting floating-point values to integers using decimal scaling
+(controlled by an *exponent* `e` and *factor* `f`), then applying Frame of
+Reference (FOR) encoding and bit-packing. Values that cannot be losslessly
+converted are stored separately as *exceptions*. The encoding achieves high
+compression for decimal-like floating-point data (e.g., monetary values, sensor
+readings) while remaining fully lossless. Each value is encoded independently,
+enabling random access to individual vectors and parallel encode/decode.
 
 #### Overview
 
@@ -430,16 +432,14 @@ The compression pipeline for each vector is:
                               |
                               v
     +----------------------------------------------------------+
-    |  1. SAMPLING & PRESET GENERATION                         |
-    |     Sample vectors from column chunk                     |
-    |     Try all (exponent, factor) combinations              |
-    |     Select best k combinations for preset                |
+    |  1. CHOOSE PARAMETERS                                    |
+    |     Select (exponent, factor) pair for this vector       |
     +----------------------------------------------------------+
                               |
                               v
     +----------------------------------------------------------+
     |  2. DECIMAL ENCODING                                     |
-    |     encoded[i] = round(value[i] * 10^e * 10^(-f))       |
+    |     encoded[i] = fast_round(value[i] * 10^e * 10^(-f))  |
     |     Detect exceptions where decode(encode(v)) != v       |
     +----------------------------------------------------------+
                               |
@@ -465,7 +465,7 @@ The compression pipeline for each vector is:
 
 ##### Header (7 bytes)
 
-All multi-byte values are little-endian.
+All multi-byte values are stored in little-endian order.
 
 ```
  Byte:    0              1               2              3    4    5    6
@@ -477,18 +477,16 @@ All multi-byte values are little-endian.
 
 | Offset | Field | Size | Type | Description |
 |--------|-------|------|------|-------------|
-| 0 | compression_mode | 1 byte | uint8 | Compression mode (must be 0 = ALP) |
+| 0 | compression_mode | 1 byte | uint8 | Compression mode (0 = ALP). Reserved for future variants (e.g., ALP-RD). |
 | 1 | integer_encoding | 1 byte | uint8 | Integer encoding (must be 0 = FOR + bit-packing) |
-| 2 | log_vector_size | 1 byte | uint8 | log2(vector\_size). Must be in \[3, 15\]. Default: 10 (vector size 1024) |
+| 2 | log_vector_size | 1 byte | uint8 | log2(vector\_size). Must be in the inclusive range \[3, 15\]. Recommended default: 10 (vector size 1024) |
 | 3 | num_elements | 4 bytes | int32 | Total number of floating-point values in the page |
 
 The number of vectors is `ceil(num_elements / vector_size)`. The last vector may
 contain fewer than `vector_size` elements.
 
-**Note:** The number of elements per vector and the packed data size are NOT stored
-in the header. They are derived:
-* Elements per vector: `vector_size` for all vectors except the last, which may be smaller.
-* Packed data size: `ceil(num_elements_in_vector * bit_width / 8)`.
+**Note:** The number of elements per vector is NOT stored in the header — it is
+derived: `vector_size` for all vectors except the last, which may be smaller.
 
 ##### Offset Array
 
@@ -496,7 +494,7 @@ Immediately following the header is an array of `num_vectors` little-endian uint
 values. Each offset gives the byte position of the corresponding vector's data,
 measured from the start of the offset array itself.
 
-The first offset equals `num_vectors * 4` (pointing just past the offset array).
+The first offset always equals `num_vectors * 4` (pointing just past the offset array).
 Each subsequent offset equals the previous offset plus the stored size of the
 previous vector.
 
@@ -521,9 +519,9 @@ Vector header sizes:
 Data section sizes:
 | Section             | Size Formula                | Description                  |
 |---------------------|-----------------------------|------------------------------|
-| PackedValues        | ceil(N * bit\_width / 8)    | Bit-packed delta values      |
+| PackedValues        | ceil(num\_elements\_in\_vector * bit\_width / 8) | Bit-packed delta values      |
 | ExceptionPositions  | num\_exceptions * 2 bytes   | uint16 indices of exceptions |
-| ExceptionValues     | num\_exceptions * sizeof(T) | Original float/double values |
+| ExceptionValues     | num\_exceptions * sizeof(type) (4 for FLOAT, 8 for DOUBLE) | Original float/double values |
 
 ###### AlpInfo (4 bytes, both types)
 
@@ -574,8 +572,7 @@ Data section sizes:
 ###### PackedValues
 
 The FOR-encoded deltas, bit-packed into `ceil(num_elements_in_vector * bit_width / 8)` bytes.
-Values are packed from the least significant bit of each byte to the most significant bit,
-in groups of 8 values, using the same bit-packing order as the
+Values are bit-packed using the same LSB-first packing order as the
 [RLE/Bit-Packing Hybrid](#RLE) encoding.
 
 If `bit_width` is 0, no bytes are stored (all deltas are zero, meaning all encoded
@@ -599,7 +596,7 @@ the same order as the corresponding positions.
 ```
 +-------------------------------------------------------------------+
 |                                                                   |
-|   encoded = round( value  *  10^e  *  10^(-f) )                  |
+|   encoded = fast_round( value  *  10^e  *  10^(-f) )             |
 |                                                                   |
 |   decoded = encoded  *  10^f  *  10^(-e)                          |
 |                                                                   |
@@ -608,9 +605,8 @@ the same order as the corresponding positions.
 
 The encoding uses two separate multiplications (not a single multiplication by
 `10^(e-f)`, and not division) to ensure that implementations produce identical
-floating-point rounding across languages. The powers of 10 MUST be stored as
-precomputed floating-point constants (i.e., literal values like `1e-3f`), not
-computed at runtime.
+floating-point rounding across languages. Implementations must ensure that the
+encoder and decoder use identical power-of-10 values for a given exponent.
 
 ##### Fast Rounding
 
@@ -618,10 +614,10 @@ The rounding function uses a "magic number" technique for branchless rounding:
 
 | Type   | Magic Number                      | Formula                          |
 |--------|-----------------------------------|----------------------------------|
-| FLOAT  | 2^22 + 2^23 = 12,582,912         | `(int)((value + magic) - magic)` |
-| DOUBLE | 2^51 + 2^52 = 6,755,399,441,055,744 | `(long)((value + magic) - magic)` |
+| FLOAT  | 2^22 + 2^23 = 12,582,912         | `(int32_t)((value + magic) - magic)` |
+| DOUBLE | 2^51 + 2^52 = 6,755,399,441,055,744 | `(int64_t)((value + magic) - magic)` |
 
-For negative values, the signs are reversed: `(int)((value - magic) + magic)`.
+For negative values, the signs are reversed: `(int32_t)((value - magic) + magic)` for FLOAT, `(int64_t)((value - magic) + magic)` for DOUBLE.
 
 ##### Parameter Selection
 

From 095a0e50731c29fa62257b85d1240a65d7731f9a Mon Sep 17 00:00:00 2001
From: Prateek Gaur <prateek.gaur@snowflake.com>
Date: Thu, 7 May 2026 03:32:18 +0000
Subject: [PATCH 3/7] Address remaining review feedback on ALP spec

- Clarify no padding between vectors in offset array description
- Use 'sizeof(encoded type) (float=4 and double=8)' per reviewer suggestion
---
 Encodings.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Encodings.md b/Encodings.md
index 5db97777..b7cea34a 100644
--- a/Encodings.md
+++ b/Encodings.md
@@ -496,7 +496,7 @@ measured from the start of the offset array itself.
 
 The first offset always equals `num_vectors * 4` (pointing just past the offset array).
 Each subsequent offset equals the previous offset plus the stored size of the
-previous vector.
+previous vector. No padding is inserted between vectors.
 
 ##### Vector Format
 
@@ -521,7 +521,7 @@ Data section sizes:
 |---------------------|-----------------------------|------------------------------|
 | PackedValues        | ceil(num\_elements\_in\_vector * bit\_width / 8) | Bit-packed delta values      |
 | ExceptionPositions  | num\_exceptions * 2 bytes   | uint16 indices of exceptions |
-| ExceptionValues     | num\_exceptions * sizeof(type) (4 for FLOAT, 8 for DOUBLE) | Original float/double values |
+| ExceptionValues     | num\_exceptions * sizeof(encoded type) (float=4 and double=8) | Original float/double values |
 
 ###### AlpInfo (4 bytes, both types)
 

From ccb6674758c19dd7cc7c36a5ed5ba3e234013890 Mon Sep 17 00:00:00 2001
From: Prateek Gaur <prateek.gaur@snowflake.com>
Date: Thu, 14 May 2026 00:28:24 +0000
Subject: [PATCH 4/7] Address alamb's second review: trim spec, fix wording,
 rework example
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Remove Characteristics, Size Calculations, Constants Reference sections
- Consolidate three examples into one worked example with f!=0 and exceptions
- Remove incorrect sign-reversal claim for fast_round on negative values
- Soften sampling recommendation from SHOULD to suggestion
- Fix "individual vectors" → "individual values" for random access
- Clarify power-of-10 interop as MUST requirement
- Use consistent fast_round terminology throughout
---
 Encodings.md | 223 +++++++++++----------------------------------------
 1 file changed, 46 insertions(+), 177 deletions(-)

diff --git a/Encodings.md b/Encodings.md
index b7cea34a..ddb8e702 100644
--- a/Encodings.md
+++ b/Encodings.md
@@ -407,7 +407,7 @@ Reference (FOR) encoding and bit-packing. Values that cannot be losslessly
 converted are stored separately as *exceptions*. The encoding achieves high
 compression for decimal-like floating-point data (e.g., monetary values, sensor
 readings) while remaining fully lossless. Each value is encoded independently,
-enabling random access to individual vectors and parallel encode/decode.
+enabling random access to individual values and parallel encode/decode.
 
 #### Overview
 
@@ -605,20 +605,20 @@ the same order as the corresponding positions.
 
 The encoding uses two separate multiplications (not a single multiplication by
 `10^(e-f)`, and not division) to ensure that implementations produce identical
-floating-point rounding across languages. Implementations must ensure that the
-encoder and decoder use identical power-of-10 values for a given exponent.
+floating-point results. All implementations MUST use the exact same floating-point
+arithmetic and power-of-10 constants to guarantee cross-language interoperability.
 
 ##### Fast Rounding
 
-The rounding function uses a "magic number" technique for branchless rounding:
+The `fast_round` function uses a "magic number" technique for branchless rounding.
+
+`fast_round(value)` is defined as follows:
 
 | Type   | Magic Number                      | Formula                          |
 |--------|-----------------------------------|----------------------------------|
 | FLOAT  | 2^22 + 2^23 = 12,582,912         | `(int32_t)((value + magic) - magic)` |
 | DOUBLE | 2^51 + 2^52 = 6,755,399,441,055,744 | `(int64_t)((value + magic) - magic)` |
 
-For negative values, the signs are reversed: `(int32_t)((value - magic) + magic)` for FLOAT, `(int64_t)((value - magic) + magic)` for DOUBLE.
-
 ##### Parameter Selection
 
 The encoder selects the (exponent, factor) pair that minimizes exceptions.
@@ -630,15 +630,15 @@ Valid combinations satisfy 0 &le; factor &le; exponent:
 | DOUBLE | 18           | 190                |
 
 To avoid the cost of exhaustive search on every vector, implementations
-SHOULD use sampling to select up to 5 candidate (exponent, factor)
-combinations (the "encoding preset") at the start of each column chunk.
-Each vector then searches only those 5 candidates.
+can use a sampling approach. One such approach, described in the paper, is to
+select up to 5 candidate (exponent, factor) combinations (the "encoding preset")
+at the start of each column chunk, and when encoding each vector,
+test each of the 5 candidates for the fewest exceptions.
 
-Sampling parameters:
+Suggested sampling parameters (from the paper):
 
 | Parameter            | Value | Description                         |
 |----------------------|-------|-------------------------------------|
-| Vector Size          | 1024  | Elements compressed as a unit       |
 | Sample Size          | 256   | Values sampled per vector           |
 | Max Combinations     | 5     | Best (e,f) pairs kept in preset     |
 | Sample Vectors       | 8     | Vectors sampled per row group       |
@@ -659,9 +659,9 @@ Exception values at positions in the vector are replaced with a placeholder
 (the encoded integer of the first non-exception value, or 0 if all values
 are exceptions) before FOR encoding. This keeps the FOR range tight.
 
-##### Frame of Reference and Bit-Packing
+##### Example: Frame of Reference and Bit-Packing
 
-After decimal encoding and exception substitution:
+Given the following data after decimal encoding and exception substitution:
 
 ```
 +---------------------------------------------------------------------+
@@ -724,185 +724,54 @@ For each vector:
 5. Patch exceptions: for each (position, value) in the exception arrays,
    overwrite the decoded output at that position with the stored value.
 
-#### Example 1: Simple Decimal Values
-
-**Input:** `float values[4] = { 1.23, 4.56, 7.89, 0.12 }`
-
-**Step 1: Find Best Exponent/Factor**
-
-Testing (exponent=2, factor=0) means multiply by 10^2 = 100:
-
-| Value | value * 100 | Rounded | Verify: rounded * 1.0 * 0.01 | Match? |
-|-------|-------------|---------|-------------------------------|--------|
-| 1.23  | 123.0       | 123     | 1.23                          | Yes    |
-| 4.56  | 456.0       | 456     | 4.56                          | Yes    |
-| 7.89  | 789.0       | 789     | 7.89                          | Yes    |
-| 0.12  | 12.0        | 12      | 0.12                          | Yes    |
-
-All values round-trip correctly -- no exceptions.
-
-**Step 2: Frame of Reference**
-
-| Encoded | min = 12 | Delta (encoded - min) |
-|---------|----------|-----------------------|
-| 123     | -        | 111                   |
-| 456     | -        | 444                   |
-| 789     | -        | 777                   |
-| 12      | -        | 0                     |
-
-**Step 3: Bit Packing**
-
-max\_delta = 777, bit\_width = ceil(log2(778)) = 10 bits,
-packed\_size = ceil(4 * 10 / 8) = 5 bytes
+#### Worked Example: Exceptions and Non-Zero Factor
 
-**Serialized Vector:**
-
-| Section             | Content                                | Size     |
-|---------------------|----------------------------------------|----------|
-| AlpInfo             | e=2, f=0, num\_exceptions=0            | 4 bytes  |
-| ForInfo             | frame\_of\_reference=12, bit\_width=10 | 5 bytes  |
-| PackedValues        | \[111, 444, 777, 0\] at 10 bits each  | 5 bytes  |
-| ExceptionPositions  | (none)                                 | 0 bytes  |
-| ExceptionValues     | (none)                                 | 0 bytes  |
-| **Total**           |                                        | **14 bytes** |
-
-Compared to PLAIN encoding (4 * 4 = 16 bytes). With 1024 values, the 9-byte
-vector header becomes negligible and compression ratios of 2-8x are typical.
-
-#### Example 2: Values with Exceptions
-
-**Input:** `float values[4] = { 1.5, NaN, 2.5, 0.333... }`
+**Input:** `double values[4] = { 1500.0, NaN, 2500.0, 333.3 }`
 
-**Step 1: Decimal Encoding with (e=1, f=0)**
+Best encoding found: (exponent=4, factor=3). This means:
+`encoded = fast_round(value * 10^4 * 10^(-3)) = fast_round(value * 10)`
 
-Multiply by 10^1 = 10:
+**Step 1: Decimal Encoding**
 
-| Index | Value    | value * 10 | Rounded | Verify         | Exception? |
-|-------|----------|------------|---------|----------------|------------|
-| 0     | 1.5      | 15.0       | 15      | 1.5 = 1.5      | No         |
-| 1     | NaN      | -          | -       | -              | Yes (NaN)  |
-| 2     | 2.5      | 25.0       | 25      | 2.5 = 2.5      | No         |
-| 3     | 0.333... | 3.333...   | 3       | 0.3 != 0.333...| Yes (round-trip) |
+| Index | Value   | value * 10^4 * 10^(-3) | Rounded | Decoded: rounded * 10^3 * 10^(-4) | Exception? |
+|-------|---------|------------------------|---------|------------------------------------|------------|
+| 0     | 1500.0  | 15000.0                | 15000   | 1500.0                             | No         |
+| 1     | NaN     | -                      | -       | -                                  | Yes (NaN)  |
+| 2     | 2500.0  | 25000.0                | 25000   | 2500.0                             | No         |
+| 3     | 333.3   | 3333.0                 | 3333    | 333.3                              | No         |
 
 **Step 2: Handle Exceptions**
 
-Exception positions: \[1, 3\]
-Exception values: \[NaN, 0.333...\]
-Placeholder: 15 (first non-exception encoded value)
-Encoded with placeholders: \[15, 15, 25, 15\]
+Exception positions: \[1\]
+Exception values: \[NaN\]
+Placeholder: 15000 (first non-exception encoded value)
+Encoded with placeholders: \[15000, 15000, 25000, 3333\]
 
 **Step 3: Frame of Reference**
 
-| Encoded          | min = 15 | Delta |
-|------------------|----------|-------|
-| 15               | -        | 0     |
-| 15 (placeholder) | -        | 0     |
-| 25               | -        | 10    |
-| 15 (placeholder) | -        | 0     |
+| Encoded            | min = 3333 | Delta |
+|--------------------|------------|-------|
+| 15000              | -          | 11667 |
+| 15000 (placeholder)| -          | 11667 |
+| 25000              | -          | 21667 |
+| 3333               | -          | 0     |
 
 **Step 4: Bit Packing**
 
-max\_delta = 10, bit\_width = ceil(log2(11)) = 4 bits,
-packed\_size = ceil(4 * 4 / 8) = 2 bytes
+max\_delta = 21667, bit\_width = ceil(log2(21668)) = 15 bits,
+packed\_size = ceil(4 * 15 / 8) = 8 bytes
 
 **Serialized Vector:**
 
-| Section             | Content                                | Size     |
-|---------------------|----------------------------------------|----------|
-| AlpInfo             | e=1, f=0, num\_exceptions=2            | 4 bytes  |
-| ForInfo             | frame\_of\_reference=15, bit\_width=4  | 5 bytes  |
-| PackedValues        | \[0, 0, 10, 0\] at 4 bits each        | 2 bytes  |
-| ExceptionPositions  | \[1, 3\]                               | 4 bytes  |
-| ExceptionValues     | \[NaN, 0.333...\]                      | 8 bytes  |
-| **Total**           |                                        | **23 bytes** |
-
-#### Example 3: Monetary Data (1024 values)
-
-1024 price values ranging from $0.01 to $999.99 (e.g., product prices).
-
-Optimal encoding: (exponent=2, factor=0)
-
-| Metric        | Value       | Calculation                          |
-|---------------|-------------|--------------------------------------|
-| Exponent      | 2           | Multiply by 100 for 2 decimal places |
-| Factor        | 0           | No additional scaling needed         |
-| Encoded range | 1 to 99,999 | $0.01 -> 1, $999.99 -> 99999         |
-| FOR min       | 1           | Assuming $0.01 is present            |
-| Delta range   | 0 to 99,998 | After FOR subtraction                |
-| Bit width     | 17          | ceil(log2(99999)) = 17 bits          |
-| Packed size   | 2,176 bytes | ceil(1024 * 17 / 8)                  |
-
-**Size Comparison:**
+| Section             | Content                                          | Size     |
+|---------------------|--------------------------------------------------|----------|
+| AlpInfo             | e=4, f=3, num\_exceptions=1                      | 4 bytes  |
+| ForInfo             | frame\_of\_reference=3333, bit\_width=15          | 9 bytes  |
+| PackedValues        | \[11667, 11667, 21667, 0\] at 15 bits each       | 8 bytes  |
+| ExceptionPositions  | \[1\]                                             | 2 bytes  |
+| ExceptionValues     | \[NaN\]                                           | 8 bytes  |
+| **Total**           |                                                   | **31 bytes** |
 
-| Encoding      | Size         | Ratio               |
-|---------------|--------------|----------------------|
-| PLAIN (float) | 4,096 bytes  | 1.0x                 |
-| ALP           | ~2,185 bytes | 0.53x (47% smaller)  |
-
-#### Characteristics
-
-| Property       | Description                                                                            |
-|----------------|----------------------------------------------------------------------------------------|
-| Lossless       | All original floating-point values are perfectly recoverable, including NaN, Inf, -0.0 |
-| Adaptive       | Exponent/factor selection adapts per vector based on data characteristics               |
-| Vectorized     | Fixed-size vectors enable SIMD-optimized bit packing/unpacking                         |
-| Exception-safe | Values that don't fit decimal model are stored separately                              |
-
-**Best use cases:**
-
-* Monetary/financial data (prices, transactions)
-* Sensor readings with fixed precision
-* Scientific measurements with limited decimal places
-* GPS coordinates and geographic data
-* Normalized scores and percentages
-
-**Worst case scenarios:**
-
-* Random floating-point values (high exception rate)
-* High-precision scientific data (many decimal places)
-* Data with many special values (NaN, Inf)
-* Very small datasets (header overhead dominates)
-
-**Comparison with other encodings:**
-
-| Encoding            | Type Support | Compression | Best For            |
-|---------------------|--------------|-------------|---------------------|
-| PLAIN               | All          | None        | General purpose     |
-| BYTE\_STREAM\_SPLIT | Float/Double | Moderate    | Random floats       |
-| ALP                 | Float/Double | High        | Decimal-like floats |
-| DELTA\_BINARY\_PACKED | Int32/Int64 | High       | Sequential integers |
-
-Unlike [Byte Stream Split](#BYTE_STREAM_SPLIT), ALP does not require a subsequent
-compression step to achieve size reduction -- the bit-packing directly reduces the
-encoded size. However, ALP and Byte Stream Split can be complementary: ALP
-exploits decimal structure while Byte Stream Split exploits byte-level correlation.
-
-#### Size Calculations
-
-##### Vector Size Formula
-
-```
-vector_bytes = vector_header_size                  // FLOAT: 9, DOUBLE: 13
-             + ceil(num_elements * bit_width / 8)  // packed values
-             + num_exceptions * 2                  // exception positions (uint16)
-             + num_exceptions * sizeof(T)          // exception values (4 or 8)
-```
-
-##### Page Size Formula
-
-```
-page_bytes = 7                                   // page header
-           + num_vectors * 4                     // offset array
-           + sum(vector_bytes for each vector)   // all vectors
-```
-
-#### Constants Reference
+Compared to PLAIN encoding (4 * 8 = 32 bytes). With 1024 values, the 13-byte
+vector header becomes negligible and compression ratios of 2-8x are typical.
 
-| Constant          | Value   | Description                             |
-|-------------------|---------|-----------------------------------------|
-| Vector size       | 1024    | Default elements per compressed vector  |
-| Max combinations  | 5       | Max (e,f) pairs in preset               |
-| Samples per vector| 256     | Values sampled per vector               |
-| Sample vectors    | 8       | Vectors sampled per row group           |
-| FLOAT max exponent| 10      | 10^10 ~ 10 billion                      |
-| DOUBLE max exponent| 18     | 10^18 ~ 1 quintillion                   |

From 270d455eb7b32e6e1a851e66337f7e42436aa18b Mon Sep 17 00:00:00 2001
From: Prateek Gaur <prateek.gaur@snowflake.com>
Date: Wed, 3 Jun 2026 14:49:57 +0000
Subject: [PATCH 5/7] Address review: clarify power-of-10 constants and add
 fast_round sign branching
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Specify that power-of-10 constants must match IEEE 754 correctly-rounded
  decimal-to-binary conversion of literals (§5.12.2), not runtime pow()
- Add negative-value branch to fast_round formula table to avoid landing
  in a binade where ULP < 1.0, which causes unnecessary exceptions
---
 Encodings.md | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/Encodings.md b/Encodings.md
index ddb8e702..f554bbc9 100644
--- a/Encodings.md
+++ b/Encodings.md
@@ -607,6 +607,11 @@ The encoding uses two separate multiplications (not a single multiplication by
 `10^(e-f)`, and not division) to ensure that implementations produce identical
 floating-point results. All implementations MUST use the exact same floating-point
 arithmetic and power-of-10 constants to guarantee cross-language interoperability.
+The power-of-10 constants MUST be the correctly-rounded IEEE 754 values of the
+decimal literals `1e0`, `1e1`, ..., `1e18` and `1e-1`, `1e-2`, ..., `1e-18` as
+defined by the decimal-to-binary conversion in IEEE 754-2008 §5.12.2.
+Implementations MUST NOT compute these constants at runtime via `pow()` or
+equivalent functions, which are not guaranteed to be correctly rounded.
 
 ##### Fast Rounding
 
@@ -614,10 +619,19 @@ The `fast_round` function uses a "magic number" technique for branchless roundin
 
 `fast_round(value)` is defined as follows:
 
-| Type   | Magic Number                      | Formula                          |
-|--------|-----------------------------------|----------------------------------|
-| FLOAT  | 2^22 + 2^23 = 12,582,912         | `(int32_t)((value + magic) - magic)` |
-| DOUBLE | 2^51 + 2^52 = 6,755,399,441,055,744 | `(int64_t)((value + magic) - magic)` |
+| Type   | Magic Number                      | Formula (value &ge; 0)           | Formula (value &lt; 0)           |
+|--------|-----------------------------------|----------------------------------|----------------------------------|
+| FLOAT  | 2^22 + 2^23 = 12,582,912         | `(int32_t)((value + magic) - magic)` | `(int32_t)((value - magic) + magic)` |
+| DOUBLE | 2^51 + 2^52 = 6,755,399,441,055,744 | `(int64_t)((value + magic) - magic)` | `(int64_t)((value - magic) + magic)` |
+
+The sign branching is necessary because the technique relies on `value ± magic`
+landing in a binade where the unit in the last place (ULP) equals 1.0. For
+non-negative values, `value + magic` lands in [2^23, 2^24) for floats or
+[2^52, 2^53) for doubles. For negative values, `value - magic` lands in
+[-2^24, -2^23) or [-2^53, -2^52) respectively, where ULP is also 1.0. Without
+sign branching, negative values with magnitude beyond 2^22 (float) or 2^51
+(double) fall into a lower binade where ULP &lt; 1.0, producing incorrect
+rounding and a higher exception rate.
 
 ##### Parameter Selection
 

From 87f1630c01ef44723297e7a349d58baae4aafea3 Mon Sep 17 00:00:00 2001
From: Prateek Gaur <prateek.gaur@snowflake.com>
Date: Wed, 3 Jun 2026 14:55:51 +0000
Subject: [PATCH 6/7] Address review: clarify parameter selection is
 encoder-only optimization
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Any valid (e,f) pair produces correct output — the decoder is agnostic.
Reframe "minimize exceptions" as one heuristic; the actual target is
smallest encoded size (bit-width + exception overhead).
---
 Encodings.md | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/Encodings.md b/Encodings.md
index f554bbc9..1e557005 100644
--- a/Encodings.md
+++ b/Encodings.md
@@ -635,7 +635,15 @@ rounding and a higher exception rate.
 
 ##### Parameter Selection
 
-The encoder selects the (exponent, factor) pair that minimizes exceptions.
+Any valid (exponent, factor) pair produces a correct encoding — the decoder is
+agnostic to the selection strategy, and the exception mechanism guarantees
+round-trip fidelity regardless of which pair is chosen. The choice only affects
+compression ratio.
+
+The encoder SHOULD select the (exponent, factor) pair that produces the smallest
+encoded output. A simple heuristic is to minimize exception count; a more precise
+approach accounts for both bit-width and exception overhead.
+
 Valid combinations satisfy 0 &le; factor &le; exponent:
 
 | Type   | Max Exponent | Total Combinations |
@@ -647,7 +655,7 @@ To avoid the cost of exhaustive search on every vector, implementations
 can use a sampling approach. One such approach, described in the paper, is to
 select up to 5 candidate (exponent, factor) combinations (the "encoding preset")
 at the start of each column chunk, and when encoding each vector,
-test each of the 5 candidates for the fewest exceptions.
+evaluate each candidate for the best compression.
 
 Suggested sampling parameters (from the paper):
 

From 2169e26d9d8ba9993592749fe3ccb3f385d5bd8a Mon Sep 17 00:00:00 2001
From: Prateek Gaur <prateek.gaur@snowflake.com>
Date: Wed, 3 Jun 2026 14:57:50 +0000
Subject: [PATCH 7/7] Address review: fix out-of-range condition to reference
 correct integer types

The example incorrectly referenced only INT32_MAX for all types. Reworded
to specify int32 for FLOAT and int64 for DOUBLE, avoiding exact numeric
limits that differ from INT_MAX due to float representability.
---
 Encodings.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Encodings.md b/Encodings.md
index 1e557005..05d9a5e0 100644
--- a/Encodings.md
+++ b/Encodings.md
@@ -674,7 +674,7 @@ A value becomes an exception if any of the following is true:
 | NaN                | `NaN`                      | Cannot convert to integer        |
 | Infinity           | `+Inf`, `-Inf`             | Cannot convert to integer        |
 | Negative zero      | `-0.0`                     | Would become `+0.0` after encoding |
-| Out of range       | value * 10^e > INT32\_MAX  | Exceeds target integer limits    |
+| Out of range       | scaled value outside int32 (FLOAT) or int64 (DOUBLE) | Exceeds target integer type range |
 | Round-trip failure  | `0.333...` with e=1, f=0  | `decode(encode(v)) != v`         |
 
 Exception values at positions in the vector are replaced with a placeholder