Skip to content

Add fluentbit -compact compressions that preserve CRI time nanoseconds#2

Merged
solsson merged 10 commits into
mainfrom
cri-columns-duckdb-parquet-optimization
Feb 24, 2026
Merged

Add fluentbit -compact compressions that preserve CRI time nanoseconds#2
solsson merged 10 commits into
mainfrom
cri-columns-duckdb-parquet-optimization

Conversation

@solsson
Copy link
Copy Markdown
Contributor

@solsson solsson commented Feb 24, 2026

... without the storage space and parsing cost of iso 8601 varchars. Also use dictionaries for "stream" (stdout/stderr) and "logtag" (logtag) where possible (arrow, not parquet, AFAIK), again to reduce storage space. Use ZSTD for arrow output because that's what the nanoarrow extension (see https://duckdb.org/2025/05/23/arrow-ipc-support-in-duckdb) supports at the time of writing.

solsson and others added 10 commits February 23, 2026 22:17
Write each log record to S3 twice via two fluent-bit s3 outputs:
- .arrow files: CRI time preserved as VARCHAR (nanosecond ISO 8601)
  with json_date_key=false suppressing the internal timestamp
- .parquet files: time_ms as BIGINT (epoch_ms) alongside CRI time

y-logcli gains -f flag (arrow/parquet/both) to query either or both
formats. UNION ALL combines them with matching TIMESTAMPTZ columns.

test.sh polls each format independently and asserts:
- Schema: arrow time is VARCHAR, parquet time_ms is BIGINT
- Arrow time values match CRI timestamp format (nanosecond precision)
- Both formats contain equal row counts
- SIGTERM flush works with dual outputs

Path layout adds NODE_NAME between date and pod/container components.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Arrow-compact stores Timestamp(ns) without timezone metadata.
The previous `time::TIMESTAMPTZ` cast made DuckDB interpret the
value as local time, causing timestamps to display 1 hour off
(in CET) from the parquet-compact output. Using `AT TIME ZONE 'UTC'`
correctly tells DuckDB the timezone-naive nanos are UTC.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The arrow-compact compression was incorrectly calling
table_to_parquet_buffer(), producing Parquet files with a .arrow
extension. Now uses a new table_to_arrow_ipc_buffer() that writes
uncompressed Feather v2 (Arrow IPC) via garrow_table_write_as_feather
with GARROW_COMPRESSION_TYPE_UNCOMPRESSED (nanoarrow/DuckDB cannot
decode LZ4-compressed Arrow IPC bodies).

Also:
- Skip dictionary encoding for arrow-compact (breaks nanoarrow reader)
- y-logcli: use read_arrow() via nanoarrow extension for .arrow files
- test.sh: use read_arrow for schema checks, show distinct metadata
  for each format (DESCRIBE for Arrow IPC, parquet_schema/metadata
  for Parquet)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Write stream and logtag as Arrow dictionary type in both the Arrow IPC
and Parquet output paths. pyarrow confirms dictionary<values=string,
indices=int32> in the written files.

DuckDB/nanoarrow cannot read dictionary-encoded Arrow IPC
(paleolimbot/duckdb-nanoarrow#25), so test.sh wraps nanoarrow-based
arrow assertions with ON_DUCKDB_ARROW_FAILURE (default: continue).
The pyarrow probe via arrow-tools image is now the authoritative
Arrow IPC format validator.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- inspect_arrow.py: format timestamps as ISO 8601 with ns precision,
  show first 5 rows (burst shows distinct ns per row), pick earliest
  file from log-generator container
- test.sh: print progress before DuckDB arrow probe and pyarrow
  validation steps so the script doesn't appear stalled

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
stream (stdout/stderr) and logtag (F/P) have very low cardinality,
so int8 indices save 3 bytes per row vs the default int32. After
dictionary encoding, cast indices from int32 to int8 and rebuild
the dictionary array with the compact index type.

Unit tests verify int8 via assert_dict_int8() (32 passed, 0 failed).
test.sh pyarrow assertion confirms dictionary<values=string, indices=int8>.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gives us both stream values (stdout/stderr) in the test data,
exercising the dictionary-encoded stream column.

y-logcli: print column types above table output

DuckDB's box mode truncates long type names like
TIMESTAMP WITH TIME ZONE. Print a -- name: TYPE header
before the table so full types are always visible.

y-logcli: add -o columns mode with ISO 8601 nanosecond timestamps

Compact output: space-separated time, pod, container, stream, message
(truncated to 60 chars). Timestamps use AT TIME ZONE 'UTC' to extract
UTC from TIMESTAMPTZ before formatting with epoch_ns nanoseconds.

test.sh: print DuckDB default interpretation of persisted formats

Shows DESCRIBE read_parquet and read_arrow output so we can see how
DuckDB maps the on-disk types without any explicit type handling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ecision

DuckDB maps Parquet isAdjustedToUTC=true to TIMESTAMP WITH TIME ZONE
which is microsecond precision, silently truncating nanoseconds. Both
formats now write Timestamp(ns) without timezone annotation — DuckDB
reads as TIMESTAMP_NS preserving full nanosecond precision. Timestamps
are UTC by convention.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@solsson solsson merged commit a1a5972 into main Feb 24, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant