MAX_COLUMN_BYTES is applied only by default only if no annotation is … by jakubZielAdform · Pull Request #59 · adform/stream-loader

jakubZielAdform · 2026-07-02T07:51:17Z

…applied on a field

Srb1996 · 2026-07-03T05:34:42Z

+                Column(q"false", q"${fl.length}", q"pw.writeFixedString(r, ${fl.length}, ${fl.truncate})")
+              case ml: MaxLength =>
+                Column(q"false", q"-1", q"pw.writeVarString(r, ${ml.length}, ${ml.truncate})")
+            }.get


If we get a non matching field in annotations this get will throw exception, can we add getOrElse abort so that we may get a clear msg instead of None.get

Srb1996 · 2026-07-03T05:44:45Z

+              case fl: FixedLength =>
+                Column(q"false", q"${fl.length}", q"pw.writeFixedString(r, ${fl.length}, ${fl.truncate})")
+              case ml: MaxLength =>
+                Column(q"false", q"-1", q"pw.writeVarString(r, ${ml.length}, ${ml.truncate})")


What would be the maximum allowed length of the field ?

imo, there should no be any max length when user is adding an annotation.
Case class that is processed here is supposed to be a mapping of db table and be an ultimate src of truth.

Tbh I think that silent truncating is also a bad option, i suppose that is probably done because one 'too big' record would fail entire file loading.

I think that would be nice to consider reading a schema at the beginning and treating it as a src of truth and based on that removing 'too big' records from a batch. Of course it would have to be done in runtime, not in preprocessing.

wdyt?

jakubZielAdform · 2026-07-04T23:02:44Z

        s"KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://${dockerNetwork.ip}:$kafkaPort",
        s"KAFKA_CONTROLLER_QUORUM_VOTERS=1@127.0.0.1:$kafkaControllerPort",
        s"KAFKA_LOG_RETENTION_HOURS=${Int.MaxValue}",
+        s"KAFKA_MESSAGE_MAX_BYTES=${32_000_000}",


allow for bigger msg sizes on a kafka broker (will be usable when test classes are exposed via #60)

jakubZielAdform force-pushed the get-rid-of-max-column-bytes-default-limit-for-string-fields-in-generated-encoders branch 2 times, most recently from 5742e55 to d941d85 Compare July 2, 2026 10:06

MAX_COLUMN_BYTES is applied only by default only if no annotation is …

91a07a1

…applied on a field

jakubZielAdform force-pushed the get-rid-of-max-column-bytes-default-limit-for-string-fields-in-generated-encoders branch from d941d85 to 91a07a1 Compare July 2, 2026 10:07

Srb1996 suggested changes Jul 3, 2026

View reviewed changes

jakubZielAdform force-pushed the get-rid-of-max-column-bytes-default-limit-for-string-fields-in-generated-encoders branch 2 times, most recently from f4513fb to 76bd37a Compare July 3, 2026 13:51

native vertica encoder - handle an unexpected annotation

58108ec

jakubZielAdform force-pushed the get-rid-of-max-column-bytes-default-limit-for-string-fields-in-generated-encoders branch 2 times, most recently from 813c9bd to f53b5ac Compare July 4, 2026 22:59

jakubZielAdform commented Jul 4, 2026

View reviewed changes

jakubZielAdform force-pushed the get-rid-of-max-column-bytes-default-limit-for-string-fields-in-generated-encoders branch from f53b5ac to 58108ec Compare July 4, 2026 23:26

allow for bigger kafka msg size (32MB)

55480be

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MAX_COLUMN_BYTES is applied only by default only if no annotation is …#59

MAX_COLUMN_BYTES is applied only by default only if no annotation is …#59
jakubZielAdform wants to merge 3 commits into
masterfrom
get-rid-of-max-column-bytes-default-limit-for-string-fields-in-generated-encoders

jakubZielAdform commented Jul 2, 2026

Uh oh!

Srb1996 Jul 3, 2026

Uh oh!

Srb1996 Jul 3, 2026

Uh oh!

jakubZielAdform Jul 3, 2026

Uh oh!

jakubZielAdform Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jakubZielAdform commented Jul 2, 2026

Uh oh!

Srb1996 Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

Srb1996 Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

jakubZielAdform Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

jakubZielAdform Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants