bugfix #4 - validate_records drops col if first row cell null with list by sbatchelder · Pull Request #5 · WHOIGit/amplify-db-utils

sbatchelder · 2026-06-30T17:40:45Z

PR adds two tests and fixes #4 .

Problem

When validate_records receives list[dict] input, it built the Arrow table with pa.Table.from_pylist(processed) — without passing the target schema. PyArrow then inferred the columns from the records themselves. If the first record omitted a nullable list column, the column was inferred as all-null and every later record's list value was silently dropped. The result was order-dependent: the same records in a different order produced different output.

Fix

Build the table against the declared schema instead of letting PyArrow infer it:

present_names = {key for record in records for key in record}
build_schema = pa.schema([f for f in schema if f.name in present_names])
table = pa.Table.from_pylist(processed, schema=build_schema)

Passing schema=build_schema is the key change — column types now come from the schema, not from whichever record happens to be inferred first.

build_schema is restricted to the schema fields actually present across the input rather than the full schema. This preserves the existing downstream behavior: missing required columns still raise ValueError, and missing nullable columns are still null-filled.

Tests

Adds two regression tests covering

row orderings (first record missing the list column
first record containing it) to guard against the order-dependent behavior.

input type

johnwaalsh

Looks good to me!

bugfix #4 - validte_records drops col if first row cell null with list

03db376

input type

sbatchelder changed the title ~~bugfix #4 - validte_records drops col if first row cell null with list~~ bugfix #4 - validate_records drops col if first row cell null with list Jun 30, 2026

sbatchelder requested a review from joefutrelle June 30, 2026 19:18

sbatchelder self-assigned this Jun 30, 2026

sbatchelder requested a review from johnwaalsh June 30, 2026 19:18

johnwaalsh approved these changes Jul 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bugfix #4 - validate_records drops col if first row cell null with list#5

bugfix #4 - validate_records drops col if first row cell null with list#5
sbatchelder wants to merge 1 commit into
mainfrom
fix/null-first-row

sbatchelder commented Jun 30, 2026

Uh oh!

johnwaalsh left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sbatchelder commented Jun 30, 2026

Problem

Fix

Tests

Uh oh!

johnwaalsh left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants