Improve dataclass type guessing to handle nested types #3375

wild-endeavor · 2026-01-29T02:37:37Z

Why are the changes needed?

When using FlyteRemote.execute() with a Pydantic model that contains a list of another Pydantic model (e.g., jobs: List[JobConfig]), the execution fails with a KeyError: 'type'
error.

This occurs because Pydantic generates JSON schemas that use $ref to reference nested model definitions in $defs. For example:

  class JobConfig(BaseModel):
      job_config_id: str

  class SchedulerConfig(BaseModel):
      jobs: List[JobConfig]  # This generates: {'items': {'$ref': '#/$defs/JobConfig'}, 'type': 'array'}

The _get_element_type() function in type_engine.py did not handle $ref references, expecting a direct type key instead, which caused the KeyError.

What changes were proposed in this pull request?

This PR adds support for $ref resolution in JSON schema processing:

generate_attribute_list_from_dataclass_json_mixin: Added handling for top-level $ref properties (e.g., field: NestedModel)
_handle_json_schema_property: Added a schema parameter to pass the full JSON schema context, enabling $ref resolution in nested calls
_get_element_type: Added $ref handling for array items (e.g., List[NestedModel]) and dict values (e.g., Dict[str, NestedModel]). When a $ref is encountered, it looks up the
referenced definition in $defs (or definitions for older schemas) and recursively converts it to a Python class.

The fix follows a similar pattern to the one implemented in flyte-sdk PR #426.

How was this patch tested?

Added unit tests in tests/flytekit/unit/core/test_dataclass_guessing.py:

test_guessing_of_nested_pydantic: Tests List[NestedModel] with round-trip JSON serialization
test_guessing_of_nested_pydantic_mapped: Tests Dict[str, NestedModel] with round-trip JSON serialization
test_strict_type_hint_matching_with_nested_pydantic: Verifies that strict_type_hint_matching correctly identifies Pydantic types with nested models

Setup process

Screenshots

Check all the applicable boxes

I updated the documentation accordingly.
All new and existing tests passed.
All commits are signed-off.

Related PRs

https://github.com/flyteorg/flyte-sdk/pull/426/changes

Docs link

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

codecov · 2026-01-29T02:40:36Z

Codecov Report

❌ Patch coverage is 22.58065% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.10%. Comparing base (330a1e8) to head (a609b4f).

Files with missing lines	Patch %	Lines
flytekit/core/type_engine.py	22.58%	22 Missing and 2 partials ⚠️

❗ There is a different number of reports uploaded between BASE (330a1e8) and HEAD (a609b4f). Click for more details.

HEAD has 58 uploads less than BASE

Flag BASE (330a1e8) HEAD (a609b4f)

64 6

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3375      +/-   ##
==========================================
- Coverage   81.53%   76.10%   -5.43%     
==========================================
  Files         324      216     -108     
  Lines       28004    22742    -5262     
  Branches     2981     2988       +7     
==========================================
- Hits        22832    17308    -5524     
- Misses       4335     4570     +235     
- Partials      837      864      +27

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

wild-endeavor added 2 commits January 27, 2026 11:04

pausing to investigate how we ended up in dataclass territory

7629931

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

remove test for now

a609b4f

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

wild-endeavor added 2 commits January 28, 2026 19:00

add a test

54263a1

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

add test for map

fb0bd50

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

wild-endeavor marked this pull request as ready for review January 29, 2026 03:02

wild-endeavor requested review from cosmicBboy, davidmirror-ops, kumare3, machichima, pingsutw and samhita-alla as code owners January 29, 2026 03:02

kumare3 approved these changes Jan 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve dataclass type guessing to handle nested types #3375

Improve dataclass type guessing to handle nested types #3375

Uh oh!

wild-endeavor commented Jan 29, 2026 •

edited

Loading

Uh oh!

codecov bot commented Jan 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Improve dataclass type guessing to handle nested types #3375

Are you sure you want to change the base?

Improve dataclass type guessing to handle nested types #3375

Uh oh!

Conversation

wild-endeavor commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are the changes needed?

What changes were proposed in this pull request?

How was this patch tested?

Setup process

Screenshots

Check all the applicable boxes

Related PRs

Docs link

Uh oh!

codecov bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wild-endeavor commented Jan 29, 2026 •

edited

Loading

codecov bot commented Jan 29, 2026 •

edited

Loading