Skip to content

feat: Add UUID and TIME_UUID as feature types (#5885)#5951

Open
soooojinlee wants to merge 12 commits intofeast-dev:masterfrom
soooojinlee:feat/add-uuid-feature-types
Open

feat: Add UUID and TIME_UUID as feature types (#5885)#5951
soooojinlee wants to merge 12 commits intofeast-dev:masterfrom
soooojinlee:feat/add-uuid-feature-types

Conversation

@soooojinlee
Copy link

@soooojinlee soooojinlee commented Feb 8, 2026

What this PR does / why we need it:

Adds UUID and TIME_UUID as native Feast feature types, resolving #5885. Currently UUID values must be stored as STRING, which loses type semantics, prevents backend-specific features (e.g. Cassandra timeuuid range queries), and makes PostgreSQL uuid columns infer as STRING. This PR enables users to declare UUID features with Field(name="user_id", dtype=Uuid) and receive uuid.UUID objects from get_online_features().to_dict().

Design Decisions

Why two types (UUID vs TIME_UUID)?
The issue author explicitly requested distinguishing time-based UUID (uuid1) and random UUID (uuid4). Both serialize
identically to string in proto, but separate types allow expressing intent in feature definitions and enable future backend-specific optimizations.

Why dedicated proto fields (uuid_val, time_uuid_val)?
Following the pattern established by SET types (PR #5888) and UNIX_TIMESTAMP (which reuses int64/Int64List), we add dedicated oneof fields that reuse existing proto scalar types (string and StringList). This allows WhichOneof("val") to identify UUID types directly from the proto message, without requiring a side-channel.

Backward compatibility for data stored before this change:
OnlineResponse accepts an optional feature_types dict. When data was previously stored as string_val, this metadata enables feast_value_type_to_python_type() to convert it to uuid.UUID. New materializations use uuid_val/time_uuid_val and are identified automatically.

Changes

Layer Files Description
Proto Value.proto, generated *_pb2.py/*_pb2.pyi Add UUID=30, TIME_UUID=31, UUID_LIST=32, TIME_UUID_LIST=33 to ValueType.Enum; add uuid_val, time_uuid_val, uuid_list_val, time_uuid_list_val to Value.oneof
Type system value_type.py, types.py Add UUID, TIME_UUID, UUID_LIST, TIME_UUID_LIST enums and Uuid/TimeUuid aliases
Type conversion type_map.py Add mappings to ~11 conversion dicts (proto, PyArrow, pandas, PostgreSQL, Couchbase, Snowflake); switch from string_val to uuid_val; add PROTO_VALUE_TO_VALUE_TYPE_MAP entries for UUID fields
Online response online_response.py, online_store.py, feature_store.py, utils.py Pass feature_types metadata for backward-compatible deserialization
ODFV on_demand_feature_view.py Add UUID/TIME_UUID sample values for schema inference

Backward Compatibility

  • Data previously stored as string_val still deserializes correctly via the feature_types side-channel
  • New materializations use dedicated uuid_val/time_uuid_val proto fields
  • feast_value_type_to_python_type(v) without feature_type now returns uuid.UUID for uuid_val fields (previously returned plain string for string_val)
  • PostgreSQL uuid columns now infer as ValueType.UUID (previously ValueType.STRING)
  • Go SDK: proto changes compile without errors; UUID handling logic is not implemented (out of scope)

Tests

  • test_types.py: Uuid/TimeUuid ↔ ValueType bidirectional conversion, Array types
  • test_type_map.py: Proto roundtrip with uuid_val, uuid.UUID object return, backward compatibility for string_val, UUID list roundtrip, PostgreSQL mapping
  • All 78 unit tests passing
  • ruff lint and format checks passing

@soooojinlee soooojinlee requested review from a team as code owners February 8, 2026 14:55
@soooojinlee soooojinlee requested review from ejscribner, robhowley and tokoko and removed request for a team February 8, 2026 14:56
devin-ai-integration[bot]

This comment was marked as resolved.

@soooojinlee soooojinlee force-pushed the feat/add-uuid-feature-types branch from 1d4cd01 to 4a5c932 Compare February 8, 2026 15:00
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@nquinn408
Copy link
Contributor

@soooojinlee , thanks so much for putting this together! Can you rebase to bring this PR up to date?

@ntkathole ntkathole force-pushed the feat/add-uuid-feature-types branch from 2c56521 to 1fd106a Compare February 11, 2026 04:44
@ntkathole ntkathole force-pushed the feat/add-uuid-feature-types branch 3 times, most recently from 26198a6 to 525ac72 Compare February 11, 2026 14:14
@soooojinlee soooojinlee force-pushed the feat/add-uuid-feature-types branch 3 times, most recently from cb6dd44 to 54c2eae Compare February 12, 2026 05:55
@ntkathole ntkathole force-pushed the feat/add-uuid-feature-types branch from 2dd648a to b280364 Compare February 16, 2026 06:44
@nquinn408
Copy link
Contributor

@soooojinlee , can you also add support for UUID_SET and TIME_UUID_SET?

@nquinn408
Copy link
Contributor

@soooojinlee , can you also update the docs with the newly supported types?

soooojinlee and others added 4 commits February 17, 2026 13:14
Signed-off-by: soojin <soojin@dable.io>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
Add uuid_val, time_uuid_val, uuid_list_val, time_uuid_list_val as
dedicated oneof fields in the Value proto message, replacing the
previous reuse of string_val/string_list_val. This allows UUID types
to be identified from the proto field alone without requiring a
feature_types side-channel. Backward compatibility is maintained for
data previously stored as string_val.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>
@soooojinlee soooojinlee force-pushed the feat/add-uuid-feature-types branch from b280364 to 180434b Compare February 17, 2026 04:20
Add Set(Uuid) and Set(TimeUuid) as feature types with full roundtrip
support, backward compatibility, and documentation for all UUID types.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@soooojinlee soooojinlee force-pushed the feat/add-uuid-feature-types branch from 180434b to 41cfe7e Compare February 17, 2026 04:26
@soooojinlee
Copy link
Author

@soooojinlee , can you also add support for UUID_SET and TIME_UUID_SET?

@soooojinlee , can you also update the docs with the newly supported types?

I added UUID_SET / TIME_UUID_SET support and updated the type system documentation as requested in here 41cfe7e

@soooojinlee soooojinlee force-pushed the feat/add-uuid-feature-types branch from 0d8115a to 3bf640a Compare February 17, 2026 13:07
…ype mappings

Keep PDF_BYTES=30 and IMAGE_BYTES=31 at their upstream values instead of
renumbering them. Shift UUID types to 32-37 in both proto and Python enum.

Also add missing SET type entries in _convert_value_type_str_to_value_type(),
convert_array_column(), and _get_sample_values_by_type() for completeness.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@soooojinlee soooojinlee force-pushed the feat/add-uuid-feature-types branch from 3bf640a to dceaa46 Compare February 17, 2026 13:12
devin-ai-integration[bot]

This comment was marked as resolved.

soooojinlee and others added 2 commits February 17, 2026 22:37
The comment claimed Sets do not support UUID/TimeUuid but the code
intentionally allows them. Updated to reflect actual behavior.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants