Skip to content

fix(preprocess): rename dotted $defs to keep generated paths flat#29

Open
sakinaroufid wants to merge 1 commit intoUniversal-Commerce-Protocol:mainfrom
sakinaroufid:fix/flatten-dotted-defs
Open

fix(preprocess): rename dotted $defs to keep generated paths flat#29
sakinaroufid wants to merge 1 commit intoUniversal-Commerce-Protocol:mainfrom
sakinaroufid:fix/flatten-dotted-defs

Conversation

@sakinaroufid
Copy link
Copy Markdown

Description

Schema names with dots (e.g. dev.ucp.shopping.checkout) trip up datamodel-codegen. It reads the dots as path separators and emits stuff like shopping/fulfillment/dev/ucp/shopping.py. See #28 for live examples: ap2_mandate, buyer_consent, discount, and fulfillment all hit it.

This adds a 64-line pass to preprocess_schemas.py that walks each schema, renames dotted $defs keys to their last segment, and patches the $refs. Runs before the existing allOf flattening. Generated paths come out clean.

Heads up: a few extension classes change import paths. They only landed at the nested paths because of the bug, and nothing released uses them, but worth flagging.

Category

  • Infrastructure: CI/CD, Linters, or build scripts.

Checklist

  • I have followed the Contributing Guide.
  • I have updated the documentation (if applicable). N/A, build script change.
  • My changes pass all local linting and formatting checks.
  • (For Core/Capability) I have included/updated the relevant JSON schemas. N/A.
  • I have regenerated Python Pydantic models by running generate_models.sh under python_sdk. Intentionally skipped; feat: update models to UCP release 2026-04-08 #28 owns the regen.

Logs

  • pre-commit run --all-files passes.
  • ./generate_models.sh 2026-04-08 with this fix produces clean paths (no dev/ucp/shopping.py).
  • Checkout payload round-trips through model_validate / model_dump.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to flatten dotted names within JSON schema $defs to prevent datamodel-codegen from creating nested directory structures. It adds flatten_dotted_defs to rename keys and _rewrite_local_defs_refs to update internal references. Feedback highlights that the reference rewriting logic is limited to local paths and may miss cross-file references. Additionally, a code suggestion was provided to handle edge cases where a dotted name might end in a period, which could result in an invalid empty string key.

Comment thread preprocess_schemas.py
Comment on lines +273 to +285
def _rewrite_local_defs_refs(node, rename_map):
"""Walks a schema tree and rewrites local $defs refs whose target was renamed."""
prefix = "#/$defs/"
for n in iter_nodes(node):
if not isinstance(n, dict):
continue
ref = n.get("$ref")
if not isinstance(ref, str) or not ref.startswith(prefix):
continue
rest = ref[len(prefix) :]
name, sep, tail = rest.partition("/")
if name in rename_map:
n["$ref"] = prefix + rename_map[name] + (sep + tail if sep else "")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _rewrite_local_defs_refs function only handles local references starting with #/$defs/. If the schema contains cross-file references to specific definitions (e.g., other.json#/$defs/dotted.name), these will not be updated when the target definition is renamed in other.json. This could lead to dangling references and broken schemas if such cross-file fragment references are used in the repository.

Comment thread preprocess_schemas.py
Comment on lines +313 to +320
tail = old.rsplit(".", 1)[-1]
if tail not in existing:
new = tail
else:
new = old.replace(".", "_")
if new in existing:
# Both candidates collide; leave as-is rather than risk corruption
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current renaming strategy might produce an empty string as a key if a definition name ends with a dot (e.g., com.example.). While unlikely for reverse-DNS names, an empty string key is technically valid in JSON but may cause issues in downstream code generation or validation. Consider adding a check to ensure tail is not empty before using it as a new key.

Suggested change
tail = old.rsplit(".", 1)[-1]
if tail not in existing:
new = tail
else:
new = old.replace(".", "_")
if new in existing:
# Both candidates collide; leave as-is rather than risk corruption
continue
tail = old.rsplit(".", 1)[-1]
if tail and tail not in existing:
new = tail
else:
new = old.replace(".", "_")
if new in existing:
# Both candidates collide; leave as-is rather than risk corruption
continue

Upstream UCP uses reverse-DNS extension mount points such as
"dev.ucp.shopping.checkout" in $defs. datamodel-codegen treats dots in
def names as path separators, so those defs were being emitted as nested
directory trees (e.g. shopping/fulfillment/dev/ucp/shopping.py) instead
of inline classes in the parent schema's module.

This added a normalization pass that renames such keys before codegen.
Strategy: prefer the last dotted component (giving a clean class name
like "Checkout"); fall back to dot-replaced-with-underscore if the bare
tail would collide with an existing def in the same file (the
fulfillment.json case, where the dotted key shares its tail with a
sibling "fulfillment" def).

Local $ref pointers to renamed defs are rewritten in the same pass so
no internal references break. No upstream schemas currently reference
the dotted defs by $ref; the rewrite is a safety net.

Effect on regen: extension classes return to their natural import paths.
For example, ucp_sdk.models.schemas.shopping.fulfillment.Checkout was
being relocated to ucp_sdk.models.schemas.shopping.fulfillment.dev.ucp.
shopping.Checkout under current upstream; this restores the flat path.
@sakinaroufid sakinaroufid force-pushed the fix/flatten-dotted-defs branch from deea90b to bcda715 Compare April 26, 2026 02:05
@sakinaroufid sakinaroufid marked this pull request as ready for review April 26, 2026 02:06
@sakinaroufid sakinaroufid requested review from a team as code owners April 26, 2026 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant