fix(preprocess): rename dotted $defs to keep generated paths flat#29
fix(preprocess): rename dotted $defs to keep generated paths flat#29sakinaroufid wants to merge 1 commit intoUniversal-Commerce-Protocol:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to flatten dotted names within JSON schema $defs to prevent datamodel-codegen from creating nested directory structures. It adds flatten_dotted_defs to rename keys and _rewrite_local_defs_refs to update internal references. Feedback highlights that the reference rewriting logic is limited to local paths and may miss cross-file references. Additionally, a code suggestion was provided to handle edge cases where a dotted name might end in a period, which could result in an invalid empty string key.
| def _rewrite_local_defs_refs(node, rename_map): | ||
| """Walks a schema tree and rewrites local $defs refs whose target was renamed.""" | ||
| prefix = "#/$defs/" | ||
| for n in iter_nodes(node): | ||
| if not isinstance(n, dict): | ||
| continue | ||
| ref = n.get("$ref") | ||
| if not isinstance(ref, str) or not ref.startswith(prefix): | ||
| continue | ||
| rest = ref[len(prefix) :] | ||
| name, sep, tail = rest.partition("/") | ||
| if name in rename_map: | ||
| n["$ref"] = prefix + rename_map[name] + (sep + tail if sep else "") |
There was a problem hiding this comment.
The _rewrite_local_defs_refs function only handles local references starting with #/$defs/. If the schema contains cross-file references to specific definitions (e.g., other.json#/$defs/dotted.name), these will not be updated when the target definition is renamed in other.json. This could lead to dangling references and broken schemas if such cross-file fragment references are used in the repository.
| tail = old.rsplit(".", 1)[-1] | ||
| if tail not in existing: | ||
| new = tail | ||
| else: | ||
| new = old.replace(".", "_") | ||
| if new in existing: | ||
| # Both candidates collide; leave as-is rather than risk corruption | ||
| continue |
There was a problem hiding this comment.
The current renaming strategy might produce an empty string as a key if a definition name ends with a dot (e.g., com.example.). While unlikely for reverse-DNS names, an empty string key is technically valid in JSON but may cause issues in downstream code generation or validation. Consider adding a check to ensure tail is not empty before using it as a new key.
| tail = old.rsplit(".", 1)[-1] | |
| if tail not in existing: | |
| new = tail | |
| else: | |
| new = old.replace(".", "_") | |
| if new in existing: | |
| # Both candidates collide; leave as-is rather than risk corruption | |
| continue | |
| tail = old.rsplit(".", 1)[-1] | |
| if tail and tail not in existing: | |
| new = tail | |
| else: | |
| new = old.replace(".", "_") | |
| if new in existing: | |
| # Both candidates collide; leave as-is rather than risk corruption | |
| continue |
Upstream UCP uses reverse-DNS extension mount points such as "dev.ucp.shopping.checkout" in $defs. datamodel-codegen treats dots in def names as path separators, so those defs were being emitted as nested directory trees (e.g. shopping/fulfillment/dev/ucp/shopping.py) instead of inline classes in the parent schema's module. This added a normalization pass that renames such keys before codegen. Strategy: prefer the last dotted component (giving a clean class name like "Checkout"); fall back to dot-replaced-with-underscore if the bare tail would collide with an existing def in the same file (the fulfillment.json case, where the dotted key shares its tail with a sibling "fulfillment" def). Local $ref pointers to renamed defs are rewritten in the same pass so no internal references break. No upstream schemas currently reference the dotted defs by $ref; the rewrite is a safety net. Effect on regen: extension classes return to their natural import paths. For example, ucp_sdk.models.schemas.shopping.fulfillment.Checkout was being relocated to ucp_sdk.models.schemas.shopping.fulfillment.dev.ucp. shopping.Checkout under current upstream; this restores the flat path.
deea90b to
bcda715
Compare
Description
Schema names with dots (e.g.
dev.ucp.shopping.checkout) trip updatamodel-codegen. It reads the dots as path separators and emits stuff likeshopping/fulfillment/dev/ucp/shopping.py. See #28 for live examples:ap2_mandate,buyer_consent,discount, andfulfillmentall hit it.This adds a 64-line pass to
preprocess_schemas.pythat walks each schema, renames dotted$defskeys to their last segment, and patches the$refs. Runs before the existingallOfflattening. Generated paths come out clean.Heads up: a few extension classes change import paths. They only landed at the nested paths because of the bug, and nothing released uses them, but worth flagging.
Category
Checklist
Logs
pre-commit run --all-filespasses../generate_models.sh 2026-04-08with this fix produces clean paths (nodev/ucp/shopping.py).Checkoutpayload round-trips throughmodel_validate/model_dump.