Skip to content

5983: DCAT-US v1.1 to v3.0 Translation Script#149

Merged
akuny merged 32 commits into
mainfrom
dcat-converter
Jun 2, 2026
Merged

5983: DCAT-US v1.1 to v3.0 Translation Script#149
akuny merged 32 commits into
mainfrom
dcat-converter

Conversation

@akuny akuny self-assigned this Jun 2, 2026
@akuny akuny requested a review from a team June 2, 2026 15:27

@jbrown-xentity jbrown-xentity left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as a first pass. Tested GSA JSON and https://opendata.hawaii.gov/data.json, gave the expected success and error output.

Having all the translation functions is helpful, thank you!

Comment thread jsonschema/transforms.py Outdated
Comment on lines +204 to +219
def transform_replaces(dataset: dict) -> dict:
"""Normalize the 'replaces' field to conform to the DCAT-US 3.0 schema."""
if "replaces" not in dataset:
return dataset

value = dataset["replaces"]
if isinstance(value, list):
return dataset

new_dataset = copy.deepcopy(dataset)
del new_dataset["replaces"]

if isinstance(value, str) and _is_iri(value):
new_dataset["relation"] = [value]

return new_dataset

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recognize the field replaces in the schema: https://resources.data.gov/resources/dcat-us/. Is this necessary?
I'm wondering if this is an "open schema" problem, where people were adding fields in 1.1 that are defined and reserved in 3.0?

Comment thread jsonschema/transforms.py
Comment on lines +420 to +430
def _parse_bbox(value: str) -> tuple[float, float, float, float] | None:
"""Return (minLon, minLat, maxLon, maxLat) if `value` is a comma-
separated bbox string, otherwise None."""
parts = [p.strip() for p in value.split(",")]
if len(parts) != 4:
return None
try:
nums = tuple(float(p) for p in parts)
except ValueError:
return None
return nums # type: ignore[return-value]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is working we don't have to replace this; however a more complete version of this is here: https://github.com/GSA/datagov-harvester/blob/main/harvester/utils/general_utils.py#L885-L955

@jbrown-xentity

Copy link
Copy Markdown
Collaborator

Oh @akuny , the fix for snyk is here: c02bb9a. Feel free to pull in using the same process.

@akuny akuny merged commit c7233af into main Jun 2, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants