Skip to content

feat: add templates_to_graph exposing templates as RDF#53

Merged
magbak merged 6 commits into
DataTreehouse:mainfrom
thenonameguy:templates-to-graph
Jun 14, 2026
Merged

feat: add templates_to_graph exposing templates as RDF#53
magbak merged 6 commits into
DataTreehouse:mainfrom
thenonameguy:templates-to-graph

Conversation

@thenonameguy

@thenonameguy thenonameguy commented May 31, 2026

Copy link
Copy Markdown
Contributor

Opening PR as a form of technical discussion starter:

  • Custom flattened vocabulary instead of standard rOTTR. Chose a denormalized, one-hop-queryable vocab (prefix mtpl, base https://datatreehouse.github.io/maplib/vocab#) rather than the standard OTTR RDF list encoding — optimized for SPARQL analysis and SHACL-shape derivation, not for round-tripping back to OTTR. IMO a custom vocabulary can grow faster over time to support code-generation and validation efforts, especially as different syntaxes and maplib templating options (Facade-X) pop up.
  • New namespace https://datatreehouse.github.io/maplib/vocab#. Is this the canonical/permanent base IRI we want to publish under?
  • Graph argument defaults to the default graph when omitted — no implicit dedicated graph name. Should we demand users to provide a named graph to keep template metadata separate from instance data? My thought was to enable analysis-only/dataless use-cases.
  • Blank nodes are used for parameters/instances/arguments. These get fresh per-call identifiers, so repeated templates_to_graph calls into the same graph will accumulate duplicate blank-node subgraphs. Stable identities would help here, also for tracing lineage for any derivation code.

Main motivating test/high-level entrypoint for review:
https://github.com/DataTreehouse/maplib/pull/53/changes#diff-f1ef54932586718fa9703beb8e469f2cd0c5d054e30790ea3d6cd6deaf8e960aR128

Note, the alternative approach to this template-based codegen is using tools like:
https://pypi.org/project/shexer/

@magbak

magbak commented Jun 14, 2026

Copy link
Copy Markdown
Member

Thanks for putting in the effort here!

Opening PR as a form of technical discussion starter:

  • Custom flattened vocabulary instead of standard rOTTR. Chose a denormalized, one-hop-queryable vocab (prefix mtpl, base https://datatreehouse.github.io/maplib/vocab#) rather than the standard OTTR RDF list encoding — optimized for SPARQL analysis and SHACL-shape derivation, not for round-tripping back to OTTR. IMO a custom vocabulary can grow faster over time to support code-generation and validation efforts, especially as different syntaxes and maplib templating options (Facade-X) pop up.

Strongly support one-hop list encoding.

  • New namespace https://datatreehouse.github.io/maplib/vocab#. Is this the canonical/permanent base IRI we want to publish under?

We have since introduced a new internal prefix maplib, but I think I favor this one as we can dereference them. We still have to set that up though :-) Will merge this and sort this out.

  • Graph argument defaults to the default graph when omitted — no implicit dedicated graph name. Should we demand users to provide a named graph to keep template metadata separate from instance data? My thought was to enable analysis-only/dataless use-cases.

Across maplib we have graph=None meaning the default graph, so that would be consistent here also.
(There is the report_graph in validate() that does not yet work this way, but that is being sorted out).

  • Blank nodes are used for parameters/instances/arguments. These get fresh per-call identifiers, so repeated templates_to_graph calls into the same graph will accumulate duplicate blank-node subgraphs. Stable identities would help here, also for tracing lineage for any derivation code.

I suggest stable IRIs using uuidv5, I guess the iri-template and some kind of unique path inside of the template will be sufficient. Think we can merge this for now and put this on the todo list as it is not crucial.

Main motivating test/high-level entrypoint for review: https://github.com/DataTreehouse/maplib/pull/53/changes#diff-f1ef54932586718fa9703beb8e469f2cd0c5d054e30790ea3d6cd6deaf8e960aR128

Note, the alternative approach to this template-based codegen is using tools like: https://pypi.org/project/shexer/

@magbak magbak merged commit 4acf515 into DataTreehouse:main Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants