Schema Conversion Orchestrator is a Flask-based schema conversion service. It builds a conversion graph across multiple schema languages and can route conversions through built-in Python converters, external Java and Node.js converter packages, and standalone executable tools such as ROBOT.
- Dynamic conversion graph and multi-hop path discovery
- Python, Java, Node.js, and standalone executable converter integrations
- HTTP API for schema conversion
- Evaluation plots for conversion coverage, robustness, and graph structure
- Unit tests that run without heavy external converter dependencies
src/schema_conversion_orchestrator/ Python package
external_converters/ Java, Node.js, and standalone executable assets
deploy/docker/ Dockerfile and compose files
requirements/ Python runtime and test requirements
scripts/ Build, run, test, and utility scripts
tests/ Pytest suite and fixtures
eval/ Evaluation schemas and generated eval outputs
Install Python runtime dependencies:
pip install -r requirements/runtime.txtInstall lightweight test dependencies:
pip install -r requirements/dev.txtBuild all external converter subpackages:
scripts/build_subpackages.shExpected build outputs:
external_converters/java/converter.jar
external_converters/node/dist/index.js
scripts/run.shThe service listens on http://localhost:5002.
Health check:
curl http://localhost:5002/healthThe service exposes a single conversion endpoint.
POST /convert
Request body:
{
"sourceLanguage": "SHACL_TTL",
"targetLanguage": "JsonSchema",
"schema": "<schema as a string>",
"useCache": true
}sourceLanguage/targetLanguage: schema language enum values (see Supported Formats).schema: the source schema, passed as a string (a JSON object is also accepted and serialized automatically).useCache(optional, defaulttrue): reuse cached results of shared intermediate sub-paths across the attempted paths.
The orchestrator discovers every feasible conversion path from the source to the target language, executes them, ranks the results, and returns all attempts (successful and failed):
{
"results": [
{
"success": true,
"result": "<converted schema or error message>",
"failedStepIndex": null,
"conversionPath": [
{
"sourceLanguage": "SHACL_TTL",
"targetLanguage": "JsonSchema",
"serviceName": "node",
"converterName": "shacl-bridge SHACL->JSON Schema",
"library": "shacl-bridge",
"libraryVersion": "x.y.z",
"libraryUrl": "https://www.npmjs.com/package/shacl-bridge"
}
]
}
]
}For a failed attempt, success is false, result carries the error
message, and failedStepIndex identifies the converter step that failed.
Each step reports the underlying library, its version, and a URL, so the exact
provenance of every result is traceable.
A ready-made example request is in scripts/send_test_request.py.
When several paths connect the source and target language, the returned attempts are ranked so the most faithful result surfaces first. Failed attempts always sort below successful ones. Successful attempts are ordered by a single fallback chain, where each criterion only breaks the ties left by the previous one:
- Benchmark accuracy, where offline accuracy scores exist for the requested source/target pair (currently the SHACL <-> JSON Schema conversions).
- Empirical edge quality: the product of the path's per-edge quality scores.
Unevaluated edges default to
0.5, so this is always defined and favors shorter paths when nothing is measured. - Shorter path: fewer converter edges.
- Larger output: assumes dropped constraints tend to shorten a schema.
The accuracy and edge-quality scores come from the offline evaluations described in eval/README.md.
Run unit tests:
python3 -m pytestRun a manual conversion request against a running local service:
python3 scripts/send_test_request.pyRun the local Docker integration check:
scripts/run_local_docker_test.sh --downBuild and run with compose:
docker compose -f deploy/docker/docker-compose.yml up -d --buildStandalone HTTPS deployment is defined in:
deploy/docker/docker-compose.https.yml
The Docker image uses the repo root as build context and
deploy/docker/Dockerfile as the Dockerfile.
PYTHONPATH=src venv/bin/python eval/plot_orchestrator_evaluation.pyGenerated files are written to:
eval/results/orchestrator_outputs/plots/
Currently modeled schema language enum values include:
JsonSchemaLinkMlMdModelsDtdXsdSHACL_TTLSHACL_JSON_LDOwl_TTLOwl_XMLOwl_OFNOWL_OBOOntologyRdfGraphQLProtobufShexMermaidSqlAlchemy
The conversion graph is built from converter registrations at service startup, so a new converter participates in path finding, execution, and ranking as soon as it is registered. No other part of the service needs to change. There are two integration routes: internal Python converters and external sub-process converters (Node.js, Java, or any standalone executable).
-
Add the dependency. Add the underlying library to
requirements/runtime.txt. -
Implement the converter. Create a module under
src/schema_conversion_orchestrator/converters/python/with a subclass ofConverterInternal(fromconverters/base.py). Declare the source and target language and the library metadata in__init__, and implementconverter_logic(plusvalidate_input/validate_output, which may simply returnTrue):from schema_conversion_orchestrator.converters.base import ConverterInternal, get_package_version from schema_conversion_orchestrator.domain.schema_types import SchemaLanguage class ConverterMyLibrary(ConverterInternal): def __init__(self) -> None: super().__init__( name="my-library", service_address="internal", service_name="FlaskApp", source_language=SchemaLanguage.JsonSchema, target_language=SchemaLanguage.SHACL_TTL, library="my-library", library_version=get_package_version("my-library"), library_url="https://github.com/example/my-library", ) def converter_logic(self, schema: str) -> str: ... # call the library, return the converted schema as a string def validate_input(self, schema: str) -> bool: return True def validate_output(self, schema: str) -> bool: return True
The
library,library_version, andlibrary_urlvalues are what the API reports as per-step provenance, so fill them in accurately.get_package_versionreads the installed package version, keeping the reported version in sync with the environment. -
Register it. Add an instance to the list returned by
register_python_converters()insrc/schema_conversion_orchestrator/converters/python_registry.py.
Node converters are auto-discovered: at startup the Python service runs node external_converters/node/dist/index.js list and registers every converter the bundle reports, so no Python code changes are needed.
-
Add the dependency. Add the npm package to
external_converters/node/package.jsonand runnpm installin that directory. -
Implement the converter. Create
external_converters/node/src/converters/<name>.tsexporting aConverterobject (seedataStructures.tsfor the interface):import {Converter, SchemaLanguage} from "../dataStructures.js"; export const converter: Converter = { name: "my-converter", sourceLanguage: SchemaLanguage.Xsd, targetLanguage: SchemaLanguage.JsonSchema, library: "my-npm-package", // resolvable package name; its version is read from package.json libraryUrl: "https://www.npmjs.com/package/my-npm-package", async convert(schema: string): Promise<string> { ... // call the library, return the converted schema as a string } }; export default converter;
Every file in
src/converters/is loaded automatically; there is no separate Node-side registry. -
Build. Run
npm run buildinexternal_converters/node/.
Java converters live in external_converters/java/ and are discovered the same way (the service runs java -jar converter.jar list at startup). Standalone executables that cannot report their own converters, such as ROBOT, are instead registered explicitly as ConverterExternalGeneric instances in src/schema_conversion_orchestrator/converters/external_registry.py, which specifies the command to run, the source and target languages, file suffixes, and the library metadata.
Add a value to the SchemaLanguage enum in src/schema_conversion_orchestrator/domain/schema_types.py (and, if Node converters use it, to the SchemaLanguage enum in external_converters/node/src/dataStructures.ts). The language appears as a node in the conversion graph as soon as a registered converter consumes or produces it.
Start the service (scripts/run.sh) and check the startup log: every registered converter is printed there, and a POST /convert request for the new language pair exercises the new edge. Add a test in tests/ (see tests/test_logic.py for converter-level examples). Optionally, add benchmark inputs and ground truths under eval/benchmarks/ so the new conversion participates in accuracy-based ranking (see eval/README.md).
This project is licensed under LICENSE.
Third-party bundled artifact notices are documented in THIRD_PARTY_NOTICES.md.