feat(ontology): add PGE Evaluator stage to owl-generator#83
Open
FiifiB wants to merge 1 commit into
Open
Conversation
Turn owl-generation into a real Planner→Generator→Evaluator loop. After the pitfall-tool fix loop settles, a deterministic Stage-1 evaluator scores the ontology against source metadata and feeds concrete retry-hints back to the generator on Tier-1 structural defects (orphan classes, dangling domain/range, naming violations, duplicate classes), bounded by MAX_OWL_EVAL_ROUNDS. - engine.py: _evaluate_ontology_stage() + loop wiring (fails open; never discards a usable ontology), MAX_OUTPUT_TOKENS=16000, exhaustive ATTRIBUTE COVERAGE prompt + get_table_detail workflow step. - New agents/pge_eval slice: normalize.py + ontology_metrics.evaluate_ontology (gold-free, intrinsic; minimal package root to avoid coupling). - Tests: ontology_metrics + owl_evaluator_stage (39 targeted, 565 unit green). Co-authored-by: Isaac
|
Fiifi Botchway seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Turns OWL ontology generation into a real Planner→Generator→Evaluator (PGE)
loop. Today
agent_owl_generatordoes single-shot generation plus a pitfall-toolfix loop, but has no deterministic Evaluator — so hard structural defects can
survive into the delivered ontology. This PR adds a Stage-1 evaluator that scores
the generated ontology against the source metadata and feeds concrete retry-hints
back to the generator on Tier-1 defects, bounded by a hard cap.
This is the first of two independent PGE PRs (ontology generation; entity/
relationship mapping). It is additive and self-contained.
What changed
agent_owl_generator/engine.py_evaluate_ontology_stage()— parses the Turtle, runs deterministic Tier-1checks (orphan classes, dangling
rdfs:domain/rdfs:range, naming violations,duplicate classes) and returns a retry-hint on hard defects. Fails open —
any parse/dependency error returns
None, so the evaluator never blocksdelivery.
MAX_OWL_EVAL_ROUNDS, and only retries while an iteration remains so a usableontology is never discarded by exhausting
MAX_ITERATIONS.MAX_OUTPUT_TOKENS = 16000so exhaustive attribute coverage isn't silentlytruncated past the old 4096 ceiling.
# ATTRIBUTE COVERAGEsection +get_table_detail-per-tableworkflow step → exhaustive (not curated) datatype-property coverage.
agents/pge_eval/slice (gold-free, intrinsic, computed from theontology + source schema only):
normalize.py+ontology_metrics.evaluate_ontology.The package root is intentionally minimal; the full scorecard/CLI lands in a
separate change, so importers depend on the concrete submodule.
Testing
uv run pytest tests/units/pge_eval/test_ontology_metrics.py tests/units/pge_eval/test_owl_evaluator_stage.py tests/units/ontology/test_owl_generator.py -q→ 39 passed.Broader sanity:
tests/units/{agents,ontology,pge_eval}→ 565 passed, 11 skipped.This pull request and its description were written by Isaac.