Skip to content

feat(agents): Agent Bricks Supervisor for PGE/simple engine selection#85

Open
FiifiB wants to merge 5 commits into
databrickslabs:masterfrom
FiifiB:feat/agents-supervisor
Open

feat(agents): Agent Bricks Supervisor for PGE/simple engine selection#85
FiifiB wants to merge 5 commits into
databrickslabs:masterfrom
FiifiB:feat/agents-supervisor

Conversation

@FiifiB

@FiifiB FiifiB commented Jun 25, 2026

Copy link
Copy Markdown

Summary

Adds agent_supervisor — a Databricks Agent Bricks Multi-Agent Supervisor
(MAS)
that orchestrates OntoBricks mapping. Per domain it deterministically
assesses complexity and routes entity/relationship mapping to either the
heavyweight PGE engine (agent_mapping_pge) or the original simple single-agent
engine (agent_auto_assignment).

Stacked PR. This is stacked on:

Until those merge, this branch's diff includes their commits. The net-new
code in this PR is only src/agents/agent_supervisor/**,
scripts/provision_supervisor.py, the SPEC + eval dataset, and its tests.

Please review/merge after #83 and #84.

Routing design (hybrid: deterministic + NL)

The MAS routes semantically, but "is this domain complex enough for the expensive
engine" should not be a vibe. So:

  1. A deterministic UC function assess_domain_complexity (uc_function.sql)
    scores the domain (tables, columns, classes, relationships, cross-source
    key-sharing, schema-naming heterogeneity) and returns recommended_engine.
  2. The supervisor's natural-language instructions tell it to call the
    assessor first and route to the matching engine endpoint — never overriding it.

This keeps the hard decision auditable while staying within Agent Bricks MAS.

What changed (net-new)

  • agents/agent_supervisor/complexity.pyComplexityAssessor, weighted
    deterministic scorer (reuses pge_eval.normalize for parsing).
  • engine.pySupervisorEngine: assess → select → dispatch via
    AgentClient. Mapping is the genuine two-engine choice; ontology uses the
    single owl-generator.
  • responses_agent.pyMappingEngineResponsesAgent, one Model Serving
    endpoint per engine (assess/run modes).
  • mas.pySupervisorProvisioner.build_config (pure) + provision; wires
    the UC function + the two engine endpoints with NL routing instructions.
  • uc_function.sql — self-contained mirror of complexity.py; constants
    guarded by test_uc_function_parity.
  • log_model.py + scripts/provision_supervisor.py — log/deploy + MAS
    provisioning.
  • .planning/agents/agent_supervisor/SPEC.md + 20-example eval dataset.

Testing

  • uv run pytest tests/agents/agent_supervisor -q35 passed (baseline
    routing accuracy 20/20, Python↔SQL constant parity).
  • Stacked-branch regression tests/agents tests/units/{agents,mapping,pge_eval,ontology}
    759 passed, 11 skipped.

This pull request and its description were written by Isaac.

Fiifi Botchway added 4 commits June 25, 2026 12:20
Turn owl-generation into a real Planner→Generator→Evaluator loop. After the
pitfall-tool fix loop settles, a deterministic Stage-1 evaluator scores the
ontology against source metadata and feeds concrete retry-hints back to the
generator on Tier-1 structural defects (orphan classes, dangling domain/range,
naming violations, duplicate classes), bounded by MAX_OWL_EVAL_ROUNDS.

- engine.py: _evaluate_ontology_stage() + loop wiring (fails open; never
  discards a usable ontology), MAX_OUTPUT_TOKENS=16000, exhaustive
  ATTRIBUTE COVERAGE prompt + get_table_detail workflow step.
- New agents/pge_eval slice: normalize.py + ontology_metrics.evaluate_ontology
  (gold-free, intrinsic; minimal package root to avoid coupling).
- Tests: ontology_metrics + owl_evaluator_stage (39 targeted, 565 unit green).

Co-authored-by: Isaac
Introduce agent_mapping_pge — a Planner→Generator→Evaluator mapping engine —
additively. Plans a source-model, generates entity + relationship SQL per
ontology item, and gates each with a deterministic evaluator + a semantic
critic. Coverage is engine-enforced from the ontology (abstract-superclass
UNION derivation + synthetic-endpoint fallback so one failed hub can't drop
all relationships).

Additive: agent_auto_assignment is retained and still reachable via
AgentClient.run_auto_assignment; a new AgentClient.run_mapping_pge gateway
exposes the PGE engine, so an orchestrator can choose between them. Upstream
features preserved (ToolContext.warehouse_id, Mapping._canonicalize_imported_uris).

- NEW agent_mapping_pge package + tools/{planner,evaluation}.py
- context.py: +source_model/+semantic_eval_report fields (warehouse_id kept)
- Mapping.py: run PGE + accumulate source_model/evaluations/run_log;
  save_mappings_to_session gains 3 optional params (legacy path unaffected)
- Tests: 90 in tests/agents/agent_mapping_pge; 208 across units/{agents,mapping}

Co-authored-by: Isaac
# Conflicts:
#	changelogs/v0.5.2/FiifiB_2026-06-25.log
Add agent_supervisor — a Databricks Agent Bricks Multi-Agent Supervisor that
deterministically assesses a domain's complexity and routes entity/relationship
mapping to the PGE engine (agent_mapping_pge) or the original simple engine
(agent_auto_assignment). Hybrid routing: a deterministic UC function
(assess_domain_complexity) yields the hard recommendation; the supervisor's NL
instructions act on it.

- complexity.py: weighted deterministic scorer (tables/columns/classes/rels +
  cross-source key-sharing + schema heterogeneity); reuses pge_eval.normalize.
- engine.py: SupervisorEngine assess -> select -> dispatch via AgentClient.
- responses_agent.py: per-engine MLflow ResponsesAgent (assess/run modes).
- mas.py: SupervisorProvisioner.build_config (pure) + provision; wires the UC
  function + two engine endpoints with NL routing instructions.
- uc_function.sql: self-contained mirror of complexity.py (parity-tested).
- SPEC.md + 20-example eval dataset + scripts/provision_supervisor.py.
- Tests: 35 (baseline routing 20/20, Python<->SQL parity); 759 stacked green.

Stacked on the ontology-PGE and mapping-PGE PRs.

Co-authored-by: Isaac
@FiifiB FiifiB requested a review from a team as a code owner June 25, 2026 11:46
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Fiifi Botchway seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Behavior-preserving post-review pass on the new supervisor code:
- engine.py: Remove Middle Man — drop _run_mapping(**kw) pack/unpack
  indirection; select run_mapping_pge vs run_auto_assignment inline.
- responses_agent.py: compute assess() only in the branch that uses it;
  Optional[dict] type hint.

Tests: tests/agents/agent_supervisor 35 passed (unchanged).

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants