databrickslabs · FiifiB · Jun 25, 2026 · Jun 25, 2026
@@ -0,0 +1,84 @@
+# SPEC: agent_mapping_pge
+
+> Required by `.cursor/12-ai-feature-lifecycle.mdc`.
+
+## 1. Purpose
+
+`agent_mapping_pge` generates entity and relationship SQL mappings for a domain
+via a Planner→Generator→Evaluator (PGE) loop. Given source metadata + an ontology
+it plans a source model, generates SQL per ontology item, and gates each mapping
+with a deterministic evaluator plus an independent semantic critic. It replaces
+the single-agent `agent_auto_assignment` mapping flow with separation of creator
+and critic, and enforces coverage from the ontology rather than LLM discretion.
+
+## 2. Identity
+
+| Field | Value |
+|---|---|
+| `agent_name` | `agent_mapping_pge` |
+| `module_path` | `src/agents/agent_mapping_pge/` |
+| `model_endpoint` | _configured per workspace_ |
+| `temperature` | `0.0`–`0.2` |
+| `mlflow_experiment` | `/Shared/ontobricks/agents/mapping_pge` |
+
+## 3. Tool surface
+
+| Tool name | Input | Output | Purpose |
+|---|---|---|---|
+| `submit_source_model` | planner source-model | `SourceModel` | Terminal planner tool |
+| `submit_entity_mapping` | entity SQL + id expr | mapping dict | Record an entity mapping |
+| `submit_relationship_mapping` | rel SQL + endpoints | mapping dict | Record a relationship mapping |
+| `normalized_value_overlap` | two columns | overlap ratio | Verify join-key overlap |
+| `submit_evaluation` | critic verdict | `EvalReport` | Terminal critic tool |
+
+## 4. Success criteria
+
+1. Every mappable ontology class/relationship is covered (engine-enforced, not
+   LLM-discretionary).
+2. Relationship endpoints reproduce the entity's canonical id expression →
+   0% dangling on a valid domain.
+3. A failed hub entity does not cascade to drop all its relationships (synthetic
+   endpoint fallback).
+
+## 5. Eval dimensions
+
+| Dimension | Metric | Threshold | Weight | Judge |
+|---|---|---|---|---|
+| `entity_coverage` | mapped entities / mappable classes | `1.00` | `0.25` | rule-based (`coverage.py`) |
+| `relationship_coverage` | mapped rels / ontology object-properties | `1.00` | `0.20` | rule-based |
+| `dangling_rate` | proportion of relationship edges with a resolvable endpoint | `1.00` | `0.25` | rule-based (deterministic evaluator) |
+| `sql_executes` | generated SQL parses + runs | `0.98` | `0.15` | rule-based |
+| `semantic_correctness` | critic agreement that the mapping matches intent | `0.85` | `0.15` | LLM critic (`evaluator/critic.py`) |
+
+**Aggregate threshold:** ≥ `0.90`.
+
+## 6. Failure modes
+
+| Symptom | Detection | Mitigation |
+|---|---|---|
+| Class silently skipped | `entity_coverage` < 1.0 | coverage is computed from the ontology; `skip[]` is advisory and never removes an item |
+| Relationship dangles | `dangling_rate` < 1.0 | relationship generator reproduces the endpoint's canonical id expression |
+| One failed hub drops all rels | rel coverage collapse | synthetic-endpoint fallback from `canonical_ids` |
+| Abstract superclass unmapped | missing union | abstract classes derived as UNION-ALL of concrete subclass SQL |
+
+## 7. Eval dataset
+
+- **Baseline:** `tests/eval/datasets/agent_mapping_pge/baseline.jsonl` — ≥ 20 examples
+  spanning single-source, multi-source cross-trust, and degenerate inputs.
+- **Regression:** added on first production mis-mapping.
+
+## 8. MLflow tracing
+
+The engine traces planner / generator / evaluator / critic stages; per-item
+`mapping_evaluations` + `mapping_run_log` are surfaced on the result.
+
+## 9. Plan reference
+
+PGE design notes tracked in session memory; loop pattern per Anthropic's
+harness-design (planner/generator/evaluator separation).
+
+## 10. Sign-off
+
+- [x] Sections 4, 5, 6, 7 filled.
+- [ ] Baseline eval run URI pasted into PR body.
+- [x] Aggregate threshold declared in §5.
@@ -0,0 +1,57 @@
+# 2026-06-25 — feat(mapping): PGE loop for entity/relationship mapping
+
+## Context
+
+Entity/relationship mapping previously ran through `agent_auto_assignment` —
+a single-agent "implementer marks its own homework" loop with no planning or
+independent evaluation. This change introduces `agent_mapping_pge`, a
+Planner→Generator→Evaluator (PGE) mapping engine, **additively**: the original
+`agent_auto_assignment` engine is retained and still reachable via
+`AgentClient.run_auto_assignment`, so a downstream orchestrator can choose which
+engine to run.
+
+The PGE engine plans a source-model, generates entity and relationship SQL per
+ontology item, and gates each with a deterministic evaluator + a semantic
+critic. Coverage is engine-enforced (computed from the ontology, not left to LLM
+discretion), with abstract-superclass UNION derivation and a synthetic-endpoint
+fallback so a single failed hub never cascades to drop all relationships.
+
+## Changes
+
+1. NEW package `src/agents/agent_mapping_pge/` — Planner (`planner.py`),
+   generators (`generators/{entity,relationship}.py`), evaluator
+   (`evaluator/{deterministic,critic,report}.py`), engine orchestrator
+   (`engine.py`, bounded ThreadPool walk + monotonic progress), `contracts.py`
+   (SourceModel/EvalReport), and `coverage.py` (deterministic ontology-derived
+   coverage; `skip[]` is advisory and never removes an item).
+2. NEW `src/agents/tools/planner.py` + `src/agents/tools/evaluation.py` —
+   planner/evaluation terminal tools (submit_source_model, submit_evaluation,
+   normalized_value_overlap) used by the PGE agents.
+3. `src/agents/tools/context.py` — ADD `source_model` + `semantic_eval_report`
+   fields (forward-ref typed to avoid a circular import). `warehouse_id` and all
+   existing fields are preserved.
+4. `src/agents/tools/mapping.py` — additive PGE tool-schema plumbing
+   (`unmapped_attributes`, `MAPPING_TOOL_DEFINITIONS_BY_NAME`).
+5. `src/back/core/agents/AgentClient.py` — ADD `run_mapping_pge()` gateway
+   (→ `agent_mapping_pge`). `run_auto_assignment()` is unchanged and still
+   points at `agent_auto_assignment` (the simple engine is retained).
+6. `src/back/objects/mapping/Mapping.py` — run the PGE engine in the auto-assign
+   flow and accumulate the PGE extras (`source_model`, `mapping_evaluations`,
+   `mapping_run_log`) across chunks and single-item runs;
+   `save_mappings_to_session` gains three OPTIONAL params (default `None`, so the
+   legacy path is unaffected). The upstream `_canonicalize_imported_uris` helper
+   is preserved.
+7. Tests: `tests/agents/agent_mapping_pge/` — contracts, coverage, planner,
+   entity/relationship generators, deterministic evaluator, critic, engine.
+
+## Modified / added files
+
+27 files changed, 12047 insertions(+), 8 deletions(-). New `agent_mapping_pge`
+package (12 modules) + 2 new tools + 9 test modules; 4 additive modifications
+(`context.py`, `mapping.py`, `AgentClient.py`, `Mapping.py`).
+
+## Tests
+
+- `uv run pytest tests/agents/agent_mapping_pge -q` → **90 passed**.
+- `uv run pytest tests/units/agents tests/units/mapping -q` → **208 passed**.
+- Imports resolve on the upstream base (origin/master, v0.5.2).
@@ -0,0 +1,50 @@
+"""Planner -> Generator -> Evaluator (PGE) mapping agent.
+
+Three-stage mapping pipeline that replaces the prior single-loop ReAct
+mapping agent:
+
+* **Planner** — proposes a :class:`SourceModel` (table roles, canonical IDs,
+  join keys, ordered mapping plan).
+* **Generator** — produces individual entity/relationship mappings given the
+  plan.
+* **Evaluator** — checks each submitted mapping; stage 1 is deterministic
+  (pure SQL counts), stage 2 is semantic.
+
+Sprint 1 lays the foundation: the typed contracts plus the deterministic
+evaluator.  Subsequent sprints add the LLM-backed Planner, Generator,
+semantic Evaluator, and the orchestrating loop.
+"""
+
+from agents.agent_mapping_pge.contracts import (
+    CanonicalId,
+    EvalFailure,
+    EvalReport,
+    JoinKey,
+    MappingPlan,
+    RetryState,
+    SkipItem,
+    SourceModel,
+    TableRole,
+    TableRoleCandidate,
+)
+from agents.agent_mapping_pge.engine import (
+    AgentResult,
+    AgentStep,
+    run_agent,
+)
+
+__all__ = [
+    "AgentResult",
+    "AgentStep",
+    "CanonicalId",
+    "EvalFailure",
+    "EvalReport",
+    "JoinKey",
+    "MappingPlan",
+    "RetryState",
+    "SkipItem",
+    "SourceModel",
+    "TableRole",
+    "TableRoleCandidate",
+    "run_agent",
+]