🧩 object-aligner

A configurable, deterministic similarity score for structured (JSON-like) data — and a drop-in, model-free reward for LLM prompt optimization.

LLMs are increasingly asked to emit JSON conforming to a fixed schema — for information extraction, tool calling, agentic planning, and knowledge-graph construction. Measuring how close such an output is to a gold reference is awkward: exact match is brittle, text similarity ignores structure, and an LLM judge — powerful and flexible, but costlier to run and harder to reproduce — is not always the right fit when you need a fast, deterministic, auditable score. Object Aligner offers a complementary alternative for that case.

Object Aligner (OA) scores two JSON objects by recursively aligning their trees — the Hungarian algorithm for unordered collections, sequence alignment for ordered ones — and awarding partial credit at the granularity the schema declares. It is configured entirely through a compact set of JSON Schema extensions, so adapting it to a new task means annotating a schema, not writing code.

A primary use is prompt optimization: OA's deterministic, decomposable score makes a ready reward signal for optimizers such as GEPA or DSPy — and because the same alignment localizes every mismatch, it also emits ranked, natural-language feedback for their reflection slots, with no extra model call.

✨ Highlights

🌳 Schema-driven recursive alignment — one deterministic score in [0, 1], with partial credit at every node, for arbitrarily nested objects, lists, and primitives.
🕸️ Referential alignment for (hyper)graphs — score cross-referenced records up to identifier renumbering. OA infers a bijection between gold and candidate ids and scores every reference through it.
🔢 Per-list sequence semantics — choose, per list, between order-agnostic matching, an order-sensitive monotone regime (insertions/deletions) for ranking & planning, and positional tuples / prefixes whose slots carry position-specific meaning.
🧭 Deterministic ranked feedback — the same alignment that produces the score also pinpoints where the candidate departs from gold and emits ranked repair operations, scored by the exact amount of score each recovers — no LLM call.
🔌 Drop-in optimizer reward — plug OA into prompt optimizers like DSPy or GEPA as a reproducible, auditable, model-free reward (and reflection signal).
🧬 Semantic string similarity (optional extension) — score text fields by meaning rather than character overlap, using OpenAI (or any OpenAI-compatible) embeddings, with built-in caching and batching.
🧮 Deterministic & decomposable — same inputs → same number; the top-level score is an explicit weighted aggregate of child scores, which is what makes attribution and feedback exact.

📦 Installation

Not on PyPI yet — install straight from GitHub (like other AIC tools, e.g. aic-nlp-utils):

pip install git+https://github.com/aic-factcheck/object_aligner.git

Or with uv:

uv add git+https://github.com/aic-factcheck/object_aligner.git

Optional extras (embedding-based semantic string similarity via an OpenAI-compatible API):

pip install "object-aligner[semantic-openai] @ git+https://github.com/aic-factcheck/object_aligner.git"

Requires Python 3.13+.

🚀 Quick start

from object_aligner import ObjectAligner

schema = {"type": "string", "score": "jaro"}
aligner = ObjectAligner(schema)

print(aligner.metric("hello", "hallo"))          # {'score': 0.8667}

Score a nested object and ask for human-readable feedback in one call:

result = aligner.metric(gold, pred, generate_feedback=True)
print(result["score"])
print(result["feedback"])   # ranked, prescriptive fix list — deterministic, no LLM

🧠 How it works

OA takes a gold object g, a candidate object p, and a schema S, and returns score(g, p | S) ∈ [0, 1]. Scoring at every internal node runs in two phases:

Alignment — fix a correspondence between the children of g and p (Hungarian assignment for unordered collections / maps; a sequence-alignment dynamic program for ordered lists).
Scoring — aggregate the per-pair child scores over that correspondence into a single number, weighted as the schema declares.

Both branches recurse, so any nesting depth works naturally. Primitives are scored directly by a configurable comparator; empty values (null/None) are handled explicitly.

🔑 Capabilities

Area	What you get
🔤 Primitives	Strings (`exact`, `jaro`, `jaro_winkler`, `levenshtein`, `damerau_levenshtein`, `osa`, `indel`, `lcsseq`), numbers (`exact`, `invdiff`, `relative`), per-field thresholds, and custom metric callables. See primitives.
📚 Lists & sequences	`order:"fixed"` (positional), `order:"align"` (order-agnostic Hungarian), monotone order-sensitive alignment, `prefixItems`/`prefixWeights` tuples, and `ignoreExcess`/`ignoreMissing`. See lists.
🗂️ Maps / objects	Keys matched by label only (Hungarian), then values graded recursively; tune with `keyImportance`, `valueImportance`, `valueWeight`. See dicts.
🕸️ Referential alignment	`idScope` / `ref` declare primary/foreign-key-style links; OA scores references invariant to id relabeling, with 1-WL tie-breaking for property-identical twins. See referential.
🧭 Feedback	`feedback()` → top-K ranked repair string for optimizer reflection slots (GEPA/DSPy/TextGrad). See feedback.
🩹 Attribution & repair	`attribute()` decomposes the deficit into ranked per-path contributions; `repair()` emits RFC-6902-style ops with exact score deltas and `apply_to()`. See attribution, repair.
🗣️ Describe	`describe()` → deterministic plain-English walk of the alignment tree. See describe.
🈳 Null handling	Per-field `nullScore` for asymmetric `null`/value mismatches. See null handling.
📈 Confidence	Opt-in per-pair stability scores harvested from each Hungarian matrix. See confidence.
🧬 Semantic similarity	Opt-in embedding-based string metric with caching, batching, and OpenAI-compatible transport. See semantic.

🕸️ Referential alignment

Complex structured data is rarely a flat tree: cross-references between records make it a graph or hypergraph, which no prior similarity metric scores once identifiers are arbitrary. Mark one primitive as an identifier (idScope) and others as references (ref):

schema = {
    "type": "object",
    "properties": {
        "people": {
            "type": "array", "order": "align",
            "items": {"type": "object", "properties": {
                "id":   {"type": "integer", "idScope": "person"},
                "name": {"type": "string",  "score": "exact", "valueWeight": 2.0},
                "role": {"type": "string",  "score": "exact"},
            }},
        },
        "mentorships": {
            "type": "array", "order": "align", "ignoreExcess": True,
            "items": {"type": "object", "properties": {
                "mentor": {"type": "integer", "ref": "person"},
                "mentee": {"type": "integer", "ref": "person"},
            }},
        },
    },
}

OA infers the gold→candidate id bijection (by everything except the masked id field), breaks remaining ties by graph structure with Weisfeiler–Leman color refinement, and scores every reference through the bijection — so two correct extractions that renumber and reorder their records still match. Recovering the bijection exactly is graph isomorphism, which OA approximates in near-linear time.

🔌 As a prompt-optimization reward

OA is a deterministic, decomposable structural reward — cheap to evaluate at scale, reproducible, and easy to audit. It complements LLM-as-judge rewards: use a judge for open-ended semantic grading, and OA when the answer has a known schema (the two can also be combined). Used as the reward inside GEPA across synthetic and real-world datasets, OA produced consistent gains and never a significant loss — and the same alignment supplies the natural-language reflection signal, so one call returns both how well a candidate did and what to change.

🧾 Schema extensions (cheat sheet)

OA is configured with a small set of keywords layered on top of JSON Schema:

Keyword	Applies to	Purpose
`score`	string / number / integer	Leaf comparator (built-in name or custom metric)
`threshold`	string / number / integer	Floor below which a leaf scores 0
`order`	array	`"fixed"` (positional) or `"align"` (order-agnostic)
`ignoreExcess` / `ignoreMissing`	array	Drop unmatched candidate / gold items from the denominator
`prefixItems` / `prefixWeights`	array	Positional tuple head with per-slot weights
`keyImportance` / `valueImportance`	object	Weight of the key term vs. the value term
`valueWeight`	object property	Per-property weight in the value aggregate
`idScope`	primitive (in an array)	Declare an identifier scope (primary key)
`ref`	primitive	Reference into a named scope (foreign key)
`nullScore`	any node	Score for an asymmetric `null`/value mismatch

Full reference: docs/schema_reference.md.

📖 Documentation

Start at docs/index.md. Chapters:

Chapter
Concepts & Architecture	Primitives
Lists & Arrays	Dictionaries & Objects
Nesting	The Metric Function
Referential Alignment	Feedback
Attribution	Repair
Describe	Null Handling
Confidence	Semantic Similarity
Schema Reference	API Reference

🧪 Development

uv sync          # install dependencies
uv run pytest    # run the test suite

The repository ships a comprehensive pytest suite under tests/ covering primitives, lists, dicts, nesting, referential alignment, feedback, repair, attribution, and edge cases.

📜 Citation

If you use Object Aligner in academic work, please cite the paper (in preparation):

@misc{drchal2026objectaligner,
  title  = {Object Aligner: A Configurable JSON Schema Similarity Score for Graphs,
            Applied to LLM Prompt Optimization},
  author = {Drchal, Jan},
  year   = {2026},
  note   = {Reference implementation: https://github.com/aic-factcheck/object_aligner}
}

📝 License

MIT

🕰️ History

This is a cleaned-up, standalone version of the Object Aligner originally developed as part of the PromptOpt prompt-optimization framework. The original implementation can be found in the first commit of PromptOpt (Dec 20, 2024).

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
docs		docs
scripts		scripts
src/object_aligner		src/object_aligner
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧩 object-aligner

✨ Highlights

📦 Installation

🚀 Quick start

🧠 How it works

🔑 Capabilities

🕸️ Referential alignment

🔌 As a prompt-optimization reward

🧾 Schema extensions (cheat sheet)

📖 Documentation

🧪 Development

📜 Citation

📝 License

🕰️ History

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🧩 object-aligner

✨ Highlights

📦 Installation

🚀 Quick start

🧠 How it works

🔑 Capabilities

🕸️ Referential alignment

🔌 As a prompt-optimization reward

🧾 Schema extensions (cheat sheet)

📖 Documentation

🧪 Development

📜 Citation

📝 License

🕰️ History

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages