Skip to content

fix: walk join branches in tag_handler.extract_metadata#247

Merged
hachej merged 1 commit intomainfrom
hussain/fix/tag-handler-walk-join-branches
May 4, 2026
Merged

fix: walk join branches in tag_handler.extract_metadata#247
hachej merged 1 commit intomainfrom
hussain/fix/tag-handler-walk-join-branches

Conversation

@hussainsultan
Copy link
Copy Markdown
Collaborator

Summary

  • tag_handler.extract_metadata walked only the source chain looking for the innermost SemanticTableOp. For joined models the tag tree branches via SemanticJoinOp.left / .right (no source), so the walk stopped at the join op and the catalog summary returned "0 dims, 0 measures" and empty dimensions/measures arrays — even though every leaf SemanticTableOp's metadata was intact further down the tree.
  • Rewrote the walker to recurse through source, fan out into both branches of any SemanticJoinOp, and union dim/measure/calc-measure names from every SemanticTableOp leaf. Inside a join subtree, names are prefixed with the leaf's table name so the catalog summary matches the field names a joined SemanticTable exposes (e.g. flights.flight_count); single-table models still emit flat names so the existing base-model tests continue to hold.
  • Calc measures are now also surfaced in the summary when present.

Scope

Audited the rest of the codebase for the same source-only blind spot — none found:

  • from_tag_node walks source only, but that's intentional: it strips off query wrappers down to the base op, then the _reconstruct_join reconstructor handles left/right itself.
  • reemit operates on tag_node.parent, doesn't traverse metadata.
  • chart/utils.py walkers operate on live op trees and use _get_merged_fields / _find_all_root_models, which already handle joins.
  • All linear ops (SemanticFilterOp, SemanticProjectOp, SemanticGroupByOp, SemanticAggregateOp, SemanticMutateOp, SemanticOrderByOp, SemanticLimitOp, SemanticUnnestOp, SemanticIndexOp) have a single source and pass through correctly.

Test plan

  • nix develop .#impure + uv run python -m pytest src/boring_semantic_layer/tests/test_xorq_tag_handler.py — 9 passed (8 existing + 1 new)
  • nix develop .#impure + uv run python -m pytest src/boring_semantic_layer/tests/test_xorq_tag_handler.py src/boring_semantic_layer/tests/test_xorq_rebuild.py src/boring_semantic_layer/tests/test_xorq_convert.py src/boring_semantic_layer/tests/test_xorq_integration.py — 31 passed, 14 skipped, 1 xfailed
  • New regression test test_extract_metadata_walks_join_branches covers a two-arm join_one chain wrapped in a query and asserts the union of prefixed dim/measure names ({t1.id, t1.name, t2.id, t3.id, t3.extra} and {t1.count, t2.total, t3.extra_count}).

🤖 Generated with Claude Code

The catalog metadata summarizer for BSL-tagged expressions descended
only the ``source`` chain to find the innermost ``SemanticTableOp``.
For joined models the tag tree branches via ``SemanticJoinOp.left`` /
``.right`` (no ``source``), so the walk stopped at the join op and
returned ``"0 dims, 0 measures"`` even though every leaf table's
metadata was intact further down.

Recurse through ``source``, fan out into both branches of any
``SemanticJoinOp``, and union dim/measure/calc-measure names from
every ``SemanticTableOp`` leaf. Inside a join subtree, names get
prefixed with the leaf's table name so the catalog summary matches
the field names a joined ``SemanticTable`` exposes (e.g.
``flights.flight_count``); single-table models still emit flat names.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hachej hachej merged commit 77eab0d into main May 4, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants