fix: walk join branches in tag_handler.extract_metadata#247
Merged
Conversation
The catalog metadata summarizer for BSL-tagged expressions descended only the ``source`` chain to find the innermost ``SemanticTableOp``. For joined models the tag tree branches via ``SemanticJoinOp.left`` / ``.right`` (no ``source``), so the walk stopped at the join op and returned ``"0 dims, 0 measures"`` even though every leaf table's metadata was intact further down. Recurse through ``source``, fan out into both branches of any ``SemanticJoinOp``, and union dim/measure/calc-measure names from every ``SemanticTableOp`` leaf. Inside a join subtree, names get prefixed with the leaf's table name so the catalog summary matches the field names a joined ``SemanticTable`` exposes (e.g. ``flights.flight_count``); single-table models still emit flat names. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tag_handler.extract_metadatawalked only thesourcechain looking for the innermostSemanticTableOp. For joined models the tag tree branches viaSemanticJoinOp.left/.right(nosource), so the walk stopped at the join op and the catalog summary returned"0 dims, 0 measures"and emptydimensions/measuresarrays — even though every leafSemanticTableOp's metadata was intact further down the tree.source, fan out into both branches of anySemanticJoinOp, and union dim/measure/calc-measure names from everySemanticTableOpleaf. Inside a join subtree, names are prefixed with the leaf's table name so the catalog summary matches the field names a joinedSemanticTableexposes (e.g.flights.flight_count); single-table models still emit flat names so the existing base-model tests continue to hold.Scope
Audited the rest of the codebase for the same source-only blind spot — none found:
from_tag_nodewalkssourceonly, but that's intentional: it strips off query wrappers down to the base op, then the_reconstruct_joinreconstructor handlesleft/rightitself.reemitoperates ontag_node.parent, doesn't traverse metadata.chart/utils.pywalkers operate on live op trees and use_get_merged_fields/_find_all_root_models, which already handle joins.SemanticFilterOp,SemanticProjectOp,SemanticGroupByOp,SemanticAggregateOp,SemanticMutateOp,SemanticOrderByOp,SemanticLimitOp,SemanticUnnestOp,SemanticIndexOp) have a singlesourceand pass through correctly.Test plan
nix develop .#impure+uv run python -m pytest src/boring_semantic_layer/tests/test_xorq_tag_handler.py— 9 passed (8 existing + 1 new)nix develop .#impure+uv run python -m pytest src/boring_semantic_layer/tests/test_xorq_tag_handler.py src/boring_semantic_layer/tests/test_xorq_rebuild.py src/boring_semantic_layer/tests/test_xorq_convert.py src/boring_semantic_layer/tests/test_xorq_integration.py— 31 passed, 14 skipped, 1 xfailedtest_extract_metadata_walks_join_branchescovers a two-armjoin_onechain wrapped in a query and asserts the union of prefixed dim/measure names ({t1.id, t1.name, t2.id, t3.id, t3.extra}and{t1.count, t2.total, t3.extra_count}).🤖 Generated with Claude Code