Skip to content

fix: emit proper CTE column for ARRAY JOIN scalar aliases (#253)#257

Merged
genezhang merged 1 commit intomainfrom
fix/bi-8-array-join-scalar-cte
Mar 28, 2026
Merged

fix: emit proper CTE column for ARRAY JOIN scalar aliases (#253)#257
genezhang merged 1 commit intomainfrom
fix/bi-8-array-join-scalar-cte

Conversation

@genezhang
Copy link
Copy Markdown
Owner

Fixes bi-8 person.id by tracking UNWIND aliases through alias_to_id and emitting correct CTE column in Phase D.

Two fixes for bi-8 person.id bug:

1. Priority 3 in alias_to_id: find_unwind_aliases() traverses plan
   tree to detect UNWIND aliases, sets alias_to_id[alias] = alias
   (scalar IS the ID). Enables downstream Phase D detection.

2. Phase D: emit person_tag.person AS p6_person_id instead of
   skipping. Creates proper CTE column that downstream CTEs can
   resolve through standard property mapping.

Before: person.id AS p6_person_id (invalid — person is scalar)
After:  person_tag.person AS p6_person_id (correct)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@genezhang genezhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Emit proper CTE column for ARRAY JOIN scalar aliases (#253)

Follow-up to the ARRAY JOIN scalar detection in PR #254. Changes the Phase D behavior from "skip" to "emit correct column reference."

Changes

1. Phase D: emit instead of skip (plan_builder_utils.rs:10006-10032)

Previously, detecting an ARRAY JOIN scalar caused a continue (skip adding the ID column). Now it emits from_alias.scalar_col AS "pN_alias_id" — so person_tag.person AS "p6_person_id". This is correct: downstream CTEs need the p6_person_id column to resolve person.id through standard property mapping.

The fallback path (no FROM alias → log and skip) is reasonable.

2. Priority 3 in alias_to_id: find_unwind_aliases() (plan_builder_utils.rs:11253-11289)

Traverses the plan tree to find Unwind nodes and collects their aliases. Sets alias_to_id[alias] = alias for detected UNWIND scalars. This feeds into the Phase D detection (is_self_id_in_upstream_cte).

One observation: find_unwind_aliases handles a specific set of plan node variants (Filter, Projection, OrderBy, Limit, Skip, GroupBy, WithClause). If a new wrapping plan node is added in the future, it won't be traversed. This is the same pattern used elsewhere in the codebase though (per CLAUDE.md §5 — plan traversal consistency), so it's consistent. Just needs to be remembered when adding new LogicalPlan variants.

3. Missing continue after Priority 2 (plan_builder_utils.rs:11227)

The added continue after Priority 2's alias_to_id_column.insert() is important — without it, Priority 3 would also run and potentially overwrite a more specific mapping. Good catch.

4. Regression test (ldbc_regression_tests.rs:472-478)

Adds a positive assertion: sql.contains("person_tag.person AS \"p6_person_id\""). Combined with the existing negative assertion (!sql.contains("person.id AS ...")), this fully validates the fix. Good.

Verdict

Clean fix that evolves the "skip" approach from PR #254 into the correct "emit proper column" approach. The find_unwind_aliases traversal is consistent with existing patterns. LGTM.

@genezhang genezhang merged commit 7d630ed into main Mar 28, 2026
4 checks passed
@genezhang genezhang deleted the fix/bi-8-array-join-scalar-cte branch March 28, 2026 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant