Skip to content

expose SpjNormalForm::predicate() + Predicate Display impl#52

Merged
zhuqi-lucas merged 2 commits into
branch-53from
qizhu/expose-spj-predicate-branch53
Jun 24, 2026
Merged

expose SpjNormalForm::predicate() + Predicate Display impl#52
zhuqi-lucas merged 2 commits into
branch-53from
qizhu/expose-spj-predicate-branch53

Conversation

@zhuqi-lucas

@zhuqi-lucas zhuqi-lucas commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

Cherry-pick of upstream PR datafusion-contrib#120 so atlas can pick up the new predicate() accessor + Display for Predicate without waiting for the upstream review cycle.

Changes:

  • Promotes Predicate and ColumnEquivalenceClass from private to pub so they can be referenced from public APIs.
  • Adds pub fn predicate(&self) -> &Predicate on SpjNormalForm.
  • Adds a Display impl on Predicate that AND-joins pairwise eq-class equalities, narrowed range intervals (default unbounded ones are skipped), and sorted residuals.

Internal fields stay private; only the read-only view is published.

Motivation

Atlas-side meta.mv_normal_form / meta.mv_candidate_map catalog tables (X-2975, atlas PR #5343) currently walk the original LogicalPlan to populate the predicates column because SpjNormalForm::predicate is private. With this change, that workaround drops to normal_form.predicate().to_string().

Tests

22/22 lib tests pass on branch-53. Adds test_predicate_getter_and_display covering eq class + range + residual buckets.

`SpjNormalForm` already exposes public getters for `output_schema`,
`output_exprs`, and `referenced_tables`, but the `predicate` field is
private. Downstream consumers that want to surface the SPJ normal form
(e.g. atlas catalog tables exposing `mv_normal_form` for debugging why
view matching does or does not engage) currently have to walk the
original `LogicalPlan` and re-collect Filter / TableScan filters as a
string, which loses the normalization the analyzer applied.

This change:

* Promotes `Predicate` and `ColumnEquivalenceClass` from private to
  `pub` so they can be referenced from public APIs.
* Adds `pub fn predicate(&self) -> &Predicate` on `SpjNormalForm`.
* Adds a `Display` impl on `Predicate` that AND-joins:
  - pairwise equalities derived from each multi-column equivalence
    class (singletons emit nothing),
  - narrowed range intervals per equivalence class (classes whose
    interval is still the default unbounded interval are skipped, so
    only meaningful constraints surface),
  - residual filter expressions, sorted for deterministic output.

The internal fields stay private, so this does not lock in any
particular representation; only the read-only view is published.

A unit test exercises the new getter + Display on a predicate that
populates all three buckets (eq class, range, residual).
Copilot AI review requested due to automatic review settings June 23, 2026 06:31

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR exposes the normalized filter predicate from SpjNormalForm to external callers and adds a human-readable Display rendering for that predicate, enabling downstream systems (e.g., catalog surfacing) to stringify the normalized predicate without re-walking the original LogicalPlan.

Changes:

  • Added SpjNormalForm::predicate(&self) -> &Predicate accessor.
  • Made Predicate public and implemented fmt::Display to render equality classes, range intervals, and residuals as an AND-joined string.
  • Added a new unit test exercising the new getter and Display output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/rewrite/normal_form.rs Outdated
Comment on lines +366 to +370
/// Built from the original `LogicalPlan` in [`SpjNormalForm::new`] and
/// exposed read-only via [`SpjNormalForm::predicate`]. Field accessors
/// describe the three filter buckets that view matching uses (equality
/// classes, per-class range intervals, and residuals); see the module
/// docs for how they participate in the subsumption tests.
Comment thread src/rewrite/normal_form.rs Outdated
Comment on lines +413 to +431
for (idx, range) in self.ranges_by_equivalence_class.iter().enumerate() {
let Some(interval) = range else { continue };
let Some(eq_class) = self.eq_classes.get(idx) else {
continue;
};
let Some(col) = eq_class.columns.iter().next() else {
continue;
};
let Ok(field) = self.schema.field_from_column(col) else {
continue;
};
let Ok(unbounded) = Interval::make_unbounded(field.data_type()) else {
continue;
};
if interval == &unbounded {
continue;
}
parts.push(format!("{col} in {interval}"));
}
Comment on lines +1757 to +1760
assert!(
!rendered.is_empty(),
"predicate Display should not be empty when filters exist"
);
- Reword the `Predicate` doc comment so it no longer implies field
  accessors that don't exist; describe the internal buckets instead.
- Make `Display for Predicate` render `FALSE` when any equivalence
  class has an empty range (`None` in `ranges_by_equivalence_class`
  after a prior `Interval::intersect` collapsed). Previously the
  unsatisfiable predicate looked identical to "no filter", which is
  misleading for catalog surfacing.
- Revert `pub struct ColumnEquivalenceClass` to `struct`; the type is
  not referenced from any public signature, so keeping it private avoids
  needlessly expanding the crate's public surface.
- Add `debug_assert!` calls on the in-loop `continue` paths in Display
  that fire only on broken internal invariants (length mismatches,
  empty classes, missing schema entries). The intentional unbounded
  skip stays a silent continue.
- Strengthen `test_predicate_getter_and_display` so it actually asserts
  the range bucket (the `a >= 5` narrowing) renders, not just the
  equality + residual buckets.
@zhuqi-lucas zhuqi-lucas force-pushed the qizhu/expose-spj-predicate-branch53 branch from 9bccd7b to 7016a0e Compare June 23, 2026 08:26
@zhuqi-lucas zhuqi-lucas merged commit c398143 into branch-53 Jun 24, 2026
8 checks passed
@zhuqi-lucas zhuqi-lucas deleted the qizhu/expose-spj-predicate-branch53 branch June 24, 2026 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants