Skip to content

Fix categorical ALE crash with high-cardinality features (#86)#87

Merged
monte-flora merged 1 commit into
masterfrom
fix/categorical-ale-high-cardinality
Apr 2, 2026
Merged

Fix categorical ALE crash with high-cardinality features (#86)#87
monte-flora merged 1 commit into
masterfrom
fix/categorical-ale-high-cardinality

Conversation

@monte-flora
Copy link
Copy Markdown
Owner

When bootstrapping categorical ALE, different bootstrap samples may contain different subsets of categories, producing ragged arrays that numpy cannot stack. Fix by:

  • Computing the full category set from the original data upfront
  • Reindexing each bootstrap's ALE values to the full category set (missing categories filled with NaN)
  • This ensures all bootstrap iterations have consistent array shapes

Also add 3 unit tests for low/medium/high cardinality categorical ALE.

Closes #86

When bootstrapping categorical ALE, different bootstrap samples may
contain different subsets of categories, producing ragged arrays that
numpy cannot stack. Fix by:

- Computing the full category set from the original data upfront
- Reindexing each bootstrap's ALE values to the full category set
  (missing categories filled with NaN)
- This ensures all bootstrap iterations have consistent array shapes

Also add 3 unit tests for low/medium/high cardinality categorical ALE.

Closes #86

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@monte-flora monte-flora merged commit 64f4c75 into master Apr 2, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Calculating ALE for high cardinality categorical feature throw a ValueError

1 participant