Skip to content

Optimize ALE and PD computation: 1.6-3.2x speedup#90

Merged
monte-flora merged 1 commit into
masterfrom
improve/performance-optimization
Apr 2, 2026
Merged

Optimize ALE and PD computation: 1.6-3.2x speedup#90
monte-flora merged 1 commit into
masterfrom
improve/performance-optimization

Conversation

@monte-flora
Copy link
Copy Markdown
Owner

ALE (compute_first_order_ale):

  • Replace DataFrame operations with numpy arrays throughout bootstrap loop
  • Batch both bin-edge predictions into single predict call (2 → 1 calls)
  • Replace pandas groupby with numpy bincount for mean effects
  • Eliminates 2 DataFrame copies per bootstrap iteration

PD (compute_partial_dependence):

  • Vectorize grid-point loop: batch all n_bins points into single predict call instead of per-grid-point predict loop (20 → 1 calls per bootstrap)
  • Fix bug: predict was called inside feature loop instead of after all features assigned
  • Use numpy arrays instead of DataFrame throughout

Benchmarks (2000 samples, 10 features, 50-tree RF):
PD 1D (3 feat, 10 boot): 1.96s → 0.60s (3.2× faster)
ALE 1D (all, 10 boot): 0.85s → 0.52s (1.6× faster)
PD 1D (3 feat, 1 boot): 0.20s → 0.06s (3.3× faster)
ALE 1D (all, 1 boot): 0.09s → 0.05s (1.7× faster)

Add benchmark_suite.py for reproducible performance measurement.

ALE (compute_first_order_ale):
- Replace DataFrame operations with numpy arrays throughout bootstrap loop
- Batch both bin-edge predictions into single predict call (2 → 1 calls)
- Replace pandas groupby with numpy bincount for mean effects
- Eliminates 2 DataFrame copies per bootstrap iteration

PD (compute_partial_dependence):
- Vectorize grid-point loop: batch all n_bins points into single predict
  call instead of per-grid-point predict loop (20 → 1 calls per bootstrap)
- Fix bug: predict was called inside feature loop instead of after all
  features assigned
- Use numpy arrays instead of DataFrame throughout

Benchmarks (2000 samples, 10 features, 50-tree RF):
  PD 1D (3 feat, 10 boot): 1.96s → 0.60s (3.2× faster)
  ALE 1D (all, 10 boot):   0.85s → 0.52s (1.6× faster)
  PD 1D (3 feat, 1 boot):  0.20s → 0.06s (3.3× faster)
  ALE 1D (all, 1 boot):    0.09s → 0.05s (1.7× faster)

Add benchmark_suite.py for reproducible performance measurement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@monte-flora monte-flora merged commit 2e1822c into master Apr 2, 2026
11 checks passed
@monte-flora monte-flora deleted the improve/performance-optimization branch April 2, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant