Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
942a10a
feat: Add SHAP integration for scorecard construction (v0.2.8a1 alpha)
xRiskLab Dec 6, 2025
b1207a6
feat: Add score points for all boosters
xRiskLab Dec 6, 2025
5dade02
Refactor SHAP integration: make optional and on-demand
xRiskLab Dec 6, 2025
42a6c60
Add likelihood boosting docs, SHAP table tests, and test fixes
xRiskLab Dec 7, 2025
31d6391
perf: vectorize SHAP score computation
xRiskLab Dec 7, 2025
e4ff21e
update docs
xRiskLab Dec 7, 2025
20c2c06
update reference to notebook
xRiskLab Dec 7, 2025
bcadd54
remove py script reference
xRiskLab Dec 7, 2025
5585f8e
update notebook
xRiskLab Dec 7, 2025
219ab79
refactor: remove SHAP column from scorecard table, use XAddEvidence w…
xRiskLab Dec 8, 2025
8e74c5e
make it fast
RektPunk Dec 20, 2025
f83aadb
clarify variable name
RektPunk Dec 20, 2025
82fb47c
use vectorize instead of merge
RektPunk Dec 20, 2025
8087767
faster get leafs
RektPunk Dec 20, 2025
4428744
faster get leafs
RektPunk Dec 21, 2025
ce6cef2
construct scorecard optimize
RektPunk Dec 21, 2025
0e23ed5
_convert_tree_to_points optimize
RektPunk Dec 21, 2025
0faf07e
set count as integer in test
RektPunk Dec 21, 2025
7610a0f
raise value error when scorecard is None
RektPunk Dec 21, 2025
87e8093
use map instead of np vectorize
RektPunk Dec 21, 2025
71078fb
use map instead of np vectorize
RektPunk Dec 21, 2025
760cc4d
Merge branch 'feat/faster-construct-scorecard' into feature/shap-inte…
xRiskLab Dec 21, 2025
d2e5292
Merge branch 'feat/faster-create-tree-to-points' into feature/shap-in…
xRiskLab Dec 21, 2025
a21e743
Merge branch 'feat/faster-get-leafs' into feature/shap-integration-alpha
xRiskLab Dec 21, 2025
b6acc27
Merge branch 'refactor/fast-xgb-construct' into feature/shap-integrat…
xRiskLab Dec 21, 2025
1b66087
chore: prepare v0.2.8rc1 release candidate
xRiskLab Dec 21, 2025
5513b93
chore: exclude examples from sdist to reduce package size
xRiskLab Dec 21, 2025
b11caf6
prepare rc2
xRiskLab Jan 4, 2026
8565c9c
feat: add fine-tuning support, mypy type checking, and v0.2.8 release…
xRiskLab Apr 19, 2026
3ff4d70
fix: mark sourcery as darwin-only, remove uv.lock and requirements.tx…
xRiskLab Apr 19, 2026
e1f1b75
chore: fix extra spaces in notebook, add requirements.txt to gitignore
xRiskLab Apr 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ xbooster.egg-info
tmp
python-workspace.code-workspace
uv.lock
requirements.txt
examples/*.sql
examples/*.py
# Keep typings directory but ignore pycache
Expand Down
9 changes: 8 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ repos:
rev: 0.6.11
hooks:
- id: uv-lock
- id: uv-export

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
Expand All @@ -21,3 +20,11 @@ repos:
- id: ruff-check
args: [--fix]
- id: ruff-format

- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.14.1
hooks:
- id: mypy
additional_dependencies: [pandas-stubs]
args: [--config-file=pyproject.toml]
files: ^xbooster/
82 changes: 82 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,87 @@
# Changelog

## [0.2.8] - 2026-04-19

### Added
- **Fine-Tuning Support**: New `finetuner` module for incremental model updates
- `finetune_xgb()`, `finetune_lgb()`, `finetune_cb()` helper functions
- `FineTuneResult` dataclass with tree counts and feature metadata
- Supports both same-feature continued training and expanded-feature warm-start
- `n_base_trees` parameter on all three constructors for fine-tuning awareness
- `from_finetune_result()` classmethod on all constructors
- `summarize_score_sources()` method for base vs. fine-tuned contribution analysis
- `TreeSource` column in scorecard output (base/finetuned)
- Example notebook: `examples/finetuning-getting-started.ipynb`

- **SHAP Integration**: SHAP-based scoring for all three libraries
- **XGBoost**: Native SHAP extraction using `pred_contribs=True`
- **LightGBM**: Native SHAP extraction using `pred_contrib=True`
- **CatBoost**: Native SHAP extraction using `get_feature_importance(type='ShapValues')`
- New `method="shap"` option in `predict_score()` and `predict_scores()` methods
- Feature-level score decomposition via `predict_scores(method="shap")`
- No external dependencies required (uses native SHAP implementations)
- Dedicated `shap_scorecard.py` module with centralized extraction functions

### Changed
- **Type Checking**: Migrated from `ty` to `mypy` with strict type safety
- Created type stubs for xgboost, catboost in `typings/`
- Updated lightgbm and scipy type stubs
- All source files and tests pass mypy with zero `type: ignore` comments
- Added mypy hook to `.pre-commit-config.yaml`

- **SHAP Module Refactoring**: Simplified SHAP API and module structure
- Simplified `compute_shap_scores()` to only require `shap_values`, `base_value`, and `feature_names`
- Module accessible via `from xbooster import shap` for cleaner imports
- SHAP computation is optional and only performed when `method="shap"` is used
- Removed SHAP column from scorecard binning tables (cleaner scorecard structure)

- **XGBoost Leaf Index Format**: `get_leafs()` now returns integer leaf indices
- Leaf indices returned as integers (7) instead of floats (7.0)
- Matches LightGBM behavior for consistency across constructors

### Performance Improvements
- **XGBoost Constructor Optimization** (PR #14, @RektPunk): Vectorized `construct_scorecard()`
- **LightGBM Constructor Optimizations** (PRs #10, #11, #13, @RektPunk):
- Vectorized `construct_scorecard()`, `_convert_tree_to_points()`, and `get_leafs()`
- All optimizations maintain backward compatibility and numerical equivalence

### Fixed
- **Package Distribution**: Excluded examples directory from sdist to reduce package size (~8.1MB reduction)

### Technical Details
- 144 tests passing (108 existing + 36 fine-tuning)
- All three constructors support SHAP and fine-tuning: `XGBScorecardConstructor`, `LGBScorecardConstructor`, `CBScorecardConstructor`
- Backward compatible: all existing APIs unchanged
- Full mypy coverage with custom type stubs (no `type: ignore` suppression)

## [0.2.8a1] - 2025-12-04 (Alpha)

### Added
- **SHAP Integration (Alpha)**: Added SHAP-based scoring for all three libraries
- **XGBoost**: Native SHAP extraction using `pred_contribs=True`
- **LightGBM**: Native SHAP extraction using `pred_contrib=True`
- **CatBoost**: Native SHAP extraction using `get_feature_importance(type='ShapValues')`
- New `method="shap"` option in `predict_score()` and `predict_scores()` methods
- SHAP values computed on-demand (not stored in scorecard binning table)
- Feature-level score decomposition via `predict_scores(method="shap")`
- Particularly useful for models with `max_depth > 1` where interpretability is challenging
- No external dependencies required (uses native SHAP implementations)

### Changed
- **SHAP Architecture Refactoring**: Moved all SHAP logic to dedicated `shap_scorecard.py` module
- SHAP extraction functions centralized: `extract_shap_values_xgb()`, `extract_shap_values_lgb()`, `extract_shap_values_cb()`
- SHAP computation is now optional and only performed when `method="shap"` is used
- Removed SHAP column from scorecard binning tables (cleaner scorecard structure)
- Simplified API: users don't need to import or call SHAP extraction functions directly

### Technical Details
- All three constructors now support SHAP: `XGBScorecardConstructor`, `LGBScorecardConstructor`, `CatBoostScorecardConstructor`
- SHAP values computed using native library methods (no shap package dependency)
- SHAP computation happens on-demand when `predict_score(method="shap")` or `predict_scores(method="shap")` is called
- Backward compatible: traditional scorecard methods unchanged
- Cleaner separation of concerns: scorecard construction vs. SHAP computation
- Alpha release for testing and feedback

## [0.2.7] - 2025-12-04

### Changed
Expand Down
Loading
Loading