Skip to content

Add scatter plot visualization for interaction values#516

Open
mmschlk wants to merge 7 commits intomainfrom
claude/plan-issue-418-dI0M3
Open

Add scatter plot visualization for interaction values#516
mmschlk wants to merge 7 commits intomainfrom
claude/plan-issue-418-dI0M3

Conversation

@mmschlk
Copy link
Copy Markdown
Owner

@mmschlk mmschlk commented Apr 25, 2026

Motivation and Context

This PR adds a new scatter_plot function to visualize per-sample interaction values against feature values. This is inspired by SHAP's scatter plot but extended to support higher-order interactions. For first-order interactions, it matches SHAP's behavior; for higher-order interactions, the x-axis is restricted to a single feature from the interaction tuple.

The implementation includes:

  • Support for both first-order and higher-order interactions
  • Flexible feature selection via index, name, or tuple
  • Optional coloring by another feature with colorbar
  • Jitter support for categorical/integer features
  • Comprehensive input validation and error handling
  • Full test coverage with 379 lines of unit tests
  • Example notebook demonstrating usage

Public API Changes

  • Yes, Public API changes (Details below)

New public function:

  • shapiq.scatter_plot() - Main plotting function exported from shapiq.plot module and top-level shapiq namespace

How Has This Been Tested?

Comprehensive unit tests added in tests/shapiq/tests_unit/tests_plots/test_scatter.py covering:

  • Basic functionality with numpy and pandas inputs
  • Shorthand interaction argument formats (int, str, tuple)
  • Auto-selection of most important interaction when interaction=None
  • Higher-order interactions with explicit x-axis selection
  • Color feature with colorbar rendering
  • NaN handling in color values
  • Edge cases (constant color values, jitter effects)
  • All error conditions with informative messages
  • Parameter validation (alpha, dot_size, jitter ranges)

Example notebook added demonstrating typical usage patterns.

Checklist

  • The changes have been tested locally (comprehensive unit test suite).
  • Documentation has been updated (docstring with full parameter descriptions, example notebook).
  • An entry has been added to CHANGELOG.md (not visible in diff, may be separate).
  • The code follows the project's style guidelines.
  • I have considered the impact of these changes on the public API (new export added to __init__.py).

Introduces shapiq.scatter_plot, a SHAP-style scatter (dependence) plot that
works for both first-order Shapley values and higher-order interactions. For
higher-order interactions the x-axis is restricted to a single feature in the
interaction tuple (defaulting to the first, overridable via x_feature),
matching the design discussed in the issue.

https://claude.ai/code/session_01BebyCFsKiVH49mUQejbMCd
Apply ruff-format collapses for lines that fit, simplify a redundant
if/elif branch, and switch isinstance() tuples to PEP 604 union syntax
to satisfy UP038 (matches the rest of the codebase).

https://claude.ai/code/session_01BebyCFsKiVH49mUQejbMCd
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 25, 2026

Codecov Report

❌ Patch coverage is 98.44961% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/shapiq/plot/scatter.py 98.43% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

claude and others added 4 commits April 25, 2026 18:24
Adds tests for the four error paths flagged by codecov: TypeError on
non-tuple interaction (float, list), empty interaction tuple, tuple of
unsupported feature type (float), and the "no non-empty interactions"
fallback when interaction=None on an InteractionValues that contains only
the empty coalition.

https://claude.ai/code/session_01BebyCFsKiVH49mUQejbMCd
…s, max_order=2

Review fixes for the scatter plot:

- y-values now go through iv[interaction_tuple] (canonical InteractionValues
  __getitem__ contract: missing interactions resolve to 0.0) instead of raw
  iv.dict_values[interaction_tuple] dict access, which would have raised
  KeyError on any sample whose lookup didn't contain the requested tuple.
- Pre-flight 'not found in InteractionValues lookup' guard now checks the
  union of lookups across all samples instead of only sample 0, so a
  heterogeneous list with the interaction missing only from index 0 no
  longer falsely rejects a valid request.
- Docstring updated to describe the broadened guard.

Example script (examples/visualization/plot_scatter.py) bumped from 20 to
200 explained instances and from max_order=3 to max_order=2 so the scatter
distribution is meaningful while staying around ~6s end-to-end in the docs
build.
The y-axis previously read 'SHAP value' for first-order interactions,
which is misleading whenever the user picked a non-SV index (FSII, k-SII,
STII, ...), and inconsistent with the higher-order branch's 'Interaction
value: A x B' format.

Replace both with a single format derived from the actual index attribute:
    {index}({feature_a}, {feature_b}, ...)

So a first-order FSII run on MedInc reads 'FSII(MedInc)' instead of
'SHAP value', and a pair reads 'FSII(MedInc, Latitude)'. Update the
existing test assertion accordingly.
Signed-off-by: Maximilian <maximilian.muschalik@gmail.com>
@mmschlk
Copy link
Copy Markdown
Owner Author

mmschlk commented Apr 25, 2026

Add a one-liner to the changelog

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants