Skip to content

classify: return all-NaN instead of crashing on all-non-finite input (#3510)#3548

Merged
brendancol merged 3 commits into
mainfrom
issue-3510
Jun 27, 2026
Merged

classify: return all-NaN instead of crashing on all-non-finite input (#3510)#3548
brendancol merged 3 commits into
mainfrom
issue-3510

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3510

What

head_tail_breaks, percentiles, and box_plot raised an opaque reduction error when every pixel was non-finite (all NaN, or all +/-inf, which is mapped to NaN before binning). Each one took a max or percentile over the finite subset of the data, and on this input that subset is empty.

  • head_tail_breaks -> ValueError: zero-size array to reduction operation fmax
  • percentiles -> IndexError: index -1 is out of bounds for axis 0 with size 0
  • box_plot -> ValueError: zero-size array to reduction operation fmax

They now guard the empty-finite case and return an all-NaN array, matching equal_interval, std_mean, and maximum_breaks.

Backends

numpy, cupy, dask+numpy, dask+cupy. The guard sits in the shared bin/percentile helpers and the eager box_plot branch, so it covers all four.

Test plan

  • Flipped the xfail regression tests (added by the test-coverage sweep) to plain assertions
  • Added all-inf coverage for the three functions
  • Full test_classify.py passes (117)

…3510)

head_tail_breaks, percentiles, and box_plot reduced over the empty finite
subset of an all-NaN/all-inf raster and raised an opaque reduction error.
Guard the empty-finite case in each so they degrade to an all-NaN result,
matching equal_interval, std_mean, and maximum_breaks. Flip the xfail
regression tests to plain assertions and add all-inf coverage.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: classify all-non-finite handling (#3510)

Read the full classify.py context, not just the diff. The fix is correct and the backend coverage holds up.

Blockers

None.

Suggestions

  • percentiles now reaches np.nanmax(clean) in the wrapper on all-NaN input (the crash that used to preempt it is gone), so it emits a benign RuntimeWarning: All-NaN slice encountered. std_mean has the same behaviour and its test suppresses it with warnings.catch_warnings(). For consistency, test_percentiles_all_nan / test_percentiles_all_inf could do the same (xrspatial/tests/test_classify.py). It does not fail CI -- RuntimeWarning is not promoted to error in setup.cfg -- so this is housekeeping, not a correctness issue. head_tail_breaks and box_plot short-circuit before any nan-reduction and stay warning-free.

Nits

None.

What looks good

  • The guard sits in the shared helpers (_compute_head_tail_bins, _run_percentiles, _run_dask_percentiles, _box_plot_bins_from_sample) plus the eager _run_box_plot branch, so a single guard per code path covers all four backends without duplicating logic. Verified the box_plot mapper routes numpy/cupy through _run_box_plot and both dask paths through _box_plot_bins_from_sample.
  • Returning a [nan] bin (or _bin(agg, [nan], [0])) is the same degradation equal_interval/std_mean/maximum_breaks already use, so behaviour is uniform across classifiers.
  • all-inf coverage uses both +inf and -inf, exercising the isinf -> nan mapping.
  • Full test_classify.py passes (117), including cupy and dask+cupy on a CUDA box.

Checklist

  • Degradation matches sibling classifiers
  • All four backends produce all-NaN consistently
  • NaN/inf handling correct (isinf catches both signs)
  • Edge case (all-non-finite) covered by tests
  • Dask sample paths guarded
  • No premature materialization introduced
  • No new public API (no README/docstring change needed)

… tests (#3510)

percentiles still takes nanmax over the cleaned array to set the top bin
edge, which warns on all-non-finite input like std_mean does. Suppress it
in the regression tests to match the test_std_mean_all_nan convention.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (#3510)

Re-reviewed after the follow-up commit and the clean merge of origin/main.

Disposition of prior findings

  • Suggestion (percentiles all-NaN RuntimeWarning): fixed in c94d36e. test_percentiles_all_nan and test_percentiles_all_inf now wrap the call in warnings.catch_warnings() the same way test_std_mean_all_nan does. Confirmed by running the all-NaN/all-inf subset with -W error::RuntimeWarning: 16 passed, no unsuppressed warnings.

New state

  • Merged origin/main (geotiff symbology sidecar commit, unrelated to classify). No conflicts. Full test_classify.py still passes (117).

No remaining findings.

@brendancol brendancol merged commit 22e2d1c into main Jun 27, 2026
10 checks passed
@brendancol brendancol deleted the issue-3510 branch July 1, 2026 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

classify: head_tail_breaks, percentiles, box_plot crash on all-NaN / all-inf input

1 participant