Skip to content

binary(): type the dask+cupy meta so the lazy dtype matches float32 (#3508)#3513

Merged
brendancol merged 1 commit into
xarray-contrib:mainfrom
brendancol:deep-sweep-metadata-classify-2026-06-25
Jun 26, 2026
Merged

binary(): type the dask+cupy meta so the lazy dtype matches float32 (#3508)#3513
brendancol merged 1 commit into
xarray-contrib:mainfrom
brendancol:deep-sweep-metadata-classify-2026-06-25

Conversation

@brendancol

@brendancol brendancol commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Part of #3508. Rebased onto current main after the duplicate accuracy-sweep PR #3514 merged.

The _cpu_binary float32 fix and the verify_dtype=True backend tests are already on main via #3514, so this PR is now scoped to the one piece that didn't land there: the dask+cupy lazy dtype.

What this fixes

_run_dask_cupy_binary passed meta=cupy.array(()), which defaults to float64. So the lazy dask+cupy array advertised float64 while _run_cupy_binary computes float32. The merged tests only check the computed dtype because general_output_checks computes the array before asserting, so nothing caught the lazy mismatch.

Probe before the fix:

dask+cupy : lazy dtype = float64 | computed dtype = float32

This is the same advertised-vs-computed mismatch class as aspect #2682 and focal #3217.

Change

  • _run_dask_cupy_binary now passes meta=cupy.array((), dtype='f4').
  • test_binary_dask_cupy asserts the lazy dtype is float32 so it stays pinned (the existing dtype checks all compute first).

After the fix, dask+cupy reports float32 both lazy and computed. Full test_classify.py suite passes on a CUDA host with the GPU backends running live.

Follow-up (out of scope)

The sibling classifiers' dask+cupy helpers (_run_dask_cupy_bin and friends) use the same untyped meta=cupy.array(()) and likely share the same latent lazy-dtype mismatch. Left for a separate change to keep this PR focused on binary.

@brendancol

Copy link
Copy Markdown
Contributor Author

Self-review (domain-aware)

I ran the review-pr checks over this branch and found one real gap, now fixed in 2db65f8.

Fixed during review

The eager dtype fix was correct, but it left the dask+cupy path half done. _run_dask_cupy_binary passed meta=cupy.array(()), which defaults to float64, so the lazy dask array advertised float64 while _run_cupy_binary computes float32. The four backend tests didn't catch it because general_output_checks computes the array before asserting the dtype, so it only ever checked the computed dtype.

Probe before the fix:

dask+numpy : lazy dtype = float32 | computed dtype = float32
dask+cupy  : lazy dtype = float64 | computed dtype = float32

This is the same advertised-vs-computed mismatch class as aspect #2682 and focal #3217. Fix: pass meta=cupy.array((), dtype='f4'), and assert the lazy dtype in test_binary_dask_cupy so it stays pinned. After the fix both paths report float32 lazy and computed. Full classify suite: 103 passed, GPU backends run live.

The dask+numpy path was already fine because map_blocks runs the chunk function to infer its meta, and that now returns float32.

What looks good

  • Root cause is right: _cpu_binary allocated dtype=data.dtype. float32 matches the docstring and every sibling classifier (reclassify, quantile, and the rest go through _cpu_bin, which is already float32).
  • verify_dtype=True added to all four binary backend tests.
  • The integer-input regression test covers the case that previously returned int32 and couldn't hold the NaN sentinel.

Out of scope (follow-up)

The sibling classifiers' dask+cupy helpers (_run_dask_cupy_bin and friends) use the same untyped meta=cupy.array(()) and likely share the same lazy-dtype mismatch. I left them alone to keep this PR focused on binary, matching how the aspect audit deferred slope/curvature.

Note on the duplicate

PR #3514 (accuracy sweep) fixes the same binary() dtype bug. I'm not touching either PR; flagging it here so a maintainer can reconcile the two. This PR also carries the dask+cupy lazy-meta fix and the lazy-dtype assertion, which #3514 may not.

…array-contrib#3508)

The _cpu_binary float32 fix and the verify_dtype=True backend tests landed
on main via the duplicate accuracy-sweep PR xarray-contrib#3514. This change is the
remaining piece: _run_dask_cupy_binary passed meta=cupy.array(()), which
defaults to float64, so the lazy dask+cupy array advertised float64 while
_run_cupy_binary computes float32. The merged tests only check the computed
dtype (general_output_checks computes before asserting), so the mismatch
went unnoticed.

Pass meta=cupy.array((), dtype='f4') and assert the lazy dtype in
test_binary_dask_cupy. Same advertised-vs-computed class as aspect xarray-contrib#2682 and
focal xarray-contrib#3217. Also records the classify metadata-propagation sweep state.
@brendancol brendancol force-pushed the deep-sweep-metadata-classify-2026-06-25 branch from 2db65f8 to 1faa5c6 Compare June 26, 2026 14:05
@brendancol brendancol changed the title Fix binary() output dtype to float32 on all backends (#3508) binary(): type the dask+cupy meta so the lazy dtype matches float32 (#3508) Jun 26, 2026
@brendancol

Copy link
Copy Markdown
Contributor Author

Rebased onto current main now that the duplicate accuracy-sweep PR #3514 has merged. Main already carries the _cpu_binary float32 fix and the verify_dtype=True backend tests, so I dropped those from this branch (they were identical to what merged) and dropped my integer-input test since #3514's test_binary_output_dtype_float32 already covers float64/float32/int32.

This PR is now scoped to the only piece that wasn't in #3514: typing the dask+cupy map_blocks meta as cupy.array((), dtype='f4') so the lazy array advertises float32 instead of float64, plus a lazy-dtype assertion in test_binary_dask_cupy. Net diff vs main is just those two spots (and the classify metadata-sweep state row).

New head: 1faa5c6

@brendancol brendancol merged commit b6e6051 into xarray-contrib:main Jun 26, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant