Skip to content

Match cKDTree max_distance check to the brute-force float32 convention (#3392)#3555

Merged
brendancol merged 2 commits into
mainfrom
issue-3392
Jun 27, 2026
Merged

Match cKDTree max_distance check to the brute-force float32 convention (#3392)#3555
brendancol merged 2 commits into
mainfrom
issue-3392

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3392

Bounded EUCLIDEAN/MANHATTAN allocation (and proximity/direction) on a dask
array could return NaN for a target whose distance sits one float32 ulp above
max_distance, while the eager numpy/cupy backends returned the target.

Root cause

The cKDTree backends compared the float64 distance against max_distance, but
the brute-force and CUDA kernels round the distance to float32 before the
dist <= max_distance check (the convention #3389 established). When
max_distance lands in the float32 ulp gap just below a target's distance, the
two conventions disagree.

The dask reproducer reaches the cKDTree path because max_possible_distance is
itself float32-rounded, so a sub-ulp bounded request gets routed to the
global-tree path rather than the halo'd map_overlap path. The same
float64/float32 mismatch affects every cKDTree consumer, including eager numpy
proximity, so the fix lives in the shared bound helper instead of the routing.

Fix

_inclusive_upper_bound now widens the query bound to the midpoint between the
largest float32 <= max_distance and the next float32 up. That midpoint is the
supremum of float64 distances that still round to a float32 value within range,
so the cKDTree decision matches the kernels and nothing the kernels exclude is
admitted. The absolute floor that keeps the p=2 boundary inclusive at small
max_distance (#3442) is preserved.

Backend coverage

  • numpy, dask+numpy: fixed (these use cKDTree).
  • cupy, dask+cupy: already used the float32 brute-force convention, unchanged.
  • Verified all four backends return the target at the boundary.

Test plan

  • New regression tests: dask+numpy proximity/allocation/direction match
    eager numpy at a float32-ulp max_distance, for EUCLIDEAN and MANHATTAN
    (they fail without the fix).
  • Mirror tests: a max_distance one float32 ulp below the target's
    distance still excludes it on both backends.
  • Full xrspatial/tests/test_proximity.py passes (552 tests).

#3392)

The cKDTree backends (eager numpy PROXIMITY and the dask paths for every
mode) compared the float64 distance against max_distance, while the
brute-force and CUDA kernels round the distance to float32 first. A
max_distance landing inside the float32 ulp gap just below a target's
distance dropped that target to NaN on the cKDTree paths but kept it on the
brute-force backends.

_inclusive_upper_bound now widens the bound to the midpoint between the
largest float32 <= max_distance and the next float32 up, matching the
kernels' float32 keep test, while preserving the absolute floor that keeps
the p=2 boundary inclusive at small max_distance.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 27, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Match cKDTree max_distance check to the brute-force float32 convention (#3392)

Reviewed the full _inclusive_upper_bound change and the new tests against the
surrounding proximity code. The root-cause analysis holds up: I traced the
reproducer and it does reach _process_dask_kdtree (not the map_overlap halo
path the issue text guessed) because max_possible_distance is computed through
_distance, which returns float32. Putting the fix in the shared bound helper is
the right call since it covers every cKDTree consumer at once.

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • The new tests pin numpy against dask+numpy only. The same boundary is worth
    asserting on cupy / dask+cupy so the four backends can't drift apart again
    (the older #3389 *_gpu_matches_cpu tests cover GPU allocation, and the #3442
    block parametrizes all four backends via create_test_raster). Not blocking
    because the fix is unreachable from the brute-force GPU paths and the helper
    is backend-independent, but a 4-backend parametrization would close the gap.

Nits (optional improvements)

  • xrspatial/tests/test_proximity.py:2670 uses
    np.linalg.norm([lon[j] - lon[0]], ord=p) to get the distance. For a
    one-element vector that is just abs(lon[j] - lon[0]) for both p=1 and p=2;
    a plain abs(...) reads more directly. The current form is correct, only
    slightly indirect.

Out of scope (mention only)

  • max_possible_distance is still float32 (via _distance at
    xrspatial/proximity.py:1388), so a sub-ulp bounded request is routed to the
    global-tree path instead of map_overlap. With this fix the result is correct
    either way, so it is a routing/perf nuance rather than a bug. Fine to leave;
    flagging so it is on record.
  • Pre-existing flake8 E127 at xrspatial/tests/test_proximity.py:1134 and
    :1160, unrelated to this PR.

What looks good

  • The midpoint bound is the exact supremum of float64 distances that round to a
    float32 value within range, so it matches the kernels without admitting
    anything they exclude. Verified the widen is half a float32 ulp across
    magnitudes and the absolute floor still keeps p=2 inclusive at small
    max_distance (md=0 -> 8.88e-16, squaring-safe).
  • Regression tests fail without the change (6 of them) and the mirror
    "just below" tests confirm a genuinely out-of-range target still NaNs out.
  • Manually confirmed all four backends return the target at the boundary.

Checklist

  • Algorithm matches the established float32 keep convention (#3389)
  • All implemented backends produce consistent results
  • NaN handling is correct (inclusion widened, exclusion preserved)
  • Edge cases covered (md=0 floor, inf, just-below boundary)
  • Dask paths handled (shared helper covers global/tiled and chunk fns)
  • No premature materialization or unnecessary copies
  • Benchmark not needed (scalar helper, no perf-critical change)
  • README feature matrix not applicable (no new function)
  • Docstring updated to explain the float32 reasoning

…3392)

Parametrize the new keep/exclude tests across numpy, dask+numpy, cupy, and
dask+cupy so the brute-force GPU backends are pinned to the cKDTree backends
at the boundary and cannot drift apart. Replace the one-element
np.linalg.norm with a plain longitude difference.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (after 917dc90)

Re-reviewed the diff after the follow-up commit.

Disposition of prior findings

  • Suggestion (4-backend coverage): fixed. The keep/exclude tests now
    parametrize over numpy, dask+numpy, cupy, and dask+cupy via the _ulp_raster
    helper, so the brute-force GPU backends are pinned to the cKDTree backends at
    the boundary. 36 boundary cases pass locally (CUDA backends included).
  • Nit (np.linalg.norm on a one-element vector): fixed. Replaced with the
    plain longitude difference, which is the distance for both p=1 and p=2 on the
    constant-latitude row.
  • max_possible_distance float32 routing and the pre-existing flake8 E127s:
    left as-is, both noted in the first review as out of scope (correct either
    way / not introduced here).

Status

No blockers, no remaining actionable findings. Full test_proximity.py passes
(579 tests across all four backends).

@brendancol brendancol merged commit 982a9ae into main Jun 27, 2026
10 checks passed
@brendancol brendancol deleted the issue-3392 branch July 1, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

proximity: bounded dask EUCLIDEAN allocation can NaN a target at a float32-ulp max_distance

1 participant