Fix EUCLIDEAN proximity/allocation/direction dropping targets at exact max_distance#3443
Merged
Merged
Conversation
…3442) cKDTree's distance_upper_bound is exclusive. The numpy/dask KDTree paths widened it by one ulp to emulate the inclusive dist <= max_distance check the brute-force/CUDA backends use, but for p=2 (EUCLIDEAN) cKDTree squares the bound internally and nextafter(0, inf)**2 underflows to 0, so a target exactly at the bound (most visibly a target pixel at max_distance=0) was dropped to NaN. _kdtree_query_lowest_index passed the raw exclusive bound with no widening at all. Route every cKDTree call through a shared _inclusive_upper_bound helper that widens by a relative epsilon with an absolute floor, robust for p=1 and p=2 including max_distance=0. numpy/dask now match cupy/dask+cupy.
brendancol
commented
Jun 22, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Fix EUCLIDEAN proximity/allocation/direction dropping targets at exact max_distance
Blockers
None.
Suggestions
None blocking. One thing worth a sentence in the helper docstring: the widen is relative (4*eps*max(|md|, 1.0)), so at large max_distance it could in principle include a target whose true distance is within ~8*eps*md of the bound. That gap is below float64 resolution at that magnitude, so the two distances are not distinguishable anyway and the behavior is correct. Calling that out would save the next reader the same check I just did.
Nits
- The
np.isfinite ... else np.infguard in_tiled_chunk_query(proximity.py:1146) is now redundant since_inclusive_upper_boundhandles inf, but it is pre-existing and harmless. Leave it.
What looks good
- The root cause is correct: cKDTree squares
distance_upper_boundfor p=2, andnextafter(0, inf)**2underflows to 0, so the old widen collapsed atmax_distance=0. Verified directly against scipy. - One shared helper now feeds all three cKDTree call sites (
_process_numpy_kdtree, both queries in_kdtree_query_lowest_index), so the numpy and dask paths can't drift apart again. - Checked over-inclusion at md=0, md=1, and md=1000: includes exactly-at-bound, excludes just-below, never pulls in the next real distance.
- The tie-break in
_kdtree_query_lowest_indexis preserved: both k=2 neighbors use the same widened bound, so the equidistant test is unaffected. - Tests cover all four backends for proximity/allocation/direction at max_distance=0 plus an exact-distance boundary pixel, and pin numpy against dask+numpy.
Checklist
- Algorithm matches reference (cKDTree exclusive-bound semantics verified against scipy)
- All implemented backends produce consistent results (numpy/dask/cupy/dask+cupy)
- NaN handling is correct (inf/unbounded returns inf; non-target stays NaN)
- Edge cases covered (max_distance=0, exact-distance boundary)
- Dask chunk boundaries handled (bounded map_overlap + global/tiled KDTree paths all routed through the helper)
- No premature materialization or unnecessary copies
- Benchmark not needed (numerical bug fix, no perf-critical change)
- README not applicable (no new function, no backend change)
- Docstrings present and accurate
brendancol
commented
Jun 22, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review after addressing the prior pass.
Disposition of the earlier findings:
- Suggestion (document the relative-widen behavior in the helper docstring): fixed in 04c6c73. Added a sentence noting that a target within ~8epsmax_distance of the bound could qualify, but the two distances are indistinguishable in float64 at that magnitude, so the result is still correct.
- Nit (redundant
np.isfinite ... else np.infguard in_tiled_chunk_query): dismissed. It is pre-existing, harmless, and outside the scope of this fix; touching it would be an unrelated change.
No new findings. The follow-up was a docstring-only edit; the 25 regression cases still pass and flake8 is clean on the changed module.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3442
What changed
distance_metric='EUCLIDEAN', the numpy and dask+numpy cKDTree paths dropped a target sitting exactly atmax_distanceto NaN, while cupy and dask+cupy (brute force) kept it. The clearest case ismax_distance=0: a target pixel is distance 0 from itself and should qualify, but numpy/dask returned NaN there.distance_upper_boundis exclusive, and for p=2 it compares squared distances. The old one-ulp widen of 0 underflows when squared, collapsing the bound back tomax_distance._kdtree_query_lowest_indexdid not widen at all._inclusive_upper_boundhelper that widens by a relative epsilon with an absolute floor, so the bound survives squaring for p=2 and stays inclusive for p=1.Backend coverage
numpy and dask+numpy were the affected backends and are fixed. cupy and dask+cupy already behaved correctly (brute force); the new tests pin all four together.
Test plan
max_distance=0keeps the target pixel (proximity 0, allocation target value, direction 0) on numpy, dask+numpy, cupy, dask+cupy, for EUCLIDEAN and MANHATTANmax_distance=0for all three functionstest_proximity.pysuite passes (543 tests)