Skip to content

cost_distance: route bounded large-radius dask path to iterative Dijkstra (#3342)#3345

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-performance-cost_distance-2026-06-15
Jun 15, 2026
Merged

cost_distance: route bounded large-radius dask path to iterative Dijkstra (#3342)#3345
brendancol merged 2 commits into
mainfrom
deep-sweep-performance-cost_distance-2026-06-15

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3342.

Problem

cost_distance on a dask+numpy raster with a finite max_cost could rechunk
the input toward a single monolithic chunk and OOM. _cost_distance_dask
gated the map_overlap vs iterative-tile-Dijkstra branch on the pad radius
versus the full array dimensions:

if not np.isfinite(max_radius) or pad >= height or pad >= width:

map_overlap's depth must stay within the chunk size. When
chunk_size <= pad < full_dim, this still took the map_overlap branch with
depth > chunk_size, and dask silently rechunked toward larger blocks. With a
large radius it collapsed to one chunk, the same OOM class as #880 (fixed in
#888 for the unbounded path only).

The dask+cupy path already guarded correctly against chunk size.

Fix

Compare pad against the max chunk dimension and route to the iterative tile
Dijkstra when it would overflow a chunk, matching the dask+cupy guard.

Verification

  • New regression test_bounded_large_radius_no_chunk_collapse: chunks (10,10),
    max_cost=95 (pad 96). Asserts the result keeps >1 partition and matches the
    numpy result. Fails on main (collapses to 1 chunk, no iterative warning),
    passes with the fix.
  • Full suite: 56 passed, including the GPU and dask+cupy paths (CUDA host).

OOM verdict for the bounded large-radius case: was RISKY, now bounded.
Bottleneck: memory-bound. Affected backend: dask+numpy.

…stra (#3342)

The map_overlap vs iterative branch in _cost_distance_dask compared the pad
radius against the full array dimensions instead of the chunk size. When the
radius exceeded the chunk size but stayed below the full dimension, the path
still used map_overlap with depth greater than the chunk size, so dask
silently rechunked the inputs toward larger (eventually single) blocks. That
reintroduces the #880-class OOM on large dask rasters with a finite max_cost.

Compare pad against the chunk size and route to the iterative tile Dijkstra
when it would otherwise overflow a chunk, matching the dask+cupy guard.

Adds a regression test asserting the bounded large-radius path keeps more than
one chunk and matches the numpy result.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 15, 2026
@brendancol

Copy link
Copy Markdown
Contributor Author

Review: route bounded large-radius dask path to iterative Dijkstra

Reviewed from a worktree on the head branch. The fix is correct and narrowly scoped. No blockers.

Blockers

None.

Suggestions

  • The ASV benchmark (benchmarks/benchmarks/cost_distance.py) only calls cost_distance with the default max_cost=inf, so it always runs the unbounded iterative path. The bounded map_overlap routing that this PR changes is never benchmarked, so a regression here would not show up in ASV. Worth adding a finite-max_cost case.

Nits

  • The guard compares pad against max(chunks) (cost_distance.py:1167-1168), which matches the dask+cupy guard at cost_distance.py:484. For ragged chunks, a pad that lands between the smallest and largest chunk still makes dask rechunk the smaller boundary chunk. That is a local merge, not the single-chunk collapse this PR fixes, so it is harmless. min(chunks) would avoid any rechunk at the cost of sending more cases to the iterative path. Matching the sibling guard is the reasonable call; noting it only so the choice is on the record.

What looks good

  • The OOM mechanism is diagnosed correctly: a depth larger than the chunk size makes map_overlap rechunk toward a single block.
  • The new guard now agrees with the dask+cupy path (cost_distance.py:484) and with the module docstring (cost_distance.py:24), which already said the iterative path is used when "the implied radius exceeds chunk dimensions". The code now does what the docstring claimed.
  • Single-chunk inputs are unaffected: max(chunks) equals the full dimension, so the condition reduces to the original pad >= height or pad >= width.
  • The regression test fails on main and checks both npartitions > 1 and value parity against the numpy result, not just that it runs.

Checklist

  • Algorithm matches intent
  • Backends consistent (guard matches dask+cupy)
  • NaN handling unchanged
  • Single-chunk edge case preserved
  • Dask chunk boundaries handled correctly
  • No premature materialization introduced
  • Benchmark covers the changed path (see suggestion)
  • Docstring accurate (now matches behavior)

The ASV CostDistance benchmark only ran cost_distance with the default
max_cost=inf, so it always exercised the unbounded iterative path and
never the bounded map_overlap branch this PR touches. Add a
time_cost_distance_bounded case whose pixel radius stays inside one
chunk so the dask path runs map_overlap, giving that branch perf
coverage.
@brendancol brendancol merged commit 95be949 into main Jun 15, 2026
7 checks passed
@brendancol brendancol deleted the deep-sweep-performance-cost_distance-2026-06-15 branch June 25, 2026 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cost_distance: bounded path rechunks toward single chunk when radius exceeds chunk size (OOM)

1 participant