Skip to content

Guard cost_distance numpy dask chunk path with _check_memory (#3343)#3346

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-security-cost_distance-2026-06-15-01
Jun 15, 2026
Merged

Guard cost_distance numpy dask chunk path with _check_memory (#3343)#3346
brendancol merged 2 commits into
mainfrom
deep-sweep-security-cost_distance-2026-06-15-01

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3343

What

  • Call _check_memory(h, w) inside the _make_chunk_func closure before running _cost_distance_kernel, so the numpy da.map_overlap path raises a clean MemoryError (pointing at max_cost= or smaller chunks) instead of an opaque allocator failure on an oversized chunk.

This was the only allocation path in the module without a guard. The numpy (_cost_distance_numpy), cupy/dask+cupy (_check_gpu_memory), and iterative tile Dijkstra paths already had one.

Backend coverage

  • numpy: unchanged (already guarded).
  • dask+numpy: now guarded per chunk (the fix).
  • cupy / dask+cupy: unchanged (already guarded via _check_gpu_memory).

Test plan

  • New test_dask_map_overlap_chunk_memory_guard_raises forces the map_overlap path with a finite max_cost and a chunk size larger than the overlap pad, mocks _available_memory_bytes low, and asserts a MemoryError mentioning max_cost at compute time.
  • Confirmed the test fails on main (DID NOT RAISE) and passes with the fix.
  • Full test_cost_distance.py suite passes (56 passed), covering numpy, dask+numpy, cupy, and dask+cupy.

The numpy map_overlap chunk function called _cost_distance_kernel
directly, allocating several height*width arrays per chunk without the
_check_memory guard that _cost_distance_numpy and the cupy/iterative
paths apply. An oversized chunk could exhaust a worker with an opaque
allocator error instead of a MemoryError pointing at max_cost= or
smaller chunks.

Add _check_memory(h, w) inside the chunk closure so every numpy
allocation path is guarded consistently.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 15, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Guard cost_distance numpy dask chunk path with _check_memory (#3343)

Blockers

None.

Suggestions

None.

Nits

None.

What looks good

  • The fix is one line in the right place. _make_chunk_func's closure receives source_block, whose shape already includes the map_overlap depth, so _check_memory(h, w) measures the actual per-chunk allocation the kernel is about to make.
  • Backend parity is the point of the change: numpy direct (_cost_distance_numpy), cupy/dask+cupy (_check_gpu_memory), and the iterative tile path all guarded this already. This was the lone gap.
  • _check_memory is a no-op for normal chunks (it only raises above 50% of available RAM), so well-sized dask graphs are unaffected.
  • The new test forces the map_overlap path on purpose (finite max_cost, chunk size larger than the overlap pad), mocks memory low, and asserts the MemoryError surfaces at compute time. I confirmed it fails on main (DID NOT RAISE) and passes with the fix, so it actually pins the bug.

Checklist

  • Algorithm unchanged; only a pre-allocation guard added
  • Backends consistent (numpy direct + dask numpy chunk now both guarded)
  • NaN handling unaffected
  • Edge case covered (oversized chunk raises a clear MemoryError)
  • Dask chunk path handled (guard runs inside the task, fires at compute)
  • No premature materialization or extra copies
  • Benchmark not needed (no perf-path change)
  • README/docs not applicable (no public API change)
  • Pure bug fix, no docstring changes needed

@brendancol brendancol merged commit bd4ec30 into main Jun 15, 2026
7 checks passed
@brendancol brendancol deleted the deep-sweep-security-cost_distance-2026-06-15-01 branch June 25, 2026 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cost_distance: numpy dask map_overlap chunk path skips the _check_memory guard

1 participant