Skip to content

from_template: lazy dask grids skip the cell cap when chunks is supplied#3556

Merged
brendancol merged 2 commits into
mainfrom
issue-3554
Jun 27, 2026
Merged

from_template: lazy dask grids skip the cell cap when chunks is supplied#3556
brendancol merged 2 commits into
mainfrom
issue-3554

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3554

Supplying chunks now gives back a lazy dask grid instead of silently dropping the chunk spec, and the 500M-cell cap only guards the eager backends.

  • When chunks is passed, an eager backend is promoted to its dask variant (numpy to dask+numpy, cupy to dask+cupy), so the result is lazy and chunked.
  • The cell cap applies only to numpy and cupy, which materialize the whole grid. A dask grid is a lazy da.full chunk graph, so it skips the cap. The eager cap is unchanged: from_template('conus', resolution=1) on numpy still raises.
  • The chunks default changes from 'auto' to None as a "not supplied" sentinel; the dask paths still use 'auto' when it is omitted.

Backend coverage: numpy and cupy stay eager and capped; dask+numpy and dask+cupy are lazy and uncapped. The promotion paths exercise both dask variants.

Test plan:

  • from_template('new_england', chunks=512, resolution=10) returns a lazy dask array with chunksize (512, 512) and no cap error
  • an explicit backend='dask+numpy' over the cap is lazy without chunks
  • chunks promotes cupy to dask+cupy on GPU
  • eager cap unchanged: from_template('conus', resolution=1) still raises

…3554)

Supplying chunks now promotes an eager backend to its dask variant
(numpy->dask+numpy, cupy->dask+cupy) and returns a lazy grid. The
500M-cell cap applies only to the eager backends, which materialize the
whole array; a dask grid is a lazy chunk graph, so the cap is skipped.
The eager cap is unchanged.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: lazy dask grids skip the cell cap when chunks is supplied

Blockers

None.

Suggestions

  • Dropping the cap for dask leaves that path with no guard at all, and the new footgun is chunk count rather than memory. With a fixed chunks=512, the old 500M-cell cap also bounded the task graph at ~1900 chunks. An over-fine resolution now builds an unbounded graph: from_template('conus', resolution=1, chunks=512) is ~50 million chunks, and graph construction alone hangs the client even though no data is materialized. The lazy grid is cheap; building tens of millions of tasks is not. Worth a sanity check on chunk count (not cells), or at least a docstring note that an over-fine resolution on a dask backend can build a huge task graph. (templates.py:390-402)

Nits

  • The Examples block in the from_template docstring does not show the chunks/lazy path. A one-line example would make the new behavior discoverable. (templates.py Examples)

What looks good

  • Minimal and surgical. The cap is now backend-aware, the stale comment that claimed it applied to dask is fixed, and the error message points users at chunks=.
  • Tests cover the three new paths (numpy promotion, explicit dask without chunks, cupy promotion) plus the eager cap staying intact. The over-cap region is picked to keep the chunk count small so the tests run fast.

Checklist

  • Algorithm matches reference/paper: n/a (grid construction)
  • All implemented backends consistent: yes, promotion verified for numpy and cupy
  • NaN handling correct: yes, fill unchanged
  • Edge cases covered: eager cap retained; lazy paths tested
  • Dask chunk boundaries: n/a (empty grid)
  • No premature materialization: yes, da.full stays lazy
  • Benchmark: not needed
  • README matrix: not needed (no backend support change)
  • Docstrings present and accurate: yes

…3554)

Docstring caution that a very fine resolution on a dask backend builds a
large task graph, plus an Examples entry for the lazy chunked path. The
actual chunk-count guard is deferred to #3557.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (after docstring commit)

Disposition of the first-pass findings:

  • Nit (docstring example): fixed in 587dfe2 -- added an Examples entry for the lazy chunked path.
  • Suggestion (dask task-graph footgun): split. The documentation half is fixed in 587dfe2 (the chunks param now warns that a very fine resolution builds a large task graph and suggests coarser resolution or larger chunks). The actual chunk-count guard is a separate design decision (threshold, chunks='auto' handling, raise vs warn) and reintroduces a gate the issue asked to remove for dask, so it is deferred to #3557.

The follow-up commit is docstring-only; no logic change. The new Examples line evaluates (type(agg.data).__name__ == 'Array') and the cap tests still pass.

Remaining open item is the deferred guard (#3557). Nothing blocking on this PR.

@brendancol brendancol merged commit e01a4f8 into main Jun 27, 2026
10 checks passed
@brendancol brendancol deleted the issue-3554 branch July 1, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

from_template: lazy dask grids should skip the cell-count cap when chunks is supplied

1 participant