Optimize chunk management

Big data sets require data to be chunked in space, even if it already has singleton chunks in time (`{'time':1}`). By default, this triggers errors in several different `xgcm.Grid` operations used in `xwmb`:
- `grid.transform(da, "Z", ...)` if the data is chunked along the `Z` axis, which is used many times to transform data from its original vertical coordinate to tracer coordinates.
- `grid.interp(da, "Z")` if the data is chunked along the `Z` axis, which is used to interpolate the tracer field onto interfaces, required by the `method=conservative` option in the `grid.transform` call above.
- `grid.interp(da, "X")` or `grid.interp(da, "Y")`, which is used to interpolate the tracer field onto horizontal cell faces, required to then transform normal transports into tracer coordinates.
- `grid.diff(da, "X")` or `grid.diff(da, "Y")`, which is used to estimate horizontal convergences.

Since the horizontal and vertical operations are often used in short succession, variables need to be rechunked multiple times so that the chunks remain manageable but still have `{operation_dim: -1}` before each corresponding operation. This excessive rechunking seems to currently be a main bottle neck, especially for computing the transport terms which require all four of these operations!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize chunk management #20

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Optimize chunk management #20

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions