Big data sets require data to be chunked in space, even if it already has singleton chunks in time ({'time':1}). By default, this triggers errors in several different xgcm.Grid operations used in xwmb:
grid.transform(da, "Z", ...) if the data is chunked along the Z axis, which is used many times to transform data from its original vertical coordinate to tracer coordinates.
grid.interp(da, "Z") if the data is chunked along the Z axis, which is used to interpolate the tracer field onto interfaces, required by the method=conservative option in the grid.transform call above.
grid.interp(da, "X") or grid.interp(da, "Y"), which is used to interpolate the tracer field onto horizontal cell faces, required to then transform normal transports into tracer coordinates.
grid.diff(da, "X") or grid.diff(da, "Y"), which is used to estimate horizontal convergences.
Since the horizontal and vertical operations are often used in short succession, variables need to be rechunked multiple times so that the chunks remain manageable but still have {operation_dim: -1} before each corresponding operation. This excessive rechunking seems to currently be a main bottle neck, especially for computing the transport terms which require all four of these operations!
Big data sets require data to be chunked in space, even if it already has singleton chunks in time (
{'time':1}). By default, this triggers errors in several differentxgcm.Gridoperations used inxwmb:grid.transform(da, "Z", ...)if the data is chunked along theZaxis, which is used many times to transform data from its original vertical coordinate to tracer coordinates.grid.interp(da, "Z")if the data is chunked along theZaxis, which is used to interpolate the tracer field onto interfaces, required by themethod=conservativeoption in thegrid.transformcall above.grid.interp(da, "X")orgrid.interp(da, "Y"), which is used to interpolate the tracer field onto horizontal cell faces, required to then transform normal transports into tracer coordinates.grid.diff(da, "X")orgrid.diff(da, "Y"), which is used to estimate horizontal convergences.Since the horizontal and vertical operations are often used in short succession, variables need to be rechunked multiple times so that the chunks remain manageable but still have
{operation_dim: -1}before each corresponding operation. This excessive rechunking seems to currently be a main bottle neck, especially for computing the transport terms which require all four of these operations!