Skip to content

Propagate friction geo-attrs through least_cost_corridor (#3446)#3449

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-metadata-corridor-2026-06-22-01
Jun 22, 2026
Merged

Propagate friction geo-attrs through least_cost_corridor (#3446)#3449
brendancol merged 2 commits into
mainfrom
deep-sweep-metadata-corridor-2026-06-22-01

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3446

least_cost_corridor builds the corridor as cd_a + cd_b, where each cost-distance surface carries the attrs of its source raster. Source rasters are usually plain marker masks with no geo-attrs, so xarray's binary sum dropped res/crs/transform/nodatavals from the output even when the friction surface that defines the grid had them.

Changes

  • Add _apply_geo_metadata to re-emit the friction surface's attrs and name on the corridor output.
  • Wire it into the non-precomputed path (single, threshold, unreachable, and pairwise outputs). The precomputed path keeps its source-derived behavior, since there is no friction surface to draw from.
  • Only .attrs and .name are set; data, coords, dims, and dtype are untouched, and dask stays lazy.

Backends

Verified live on numpy, cupy, dask+numpy, and dask+cupy (CUDA host).

Test plan

  • test_corridor_inherits_friction_geo_attrs across all four backends
  • test_corridor_threshold_keeps_geo_attrs across all four backends
  • test_corridor_unreachable_keeps_geo_attrs
  • test_pairwise_inherits_friction_geo_attrs
  • test_precomputed_keeps_source_attrs_not_friction
  • Full corridor suite: 43 passed

Found by the metadata propagation sweep (/sweep-metadata).

The corridor is cd_a + cd_b, and each cost-distance surface carries its
source raster's attrs. Source masks usually have no geo-attrs, so the
xarray binary sum dropped res/crs/transform/nodatavals even when the
friction surface had them. Re-emit friction's attrs and name on the
output for the non-precomputed path; leave the precomputed path on its
source-derived behaviour.

Covers single, threshold, unreachable, and pairwise outputs across all
four backends.
@brendancol brendancol added bug Something isn't working severity:high Sweep finding: HIGH sweep-metadata Found by /sweep-metadata labels Jun 22, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Propagate friction geo-attrs through least_cost_corridor (#3446)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None.

Nits (optional improvements)

  • In the pairwise branch, _apply_geo_metadata calls .rename(geo_source.name) on each corridor, but the DataArray then goes into xr.Dataset(corridors) keyed by corridor_i_j, and the dict key wins as the variable name. So the rename is a no-op for the Dataset variables; only the attrs propagation does anything there. Harmless, just slightly redundant, not worth a change on its own.

What looks good

  • The fix goes at the root cause: the corridor is cd_a + cd_b, each surface carries its source raster's attrs, and the binary sum drops anything that differs between the two operands. Re-emitting friction's attrs and name is the right call, since friction is the raster that defines the grid.
  • The precomputed path is left alone, which is correct. There is no friction surface to draw from there, and test_precomputed_keeps_source_attrs_not_friction pins that behavior.
  • _apply_geo_metadata calls result.copy() before touching attrs, so it does not mutate the input and keeps dask arrays lazy (checked: no compute triggered). .rename(None) yields a None name when friction is unnamed.
  • Test coverage is thorough: all four backends for the main and threshold paths, plus the unreachable all-NaN and pairwise paths. Full corridor suite passes (43 tests).

Checklist

  • Algorithm matches reference/paper (N/A; metadata-only change, values untouched)
  • All implemented backends produce consistent results (numpy/cupy/dask+numpy/dask+cupy verified live)
  • NaN handling is correct (unreachable all-NaN path keeps attrs)
  • Edge cases are covered by tests (unreachable, pairwise, precomputed)
  • Dask chunk boundaries handled correctly (no neighborhood op; copy stays lazy)
  • No premature materialization or unnecessary copies (.copy() is a shallow metadata copy)
  • Benchmark exists or is not needed (not needed; no perf-sensitive change)
  • README feature matrix updated (N/A; no new function or backend change)
  • Docstrings present and accurate (helper docstring explains the root cause)

@brendancol brendancol merged commit 2b823ca into main Jun 22, 2026
10 checks passed
@brendancol brendancol deleted the deep-sweep-metadata-corridor-2026-06-22-01 branch June 25, 2026 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working severity:high Sweep finding: HIGH sweep-metadata Found by /sweep-metadata

Projects

None yet

Development

Successfully merging this pull request may close these issues.

least_cost_corridor drops friction surface geo-attrs (res/crs/transform/nodatavals)

1 participant