Skip to content

Add .xrs.validate() to check a raster against the xarray-spatial contract#3486

Merged
brendancol merged 5 commits into
mainfrom
issue-3485
Jun 25, 2026
Merged

Add .xrs.validate() to check a raster against the xarray-spatial contract#3486
brendancol merged 5 commits into
mainfrom
issue-3485

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3485

Adds validate() to the .xrs accessor on both DataArray and Dataset. It checks a raster against the input contract every xrspatial op assumes and hands back a report of every violation, each with a suggested fix, instead of letting ops raise one at a time partway through a pipeline.

  • New xrspatial/validate.py: validate() / validate_dataset() plus ValidationReport, DatasetValidationReport, ValidationIssue, and XrsContractError. The report doesn't raise and is truthy when there are no error-level issues; pass raise_on_error=True to raise instead.
  • Two severities. An error means a spatial op will fail (wrong type/ndim/dtype, non-numeric or non-monotonic coords, non-finite cell size). A warning means behavior degrades (unconventional dim names, missing coords, uneven spacing, projected lat/lon, missing CRS). is_valid keys off errors only.
  • Checks reuse the existing utils.py validators and the polygonize CRS-resolution order. They look at structure only (dims, coords, dtype, attrs) and never read the data buffer, so dask stays lazy and cupy stays on device.
  • Accessor methods delegate to the standalone functions and show up under a new Diagnostics banner in the categorized repr.

Backends: numpy, dask+numpy, cupy, dask+cupy. The check is structural metadata only, so it behaves the same on all four (the cupy and dask+cupy tests run on the dev box, not skipped).

Test plan:

  • New xrspatial/tests/test_contract_validate.py covering each error and warning check, severity semantics, raise_on_error, per-variable Dataset reports, repr, and all four backends
  • test_accessor.py expected-method lists updated; full accessor suite passes
  • test_validation.py still green
  • flake8 clean on new files

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Add .xrs.validate() to check a raster against the xarray-spatial contract

Reviewed in the worktree on the head branch. Two findings, both fixed in the follow-up commit on this branch.

Blockers

None.

Suggestions

  • xrspatial/validate.py _check_geographic_range — false warning on a bare lat/lon dim. The check read coordinates with coords.get(dim), which synthesizes a default integer index (0..N-1) for a dimension that has no real coordinate. A DataArray with a dim named lat but no coordinate and more than 91 rows would get synthesized indices whose max exceeds 90, producing a misleading "looks projected" warning on top of the correct coords_present warning. Fixed: the geographic check now runs only against a real coordinate (dim in agg.coords), with a regression test.

Nits

  • even_spacing reference step. The comment claimed it mirrors utils._warn_if_irregular_spacing, but that helper compares each step to the averaged span/(n-1) resolution while this used diffs[0]. Fixed to use the averaged step so behavior matches the cited helper.

What looks good

  • Checks are structural only (dims/coords/dtype/attrs) and never read the data buffer; verified dask stays lazy and cupy stays on device. Backend tests run unskipped on the dev box.
  • CRS resolution reuses the polygonize order. Severity split is clean and is_valid keys off errors only.

Checklist

  • All implemented backends produce consistent results (structural; tested across all four)
  • NaN handling correct (isfinite guards, nan-safe min/max)
  • Edge cases covered (1D/4D, warnings-only, non-finite cell size, bare dims)
  • No premature materialization or unnecessary copies
  • README feature matrix updated
  • Docstrings present and accurate

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (after fixes)

Re-reviewed the two changed helpers in validate.py:

  • _check_geographic_range now gates on dim in agg.coords, so a bare lat/lon dimension no longer reads xarray's synthesized integer index. Covered by test_bare_latlon_dim_does_not_trigger_geographic_warning.
  • even_spacing compares each step to the averaged span/(n-1) resolution, matching utils._warn_if_irregular_spacing.

No new findings. Full suite: 117 passed (accessor + contract-validate), flake8 clean. Backend tests (cupy, dask+cupy) ran unskipped.

@brendancol

Copy link
Copy Markdown
Contributor Author

Heads up on the red run (..., 3.14) checks: they are not caused by this PR. The only real failure is xrspatial/tests/test_perlin.py::test_perlin_drops_input_coords, which is broken on main (the macOS/Windows/3.14t jobs show "operation canceled" from matrix fail-fast). It is a merge race between perlin PRs #3470 and #3472, fixed in #3488. Every test in this PR passes on every platform it ran on (test_contract_validate.py is green on ubuntu and windows fast lanes). Once #3488 lands on main, I will re-merge main into this branch and the checks will go green.

@brendancol brendancol merged commit 52105da into main Jun 25, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add .xrs.validate() to check a raster against the xarray-spatial contract

1 participant