Add .xrs.validate() to check a raster against the xarray-spatial contract#3486
Conversation
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Add .xrs.validate() to check a raster against the xarray-spatial contract
Reviewed in the worktree on the head branch. Two findings, both fixed in the follow-up commit on this branch.
Blockers
None.
Suggestions
xrspatial/validate.py_check_geographic_range— false warning on a bare lat/lon dim. The check read coordinates withcoords.get(dim), which synthesizes a default integer index (0..N-1) for a dimension that has no real coordinate. A DataArray with a dim namedlatbut no coordinate and more than 91 rows would get synthesized indices whose max exceeds 90, producing a misleading "looks projected" warning on top of the correctcoords_presentwarning. Fixed: the geographic check now runs only against a real coordinate (dim in agg.coords), with a regression test.
Nits
even_spacingreference step. The comment claimed it mirrorsutils._warn_if_irregular_spacing, but that helper compares each step to the averagedspan/(n-1)resolution while this useddiffs[0]. Fixed to use the averaged step so behavior matches the cited helper.
What looks good
- Checks are structural only (dims/coords/dtype/attrs) and never read the data buffer; verified dask stays lazy and cupy stays on device. Backend tests run unskipped on the dev box.
- CRS resolution reuses the polygonize order. Severity split is clean and
is_validkeys off errors only.
Checklist
- All implemented backends produce consistent results (structural; tested across all four)
- NaN handling correct (
isfiniteguards, nan-safe min/max) - Edge cases covered (1D/4D, warnings-only, non-finite cell size, bare dims)
- No premature materialization or unnecessary copies
- README feature matrix updated
- Docstrings present and accurate
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review (after fixes)
Re-reviewed the two changed helpers in validate.py:
_check_geographic_rangenow gates ondim in agg.coords, so a bare lat/lon dimension no longer reads xarray's synthesized integer index. Covered bytest_bare_latlon_dim_does_not_trigger_geographic_warning.even_spacingcompares each step to the averagedspan/(n-1)resolution, matchingutils._warn_if_irregular_spacing.
No new findings. Full suite: 117 passed (accessor + contract-validate), flake8 clean. Backend tests (cupy, dask+cupy) ran unskipped.
|
Heads up on the red |
Closes #3485
Adds
validate()to the.xrsaccessor on both DataArray and Dataset. It checks a raster against the input contract every xrspatial op assumes and hands back a report of every violation, each with a suggested fix, instead of letting ops raise one at a time partway through a pipeline.xrspatial/validate.py:validate()/validate_dataset()plusValidationReport,DatasetValidationReport,ValidationIssue, andXrsContractError. The report doesn't raise and is truthy when there are no error-level issues; passraise_on_error=Trueto raise instead.errormeans a spatial op will fail (wrong type/ndim/dtype, non-numeric or non-monotonic coords, non-finite cell size). Awarningmeans behavior degrades (unconventional dim names, missing coords, uneven spacing, projected lat/lon, missing CRS).is_validkeys off errors only.utils.pyvalidators and the polygonize CRS-resolution order. They look at structure only (dims, coords, dtype, attrs) and never read the data buffer, so dask stays lazy and cupy stays on device.Backends: numpy, dask+numpy, cupy, dask+cupy. The check is structural metadata only, so it behaves the same on all four (the cupy and dask+cupy tests run on the dev box, not skipped).
Test plan:
xrspatial/tests/test_contract_validate.pycovering each error and warning check, severity semantics,raise_on_error, per-variable Dataset reports, repr, and all four backendstest_accessor.pyexpected-method lists updated; full accessor suite passestest_validation.pystill green