diff --git a/.claude/sweep-documentation-state.csv b/.claude/sweep-documentation-state.csv index fe6b63381..84fe31d0c 100644 --- a/.claude/sweep-documentation-state.csv +++ b/.claude/sweep-documentation-state.csv @@ -2,7 +2,7 @@ module,last_inspected,issue,severity_max,categories_found,notes,doc_coverage classify,2026-06-25,3506,MEDIUM,1;3,"Cat3: reclassify (numpy/dask/cupy blocks) + equal_interval example outputs were stale/wrong, binary used np.nan in array repr; corrected to actual output (tests confirm code is correct). Cat1: added missing Examples to std_mean, head_tail_breaks, percentiles, maximum_breaks, box_plot. Fixed in deep-sweep-documentation-classify-2026-06-25 (PR for #3506). Cat2 natural_breaks num_sample-None omission already tracked in #3501 (left alone). All 10 public funcs listed in reference/classification.rst (no Cat4 gap). CUDA available: ran numpy examples; cupy/dask reprs reviewed statically.",10/10 fire,2026-06-25,,MEDIUM,1;5,"all 7 public funcs (dnbr, rdnbr, burn_severity_class, fireline_intensity, flame_length, rate_of_spread, kbdi) lacked Examples section (Cat1 MEDIUM) and backend-support note (Cat5 MEDIUM); fixed in deep-sweep-documentation-fire-2026-06-25-01; repo issues disabled so no issue number; examples run and outputs match numpy backend; all 7 listed in reference/fire.rst; no Cat2/Cat3/Cat4 issues",7/7 flood,2026-06-25,,HIGH,1;4;5,"Cat4 HIGH: vegetation_roughness, vegetation_curve_number, flood_depth_vegetation public but absent from reference/flood.rst; Cat1 MEDIUM: no Examples on any of 7 public funcs; Cat5 MEDIUM: backend support undocumented (all 4 backends) + NaN propagation undocumented for curve_number_runoff/travel_time. Fixed in deep-sweep-documentation-flood-2026-06-25: added 3 rst entries, Examples+backend Notes to all 7 funcs (examples executed OK on CUDA host), NaN notes. PR #3502 opened with the fix; gh issue create blocked by auto-mode classifier so no issue number.",7/7 -geotiff,2026-06-25,,MEDIUM,1,"to_geotiff (public write entry point) had Parameters/Returns/Raises but no Examples section while open_geotiff does (Cat1 MEDIUM); added Examples block (plain GeoTIFF, cog=True, .vrt mosaic) modeled on open_geotiff; fixed on deep-sweep-documentation-geotiff-2026-06-25; repo issues disabled so no issue number. Cat2/3/4/5 clean: open_geotiff/to_geotiff signature-docstring parity locked by parity/test_signature_contract.py + write/test_bigtiff.py + test_polish.py; both funcs in autosummary in reference/geotiff.rst; reference page mirrors SUPPORTED_FEATURES tiers (tier-parity gate); CUDA available, all docstring examples are +SKIP illustrative only",2/2 +geotiff,2026-07-01,3592,MEDIUM,5,"Cat5 MEDIUM: PAM categorical sidecar behavior undocumented -- open_geotiff merges category_names/category_colors from .aux.xml on all 4 read paths and to_geotiff writes the sidecar keyed on attrs alone (no kwarg); absent from both docstrings and attrs_contract.rst. Doc-only fix on deep-sweep-documentation-geotiff-2026-07-01: open_geotiff Notes paragraph, to_geotiff description paragraph, 2 attrs_contract rows; issue #3592. Cat1/2/3/4 clean: 25 signature-contract parity tests pass incl. tier table vs SUPPORTED_FEATURES; examples executed copy-paste (plain/cog/vrt/color_ramp) and gpu=True->cupy / chunks->dask / gpu+chunks->dask+cupy verified on CUDA host; both funcs in reference/geotiff.rst autosummary, no dupes. LOW noted not fixed: categorical RAT + color_ramp symbology sidecars have no SUPPORTED_FEATURES key or release-contract row (runtime-constant change, out of doc-only scope).",2/2 interpolate-idw,2026-06-26,,MEDIUM,1;5,"Cat1 MEDIUM: idw (only public func in _idw.py) had Parameters/Returns/Raises but no Examples section. Cat5 MEDIUM: backend support undocumented (numpy/cupy/dask+numpy/dask+cupy via ArrayTypeFunctionMapping) + k-nearest GPU rejection (NotImplementedError) and NaN/inf input-point dropping undocumented. Doc-only fix on deep-sweep-documentation-interpolate-idw-2026-06-26: added backend-support line, Examples block (executed, output matches), Returns dtype/NaN/fill_value notes. Cat2 clean (all 9 params documented, defaults/types match signature). Cat3 n/a (no prior examples). Cat4 clean (idw listed in reference/interpolation.rst autosummary). CUDA available: ran numpy example; cupy/dask paths covered by test_interpolation.py (92 pass). repo issues disabled so no issue number.",1/1 mahalanobis,2026-06-30,3579,MEDIUM,1;2;5,"Cat1 MEDIUM: mahalanobis (only public func) had Parameters/Returns but no Examples section (peers normalize.rescale/standardize do). Cat5 MEDIUM: NaN propagation (any non-finite band -> NaN pixel) and auto-stats N+1 all-finite requirement undocumented. Cat2 LOW: name param default not noted. Doc-only fix on deep-sweep-documentation-mahalanobis-2026-06-30: added Notes (NaN+backends) and a runnable Examples block (executed, output matches incl. NaN), set name default='mahalanobis'. Returns float64/shape and 4-backend claim verified against ArrayTypeFunctionMapping (accurate, left as-is). Cat3 n/a (no prior examples). Cat4 clean (listed in reference/utilities.rst). issue #3579. CUDA available: ran numpy example; cupy/dask covered by test_mahalanobis.py (24 pass).",1/1 perlin,2026-06-23,,MEDIUM,2;5,"name param undocumented (Cat2) + float-dtype requirement/ValueError undocumented, no Raises section (Cat5); fixed in deep-sweep-documentation-perlin-2026-06-23; repo has issues disabled so no issue number; example runs and output matches; 1 public func (perlin) listed in reference/surface.rst",1/1 diff --git a/docs/source/user_guide/attrs_contract.rst b/docs/source/user_guide/attrs_contract.rst index 6ddfc31d7..7cf0cc932 100644 --- a/docs/source/user_guide/attrs_contract.rst +++ b/docs/source/user_guide/attrs_contract.rst @@ -137,6 +137,23 @@ write. - str - Verbatim XML string of the ``GDAL_METADATA`` tag. Preferred over ``gdal_metadata`` by writers when both are present. + * - ``category_names`` + - list of str + - Class labels for a categorical raster; the list index is the + pixel value. Unlike the other canonical keys, this one lives in + the GDAL PAM sidecar (``.aux.xml``), not in the TIFF + itself: ``to_geotiff`` writes the sidecar whenever the attr is + present on a string destination, and ``open_geotiff`` merges the + attr back from the sidecar on local string-path reads (all + backends). A missing, malformed, or foreign sidecar (e.g. GDAL's + statistics-only PAM) leaves the attr unset. See issue #3482. + * - ``category_colors`` + - list of tuples + - One ``(r, g, b, a)`` tuple (components 0-255) per entry in + ``category_names``. Carried in the same PAM sidecar as a + thematic RAT with color columns; round-trips through + ``to_geotiff`` / ``open_geotiff`` alongside ``category_names``. + Ignored when its length disagrees with ``category_names``. * - ``x_resolution`` - float - ``XResolution`` TIFF tag value. diff --git a/xrspatial/geotiff/__init__.py b/xrspatial/geotiff/__init__.py index 41e7c7fca..c3ad25a19 100644 --- a/xrspatial/geotiff/__init__.py +++ b/xrspatial/geotiff/__init__.py @@ -885,6 +885,19 @@ def open_geotiff(source: str | BinaryIO, *, non-finite, or fractional) leaves the source dtype, so ``dtype=`` works in that case. + Local string-path reads also pick up categorical metadata from a + GDAL PAM sidecar. When a ``.aux.xml`` file sits next to the + source, ``category_names`` (and ``category_colors``, when the + sidecar's thematic RAT carries color columns) are merged into the + result's attrs. This runs on every backend -- eager, dask, GPU, and + VRT reads alike. ``to_geotiff`` writes the same sidecar for a + DataArray carrying ``attrs['category_names']``, so a categorical + write -> read round-trip keeps class labels and colors. A missing, + malformed, or foreign sidecar (for example GDAL's statistics-only + PAM) is ignored, and file-like sources skip the lookup. See the + attrs contract page (``docs/source/user_guide/attrs_contract.rst``) + for the key definitions. + Examples -------- Safe VRT usage. Write a ``.vrt`` mosaic with ``to_geotiff`` and read diff --git a/xrspatial/geotiff/_writers/eager.py b/xrspatial/geotiff/_writers/eager.py index dfa148cbe..927033e1c 100644 --- a/xrspatial/geotiff/_writers/eager.py +++ b/xrspatial/geotiff/_writers/eager.py @@ -131,6 +131,21 @@ def to_geotiff(data: xr.DataArray | np.ndarray, GPU write uses nvCOMP batch compression (deflate/ZSTD) and keeps the array on device. Falls back to CPU if nvCOMP is not available. + Styling sidecars for QGIS/GDAL land next to the file, not inside + it. A DataArray carrying ``attrs['category_names']`` (for example a + categorical ``rasterize`` result) gets a PAM ``.aux.xml`` + sidecar with the class labels and, when + ``attrs['category_colors']`` is present, an RGBA column per class. + No kwarg is involved; the attrs alone trigger the write, on every + dispatch path (eager, dask, GPU, and ``.vrt``). Continuous rasters + opt in through ``color_ramp``, which writes a QGIS ``.qml`` style + plus PAM statistics instead; the categorical sidecar wins when both + apply. File-like destinations skip sidecars (no path to anchor + them). ``open_geotiff`` merges the categorical attrs back on read, + so the labels round-trip; see + ``docs/source/user_guide/attrs_contract.rst`` for the key + definitions. + Parameters ---------- data : xr.DataArray or np.ndarray