Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
62aaa6c
docs(spec): gbx.viz + pyrx escape-hatches design
Jun 23, 2026
731f778
docs(plan): gbx.viz + pyrx escape-hatches implementation plan
Jun 23, 2026
2f60f2b
feat(viz): [viz] extra + package skeleton + dep guard
Jun 23, 2026
4c39104
feat(viz): raster decimation + percentile-stretch pipeline
Jun 23, 2026
a7f4e8f
feat(viz): plot_raster + plot_file public raster plotters
Jun 23, 2026
a70f02d
fix(viz): robust headless backend selection in _render
Jun 23, 2026
05c99bb
feat(viz): as_gdf + cells_as_gdf Spark→GeoDataFrame adapters
Jun 23, 2026
aa189be
feat(pyrx): tile_to_numpy + rst_apply escape-hatches
Jun 23, 2026
25b1357
ci(viz): run test/viz in the lightweight tier (heavy skips)
Jun 23, 2026
529740f
docs(viz): gbx.viz page + raster escape-hatches section
Jun 23, 2026
ba12726
docs(viz): correct dev-container-match note in light lock comment
Jun 23, 2026
57c2dee
docs(eo-series): migrate notebooks onto gbx.viz + pyrx escape-hatches…
Jun 23, 2026
d6ea882
docs(eo-series): executed notebooks on gbx.viz + cleaned config_nb
Jun 23, 2026
ab3d0e3
docs(eo-series): refresh notebooks doc page + README for gbx.viz migr…
Jun 23, 2026
0719b94
docs(spec): H3 cell rasterizer (rst_h3_rasterize_agg + rst_h3_gridspe…
Jun 23, 2026
b02a420
docs(plan): H3 cell rasterizer implementation plan
Jun 23, 2026
3ca10ee
feat(pyrx): H3 cell rasterize core (centroid burn + gridspec)
Jun 24, 2026
4f8b221
fix(pyrx): drop redundant H3 round-trip in cells_to_raster hot loop
Jun 24, 2026
c11c10c
feat(pyrx): gbx_h3_cell_bbox scalar + rst_h3_gridspec helper
Jun 24, 2026
064aebe
fix(pyrx): rst_h3_gridspec auto-pixel-size parity + empty-input guard
Jun 24, 2026
6d387a2
feat(pyrx): rst_h3_rasterize_agg grouped aggregator
Jun 24, 2026
ac4e57b
refactor(pyrx): drop dead value-None branch in rst_h3_rasterize_agg udf
Jun 24, 2026
e1efb7f
docs(function-info): register gbx_rst_h3_rasterize_agg + gbx_h3_cell_…
Jun 24, 2026
cd30d8e
fix(function-info): h3_cell_bbox SQL example args + rasterize_agg ord…
Jun 24, 2026
b843a8a
test(pyrx): H3 rasterize round-trip + partition validation
Jun 24, 2026
766cf54
test(pyrx): sound nearest-value round-trip oracle + partition pixel-c…
Jun 24, 2026
ed1ff4e
test(pyrx): FCC fixed-wireless H3 rasterize fixture + test
Jun 24, 2026
c5ca073
test(pyrx): multi-speed-tier FCC fixture for the rasterize-per-thresh…
Jun 24, 2026
2d9fa25
feat(rasterx): RST_H3_RasterizeAgg UDAF + gbx_h3_cell_bbox (heavy tier)
Jun 24, 2026
97f4250
test(rasterx): heavy<->light H3 rasterize parity (JAR-gated)
Jun 24, 2026
5100af7
feat(rasterx): heavy Python wrappers for rst_h3_rasterize_agg + gbx_h…
Jun 24, 2026
c53456d
style(test): black-normalize _LIGHT_TEST_DIRS list in conftest
Jun 24, 2026
c499591
docs(rasterx): rst_h3_rasterize_agg + gbx_h3_cell_bbox + rst_h3_gridspec
Jun 24, 2026
35f569f
docs(notebooks): H3 cell rasterize + stacking demo (DEM isobands)
Jun 24, 2026
af8e89f
docs(rasterx): correct rst_h3_gridspec API doc + reconcile spec snapp…
Jun 24, 2026
30897c1
docs(rasterx): diagram + release-notes + tile-output/table for H3 ras…
Jun 24, 2026
ed0c2d9
docs(notebooks): rename to h3_rasterize_isobands + auto-stage DEM cell
Jun 24, 2026
3899ac1
fix(test): guard h3/rasterio/pandas imports in heavy<->light parity test
Jun 24, 2026
888e2d0
feat(viz): grid_as_gdf canvas helper + cells_as_gdf dissolve_by
Jun 24, 2026
bea2c86
docs(notebooks): render DEM/polyfill/grid via gbx.viz in isobands demo
Jun 24, 2026
a47713c
docs(notebooks): switch isobands demo to San Francisco DEM (N37W123)
Jun 24, 2026
da407fe
docs(notebooks): clear stale NYC outputs from SF isobands demo
Jun 24, 2026
e1cf35a
docs(notebooks): SF-specific narrative + distributed-scaling notes fo…
Jun 24, 2026
04f4f14
feat(viz): plot_raster composite="depth" coverage-depth mode
Jun 24, 2026
3c8b37d
docs(notebooks): finer pixel size + coverage-depth stack render (SF i…
Jun 24, 2026
5924846
fix(viz): render constant single-band masks (presence) with visible b…
Jun 24, 2026
f33c25a
docs(notebooks): materialize band tiles to session temp table
Jun 24, 2026
509e2f0
docs(notebooks): inspect highest-two band coverage shapes
Jun 24, 2026
854ff31
docs(notebooks): inspect mid-coverage bands (clear footprint shape)
Jun 24, 2026
abe0491
fix(viz): draw single-band masks via imshow, not rasterio show()
Jun 24, 2026
8af9563
style(pyrx): clear flake8 dead code + docker-black reformat
Jun 24, 2026
491c2cf
feat(viz): plot_mask_layers overlay + use it in isobands cell 8
Jun 24, 2026
fe8433d
docs(notebooks): match polyfill markdown to undissolved render + viz …
Jun 24, 2026
3f66060
docs: H3 rasterize example page + viz functions + release notes
Jun 24, 2026
33c954f
docs(raster-functions): link the H3 rasterize notebook from rst_h3_ra…
Jun 24, 2026
34f3753
bench: add rst_h3_rasterize_agg to the cluster benchmark (both tiers)
Jun 24, 2026
f245e5b
bench: heavy rst_h3_rasterize_agg returns a tile struct, do not wrap …
Jun 24, 2026
664118a
fix(pyrx): rst_h3_rasterize_agg null value column burns presence 1.0,…
Jun 24, 2026
f425477
docs(benchmarking): add rst_h3_rasterize_agg spark-path row (heavy 1.…
Jun 24, 2026
e0e5ed6
docs(performance): classify rst_h3_rasterize_agg + the custom-grid fa…
Jun 24, 2026
032eabb
docs(images): regenerate landscape function-categories PNG (108 fns, H3)
Jun 24, 2026
0a76980
docs(examples): hero diagrams at the top of each example notebook
Jun 24, 2026
5547b38
docs(images): refresh eo-series diagrams to the lightweight tier
Jun 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/actions/pyrx_build/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,6 @@ runs:
# The lightweight tier is exercised ONLY here (the heavy phase skips these
# dirs via test/conftest.py collect_ignore). Every light test dir must be
# listed: pyrx, ds, pyvx, pygx (light GridX), pmtiles_light (light
# pmtiles_agg), stac (light STAC client, [stac] extra). See
# test/conftest.py for the maintained condition.
pytest test/pyrx test/ds test/pyvx test/pygx test/pmtiles_light test/stac -m "not integration" -v
# pmtiles_agg), stac (light STAC client, [stac] extra),
# viz (gbx.viz, [viz] extra). See test/conftest.py for the maintained condition.
pytest test/pyrx test/ds test/pyvx test/pygx test/pmtiles_light test/stac test/viz -m "not integration" -v
3 changes: 3 additions & 0 deletions docs/docs/api/benchmarking.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -545,6 +545,7 @@ Pure-core measures the algorithm; **spark-path measures the job**. Each function
| `rst_h3_rastertogridmin` | 56.51 | 79.28 | 0.71× | timing-only |
| `rst_h3_rastertogridcount` | 55.34 | 79.39 | 0.70× | timing-only |
| `rst_fromcontent` | 4.03 | 5.90 | 0.68× | timing-only |
| `rst_h3_rasterize_agg`¹ | 1.50 | 2.26 | 0.66× | exact |
| `rst_asformat` | 4.00 | 6.04 | 0.66× | timing-only |
| `rst_frombands` | 2.31 | 4.94 | 0.47× | timing-only |
| `rst_cog_convert` | 4.23 | 11.13 | 0.38× | timing-only |
Expand All @@ -556,6 +557,8 @@ Pure-core measures the algorithm; **spark-path measures the job**. Each function
| `rst_gridfrompoints_agg` | 2.66 | 33.75 | 0.08× | within_tol |
| `rst_proximity` | 0.52 | 8.12 | 0.06× | timing-only |

¹ `rst_h3_rasterize_agg` is a cell→raster aggregator with a different workload than the 1024² rows: each of the 1,000 groups burns a fixed 331-cell H3 set (resolution 9, presence mask) onto one small 39×24 canvas, so its per-tile figure reflects that small output rather than a 1024² tile. Read the **cross-tier ratio** (heavyweight ~1.5× faster) and **exact** parity as the comparable result, not the absolute ms against the 1024² rows. Measured on DBR 18.x at the same fixed 20-worker cluster (the 1024² rows above are DBR 17.3 LTS).

#### Repartition strategy

The spark-path numbers above are wall-clock for a whole distributed job, so they're sensitive to how evenly the 1,000 tiles spread across the cluster's task slots. The benchmark harness tunes the partitioning so the comparison reflects the engines, not a straggler tail:
Expand Down
15 changes: 15 additions & 0 deletions docs/docs/api/performance.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ These functions are Arrow-backed `pandas_udf(BinaryType())` aggregators used ins
| `gbx_rst_derivedband_agg` | `rst_derivedband_agg` | Apply a user-supplied Python expression across a tile group |
| `gbx_rst_gridfrompoints_agg` | `rst_gridfrompoints_agg` | Interpolate a raster from point observations |
| `gbx_rst_dtmfromgeoms_agg` | `rst_dtmfromgeoms_agg` | Build a DTM raster from elevation geometries |
| `gbx_rst_h3_rasterize_agg` | `rst_h3_rasterize_agg` | Burn a group of H3 cells onto one raster (presence mask, or per-cell value) |

**VectorX (pyvx)**

Expand Down Expand Up @@ -327,6 +328,20 @@ Same per-shape split as quadbin. **Scalar/bounded-output** functions (`pointasce
| `gbx_bng_geomkloop` | `bng_geomkloop` | geometry → array of k-loop cell IDs |
| `gbx_bng_tessellate` | `bng_tessellate` | geometry → array of `STRUCT<cellid, core, chip>` |

**GridX (pygx) — custom grid**

Same per-shape split as quadbin and BNG. **Scalar/bounded-output** functions (`grid`, `pointascell`, `cellaswkb`, `cellaswkt`, `centroid`) register as **`pandas_udf`s** — a pure-Python grid codec (extent + split factor + root cell size → integer row/col cell index, and the inverse cell→polygon), exact-parity with the heavyweight tier. **Array-returning** functions (`polyfill`, `kring`) register as **plain `@udf`s** so variable-length outputs stream row-by-row rather than buffering a whole Arrow batch. All stay Serverless-safe (`spark.udf.register` only, no Spark-config or JVM access). Custom grids index against an arbitrary projected grid (extent + resolution + SRID) — for a project- or nation-specific tiling that H3, quadbin, or BNG cells don't match.

| SQL name | Python name | Operation |
|---|---|---|
| `gbx_custom_grid` | `custom_grid` | extent + split factor + root cell size + SRID → grid spec |
| `gbx_custom_pointascell` | `custom_pointascell` | point → cell ID on the custom grid |
| `gbx_custom_cellaswkb` | `custom_cellaswkb` | cell ID → footprint polygon (WKB) |
| `gbx_custom_cellaswkt` | `custom_cellaswkt` | cell ID → footprint polygon (WKT) |
| `gbx_custom_centroid` | `custom_centroid` | cell ID → centroid point |
| `gbx_custom_polyfill` | `custom_polyfill` | geometry → array of covering cell IDs |
| `gbx_custom_kring` | `custom_kring` | cell ID → array of cells within distance `k` |

</TabItem>

<TabItem value="cores" label="Vectorized cores">
Expand Down
152 changes: 148 additions & 4 deletions docs/docs/api/raster-functions.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ RasterX exposes 87+ SQL functions (registered as `gbx_rst_*`; available in Pytho
![RasterX function categories — Constructors, Accessors, Aggregators, Generators, Operations, H3 Grid](../../../resources/images/rasterx-function-categories.png)

- **Accessor Functions**: Read raster properties and metadata (bounds, dimensions, CRS, bands, pixel size, georeference, format, type, NoData, subdatasets, summary, etc.)
- **Aggregator Functions**: Combine or merge rasters in group-by (combineavg_agg, derivedband_agg, merge_agg)
- **Aggregator Functions**: Combine or merge rasters in group-by (combineavg_agg, derivedband_agg, merge_agg, h3_rasterize_agg)
- **Constructor Functions**: Create or load rasters from paths, binary content, or bands
- **Generator Functions**: Produce multiple tiles or bands (h3_tessellate, maketiles, retile, separatebands, tooverlappingtiles)
- **Grid Functions (H3)**: Aggregate raster values to H3 cells (rastertogrid avg/count/max/min/median)
Expand Down Expand Up @@ -542,7 +542,7 @@ Powered by **rasterio**.

## Aggregator Functions

Combine or merge rasters in group-by (6 total).
Combine or merge rasters in group-by (7 total).

### rst_combineavg_agg

Expand Down Expand Up @@ -758,6 +758,52 @@ Streaming IDW-interpolation aggregator that accepts one point geometry and one s

<CodeFromTest language="sql" source="docs/tests/python/api/rasterx_functions_sql.py" testFile="docs/tests/python/api/test_rasterx_functions_sql.py" functionName="rst_gridfrompoints_agg_sql_example" outputConstant="rst_gridfrompoints_agg_sql_example_output" code={rasterxSqlCode} />

### rst_h3_rasterize_agg

<Tier both/> <Impl groupedAgg/>

:::note Lightweight tier (pyrx)
Powered by **rasterio** + **h3**. Aggregate — `groupBy(...).agg(rx.rst_h3_rasterize_agg("cellid", "value", ...))` burns each H3 cell's centroid pixel (or spatial-envelope pixels with `mode='spatial_envelope'`) into one raster tile per group. When `value` is omitted or null, all burned pixels carry `1.0` (presence mask). The extent and pixel size are derived automatically from the H3 resolution unless explicit bounds are supplied; `kring_pad` (default 1) expands the canvas by that many rings to avoid clipping edge cells.
:::

:::warning Lightweight SQL returns BINARY (not the tile struct)
Heavyweight `gbx_rst_h3_rasterize_agg` returns a tile **`STRUCT<cellid, raster, metadata>`**; the lightweight **SQL** function returns **`BINARY`** (the raster bytes). A PySpark grouped-aggregate `pandas_udf` cannot return a `StructType`, so the lightweight SQL aggregate returns the raster payload as BINARY. The lightweight **Python** wrapper `rx.rst_h3_rasterize_agg(...)` returns the full tile struct (it composes the aggregate with a tile-wrapping step), so only raw SQL differs. To rebuild the tile-struct equivalent in SQL, select the group key as `cellid` and wrap the BINARY with `gbx_rst_fromcontent`:

```sql
-- Lightweight SQL: rebuild the (cellid, raster) the heavyweight struct would carry
SELECT
region_id AS cellid,
gbx_rst_fromcontent(
gbx_rst_h3_rasterize_agg(cellid, burn_value, 4326, null, null,null,null,null, null,null, 'centroids', 1),
'GTiff'
) AS tile
FROM h3_cell_values
GROUP BY region_id
```
:::

Streaming aggregator that burns H3 cell centroid pixels (or spatial-envelope pixels) into one raster tile per group. This is the **inverse** of [`rst_h3_rastertogrid*`](#rst_h3_rastertogridavg): where those functions reduce raster pixels to per-cell statistics, `rst_h3_rasterize_agg` reconstructs a raster from per-cell values. Use [`rst_frombands_agg`](#rst_frombands_agg) to stack per-threshold rasters (each produced by one `rst_h3_rasterize_agg` call) into a single multi-band output.

:::tip Worked example
The [H3 Rasterize notebook](../notebooks/h3-rasterize) walks this through end to end on a San Francisco Bay Area DEM: elevation isobands → H3 polyfill → a shared canvas from [`rst_h3_gridspec`](#h3-grid) → per-band `rst_h3_rasterize_agg` → multi-band stack via [`rst_frombands_agg`](#rst_frombands_agg), visualized with the [`gbx.viz`](./viz) helpers. The same pattern maps directly to a telco multi-threshold signal-coverage stack.
:::

**Signature:** `rst_h3_rasterize_agg(cellid: Column, value: Column, srid: Column, pixel_size: Column, xmin: Column, ymin: Column, xmax: Column, ymax: Column, width: Column, height: Column, mode: Column, kring_pad: Column): Column`

**Parameters:**
- `cellid` — H3 cell ID (BIGINT or STRING) to burn (one per row)
- `value` — numeric burn value; pass `null` for a presence mask (all pixels → `1.0`)
- `srid` — EPSG code for the output CRS; defaults to `4326` (WGS 84)
- `pixel_size` — ground resolution in CRS units (derives from H3 resolution when `null`)
- `xmin/ymin/xmax/ymax` — output canvas extent; auto-computed from cell bounds + `kring_pad` when `null`
- `width/height` — output raster dimensions in pixels; auto-derived from extent + pixel_size when `null`
- `mode` — `'centroids'` (default, burns the cell-centroid pixel only) or `'spatial_envelope'` (burns all pixels inside the hexagon envelope)
- `kring_pad` — ring count by which to expand the auto-computed canvas (default `1`)

**SQL:**

<CodeFromTest language="sql" source="docs/tests/python/api/rasterx_functions_sql.py" testFile="docs/tests/python/api/test_rasterx_functions_sql.py" functionName="rst_h3_rasterize_agg_sql_example" outputConstant="rst_h3_rasterize_agg_sql_example_output" code={rasterxSqlCode} />

---

## Constructor Functions
Expand Down Expand Up @@ -965,9 +1011,9 @@ Powered by **rasterio**. Streams one tile row per overlapping region via streami

---

## Grid Functions (H3)
## Grid Functions (H3) {#h3-grid}

Aggregate raster values to H3 grid cells (5 total).
Aggregate raster values to H3 grid cells, and utility functions for H3-based canvas setup (7 total).

### rst_h3_rastertogridavg

Expand Down Expand Up @@ -1039,6 +1085,73 @@ Powered by **rasterio** + **h3**. Returns an `ARRAY` (one element per band) of `

<CodeFromTest language="sql" source="docs/tests/python/api/rasterx_functions_sql.py" testFile="docs/tests/python/api/test_rasterx_functions_sql.py" functionName="rst_h3_rastertogridmedian_sql_example" outputConstant="rst_h3_rastertogridmedian_sql_example_output" code={rasterxSqlCode} />

### gbx_h3_cell_bbox

<Tier both/>

:::note Lightweight tier (pyrx)
Powered by **h3**. Returns a `STRUCT<xmin DOUBLE, ymin DOUBLE, xmax DOUBLE, ymax DOUBLE>` bounding box for the given H3 cell in the requested `srid`. In `'centroids'` mode the box tightly wraps the centroid point; in `'spatial_envelope'` mode it wraps the full hexagon outline. When `kring_pad > 0` the k-ring of that radius is computed first and the bounding box covers all cells in the ring. The lightweight SQL function requires all four arguments explicitly; the Python API (`rx.h3_cell_bbox(cellid, srid, mode, kring_pad)`) honors the same defaults as the Scala implementation.
:::

Scalar function — returns the bounding box `STRUCT<xmin, ymin, xmax, ymax>` for one H3 cell in the given CRS. Use this to drive the `xmin/ymin/xmax/ymax` and grid-size parameters of [`rst_h3_rasterize_agg`](#rst_h3_rasterize_agg) when you need a consistent per-cell canvas, or to clip and inspect cell extents in downstream queries.

**Signature:** `h3_cell_bbox(cellid: Column, srid: Column, mode: Column, kring_pad: Column): Column`

**Parameters:**
- `cellid` — H3 cell ID (BIGINT or STRING)
- `srid` — EPSG code; `4326` for WGS 84 lon/lat
- `mode` — `'centroids'` (centroid point envelope) or `'spatial_envelope'` (hexagon boundary envelope)
- `kring_pad` — expand by this many k-rings before computing the bounding box; `0` = no expansion

**Returns:** `STRUCT<xmin DOUBLE, ymin DOUBLE, xmax DOUBLE, ymax DOUBLE>`

**SQL:**

<CodeFromTest language="sql" source="docs/tests/python/api/rasterx_functions_sql.py" testFile="docs/tests/python/api/test_rasterx_functions_sql.py" functionName="h3_cell_bbox_sql_example" outputConstant="h3_cell_bbox_sql_example_output" code={rasterxSqlCode} />

### rst_h3_gridspec (Python / DataFrame helper)

:::note Lightweight (pyrx) Python helper only
`rst_h3_gridspec` is **not registered as a SQL function** and is **not available in the heavyweight tier**. It is a pure-Python / PySpark DataFrame helper in the `pyrx` package.

For the heavyweight tier, compose the equivalent shared canvas using the registered scalar `gbx_h3_cell_bbox(cellid, srid, mode, kring_pad)` with native Spark `min`/`max` aggregates and the same floor/ceil snap arithmetic:
```sql
SELECT min(cell_bbox.xmin), min(cell_bbox.ymin),
max(cell_bbox.xmax), max(cell_bbox.ymax)
FROM (SELECT gbx_h3_cell_bbox(cellid, 4326, 'centroids', 1) AS cell_bbox FROM cells)
```
:::

`rst_h3_gridspec` computes the shared, snapped canvas (extent + pixel size) that a group of H3 cells should use so that per-threshold rasters align on a common grid and can be stacked via `rst_frombands_agg` or mosaicked via `rst_merge_agg`.

**Signature:**
```python
rx.rst_h3_gridspec(df, cell_col="cellid", *group_cols,
srid=4326, pixel_size=None,
mode="centroids", kring_pad=1)
```

**Typical multi-threshold workflow:**

1. Call `rx.rst_h3_gridspec(df, cell_col="cellid", srid=4326, mode='centroids', kring_pad=1)` once on the distinct cell set. It returns the grouped DataFrame with a `grid` struct column — one row per group containing the shared canvas.
2. For each threshold band, run `rst_h3_rasterize_agg` with those fixed bounds — all output tiles share the same origin and pixel grid.
3. Stack aligned bands with `rst_frombands_agg` (ordered by `band_index`), or mosaic per-cell tiles with `rst_merge_agg`.

**Parameters:**
- `df` — input Spark DataFrame containing H3 cell IDs
- `cell_col` — column name holding H3 cell IDs (integer or string); default `"cellid"`
- `*group_cols` — additional grouping columns (e.g. a transmitter ID, year, month); one grid spec row is produced per group
- `srid` — EPSG code for the output CRS; `4326` for WGS 84 (default)
- `pixel_size` — ground resolution in CRS units; `None` = auto-derived from the H3 resolution via an edge-length heuristic (default)
- `mode` — `'centroids'` (default) or `'spatial_envelope'`
- `kring_pad` — k-ring expansion applied per cell before computing its bounding box (default `1`)

**Returns:** the grouped DataFrame with a `grid` column of type:
```
STRUCT<xmin DOUBLE, ymin DOUBLE, xmax DOUBLE, ymax DOUBLE,
pixel_size DOUBLE, width INT, height INT, srid INT>
```

---

## Grid Functions (quadbin)
Expand Down Expand Up @@ -1898,6 +2011,37 @@ GDAL also accepts `GDAL_VRT_ENABLE_PYTHON=TRUSTED_MODULES` plus a `GDAL_VRT_PYTH

---

## Escape hatches

When a raster operation isn't in the `rst_*` surface, drop down to rasterio/NumPy per tile (lightweight tier). These are **Python-only** helpers on `databricks.labs.gbx.pyrx.functions` — not SQL functions, and not registered as `gbx_rst_*`.

### tile_to_numpy

```python
from databricks.labs.gbx.pyrx.functions import tile_to_numpy

arr = tile_to_numpy(tile_or_bytes) # -> np.ndarray, shape (bands, rows, cols)
```

Read a tile's raster into a NumPy array (all bands). Accepts either a tile struct (a `Row`/dict with a `raster` field) or raw `bytes`. Useful when you've collected a tile to the driver, or inside your own UDF.

### rst_apply

```python
from pyspark.sql.types import DoubleType
from databricks.labs.gbx.pyrx.functions import rst_apply

df.select(
rst_apply("tile", lambda ds: float(ds.read(1).mean()), returnType=DoubleType()).alias("band1_mean")
)
```

Apply your own function to each tile's open rasterio dataset, returning one scalar per row. `fn` receives a rasterio `DatasetReader`; the return value must match `returnType` (default `DoubleType()`; any Spark `DataType`). A null/empty tile yields null. This is the "GeoBrix doesn't have function X — run my own rasterio per tile" path; it returns a scalar (raster→raster transforms are the domain of `rst_mapalgebra` / `rst_derivedband`).

For rendering tiles and building maps from results, see the [Visualization (`gbx.viz`)](./viz) page.

---

## Next Steps

- [GridX Function Reference](./gridx-functions)
Expand Down
Loading
Loading