Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/01-design-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ numpy arrays + metadata (bands, CRS, bounds, timestamps)
CoverageInput (intermediate representation)
|
v
RasterCovJSONModeler (conversion logic)
Modeler functions (to_coverage; conversion logic)
|
v
covjson-pydantic Models (Coverage, Domain, Range, Parameter...)
Expand All @@ -60,7 +60,7 @@ JSON Response (application/prs.coverage+json)
### Key Design Decisions

- **TiTiler extension**: Implemented as a FastAPI router that plugs into TiTiler, not a standalone service
- **Data-agnostic modeler**: A `RasterCovJSONModeler` converts an intermediate `CoverageInput` to CovJSON, decoupled from specific readers
- **Data-agnostic modeler**: Stateless module-level functions (`to_coverage`) convert an intermediate `CoverageInput` to CovJSON, decoupled from specific readers
- **covjson-pydantic**: Uses the [KNMI covjson-pydantic](https://github.com/KNMI/covjson-pydantic) library (Pydantic v2) for spec-compliant model serialization
- **Domain type auto-detection**: Geometry type determines CovJSON domain type (Grid for raster, Point/PointSeries for point queries, Trajectory for transects, etc.)
- **Grid-native**: Raster data naturally maps to CovJSON Grid domains, the most common case
Expand Down
102 changes: 76 additions & 26 deletions docs/04-modeler-converter-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
The **Modeler** is the layer that converts raster data (rio-tiler ImageData, numpy arrays, STAC metadata) into CovJSON model objects (via covjson-pydantic). It follows a clean separation of concerns:

```plain
rio-tiler data --> CoverageInput (intermediate) --> RasterCovJSONModeler --> covjson-pydantic Coverage
rio-tiler data --> CoverageInput (intermediate) --> modeler (to_coverage) --> covjson-pydantic Coverage
```

## 2. Conversion Flow
Expand Down Expand Up @@ -72,7 +72,7 @@ class CoverageInput:
crs: rasterio.CRS
geometry: BaseGeometry | None = None # For non-grid domains

# Band/variable metadata (may be empty; modeler synthesizes identities)
# Band/variable metadata; resolved to one entry per band at construction
bands: tuple[BandInfo, ...] = ()

# Temporal info (optional)
Expand All @@ -89,6 +89,26 @@ Domain-dependent consistency (geometry vs. timestamps vs. array shape) is
deferred to the modeler -- see Section 7 for the planned evolution that removes
this split.

`__post_init__` also **resolves `bands` once, at construction**: when `bands` is
empty it synthesizes one `BandInfo` per leading-axis band (`b1, b2, ...`,
matching rio-tiler's default band naming), assigning through
`object.__setattr__` because the dataclass is frozen. So `bands` is never empty
afterwards and every consumer reads a populated tuple -- the default-band naming
convention lives here, in one place, rather than being duplicated in the modeler.

This intentionally erases the distinction between "the caller supplied no band
metadata" and "the caller supplied bands". That is consistent with
`CoverageInput`'s role as the *post-resolution* representation: all precedence
and enrichment (explicit `bands` > per-attribute kwargs > the reader's own
`band_names`) is resolved upstream in the converters (Section 5), which still
hold the raw reader info. The realistic consumers of "were bands supplied?" --
e.g., a strict mode that rejects placeholder parameters, or metadata enrichment
from a STAC item's `eo:bands` -- all live at that converter/endpoint layer,
where the signal is still available; none need it on `CoverageInput`. If such a
need ever does reach this layer, the clean fix is an explicit converter flag or
a `bands_supplied` field, not reconstructing intent from a resolved
`CoverageInput`.

### 3.1 Single-array, data-cube constraint

`CoverageInput.data` is a single masked array with a leading band axis:
Expand Down Expand Up @@ -117,33 +137,63 @@ its own array and dtype) rather than a single `(bands, ...)` array. Defer
this until a concrete endpoint requires it; the Section 7 union refactor is
independent and addresses domain shape, not band heterogeneity.

## 4. RasterCovJSONModeler
## 4. Modeler

The modeler is a set of **stateless module-level functions** in `modeler.py`,
not a class. The conversion holds no state and depends only on the neutral
`CoverageInput`, so a class would add ceremony without benefit. (If
configuration ever needs to be threaded through -- e.g., a `TiledNdArray` size
threshold in Story 12 -- introduce a function argument or a small frozen config
object rather than reviving a stateful class.)

The public entry point is `to_coverage`:

```python
class RasterCovJSONModeler:
"""Converts raster data to CovJSON Coverage objects."""

def to_coverage(self, input: CoverageInput) -> Coverage:
domain = self._create_domain(input)
parameters = self._create_parameters(input)
ranges = self._create_ranges(input, domain)
return Coverage(domain=domain, parameters=parameters, ranges=ranges)

def to_coverage_collection(self, inputs: list[CoverageInput]) -> CoverageCollection:
parameters = self._create_parameters(inputs[0])
references = self._get_references(inputs[0])
coverages = []
for inp in inputs:
cov = self.to_coverage(inp)
cov.parameters = {} # Hoisted to collection level
coverages.append(cov)
return CoverageCollection(
coverages=coverages, parameters=parameters, referencing=references
)
def to_coverage(coverage_input: CoverageInput) -> Coverage:
domain = _create_grid_domain(coverage_input)
parameters = _create_parameters(coverage_input)
ranges = _create_grid_ranges(coverage_input)
return Coverage(domain=domain, parameters=parameters, ranges=ranges)
```

**Current status (Grid only).** `to_coverage` implements the Grid domain
(gridded rasters, `geometry is None`). It guards the cases it does not yet handle
with `NotImplementedError`: a non-`None` `geometry` (a non-grid domain), and data
that is not 3-D `(bands, height, width)` (`CoverageInput` also permits 2-D
point/profile data, which must not reach the grid path). Multi-domain support
arrives via the per-domain input union and `match` dispatch in Section 7 --
chosen at the second domain type rather than building out the `_get_domain_type`
inference sketched in Section 4.1.

`to_coverage_collection` is its planned sibling for multi-result responses (**not
yet implemented**): build one coverage per input, then hoist the shared
`parameters` and `referencing` to the collection level and clear them on the
member coverages.

```python
def to_coverage_collection(inputs: list[CoverageInput]) -> CoverageCollection:
parameters = _create_parameters(inputs[0])
references = _get_references(inputs[0])
coverages = []
for coverage_input in inputs:
cov = to_coverage(coverage_input)
cov.parameters = {} # Hoisted to collection level
coverages.append(cov)
return CoverageCollection(
coverages=coverages, parameters=parameters, referencing=references
)
```

### 4.1 Domain Type Detection

> **Status**: not implemented as written. The Grid-only modeler (Section 4)
> guards non-grid inputs with `NotImplementedError` instead of detecting a
> domain type. When the second domain type lands, the per-domain input union
> (Section 7) supersedes this inference entirely -- the variant is selected
> *explicitly* by the endpoint, not detected from `geometry` + `shape`. The
> sketch below is retained for the design rationale (and the Polygon discussion
> that follows).

```python
def _get_domain_type(self, input: CoverageInput) -> DomainType:
has_time = input.timestamps is not None and len(input.timestamps) > 0
Expand Down Expand Up @@ -179,7 +229,7 @@ def _get_domain_type(self, input: CoverageInput) -> DomainType:

| Domain Type | Axes Produced |
| --- | --- |
| Grid | `x: CompactAxis(start=west, stop=east, num=w)`, `y: CompactAxis(start=north, stop=south, num=h)` |
| Grid | `x`/`y` `CompactAxis` of cell *centers*, inset half a cell from the bounds edges: `x` runs `west + dx/2 .. east - dx/2` (`dx = (east-west)/w`), `y` runs `north + dy/2 .. south - dy/2` (`dy = (south-north)/h`) |
| Point / PointSeries | `x: ValuesAxis[float]`, `y: ValuesAxis[float]`, optionally `z`, optionally `t` |
| MultiPoint | `composite: ValuesAxis[Tuple]` |
| Polygon / PolygonSeries | `composite: ValuesAxis` with polygon rings, optionally `t` |
Expand Down Expand Up @@ -399,8 +449,8 @@ exhaustiveness checking:
```python
from typing import assert_never # typing_extensions on Python < 3.11

def to_coverage(self, input: CoverageInput) -> Coverage:
match input:
def to_coverage(coverage_input: CoverageInput) -> Coverage:
match coverage_input:
case GridInput():
...
case GridSeriesInput():
Expand Down
6 changes: 3 additions & 3 deletions docs/05-implementation-roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,16 +60,16 @@ Add CoverageJSON (CovJSON) as a new output format to TiTiler via the `titiler-co

**Estimated effort**: S (1-2 days)

### Story 3: RasterCovJSONModeler - Core Conversion Logic
### Story 3: Modeler - Core Conversion Logic

**Priority**: P0 (Foundation)

**Description**: Implement the modeler that converts CoverageInput to CovJSON Coverage objects.

**Tasks**:

- [ ] Create `titiler_covjson/modeler.py` with RasterCovJSONModeler class
- [ ] Implement domain type detection (`_get_domain_type`)
- [x] Create `titiler_covjson/modeler.py` with the modeler conversion functions (`to_coverage`); stateless module functions, not a class
- [ ] Implement domain dispatch (per-domain input union + `match`, per doc 04 Section 7) -- the Grid-only path currently guards non-grid inputs rather than detecting a domain type
- [ ] Implement axis creation for all domain types (`_create_axes`)
- Grid (start/stop/num)
- Point/PointSeries (x, y, optional z, optional t)
Expand Down
4 changes: 2 additions & 2 deletions docs/06-existing-libraries-analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ A JSON Schema-based validator with a Python CLI tool and runtime validation logi
**No custom Pydantic models needed.** Add `covjson-pydantic` as a dependency and use its models directly. Custom code is limited to:

1. **`CoverageInput` dataclass** (bridge between rio-tiler and covjson-pydantic)
2. **`RasterCovJSONModeler`** (conversion logic: numpy arrays -> covjson-pydantic objects)
2. **Modeler functions** (`to_coverage`; conversion logic: numpy arrays -> covjson-pydantic objects)
3. **FastAPI routes** (endpoint definitions, parameter handling)
4. **TiTiler integration** (router extension, content type registration)
5. **Unit/CRS mapping helpers** (UCUM symbols, EPSG->OGC URI conversion)
Expand Down Expand Up @@ -170,5 +170,5 @@ httpx # FastAPI test client
| 2 | **Use `covjson-validator` in integration tests** | Catches spec violations that Pydantic alone might miss (axis/shape consistency, monotonicity) |
| 3 | **Vendor validator schemas, don't depend on the repo** | The repo isn't pip-installable and has deprecated deps; extract the JSON Schema files |
| 4 | **Contribute Polygon domain type upstream** | Benefits the community; reduces local maintenance burden |
| 5 | **Keep `CoverageInput` + `RasterCovJSONModeler` as custom code** | Neither library provides the raster->CovJSON conversion bridge |
| 5 | **Keep `CoverageInput` + the modeler functions as custom code** | Neither library provides the raster->CovJSON conversion bridge |
| 6 | **Pin `covjson-pydantic` minor version** | Pre-1.0 library; avoid surprise breaking changes |
3 changes: 3 additions & 0 deletions src/titiler_covjson/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -244,9 +244,12 @@ def numpy_dtype_to_ndarray(
>>> nd = numpy_dtype_to_ndarray(arr, np.float32, ["y", "x"])
>>> nd.shape
[2, 2]
>>> nd.dataType
'float'
>>> nd.values
[1.5, 2.5, 3.5, 4.5]
"""

covjson_dtype = numpy_to_covjson_dtype(dtype)
shape = list(data.shape)

Expand Down
57 changes: 51 additions & 6 deletions src/titiler_covjson/input.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,8 +94,9 @@ class CoverageInput:
geometry: Source geometry for non-grid domains -- e.g., the queried
point, the transect line, or the aggregation polygon; ``None``
for gridded rasters.
bands: Per-band metadata. May be empty, in which case the modeler
synthesizes generic band identities.
bands: Per-band metadata, one entry per band. Resolved at construction:
when not supplied, generic ``b1, b2, ...`` identities are synthesized
(see ``__post_init__``), so this is always populated afterwards.
timestamps: ISO 8601 / RFC 3339 timestamps for temporal data (e.g.,
one per STAC item in a time series); ``None`` for purely spatial
data.
Expand Down Expand Up @@ -132,7 +133,7 @@ class CoverageInput:
item_ids: tuple[str, ...] | None = None

def __post_init__(self) -> None:
"""Validate array dimensionality, band count, and 2-D timestamps.
"""Validate the data/band/timestamp invariants, then resolve ``bands``.

Mostly domain-independent invariants. The one domain-shaped exception
is timestamps against 2-D point/profile data: there the sample axis is
Expand All @@ -143,11 +144,18 @@ def __post_init__(self) -> None:
eventually, the per-domain input variants -- see
``docs/04-modeler-converter-design.md``, Section 7).

As a final step ``bands`` is resolved: when empty it is populated with
synthesized ``b1, b2, ...`` identities (assigned via
``object.__setattr__``, as the dataclass is frozen), so it is never empty
after construction.

Raises:
ValueError: If ``data`` is not 2-D or 3-D with at least 1 band; if
``bands`` is non-empty and its length does not match
``data.shape[0]``; or if ``data`` is 2-D and ``len(timestamps)``
does not match the sample axis ``data.shape[-1]``.
any ``data`` axis is empty (size 0); if ``bands`` is non-empty and
its length does not match ``data.shape[0]``; if two ``bands`` share
a name (names become CoverageJSON keys, so must be unique); or if
``data`` is 2-D and ``len(timestamps)`` does not match the sample
axis ``data.shape[-1]``.
"""

if self.data.ndim not in {2, 3} or self.data.shape[0] == 0:
Expand All @@ -157,12 +165,31 @@ def __post_init__(self) -> None:
f"with {self.data.shape[0]} band(s)"
)
raise ValueError(msg)

# No data axis may be empty: a zero-size height/width/sample axis is a
# degenerate coverage and would otherwise surface as an opaque CompactAxis
# error (num must be a positive cell count) deep in the modeler.
if 0 in self.data.shape:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟨 Reviewer-found.

nit: 0 in self.data.shape also matches shape[0] == 0, already rejected above with a different message — so a (0, h, w) array reports "must be 2-D or 3-D" instead of "non-empty". Harmless; could narrow to data.shape[-2:] for a consistent message.

@chuckwondo chuckwondo Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This proposed change does not change the error a user sees, and actually makes the code harder to reason about because the slicing causes the reader to pause to grok the reason for the slicing, which is merely to avoid a potentially redundant check of shape[0] against 0. That's additional cognitive load that's not worth merely skipping an unnecessary check of a single element in the shape tuple for no change in behavior.

msg = (
"CoverageInput data axes must all be non-empty; "
f"got shape {self.data.shape}"
)
raise ValueError(msg)

if self.bands and len(self.bands) != self.data.shape[0]:
msg = (
f"Number of bands ({len(self.bands)}) does not match "
f"data.shape[0] ({self.data.shape[0]})"
)
raise ValueError(msg)

# Band names become CoverageJSON range/parameter keys, so they must be
# unique; duplicates would silently collapse entries in the modeler.
if self.bands and len({band.name for band in self.bands}) != len(self.bands):
names = [band.name for band in self.bands]
msg = f"CoverageInput band names must be unique; got {names}"
raise ValueError(msg)

if (
self.timestamps is not None
and self.data.ndim == 2
Expand All @@ -174,6 +201,24 @@ def __post_init__(self) -> None:
)
raise ValueError(msg)

# Resolve bands once, at construction, so every consumer can read a
# populated `bands` tuple without re-deriving defaults. It arrives
# populated from converters (e.g., imagedata_to_coverage_input, via the
# image's band_names); it is empty only on direct array construction
# without metadata -- the modeler's array-only test path. Synthesize
# b1, b2, ... then, matching rio-tiler's default band naming so
# synthesized and converter-supplied identities are indistinguishable.
# (frozen dataclass: assign through object.__setattr__.)
if not self.bands:
object.__setattr__(
self,
"bands",
tuple(
BandInfo(name=f"b{i + 1}", dtype=self.data.dtype)
for i in range(self.data.shape[0])
),
)


def band_info_from_reader_info(info: Info) -> list[BandInfo]:
"""Build per-band metadata from a rio-tiler reader ``info()`` result.
Expand Down
Loading