Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
6c4b58e
COMP: Zarr Python 3 StoreLike support
thewtex Feb 11, 2025
124d21b
COMP: Use LocalStore for Zarr Python 3
thewtex Feb 11, 2025
855fc3d
COMP: Remove dimension_deparator arg to MemoryStore
thewtex Feb 11, 2025
9151bc4
Remove home-page from pyproject.toml
thewtex Feb 11, 2025
4786f38
COMP: Update pixi.lock
thewtex Feb 14, 2025
ab2560c
BUG: Remove Python version upper bound constraint
thewtex Feb 24, 2025
364da41
BUG: Fix dependencies version constraint format
thewtex Feb 24, 2025
4e9ef05
BUG: Fix dependency specification
thewtex Feb 25, 2025
a4123d4
COMP: Add itk and dask-image extras for CI tests
thewtex Feb 25, 2025
808c029
COMP: Update CI os versions, add Python 3.13
thewtex Feb 25, 2025
15556ef
DOC: Add Python 3.13 to supported package versions
thewtex Feb 25, 2025
0a92585
WIP: COMP: Bump xarray dep to >=2025.1.2
thewtex Feb 25, 2025
1de2c89
change to xarray-dataclass
melonora Jul 30, 2025
9b6a2e0
DOC: Init copilot instructions
thewtex Jul 30, 2025
12f1f51
COMP: Update test_ngff_validation for zarr-python 3
thewtex Jul 30, 2025
a1da0c1
COMP: pass expected keyword arguments to DataArray.reindex
thewtex Jul 30, 2025
5229616
COMP: Update pixi.lock
thewtex Jul 30, 2025
90226cd
COMP: Bump xarray-dataclass dep to 3.0.0
thewtex Jul 30, 2025
c8881ac
WIP: COMP: Zarr 3 testing updates
thewtex Jul 31, 2025
b0979d8
fix: Add zarr v3 compatibility to test_ngff_validation.py
thewtex Aug 1, 2025
52efe7f
fix: use zarr format 2 with NGFF v0.4
thewtex Aug 5, 2025
639d70c
ci: run notebook tests on ubuntu-22.04
thewtex Aug 5, 2025
debef90
fix run test-notebooks
melonora Aug 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Copilot Instructions for multiscale-spatial-image

## Project Overview

This library generates multiscale, chunked, multi-dimensional spatial image data
structures serializable to OME-NGFF format. It's built on Xarray DataTree with
spatial-image Dataset nodes, creating image pyramids for visualization and
analysis.

## Core Architecture

### Key Components

- **MultiscaleSpatialImage**: Xarray DataTree accessor
(`@register_datatree_accessor("msi")`) providing NGFF-compatible multiscale
operations
- **to_multiscale()**: Main entry point for creating pyramids with configurable
downsampling methods
- **Methods enum**: Defines downsampling algorithms (XARRAY*COARSEN, ITK*_,
DASK*IMAGE*_)
- **Operations module**: Provides dataset operations that skip non-dimensional
nodes

### Data Structure Pattern

```python
# DataTree structure: /scale0, /scale1, /scale2, etc.
# Each scale contains Dataset with same variable name as input
multiscale = to_multiscale(image, [2, 4]) # Creates 3 scales total
multiscale['scale0'].ds # Original resolution
multiscale['scale1'].ds # 2x downsampled
multiscale['scale2'].ds # 8x downsampled (2*4)
```

### Zarr Integration Patterns

- **Always use `dimension_separator='/'`** for NGFF compliance
- Support both Zarr v2 (DirectoryStore) and v3 (LocalStore) with fallback
pattern in `_data.py`
- NGFF metadata automatically added in `to_zarr()` method with coordinate
transformations

## Development Workflows

### Environment Management (Pixi-based)

```bash
pixi install -a # Install all environments
pixi run -e test test # Run unit tests
pixi run test-notebooks # Test example notebooks
pixi shell # Activate development shell
```

### Testing Patterns

- **Baseline comparison testing**: Use `verify_against_baseline()` for image
output validation
- **Test data**: IPFS-hosted via pooch, SHA256-verified downloads
- **Notebook testing**: nbmake integration tests all examples
- **Multiple backends**: Tests run against ITK, dask-image, and xarray methods

### Adding Test Data

```python
# Temporary add this line to generate new baseline:
store_new_image(dataset_name, baseline_name, multiscale)
# Remove after running tests, then update data.tar.gz and SHA256 in _data.py
```

## Critical Patterns

### Dimension Handling

- **skip_non_dimension_nodes decorator**: Essential for operations on DataTree
root nodes without dimensions
- **Default chunking**: Uses 64 for 3D (with 'z'), 256 for 2D, 1 for 't'
dimension
- **Coordinate transforms**: Scale and translation automatically computed from
coordinate spacing

### Multi-Backend Support

```python
# Pattern for optional dependencies
try:
from zarr.storage import DirectoryStore # Zarr v2
except ImportError:
from zarr.storage import LocalStore # Zarr v3
```

### Downsampling Method Dispatch

- Each method in `to_multiscale/` subdirectory follows `_downsample_{method}`
naming
- Methods handle chunking alignment via `_align_chunks()` helper
- Scale factors can be uniform int or per-dimension dict:
`[2, {'x': 2, 'y': 4}]`

## Integration Points

### External Dependencies

- **spatial-image**: Base SpatialImage input type
- **xarray-datatree**: Core DataTree functionality
- **OME-NGFF**: Metadata standard compliance via `multiscales` attribute
- **Optional**: ITK (medical imaging), dask-image (distributed), pyimagej

### File Patterns

- `multiscale_spatial_image.py`: Core accessor class with to_zarr() and
operations
- `to_multiscale/`: Downsampling method implementations
- `operations/`: Dataset operations with dimension-aware decorators
- `examples/`: Jupyter notebooks demonstrating usage patterns

## Key Conventions

- Use `promote_attrs=True` when converting DataArrays to Datasets to preserve
metadata
- Coordinate names follow spatial conventions: 't' (time), 'c' (channel),
'x'/'y'/'z' (space)
- Error handling validates scale factors against current image dimensions before
processing
- NGFF axes metadata includes type classification (time/channel/space) and
optional units
2 changes: 1 addition & 1 deletion .github/workflows/notebook-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ on: [push, pull_request]

jobs:
run:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
name: Test notebooks with nbmake
steps:
- uses: actions/checkout@v4
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ jobs:
strategy:
max-parallel: 5
matrix:
os: [ubuntu-22.04, windows-2022, macos-12]
python-version: ["3.10", "3.11", "3.12"]
os: [ubuntu-24.04, windows-2022, macos-13]
python-version: ["3.10", "3.11", "3.12", "3.13"]

steps:
- uses: actions/checkout@v4
Expand All @@ -20,7 +20,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install -e ".[test]"
python -m pip install -e ".[test,itk,dask-image]"
- name: Test with pytest
run: |
pytest --junitxml=junit/test-results.xml
Expand Down
8 changes: 0 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,14 +153,6 @@ method and `assign_coords`, equivalent to `xr.Dataset` `assign_coords` method.
Store as an Open Microscopy Environment-Next Generation File Format ([OME-NGFF])
/ [netCDF] [Zarr] store.

It is highly recommended to use `dimension_separator='/'` in the construction of
the Zarr stores.

```python
store = zarr.storage.DirectoryStore('multiscale.zarr', dimension_separator='/')
multiscale.to_zarr(store)
```

**Note**: The API is under development, and it may change until 1.0.0 is
released. We mean it :-).

Expand Down
4 changes: 3 additions & 1 deletion examples/ConvertPyImageJDataset.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@
"source": [
"import sys, os\n",
"!conda install --yes --prefix {sys.prefix} -c conda-forge openjdk=8\n",
"os.environ['JAVA_HOME'] = os.sep.join(sys.executable.split(os.sep)[:-2] + ['jre'])\n",
"# In case of being already installed through pixi it should not be set to this path.\n",
"if 'JAVA_HOME' not in os.environ:\n",
" os.environ['JAVA_HOME'] = os.sep.join(sys.executable.split(os.sep)[:-2] + ['jre'])\n",
"!{sys.executable} -m pip install multiscale-spatial-image matplotlib zarr pyimagej"
]
},
Expand Down
15 changes: 12 additions & 3 deletions multiscale_spatial_image/multiscale_spatial_image.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,19 @@
import numpy as np
from collections.abc import MutableMapping, Hashable
from pathlib import Path
from zarr.storage import BaseStore
import zarr.storage
from multiscale_spatial_image.operations import (
transpose,
reindex_data_arrays,
assign_coords,
)

# Zarr Python 3
if hasattr(zarr.storage, "StoreLike"):
StoreLike = zarr.storage.StoreLike
else:
StoreLike = Union[MutableMapping, str, Path, zarr.storage.BaseStore]


@register_datatree_accessor("msi")
class MultiscaleSpatialImage:
Expand All @@ -33,7 +39,7 @@ def __init__(self, xarray_obj: DataTree):

def to_zarr(
self,
store: Union[MutableMapping, str, Path, BaseStore],
store: StoreLike,
mode: str = "w",
encoding=None,
**kwargs,
Expand All @@ -43,7 +49,7 @@ def to_zarr(

Metadata is added according the OME-NGFF standard.

store : MutableMapping, str or Path, or zarr.storage.BaseStore
store : StoreLike
Store or path to directory in file system
mode : {{"w", "w-", "a", "r+", None}, default: "w"
Persistence mode: “w” means create (overwrite if exists); “w-” means create (fail if exists);
Expand Down Expand Up @@ -125,6 +131,9 @@ def to_zarr(
ngff_metadata = {"multiscales": multiscales, "multiscaleSpatialImageVersion": 1}
self._dt.ds = self._dt.ds.assign_attrs(**ngff_metadata)

# Ensure zarr v2 format for NGFF v0.4
if "zarr_format" not in kwargs:
kwargs["zarr_format"] = 2
self._dt.to_zarr(store, mode=mode, **kwargs)

def transpose(self, *dims: Hashable) -> DataTree:
Expand Down
20 changes: 19 additions & 1 deletion multiscale_spatial_image/operations/operations.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,22 @@ def transpose(ds: Dataset, *args: Any, **kwargs: Any) -> Dataset:

@skip_non_dimension_nodes
def reindex_data_arrays(ds: Dataset, *args: Any, **kwargs: Any) -> Dataset:
return ds["image"].reindex(*args, **kwargs).to_dataset()
# Extract the first argument as indexers, and pass the rest as keyword arguments
if args:
indexers = args[0]
# Map positional arguments to their parameter names
reindex_kwargs = {}
if len(args) > 1:
reindex_kwargs["method"] = args[1]
if len(args) > 2:
reindex_kwargs["tolerance"] = args[2]
if len(args) > 3:
reindex_kwargs["copy"] = args[3]
if len(args) > 4:
reindex_kwargs["fill_value"] = args[4]
# Add any additional keyword arguments
reindex_kwargs.update(kwargs)
return ds["image"].reindex(indexers, **reindex_kwargs).to_dataset()
else:
# Fall back to original behavior if no arguments
return ds["image"].reindex(**kwargs).to_dataset()
Loading
Loading