Port to iohub 0.3.2 ImageArray API by alxndrkalinin · Pull Request #407 · mehta-lab/VisCy

alxndrkalinin · 2026-04-22T00:28:27Z

Summary

Staging commit a10d4c4 bumped packages/viscy-data to iohub>=0.3.2, which rewrote ImageArray and requires Python ≥3.12. Fresh uv sync breaks ("No solution found"); once forced through, 81 tests fail / 1090 pass against this tip because callers still use the removed .name / .tensorstore(context=...) attributes. This PR ports all callers to the new API while preserving the tensorstore backend (async concurrent reads with data_copy_concurrency / cache_pool_bytes).

After this PR: fresh sync resolves cleanly, 8 fail / 1163 pass — the remaining 8 are the pre-existing out-of-scope failures listed below, not caused or touched by this change.

Commits

chore(deps) — bump requires-python >=3.12 in root + 9 sub-project pyprojects, drop the now-unsupported Python :: 3.11 classifier, regenerate uv.lock (iohub==0.3.2).
refactor(viscy-data) — ImageArray.name → ImageArray.path at 5 call sites (3 source files). Sliding-window sample-id tuple is emitted as f"/{img.path}" so PredictionWriter._create_image / MaskTestDataset continue to parse the 5-part /A/1/0/array layout via split("/").
refactor(data) — port .tensorstore(context=...) → .native and move tensorstore context config from per-dataset to per-plate (at open_ome_zarr time) via iohub.core.config.TensorStoreConfig.
refactor(data): use context managers for one-shot plate openers — wrap generate_normalization_metadata and run_qc_metrics plate opens in with blocks, per CLAUDE.md's context-manager rule.
refactor(data): drop unused cache_pool_bytes from Dataset classes — remove the no-op param from TripletDataset / MultiExperimentTripletDataset and the three call sites that still passed it. DataModule-level cache_pool_bytes stays (it's the one that actually feeds TensorStoreConfig at plate-open time), so YAML configs are unaffected.
docs(data): align docstrings and error text with .name -> .path port — update HCSStackIndex.image comment, _read_img_window return-docstring, and _get_windows error message to match the /A/1/0/array format the port emits.
fix(docs): address Copilot review typos — Z-sclice → Z-slice in prediction_writer.py; fix stale qc_metrics.run_qc_metrics reference in focus.py comment (real function is generate_qc_metadata).

How the tensorstore backend is preserved

iohub 0.3.2 removed per-array .tensorstore(context=...). The context is now set once at plate open:

from iohub.core.config import TensorStoreConfig
with open_ome_zarr(
    path,
    implementation="tensorstore",
    implementation_config=TensorStoreConfig(
        data_copy_concurrency=N,
        cache_pool_bytes=M,
    ),
) as plate:
    handle = plate[pos]["0"].native  # raw ts.TensorStore
    data = handle[sel].read().result()
    # or ts.stack([handle[sel].translate_to[0] for _ in patches]).read().result()

Four plate openers are upgraded to the tensorstore impl — the only ones in the repo whose positions flow into .native[sel].read().result() / ts.stack(...) reads:

Opener	Call site
`TripletDataModule._align_tracks_tables_with_positions`	`packages/viscy-data/src/viscy_data/triplet.py`
`MultiExperimentIndex._store_cache` opener	`applications/dynaclr/src/dynaclr/data/index.py`
`generate_normalization_metadata`	`packages/viscy-utils/src/viscy_utils/meta_utils.py`
`run_qc_metrics` → `generate_qc_metadata`	`applications/qc/src/qc/qc_metrics.py`

SLURM-aware CPU count (SLURM_CPUS_PER_TASK → os.cpu_count() fallback) is preserved. cache_pool_bytes is owned by the DataModule and threaded into TensorStoreConfig at plate-open time. The Dataset classes no longer carry the parameter.

All downstream .oindex[...], .translate_to[...], ts.stack(...).read().result() — including the batched-read pattern documented in applications/dynaclr/CLAUDE.md — work unchanged because .native on a tensorstore-backed plate returns exactly the same raw ts.TensorStore the old .tensorstore(context=ctx) factory returned. Other openers (hcs.py, mmap_cache.py, segmentation.py, cell_index.py, etc.) stay on the default zarr impl — they only use .oindex / img[...] / metadata, which work on both impls.

recheck_cached_data="open" is no longer plumbable through iohub and has been dropped; benign for the read-only training/predict paths that were the only consumers.

Smoke verification

handle type: TensorStore
slab dtype/shape: float32 (44, 10, 10)
ts.stack shape: (3, 10, 10)

An HCSDataModule smoke test on real SEC61B zarr builds and iterates through a batch end-to-end.

Not fixed in this PR (pre-existing, unrelated)

These 8 tests are still red on the base branch and this PR does not attempt them:

applications/airtable/tests/test_database.py — 2 failures (mock call-count drift, env-vars fixture).
applications/dynaclr/tests/test_multi_experiment_integration.py — 4 failures (missing configs/training/{fit,multi_experiment_fit}.yml after the dynaclr-dino / cleanup PRs).
applications/dynaclr/tests/test_training_integration.py — 2 failures (one config-missing, one pytorch_metric_learning labels shape regression).

Test plan

uv sync --all-packages --all-extras resolves cleanly on a fresh venv.
Smoke: open_ome_zarr(..., implementation="tensorstore", implementation_config=TensorStoreConfig(data_copy_concurrency=8)) on SEC61B zarr returns a TensorStore handle with working .read().result() and ts.stack.
uv run pytest across packages + applications: 8 failed / 1163 passed / 1 xfailed (vs. 81 failed / 1090 passed pre-port).
HCSDataModule setup + batch iteration works end-to-end against real zarr.
Full dynacell fit --trainer.fast_dev_run=true on a GPU (deferred — blocked on a /-partition space issue unrelated to this PR).

🤖 Generated with Claude Code

iohub 0.3.2 only ships wheels for Python >=3.12, so a workspace that still advertises >=3.11 forces uv to resolve for a Python where the viscy-data dep cannot be satisfied ("No solution found" during fresh sync). Bumps the pin on the root workspace plus every sub-project (four packages and five applications) and drops the now-unsupported Python 3.11 classifier from each pyproject. uv.lock shrinks significantly because large portions of the 3.11-only resolve graph drop out. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

iohub 0.3.2 removed ImageArray.name; .path returns the equivalent zarr node path (minus the leading slash on the zarr-python backend these call sites use). Prefix a literal "/" in the sliding-window sample-id tuple so downstream consumers that split the path — notably PredictionWriter._create_image and MaskTestDataset — continue to see the five-part /A/1/0/array layout they already parse via split("/"). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

iohub 0.3.2 dropped per-array ImageArray.tensorstore(context=...). The tensorstore context is now set once at plate-open time via open_ome_zarr(implementation="tensorstore", implementation_config=TensorStoreConfig(...)), and the raw ts.TensorStore handle is reachable through ImageArray.native. Moves context construction (data_copy_concurrency from SLURM_CPUS_PER_TASK or os.cpu_count(), cache_pool_bytes from the DataModule config) out of the Dataset layer and onto the plate opener in each of the four call paths that actually need tensorstore reads: TripletDataModule, MultiExperimentIndex (for MultiExperimentTripletDataset), viscy-utils meta_utils grid-sample, and qc.run_qc_metrics. Every downstream .oindex / .translate_to / ts.stack(...).read() call — including the batched-read pattern documented in applications/dynaclr/CLAUDE.md — keeps working unchanged because .native returns the same raw ts.TensorStore the old .tensorstore(context=ctx) factory used to return. Dataset-level cache_pool_bytes params are retained as no-ops for call-site compat, documented as such. The recheck_cached_data="open" kwarg is no longer plumbable through iohub's API; dropping it is benign for the read-only training and predict paths that were the only consumers. qc/tests/test_focus.py::test_focus_slice_metric_call is updated to open its fixture plate with the tensorstore impl, matching how FocusSliceMetric is invoked from production qc_metrics.run_qc_metrics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Ports the codebase to iohub>=0.3.2’s updated ImageArray API and preserves the tensorstore backend by configuring tensorstore at plate-open time (via TensorStoreConfig) and switching call sites from .tensorstore(context=...) to .native.

Changes:

Bump requires-python to >=3.12 across the root project and multiple sub-projects, removing the Python 3.11 classifier.
Update ImageArray.name usages to ImageArray.path, and adjust sliding-window sample IDs to keep the existing "/A/1/0/array" parsing behavior.
Move tensorstore configuration to open_ome_zarr(..., implementation="tensorstore", implementation_config=TensorStoreConfig(...)) and update downstream reads to use .native.

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
pyproject.toml	Raise minimum Python to 3.12; drop 3.11 classifier.
packages/viscy-utils/src/viscy_utils/meta_utils.py	Open plates with tensorstore + `TensorStoreConfig`; switch sampling to `.native`.
packages/viscy-utils/src/viscy_utils/callbacks/prediction_writer.py	Replace `image.name` with `image.path` in resize debug logging.
packages/viscy-utils/pyproject.toml	Raise minimum Python to 3.12; drop 3.11 classifier.
packages/viscy-transforms/pyproject.toml	Raise minimum Python to 3.12; drop 3.11 classifier.
packages/viscy-models/pyproject.toml	Raise minimum Python to 3.12; drop 3.11 classifier.
packages/viscy-data/src/viscy_data/triplet.py	Switch tensorstore access to `.native`; configure tensorstore at plate open with `TensorStoreConfig`.
packages/viscy-data/src/viscy_data/sliding_window.py	Replace `img.name` with `img.path`; emit sample IDs as `f"/{img.path}"` to preserve downstream parsing.
packages/viscy-data/src/viscy_data/segmentation.py	Replace `target_img.name` with `target_img.path` in debug logging.
packages/viscy-data/pyproject.toml	Raise minimum Python to 3.12; normalize `iohub>=0.3.2` spec formatting.
applications/qc/tests/test_focus.py	Open test plate using tensorstore impl + `TensorStoreConfig` to support `.native.read().result()`.
applications/qc/src/qc/qc_metrics.py	Open plate using tensorstore impl + `TensorStoreConfig` for metric execution.
applications/qc/src/qc/focus.py	Switch focus metric reads to `.native[...] .read().result()` and drop direct tensorstore context creation.
applications/qc/pyproject.toml	Raise minimum Python to 3.12; drop 3.11 classifier.
applications/dynaclr/src/dynaclr/data/index.py	Cache plates opened with tensorstore impl + configurable `TensorStoreConfig`.
applications/dynaclr/src/dynaclr/data/dataset.py	Switch tensorstore handle acquisition to `.native`; keep cache_pool_bytes as documented no-op.
applications/dynaclr/src/dynaclr/data/datamodule.py	Build and pass a shared `TensorStoreConfig` (SLURM-aware concurrency + cache pool) into the index.
applications/dynaclr/pyproject.toml	Raise minimum Python to 3.12; drop 3.11 classifier.
applications/dynacell/pyproject.toml	Raise minimum Python to 3.12; drop 3.11 classifier.
applications/cytoland/pyproject.toml	Raise minimum Python to 3.12; drop 3.11 classifier.
applications/airtable/pyproject.toml	Raise minimum Python to 3.12; drop 3.11 classifier.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Both generate_normalization_metadata and run_qc_metrics open the plate in read/write mode, use it through a sizable loop body, then close it explicitly at the end. A mid-loop exception would leak the write lock and the tensorstore handle. CLAUDE.md requires context managers for any external resource — wrap both openers in `with` blocks so cleanup happens on every exit path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The iohub 0.3.2 port moved tensorstore cache-pool config to the plate-open site (TripletDataModule and DynaCLR DataModule). The parameter on TripletDataset and MultiExperimentTripletDataset was retained in the previous commit as a no-op for call-site compat, but CLAUDE.md explicitly says to delete unused code rather than carry backwards-compat ballast. Removing it — and the three Dataset-construction call sites that still pass it — closes the loop. DataModule-level cache_pool_bytes stays (it's the one that actually feeds TensorStoreConfig at open time), so existing YAML configs are unaffected. Also updates the TripletDataModule docstring for cache_pool_bytes to describe the new plate-level semantic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Three text spots still described the pre-port shape: - HCSStackIndex.image comment still said "name" and showed the old four-segment format without the leading slash. - SlidingWindowDataset._read_img_window docstring still referred to the returned string as "image name". - The IndexError in _get_windows formatted the FOV as img_arr.path without the leading slash prefix the read sites prepend, so the error message's FOV identifier didn't match what the sample-id tuple stores. Update all three to match the ported format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- prediction_writer.py:63 — debug f-string said "Z-sclice"; fix to "Z-slice" to match the parameter docstring a few lines above. - qc/focus.py:54 — inline comment referenced a non-existent qc_metrics.run_qc_metrics; point to the actual orchestrator, generate_qc_metadata. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

edyoshikun · 2026-04-22T16:33:22Z

Here are some of my findings:

@srivarra can we expose some TensorStoreConfig variables? 0.3.2 file_io_concurrency for NSF I/O parallelism ,recheck_cached_data for the heavy load NSF reads of metadata. These are critical for I/O boosts. These need to be exposed for us to evalute.
Why do we have TensorStoreConfig instead of just the raw ts.Context (naive question). In VisCy we use the ts.Context. Currently, we do ts.stack([pos1["0"].native, pos2["0"].native, ...]).read().result() which works, but each .native call goes through iohub's Position wrapper and I think it re-parses .zattrs/.zarray on first access. An iohub helper like plate.stack_positions([path1, path2, ...], array="0") -> ts.TensorStore that batches the metadata reads would cut the per-FOV Python overhead. Even better: a plate-level "open all FOVs under this well" call that shares one ts.Context and one metadata read pass.
@srivarra do we always need to through the ImageArray? Can we go Position.array_tensorstore(array="0") that returns ts.TensorStore directly without building a wrapper? Or is this beause the default backend is zarrpython?

ieivanov · 2026-04-22T20:04:00Z

@srivarra is there an opportunity to introduce better iohub defaults what will reduce some of the boilerplate code in this PR?

alxndrkalinin and others added 3 commits April 21, 2026 17:02

alxndrkalinin requested a review from Copilot April 22, 2026 00:34

Copilot started reviewing on behalf of alxndrkalinin April 22, 2026 00:34 View session

Copilot AI reviewed Apr 22, 2026

View reviewed changes

Comment thread packages/viscy-utils/src/viscy_utils/callbacks/prediction_writer.py Outdated

Comment thread applications/qc/src/qc/focus.py Outdated

alxndrkalinin and others added 4 commits April 21, 2026 22:19

alxndrkalinin requested a review from edyoshikun April 22, 2026 06:22

ieivanov requested a review from srivarra April 22, 2026 19:40

ieivanov mentioned this pull request Apr 22, 2026

Zarr v2 -> Zarr v3 migration across computational imaging team's frameworks czbiohub-sf/biahub#199

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port to iohub 0.3.2 ImageArray API#407

Port to iohub 0.3.2 ImageArray API#407
alxndrkalinin wants to merge 7 commits intomodular-viscy-stagingfrom
port/iohub-0.3.2-api

alxndrkalinin commented Apr 22, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

edyoshikun commented Apr 22, 2026

Uh oh!

ieivanov commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

alxndrkalinin commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commits

How the tensorstore backend is preserved

Smoke verification

Not fixed in this PR (pre-existing, unrelated)

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

edyoshikun commented Apr 22, 2026

Uh oh!

ieivanov commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alxndrkalinin commented Apr 22, 2026 •

edited

Loading