Skip to content

Refactor applications/DynaCell into a reusable benchmark package#404

Draft
alxndrkalinin wants to merge 164 commits intomodular-viscy-stagingfrom
dynacell-models
Draft

Refactor applications/DynaCell into a reusable benchmark package#404
alxndrkalinin wants to merge 164 commits intomodular-viscy-stagingfrom
dynacell-models

Conversation

@alxndrkalinin
Copy link
Copy Markdown
Collaborator

@alxndrkalinin alxndrkalinin commented Apr 7, 2026

Summary

Makes applications/dynacell into the reusable dynacell package in VisCy.

Splits config responsibilities more cleanly:

  • VisCy keeps reusable dynacell code plus generic model recipes/examples.
  • Paper-specific benchmark state such as SEC61B run configs is removed from VisCy and is expected to live in dynacell-paper.

This PR also adds opt-in validation loss support for CellDiff flow-matching so benchmark reruns can log a scalar validation curve without forcing extra validation work on every flow-matching job.

What changed

Config and package reorg

  • Move reusable configs from applications/dynacell/examples/configs/ into applications/dynacell/configs/.
  • Keep configs/recipes/ for reusable fragments and configs/examples/ for generic runnable examples.
  • Refresh generic fit/predict examples for CellDiff, FNet3D, and UNetViT3D, and add generic UNeXt2 fit/predict examples.
  • Remove committed lightning_logs/, SEC61B paper configs, and the hard-coded hcs_sec61b_3d.yml recipe.
  • Update the dynacell README, config-discovery tests, and local ignore rules to match the new layout.

Absorb reusable benchmark modules from dynacell-paper

  • Add dynacell.data with path-based manifest/collection/spec loaders and Pydantic schemas.
  • Add dynacell.reporting with Hydra config, tables, figures, and tests.
  • Add dynacell.evaluation with configs, spectral-PCC tooling, I/O helpers, segmentation, metrics, and tests.
  • Add dynacell.preprocess with the reusable config.py and zarr_utils.py utilities.

Package and CLI shape

  • Extend the dynacell CLI so evaluate and report route to Hydra entry points while fit/predict/test/validate still go through viscy_utils.
  • Ship Hydra _configs inside the Python package so installed dynacell evaluate/report commands resolve config from package data instead of the source tree.
  • Make dynacell.__init__ lazy so lightweight imports do not pull in the full training stack.
  • Add optional dependency extras for eval, report, and preprocess, plus clearer missing-extra install hints from the CLI router.

Runtime, training, and transform fixes

  • Add overwrite= support to HCSPredictionWriter, including explicit handling/logging when prediction channels already exist.
  • Adjust VisCyCLI checkpoint hparam precedence so predict/test/validate honor user config over stale checkpoint values, while fit resume still keeps checkpoint training hparams.
  • Add direct ckpt_path loading to DynacellFlowMatching so predict-time settings come from config rather than checkpoint metadata.
  • Add opt-in compute_validation_loss to DynacellFlowMatching. When enabled it logs loss/val/<idx> and aggregate loss/validate while preserving epoch-end ODE sample generation; when disabled it keeps the previous cheaper validation path.
  • Extract a shared _aggregate_validation_losses() helper so UNet and flow-matching use the same weighted aggregation logic.
  • Skip the strict HCS spatial-shape check when GPU augmentations intentionally handle cropping.
  • Extend BatchedRandAffined with configurable padding_mode, safe_crop_size, and safe_crop_coverage, plus warnings/tests around unsupported X/Y-rotation guarantees.

Evaluation and reporting follow-up fixes

  • Fix corr_coef/PCC computation and add regression tests.
  • Make cached evaluate_model() returns consistent with fresh evaluation output types.
  • Fix report barplots for mixed-result inputs where some models are missing requested metrics.
  • Respect use_gpu in evaluation instead of always taking the CUDA path when visible.
  • Raise FileNotFoundError for missing preprocess configs instead of failing less clearly.

Benchmark schema + launcher (landed earlier on this branch)

  • Commit the configs/benchmarks/virtual_staining/ layout with shared axes (train_sets/, targets/, model_overlays/, launcher_profiles/, predict_sets/) and per-leaf train/predict files composed via viscy_utils.compose.load_composed_config.
  • Migrate the CellDiff train + predict configs for ER/mito/nucleus/membrane onto schema leaves; add FNet3D paper, UNetViT3D, and UNeXt2 train leaves for ER.
  • Add applications/dynacell/tools/submit_benchmark_job.py with --dry-run, --print-script, --print-resolved-config, and --override. It composes a leaf, strips the launcher: + benchmark: reserved keys, freezes the resolved config to {run_root}/resolved/, and renders an sbatch script to {run_root}/slurm/.
  • Extend _maybe_compose_config in viscy_utils/cli.py so direct dynacell fit -c <leaf> also strips those reserved keys.
  • Add a ckpt_sha256_12 sidecar to the evaluation cache so repeated lookups skip full-file hashing.

Topology / trainer-recipe ownership cleanup

Four commits (9734d07 → f9b8f1e → 5b7eaae → 3fdb7cf) untangle three layers that were previously mingled in recipes/trainer/fit_*.yml:

  • Runtime profile renamed. runtime_single_gpu.yml → runtime_shared.yml: the file's content (srun + PYTHONUNBUFFERED/NCCL_DEBUG/PYTHONFAULTHANDLER env vars) has nothing single-GPU-specific, and every leaf — including the 4-GPU UNeXt2 leaf — composed it. Rename + 14 leaf-base updates + schema-doc references.
  • New topology layer (recipes/topology/single_gpu.yml + ddp_4gpu.yml) owns trainer.accelerator/devices/strategy/num_nodes. Duplicated independently under applications/dynacell/configs/recipes/topology/ and applications/cytoland/examples/configs/recipes/topology/ because CLAUDE.md forbids cross-application imports.
  • Unified trainer recipesrecipes/trainer/fit.yml + predict.yml per app, replacing fit_1gpu/fit_4gpu/fit_fm_4gpu/predict_gpu. Mode invariants only (seed, log cadence, callbacks, WandbLogger class pinned with per-app project). Topology and precision are owned by the topology recipes and model overlays respectively. Every benchmark model overlay and example leaf now composes [fit.yml + topology/*.yml] and sets precision + max_epochs explicitly.
  • Hardware profiles (hardware_h200_single, hardware_gpu_any_long, hardware_4gpu) drop their trainer: block; they now own only launcher.sbatch.*.
  • Intentional behavior deltas (spelled out, not masked): strategy: ddp → auto on every single-GPU leaf (Lightning-equivalent at devices=1, just intent-honest); dynacell examples fnet3d/unetvit3d/unext2/fit.yml and benchmark unext2.yml now pin WandbLogger project=dynacell instead of falling through to Lightning's default TensorBoard; fit_1gpu-derived paths move from save_top_k: 4 to save_top_k: 5.
  • Preserved via leaf-body overrides: FNet3D paper keeps precision: 32-true; examples/celldiff/fit.yml keeps FM-style epoch-based checkpointing (save_top_k: -1, every_n_epochs: 10, no monitor); cytoland FCMAE pretraining keeps strategy: ddp_find_unused_parameters_true; cytoland→dynacell bridge configs keep project: dynacell override over cytoland's default.
  • LEGACY deletion: applications/dynacell/tools/LEGACY/examples_configs/ (pre-schema CellDiff configs) is removed outright per CLAUDE.md's "avoid backwards-compatibility hacks" rule. The equivalence tests that composed LEGACY (test_*_leaf_matches_legacy, test_fnet3d_paper_leaf_matches_ran_config, test_byte_equivalence_sec61b_train_leaf) are replaced with forward-looking composition sanity tests.
  • Plan + review trail: plan is in ~/.claude/plans/vectorized-sleeping-clock.md; commits ran through /review-plan and /verify-plan subagent rounds before landing.

Dependency and packaging updates

  • Pin microssim to a compatible Git revision and refresh uv.lock so the new dynacell[eval] extra resolves with the current workspace.
  • Allow direct references in hatch metadata so the workspace can build/package those dependency pins cleanly.

Why

This is the VisCy side of the dynacell-paper split:

  • dynacell in VisCy becomes the reusable runtime.
  • dynacell-paper owns benchmark-instance configs, orchestration, and manuscript-specific assets.

That gives us a cleaner boundary for future work like multi-experiment collections and benchmark-specific runners without keeping paper state in the shared application package.

The flow-matching validation-loss change is intentionally opt-in because it adds extra validation compute and the resulting metric is still stochastic; default flow-matching checkpointing remains epoch-based.

The topology / trainer-recipe cleanup at the end removes a silent-drop-DDP trap: before this change, pairing a mismatched trainer recipe and benchmark hardware profile let the hardware profile's trainer.devices win (both layers redundantly set it), which could silently run a DDP training on 1 GPU. Topology is now the single source of truth.

Verification

  • uvx ruff check applications/dynacell/
  • uvx ruff format --check applications/dynacell/
  • .venv/bin/python -m pytest applications/dynacell/tests -q
  • .venv/bin/python -m pytest applications/dynacell/tests/test_data_manifests.py applications/dynacell/tests/test_reporting_tables.py applications/dynacell/tests/test_reporting_tables_extended.py applications/dynacell/tests/test_reporting_figures.py applications/dynacell/tests/test_evaluation_metrics.py applications/dynacell/tests/test_evaluation_pipeline.py applications/dynacell/tests/test_evaluation_io.py applications/dynacell/tests/test_preprocess_config.py applications/dynacell/tests/test_preprocess_zarr_utils.py applications/dynacell/tests/test_cli_routing.py -q
  • .venv/bin/python -m pytest applications/dynacell/tests/test_engine.py -q
  • .venv/bin/python -m pytest applications/dynacell/tests/test_training_integration.py::test_celldiff_fm_validation_loss_keeps_generation applications/dynacell/tests/test_training_integration.py::test_celldiff_fm_warmup_cosine_fast_dev_run applications/dynacell/tests/test_training_integration.py::test_celldiff_fm_constant_schedule_fast_dev_run -q
  • .venv/bin/dynacell fit --print_config -c applications/dynacell/configs/examples/unext2/fit.yml
  • uv run pytest applications/dynacell/tests/test_benchmark_config_composition.py applications/dynacell/tests/test_submit_benchmark_job.py -v (benchmark schema + submit-tool tests)
  • Topology/trainer cleanup end-to-end:
    • Composition snapshot of 35 touched leaves before vs after the cleanup; only expected intentional deltas (strategy ddp → auto at devices=1, WandbLogger added where previously default TB).
    • uv run dynacell fit -c <benchmark-unext2-leaf> --trainer.fast_dev_run=true --trainer.devices=1 --trainer.strategy=auto ran one training step cleanly on an A40 interactive node (loss 1.048).
    • submit_benchmark_job.py <benchmark-unext2-leaf> queued a real 4-GPU DDP job (31122607) via the new schema path.

alxndrkalinin and others added 30 commits April 7, 2026 14:28
When gpu_augmentations include a spatial crop (e.g. BatchedCenterSpatialCropd),
the output shape intentionally differs from z_window_size/yx_patch_size.
The validation was raising a false ValueError for configs like UNeXt2
(z_window=20 read, gpu crop to 15) and CellDiff (z_window=13 read, crop to 8).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
z_window_size 15→20: restores the 5-slice Z margin for affine augmentation
(20 read → 15 after GPU center crop), matching the original VSCyto3D
finetune_3d.py pipeline.

batch_size 8→64 with num_samples 2→4: each GPU now processes 16 samples
(~13 GB VRAM) instead of 2 (~2 GB), reducing GPU idle time from ~97% to
~84%. LR scaled by sqrt(8) for Adam (0.0002→0.0006).

SLURM: mem-per-cpu 12G→20G for /dev/shm mmap buffer headroom. Removed
--ckpt_path for fresh training with new hyperparameters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…l crop

z_window 8→13 with spatial_size [13,624,624]: provides 5-slice Z margin
(13→8 after GPU center crop) and 112px YX margin (624→512) for affine
augmentation artifacts, matching the UNeXt2 two-stage crop strategy.

batch_size 4→8: doubles throughput without VRAM pressure.

Phase3D normalization: median/iqr→mean/std to match UNeXt2 and the
original VSCyto3D pipeline.

val_gpu_augmentations: DivisibleCropd→BatchedCenterSpatialCropd because
CellDiff's ViT requires exact input_spatial_size [8,512,512] (fixed
positional embeddings), not just divisible dimensions.

max_epochs 200→10: initial smoke-test run before scaling up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
crop_at_read was removed in the mmap preload refactor but this error
message still referenced it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Spell out full read/input sizes in header comment.
Add single-GPU SLURM script for A100/H100 (80GB).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add scratch_dir: /dev/shm and /dev/shm cleanup to SLURM trap,
matching the UNeXt2 config. 256G RAM allocation covers the 86 GB
mmap buffer in tmpfs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fit_unext2_continue.yml had no ckpt_path, causing the job to train
from scratch instead of resuming from epoch=15 (step=10128).
Points to last-v1.ckpt which is the actual latest checkpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reduce lr 0.0006→0.0004 and batch_size 64→32 to fit within node RAM
budget. Fix val_gpu_augmentations: BatchedDivisibleCropd with k=
[1,64,64] left Z at 20 slices (no Z reduction); replace with
BatchedCenterSpatialCropd roi_size=[15,384,384] to match the model's
expected input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ViT with 8×512×512 patches is significantly larger than UNeXt2;
batch_size=8 OOMs on a single GPU. Reduce to 2 (1 FOV → 2 patches).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previous checkpoints used wrong val aug (BatchedDivisibleCropd,
no Z reduction) and different lr/batch_size. Resuming would mix
incompatible training dynamics. Start fresh with corrected config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Single GPU gave 10,816 steps/epoch (2h) due to 32 overlapping
z-windows per FOV × 338 FOVs. With 4-GPU DDP at batch=2/GPU, steps
drop to 2,704/epoch (~31 min). mmap buffer is OS-shared so RAM stays
flat. Reduce num_workers 8→4 (4 workers × 4 ranks = 32 processes
for 32 CPUs).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Kornia 0.8.x hard-codes padding_mode='zeros' in RandomAffine3D.
Add _PaddedRandomAffine3D subclass that overrides apply_transform
to pass the user-specified mode to warp_affine3d (which already
supports it).

Expose padding_mode='border'|'reflection' in BatchedRandAffined
for configs where crop/output ratio < √2 and the oversized border
cannot absorb large rotations without zero-corner artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
624/512 = 1.22 < √2, so the 56px oversized-crop border cannot absorb
rotations larger than ~14°. With rotate_range=±π, large rotations
leave visible zero-corner gaps in the training batches (not an issue
for UNeXt2 which has ratio 600/384=1.56 > √2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LightningCLI._parse_ckpt_path merges checkpoint hyper_parameters back
into the config before class instantiation, so a checkpoint trained
with predict_method='generate' would silently override the user's
predict_method='sliding_window' in the YAML config.

Fix by mirroring DynacellUNet: add ckpt_path to __init__, load state
dict there directly, and exclude it (along with all predict-time params)
from save_hyperparameters. With no top-level ckpt_path in the config,
_parse_ckpt_path is never triggered.

Move ckpt_path and output_store from the predict_gpu.yml recipe into
the per-experiment predict config; recipes stay as pure templates.
Add sec61b/predict_celldiff.yml as the canonical SEC61B CellDiff
predict config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LightningCLI._parse_ckpt_path applies checkpoint hyper_parameters as
the highest-priority config layer, overriding values the user explicitly
set in the YAML. The correct hierarchy is:
  base-class defaults → checkpoint hparams → user config

Snapshot model init_args before the checkpoint merge and restore them
after, so any value present in the user's config always wins.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When rotation + zoom-out combine, the backward-warp footprint can
exceed the source crop, creating zero-corner artifacts. The new
safe_crop_size parameter computes a per-sample scale floor from the
sampled rotation angle: s_min = coverage * k(θ) * D / S, where
k = |cos θ| + |sin θ|. safe_crop_coverage (default 1.0) relaxes the
constraint — 0.9 allows small corners as extra augmentation while
eliminating the worst ~30% of artifacts.

Also removes padding_mode=border from CellDiff config (the scale
clamping makes it unnecessary).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reorganize into four sections: Project & Context (what is this),
Development (how to get running), Project Conventions (patterns here),
Engineering Standards (how to write code).  Remove duplicate Code Style
heading.  Consistent structure with dynacell CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
`str or None` → `str | None` for consistency with the rest of the
file and project conventions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The docstring said validation always runs after gpu_augmentations,
but the code skips validation when gpu_augmentations is present
(they handle cropping themselves). Update to match actual behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When overwrite=True, existing prediction channels in the output store
are silently reused instead of raising FileExistsError. Default False
preserves the previous error-on-duplicate behavior. Needed for
re-running predictions on the same output store during iteration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SEC61B benchmark configs (fit, predict, SLURM scripts) are paper-specific
and belong in the dynacell-paper repo. The hcs_sec61b_3d data recipe
hardcodes HPC paths. All preserved on preserve/sec61b-configs branch.

Also adds .gitignore for lightning_logs/ and outputs/, and updates
README to reference the new config layout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three-layer config model:
- configs/recipes/ — reusable fragments (model, trainer, data, modes)
- configs/examples/ — generic fit/predict pair per model family

Fix base: references (../recipes/ → ../../recipes/), scrub hardcoded
SEC61B paths from celldiff predict, add missing ckpt_path to fnet3d
and unetvit3d predict, change preload default to false in generic
recipe, update test config discovery path and __main__ docstring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
UNeXt2 had a model recipe but no generic example. Adds fit.yml and
predict.yml following the same pattern as fnet3d/unetvit3d/celldiff
examples. Config discovery test now finds 8 leaf configs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Absorb dataset manifest schemas from dynacell-paper and add new
benchmark collection and spec schemas per VISCY_HANDOFF.md.

- manifests.py: DatasetManifest, TargetConfig, VoxelSpacing,
  StoreLocations, SplitDefinition + load_manifest, load_splits,
  get_target
- collections.py: Provenance, ChannelEntry, CollectionExperiment,
  BenchmarkCollection + load_collection
- specs.py: BenchmarkSpec + load_benchmark_spec

No registry dict, no import-time side effects. Callers pass paths
explicitly. Add pydantic>=2 and omegaconf to base dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Reuse viscy_data.collection.ChannelEntry instead of duplicating it
- Extract shared OmegaConf+Pydantic loading to _yaml.load_yaml()
- Keep Provenance local (stricter than viscy_data version: required
  created_at/created_by for benchmark traceability)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Migrate tables, figures, and Hydra CLI entry point for benchmark
reporting. tables.py is pure pandas, figures.py preserves Agg backend
ordering, cli.py uses parents[3] for config resolution.

Tests use module-level pytest.importorskip() for pandas/matplotlib
so they skip gracefully without those deps installed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Migrate rewrite_zarr() and load_preprocess_config() — the clearly
reusable, dependency-light preprocess helpers. Dataset-specific
utilities (selection, segmentation, workflow) remain in dynacell-paper.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lightning subcommands (fit, predict, test, validate) delegate to
viscy_utils.cli.main(). Hydra subcommands (evaluate, report) lazily
import their entry points. ModuleNotFoundError prints an install hint
for the missing extra instead of a raw traceback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Full evaluation pipeline: pixel metrics (PCC, SSIM, NRMSE, PSNR),
segmentation metrics (Dice, IoU), feature metrics (DINOv3, DynaCLR FID),
spectral PCC subpackage, and Hydra CLI entry point.

Heavy deps (segmenter_model_zoo, aicssegmentation, cubic, microssim,
transformers, dynaclr, skimage) gated with try/except ImportError.
Pipeline imports segmentation/utils lazily inside functions so
`import dynacell.evaluation.pipeline` succeeds without heavy deps.

eval.yaml uses OmegaConf ??? markers for all required fields so
running without overrides gives MissingMandatoryValue, not AttributeError.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- eval: all deps for dynacell evaluate (segmentation, metrics, cubic,
  transformers, dynaclr, etc.)
- report: pandas + matplotlib + hydra for dynacell report
- preprocess: iohub + tqdm for zarr rewriting utilities

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
alxndrkalinin and others added 9 commits April 20, 2026 18:47
Restructure applications/dynacell/configs/benchmarks/virtual_staining/
so train, predict, and eval leaves for the same benchmark cell live
under one directory `<org>/<train_set>/<model>/`, with shared building
blocks split by consumer:

- `shared/model/` — train/predict building blocks (train_sets,
  predict_sets, targets, model_overlays, launcher_profiles)
- `shared/eval/` — HPC-bound Hydra eval groups (target, feature_extractor/dynaclr)
- `<org>/<train_set>/<model>/{train.yml, predict/<predset>.yml, eval/<predset>.yaml}`

Fold `configs/evaluation/` into the benchmark tree and delete the
sibling root. The eval selector renames from `benchmark=<path>` to
`leaf=<path>` to reflect the new filesystem model; leaves are addressed
through a committed symlink tree at `virtual_staining/leaf/` because
Hydra's group resolution requires a physical `leaf/` prefix. Eval
leaves keep `.yaml` (Hydra only discovers `.yaml` for groups); train
and predict leaves stay `.yml`.

`dynacell.__main__` now injects two `hydra.searchpath` roots — the
benchmark tree (for `leaf/`) and `shared/eval/` (for `target/` and
`feature_extractor/dynaclr/`). The `_EXTERNAL_CONFIGS_SUBPATH` constant
becomes `_EXTERNAL_SEARCHPATHS`, and the helper is renamed to
`_external_configs_dirs` (plural, returning a list).

Also fixes a pre-existing bug: the three unetvit3d train leaves under
mito/nucleus/membrane referenced a non-existent `runtime_single_gpu.yml`
launcher profile; they now use `runtime_shared.yml`. Model-overlay
`base:` entries targeting `recipes/` gain an extra `..` segment to
account for the new `shared/model/` depth.

Tests:
- test_benchmark_config_composition.py: path literals, three new
  (unetvit3d × {mito,nucleus,membrane}) pairs, new
  test_eval_leaf_symlink_resolves parametrised over all 8 eval leaves.
- test_cli_routing.py: three `_external_configs_dir` patches renamed
  (plural) and argv selector strings updated to `leaf=`.
- test_submit_benchmark_job.py: parametrised leaf paths.

Docs: virtual_staining/README.md (full rewrite to match new tree),
evaluation/README.md (groups table, external-users section, benchmark
eval leaves section), UNEXT2_VS_FCMAE_CLASSES.md (stale path), dynacell/
README.md (sbatch example path), BENCHMARK_CONFIG_SCHEMA.md (Hydra
search-path contract note about the `leaf/` symlink adapter).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…l/ move

The preceding refactor commit renamed train and predict leaves to
<org>/<train_set>/<model>/{train.yml, predict/<predset>.yml} but the
content updates that prepend the shared/model/ segment to every base:
line were unstaged when that commit landed. This catches the 25 train
and predict leaves up with the new shared/model/ layout, and tightens a
prose comment in fnet3d_paper_fit.yml that still referenced the
pre-refactor shared/targets/ path.

Without this fix a fresh checkout of the reorg commit has train and
predict leaves pointing at shared/train_sets/, shared/targets/, etc. —
paths that no longer exist — so `load_composed_config` would raise
FileNotFoundError on every leaf.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trim the WHAT-narration from docstrings and comments in
dynacell.__main__: drop the redundant "two roots are injected"
enumeration (the tuple is self-describing), fix the "repo root"
claim in _external_configs_dirs (the walk stops at the app's
pyproject.toml, not the workspace root), and drop the
"joined with commas" line from _inject_external_configs' docstring.
Bind ``p = parent / sub`` via walrus inside the existence filter
so the path is computed once instead of twice per subpath.

In the composition tests, delete the ``EVAL_LEAVES = PREDICT_LEAVES``
alias (the aliased list was the same object, causing silent coupling)
and drive the symlink-integrity test off ``PREDICT_LEAVES`` directly.
Drop the ``@package _global_`` header assertion from the symlink test
— that check belongs to composition, not to symlink integrity. Add a
``REPO_ROOT`` sanity assert so drift of the ``parents[3]`` magic
count fails loudly.

Tighten ``TestInjectExternalConfigs`` class docstring to a single
line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the "in-package / repo checkout" shorthand in the config-groups
table with concrete source paths, so readers see exactly where each
group lives after the reorg: shared/eval/ for HPC-bound groups, leaf/
for the eval-leaf symlink tree, _configs/ for packaged schema.

Tighten the "Running an evaluation" intro so it no longer implies all
groups live under _configs/, and spell out the two hydra.searchpath
roots dynacell.__main__ injects.

Add a footgun entry for external users about Hydra's .yaml-only group
resolution — saving a group as .yml makes the selector silently fail
with MissingConfigException.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Keep the top level of virtual_staining/ biology-only: `er/`, `membrane/`,
`mito/`, `nucleus/` plus README.md. Everything else — shared building
blocks and the leaf/ Hydra symlink adapter — now lives under a single
_internal/ directory whose leading underscore signals "implementation
detail; don't browse here for science."

Layout is now:

    virtual_staining/
      <org>/<train_set>/<model>/{train.yml, predict/, eval/}
      _internal/
        shared/
          model/         # train/predict building blocks
          eval/          # target/, feature_extractor/dynaclr/
        leaf/            # symlink tree aliasing canonical eval leaves

dynacell.__main__ injects two hydra.searchpath roots under _internal/
(the tree root for leaf/, and shared/eval/ for target + dynaclr) —
same cardinality as before, just under the hidden support tree.

Train/predict leaves get `_internal/` prepended to every base: path.
Model overlays gain one more `../` to reach configs/recipes/ (now 6
segments deep instead of 5). The 8 leaf/ symlinks are recreated with
6 relative-dot segments (was 5 at virtual_staining/leaf/, now 6 at
virtual_staining/_internal/leaf/).

Docs refreshed across BENCHMARK_CONFIG_SCHEMA.md, virtual_staining/
README.md, and evaluation/README.md: every layout tree, base: example,
group-source column, and search-path reference points at _internal/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related bugs in predict-path tiling:

P1. DynacellFlowMatching.predict_step routed predict_method='sliding_window'
to CELLDiff3DVS.generate_sliding_window(), which is the non-overlapping
tiler and does not consume predict_overlap. Any config that set both
predict_method='sliding_window' and predict_overlap=[...] silently got
non-overlapping tiles, producing different predictions than the user
expected. Now raise ValueError when sliding_window is combined with a
non-zero overlap, pointing users to predict_method='iterative' (the
overlap-anchored velocity tiler) instead. Also refresh the docstring
to spell out which methods consume predict_overlap. Fix the membrane
predict leaf to match its peers (er, mito): predict_method=iterative.
Example config at configs/examples/celldiff/predict.yml gets the same
fix plus an inline comment clarifying which methods overlap.

P2. DynacellUNet.predict_sliding_window allocated both accumulator
buffers (prediction_sum, prediction_count) with zeros_like(source),
so their channel dimension was always in_channels. For models whose
out_channels != in_channels (common: 1 phase in -> 2 target out), the
first `prediction_sum[slicer] += patch_out` raised a broadcast error.
Allocate the accumulators lazily from the first patch output so they
take the model's actual out_channels and dtype.

Regression tests added for both paths:
- sliding_window + non-zero overlap must raise ValueError
- predict_sliding_window returns the correct shape when out_channels=2

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier "always append" strategy broke argparse when users placed
diagnostic flags (-c job, --info, --help, ...) after positional
overrides:

    $ dynacell evaluate leaf=er/... -c job
    dynacell: error: unrecognized arguments: hydra.searchpath=[file://...]

Hydra's argparse declares ``overrides`` as a single positional with
nargs="*". Once that nargs run is interrupted by a flag, any later
positional is reported unrecognized. Appending the injected token at
the end scatters a positional across the flag in the override-first
layout; prepending scatters it across the flag in the flag-first layout.

Fix: insert the token adjacent to an existing positional override when
one is present, so positionals stay contiguous whichever side the
flags sit on. Fall back to append when the user passed no positional
overrides (e.g. bare --help).

Three regression tests cover the three layouts: flag-leading,
flag-trailing, and no-positional. The former "appends always" test is
replaced by the layout-specific pair.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tighten the "Running an evaluation" lead so readers see up front that
repo-checkout groups live under virtual_staining/_internal/ (the
hidden support tree), not loose under virtual_staining/. The detailed
bullet list below already spelled this out; the intro now matches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 191 out of 201 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +65 to +69
if target_name in ["nucleus", "membrane"]:
_require_segmenter_model_zoo()
if seg_model is None:
raise ValueError("SegModel must be provided for nucleus and membrane segmentation.")
mask = seg_model.apply_on_single_zstack(img[None, ...])
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ValueError message for missing seg_model says "SegModel must be provided", but the parameter is seg_model and the required type here is the loaded SuperModel. Updating the message to reference seg_model/SuperModel will make the failure clearer for users.

Copilot uses AI. Check for mistakes.
alxndrkalinin and others added 18 commits April 21, 2026 08:09
Lightning's SLURMEnvironment rejects --ntasks at runtime
(slurm.py:_validate_srun_variables) and demands --ntasks-per-node.
Yesterday's fix emitted the wrong directive; all four FCMAE jobs
failed within 50 s on the cluster. Switch the sbatch generator,
hardware profiles, pre-submit validator, and invariant test to use
ntasks_per_node throughout. The enforced invariant is now
devices == ntasks_per_node and gpus == nodes × devices.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gated HF repos (DINOv3) need per-user access grants, but the downloaded
weights are identical. Before this change every team member who ran
`dynacell evaluate compute_feature_metrics=true` re-downloaded ~2 GB
of DINOv3 weights into their own `~/.cache/huggingface/`. Even ignoring
the network cost, only one team member has HF gate access right now —
every other member's runs 403 at model fetch.

Fix: point HF_HOME at a team-shared project-storage dir. First user
with gated access downloads once; everyone else reads the shared copy.

Two layers so the behavior is automatic regardless of invocation:

1. `dynacell.__main__._maybe_set_shared_hf_cache()` runs before Hydra
   subcommands (evaluate, precompute-gt, report). It defaults HF_HOME
   to
   `/hpc/projects/comp.micro/virtual_staining/models/dynacell/evaluation/hf_cache/`
   when three conditions hold: the user hasn't set HF_HOME themselves,
   we're on a repo checkout (same signal as external configs), and the
   shared dir exists on this machine. Wheel installs and non-HPC
   machines fall through to the normal ~/.cache/huggingface default.

2. `runtime_shared.yml` also sets HF_HOME for sbatch-rendered fit /
   predict jobs. Those jobs don't currently load HF models, but having
   the env var declared in the launcher profile future-proofs any
   later workflow that does and keeps the team convention visible in
   config.

Four regression tests cover the auto-setter: user-set wins, wheel
install skips, missing shared dir skips, repo checkout + existing dir
sets. README documents the mechanism and first-time download contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three defaults were required on every eval / precompute-gt invocation
even though they are effectively fixed for our benchmark set:

  feature_extractor/dinov3=lvd1689m       (only DINOv3 we evaluate with)
  feature_extractor/dynaclr=default       (team DynaCLR checkpoint)
  io.gt_cache_dir=/hpc/.../eval_cache/<TARGET>   (one dir per target)

Hoist them so the minimal command

  dynacell precompute-gt target=er_sec61b predict_set=ipsc_confocal

composes everything needed for all four cache families, and

  dynacell evaluate leaf=<...>

carries the same defaults without the redundant override directives.

- eval.yaml defaults list switches
    optional feature_extractor/dinov3: null   -> lvd1689m
    optional feature_extractor/dynaclr: null  -> default
  The `optional` keyword means wheel installs that don't ship
  dynaclr=default (it lives in the repo's shared/eval/ tree) silently
  skip the default — wheel users supply their own via --config-dir.
- Each target YAML now carries io.gt_cache_dir for its target's cache
  path (SEC61B, TOMM20, nucleus, membrane).
- The 8 canonical eval leaves drop the redundant
    override /feature_extractor/dinov3: lvd1689m
    override /feature_extractor/dynaclr: default
  and the per-leaf io.gt_cache_dir — all now inherited.
- README updated: "Enable feature metrics" and "Priming the cache"
  examples show the one-line minimal forms.

To swap variants at call-time nothing changes:
  dynacell evaluate ... feature_extractor/dinov3=<other_variant>
  dynacell evaluate ... feature_extractor.dynaclr.checkpoint=/other.ckpt
  dynacell evaluate ... feature_extractor/dinov3=null   # disable

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three issues from /simplify agents:

- Hardcoded HPC path in _SHARED_HF_CACHE makes the module CZ Biohub-
  specific; other sites couldn't override without editing source. Read
  DYNACELL_SHARED_HF_CACHE at module load and fall back to the CZ
  Biohub default only when the env var is absent. Extension story is
  now: set DYNACELL_SHARED_HF_CACHE in your shell / sbatch env block
  and the rest of the mechanism just works.
- Docstring on _maybe_set_shared_hf_cache named the sibling helper
  _external_configs_dirs; rephrase in semantic terms ("external Hydra
  searchpaths resolve") so the docstring doesn't rot if the detection
  mechanism ever changes.
- Four test methods in TestMaybeSetSharedHfCache imported `os` inline
  instead of at module top (violates CLAUDE.md). Plus a redundant
  monkeypatch.delenv() cleanup at the end of the last test that
  pytest's monkeypatch fixture already handles on teardown. Lift
  `import os` to module top, drop the inline imports, drop the
  redundant cleanup.

Skipped /simplify nitpicks:
- functools.cache on _external_configs_dirs (6-8 stat calls saved
  per CLI invocation; not worth the cognitive cost of a cache that
  has to coexist with tests that patch the function reference)
- Parametrize the 4 tests (4 is small enough to leave explicit)
- Inline comment on opaque group names (marginal signal)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Setting HF_HOME to the team-shared project-storage dir inadvertently
broke per-user gated-repo authentication: HF_HOME relocates the entire
HF directory tree including the auth token file. The first precompute
attempt with DINOv3 enabled came back 401 "Access to model ... is
restricted. You must have access to it and be authenticated to access
it" — not a missing grant (the user had been granted access, which
would have returned 403), but a missing token. HF was looking under
$HF_HOME/token, where no token file lives.

Switch to HF_HUB_CACHE, which only relocates the weights/datasets
cache and leaves auth state at its default ~/.cache/huggingface/token.
Each user's personal token now enforces gate ACLs on their own
account; the cache of downloaded weights is shared on project
storage as intended.

The _maybe_set_shared_hf_cache helper and its four tests flip from
HF_HOME → HF_HUB_CACHE. The runtime_shared.yml launcher env block
does the same. README gains a paragraph explaining *why* we chose
HF_HUB_CACHE over HF_HOME, and clarifies that subsequent reads from
the shared cache are plain disk reads that don't round-trip to HF and
so don't need a token at all.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous 4-GPU FCMAE runs OOM-killed before first training step. On
gpu-c-1 sstat showed MaxVMSize=103G/rank at startup — 4 ranks × 103G
≈ 412G of virtual memory, plus mmap_preload /dev/shm staging, pushed
past the 512G cgroup cap. num_workers=8 per rank meant 32 dataloader
workers inflating per-rank VM before the first step. Halving to 4
cuts startup VM roughly in half while keeping prefetch above the
single-worker bottleneck.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…odel>/

Regroup the virtual_staining benchmark tree so a training experiment
lives in one directory. Paths go from <org>/<train_set>/<model>/ with
separate train/predict/eval subdirs to <org>/<model>/<train_set>/ with
flat predict__<predict_set>.yml and eval__<predict_set>.yaml filenames.

The old layout assumed each (org, model) cell had exactly one training
run. New research requirements — joint ipsc_confocal + a549_mantis
training, plus evaluating each trained model on both held-out splits —
make train_set an index within a cell rather than a hierarchy. Grouping
by train set puts a training run (train + its predicts + its evals)
in a single subdir so adding or removing an experiment is one `mkdir`
or one `rm -r`.

33 leaf moves (17 train + 8 predict + 8 eval), 8 _internal/leaf symlink
rewrites (5-up target instead of 6-up, new eval__<X>.yaml filename),
predict base: paths lose one ../ (depth 4 -> 3), tests + tool docstring
+ 4 READMEs track the new layout. New selectors: leaf=er/celldiff/
ipsc_confocal/eval__ipsc_confocal. No __main__.py code change — search
path roots (_internal/ and _internal/shared/eval/) are unaffected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ext early return, fill eval defaults

- nrmse/psnr/ssim now normalize each image independently (min-max) instead of jointly
- init_cache_context early return (no gt_cache_dir) was missing dinov3/dynaclr fields
- eval.yaml: fill in default dinov3 model name and dynaclr encoder config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…aves

Stage 1 of the A549 expansion roadmap. Benchmark leaves now declare
`benchmark.dataset_ref: {dataset, target}` instead of hardcoding
`data_path`, `source_channel`, and `target_channel` per target. A
composition-time resolver reads a Pydantic DatasetManifest YAML from
roots configured via (in order) CLI roots → DYNACELL_MANIFEST_ROOTS env
var → entry points under `dynacell.manifest_roots`. The resolver is a
plain callable threaded through `viscy_utils.compose.load_composed_config`
as a keyword-only `resolver=` kwarg so viscy-utils stays generic.

Partial `dataset_ref` is a strict no-op: leaves that inherit only one
half (e.g. `{dataset: aics-hipsc}` from the shared ipsc_confocal
train_set) compose byte-identically until their target fragment is also
migrated. This makes Stage 2 (per-organelle migration) safe to roll out
incrementally.

Migrates the full ER-on-iPSC slice as a proof of end-to-end parity:
targets/er_sec61b.yml, train_sets/ipsc_confocal.yml,
predict_sets/ipsc_confocal.yml, and both ER predict leaves (celldiff,
unetvit3d). Non-ER leaves inherit partial refs and are untouched until
their targets migrate. Shared `source_channel: Phase3D` stays in the
ipsc_confocal fragments; the resolver tolerates matching values and
raises only on mismatch, which keeps non-ER leaves composing correctly
during the staged rollout.

The Lightning branch of dynacell's `main_cli` catches
NoManifestRootsError / ManifestNotFoundError / TargetNotFoundError and
prints the message to stderr with exit code 2 — no stack trace for user
misconfiguration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A549_EXPANSION_ROADMAP.md sequences the 6-stage rollout that moves the
benchmark tree from hardcoded per-target paths to manifest-driven
resolution, so iPSC confocal and a549 mantis datasets can coexist
without path duplication.

DATASET_REF_RESOLVER_SPEC.md is the authoritative spec for the Stage 1
resolver that just landed in 0aa7e1a. Both docs live in-tree so future
readers can reconstruct why the resolver exists and how the remaining
stages (per-target migration, eval-side resolution, eval target
migration, a549 manifest, a549 leaf expansion) should land.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ot fixture

Resolve drift from the Stage 1 plan: source_channel moves out of the
shared ipsc_confocal train/predict fragments and onto each target
fragment (mito/nucleus/membrane), since dataset_ref is the single
source of truth for data_path/source_channel/target_channel on
migrated ER leaves. Restore the strict collision check that raises
whenever a composed fragment also declares a resolved field, dropping
the agreement-tolerant transitional policy.

Also: cache load_manifest with lru_cache (same policy as
_load_yaml_cached); move DatasetManifest.source_channel validation
onto the schema; delete the unused _DATA_FIELDS constant; rename a
shadowing loop variable in the conflict-details comprehension; switch
DYNACELL_MANIFEST_ROOTS parsing to os.pathsep; consolidate the
autouse manifest-root fixture into conftest.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mirrors the companion dynacell-paper manifest additions so that Stage 2
resolver tests can splice data_path / target_channel for the two
multi-marker targets derived from cell.zarr. No production-side effect
— the fixture is consumed only by the autouse test conftest.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stage 2 of the A549 expansion roadmap. Each target fragment now
declares benchmark.dataset_ref.target and drops its hardcoded
data_path / source_channel / target_channel; the resolver splices
concrete values from the manifest (either the repo-local fixture in
tests or the dynacell-paper canonical manifest at runtime). The 6
non-ER predict leaves drop their explicit data.init_args.data_path
for the same reason — strict collision would otherwise raise.

Tests swap the mito-only partial-ref no-op parity pair for
parametrized resolver tests that cover every model in TRAIN_LEAVES /
PREDICT_LEAVES composing a migrated fragment (pytest.skip handles
missing FCMAE leaves for nucleus/membrane). The roadmap Stage 2
paragraph is updated to match the batched single-PR shape.

Requires the dynacell-paper companion PR to land before production
nucleus/membrane runs — VisCy CI is self-contained via the fixture.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d organelles

Collapses the ER-specific train/predict resolver tests and the parametrized
non-ER block into unified table-driven tests keyed off a single
_MIGRATED_TARGET_INFO dict. Eliminates duplicate skip/fixture setup and the
N=5 model sub-parametrization that existed solely to accommodate
pytest.skip for non-existent leaves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends StoreLocations with gt_cache_dir: Path | None, and extends
ResolvedDataset + resolve_dataset_ref to surface cell_segmentation_path
and gt_cache_dir alongside the existing manifest fields. Adds a
Hydra-side apply_dataset_ref hook in evaluation/_ref_hook.py that
mirrors the Lightning-side compose hook: partial refs are no-ops,
collisions raise ValueError, and _splice writes io.* + derived
pred_channel_name + pixel_metrics.spacing into a DictConfig.

Introduced as unused code — the hook is only wired at eval/precompute
entry points in the follow-up commit, which keeps this change
bisect-safe. Fixture manifest now carries gt_cache_dir on all four
targets so the next commit's integration tests can assert resolved
values byte-for-byte against the pre-migration hardcoded paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires apply_dataset_ref at the top of evaluate_model and precompute_gt,
and extends the __main__ Hydra branch with the same
ManifestNotFound/NoManifestRoots/TargetNotFound catch the Lightning
branch already has. Sets HYDRA_FULL_ERROR=1 so @hydra.main does not
swallow resolver errors before our clean SystemExit(2) catch.

Migrates all four eval target YAMLs (er_sec61b, mito_tomm20, nucleus,
membrane) plus predict_set/ipsc_confocal.yaml together so io.* and
pixel_metrics.spacing come from the manifest via benchmark.dataset_ref.
Adds benchmark: null placeholder to eval.yaml so struct-mode merges
accept the new key. Nucleus and membrane share cell.zarr but get
distinct gt_cache_dir paths, keyed by organelle not channel.

Test coverage: integration tests compose every canonical eval leaf and
verify splice output (Layer 1) and invoke evaluate_model.__wrapped__ /
precompute_gt.__wrapped__ to prove the hook is actually wired at both
@hydra.main entry points (Layer 2). Router tests assert clean SystemExit(2)
with the expected stderr for all three resolver error classes on both
evaluate and precompute-gt.

Companion dynacell-paper manifest bump adds gt_cache_dir to the
canonical aics-hipsc manifest; safe to merge either order because
Pydantic extra='ignore' makes the field optional.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Factors the partial-ref no-op + Pydantic validation pipeline shared by
_compose_hook and _ref_hook into a single dataset_ref_from_dict helper
on dynacell.data.resolver, so REQUIRED_REF_KEYS lives in exactly one
place and both hook surfaces use identical semantics (ValidationError
propagates as-is rather than being rewrapped).

Restores the caller's struct-mode state in _splice via try/finally so
the hook no longer silently unlocks the whole Hydra config for the
rest of evaluate_model / precompute_gt.

Test cleanup: promotes clear_global_hydra to a shared conftest fixture
(deletes the two local copies), drops the redundant
DYNACELL_MANIFEST_ROOTS setenv calls now that conftest autouses it,
removes the dead _FIXTURE_ROOT module constants, drops the
_clear_manifest_cache autouse (the path-keyed lru cache is already
stable across tests), and aligns the null-dataset scenario on the
pydantic.ValidationError the hook now surfaces.

YAML cleanup: deletes the WHAT-narrating header blocks from the four
migrated eval target YAMLs, predict_set/ipsc_confocal.yaml, and
eval.yaml — the behavior is already visible from dataset_ref + the
hook module, so the comments added noise without WHY.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ints/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants