Open
Conversation
…thods (i.e phate, pca,umap)
Add normalization columns (norm_mean/std/median/iqr/max/min), z_focus_mean, and TCZYX shape columns to the cell index schema. preprocess_cell_index reads per-FOV zattrs and writes stats as parquet columns for fast per-row normalization at training time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ExperimentRegistry.from_cell_index: build registry directly from preprocessed parquet + zarr metadata (no collection YAML needed) - datamodule: cell_index_path as primary entry point, _train_final_crop changed from BatchedRandSpatialCropd to BatchedCenterSpatialCropd (random crop for Z/XY translation is now a user-configured augmentation) - dataset: read norm stats from parquet columns, build_norm_meta fallback - index: _align_parquet_columns, _resolve_dims from parquet Y/X_shape Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- DynaCLR-3D-BagOfChannels-v2: z_window=32, yx_patch=256, RandSpatialCrop(40,228,228) after affine for Z focus invariance + XY translation, CenterCrop(32,160,160) auto-appended. batch_size=256, 2 GPUs, 2-day wall time. - Add dataloader_demo.py: Jupyter-style visualization of raw vs augmented anchor/positive batches with per-sample metadata - Update demo configs and inspection scripts for new pipeline Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
np.nanmin/nanmax fail on scipy sparse arrays. Convert to dense before computing range stats so the command works on Seurat-exported anndata zarr stores. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CLI for running evals - DAG for evals - yaml files for evals
… 3 base callbacks - model/contrastive_encoder_convnext_tiny.yml: ConvNeXt-Tiny class_paths - model/dinov3_frozen_mlp.yml: frozen DINOv3 + MLP projection block - augmentations/ops_2d_mild.yml: OPS-specific mild augmentation pipeline - data/ops_gene_reporter.yml: OPS data defaults (patch sizes, sampling)
- train_linear_classifier() now returns a third value: raw val outputs (y_val, y_val_proba, classes) for downstream ROC curve plotting - orchestrated run-linear-classifiers generates metrics_summary.pdf alongside the CSV: bar chart of AUROC/accuracy/F1 + per-task ROC curves - Delete evaluate_dataset.py (argparse-based, not in CLI, superseded by orchestrator) and its example config - Strip generate_comparison_report and its helpers from report.py; file is now CV-only - Remove dead _detect_n_features() from cross_validation.py - Update all callers of train_linear_classifier() to unpack 3-tuple - Update DAG doc and linear classifiers README Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- FOVRecord.channel_markers: dict[str, str] maps zarr channel name to marker for a specific well (populated from Airtable channel_N_marker fields) - ChannelEntry.wells: list[str] restricts a channel to a subset of wells; empty means valid in all wells - build_collection auto-populates wells by comparing which wells have a non-None marker for each channel across all FOVRecords - _build_experiment_tracks skips channel rows where ch.wells is non-empty and the current well is not in that set, preventing noise rows from mixed-plate experiments (e.g. viral sensor only in B/3, C/2) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The glob */*/* on zarr v3 stores yields zarr.json files (e.g. A/2/zarr.json) in addition to position directories. The previous check only stripped names starting with "." (.zattrs, .zgroup) but missed zarr.json. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ollection - DynaCLR-2D-MIP-BagOfChannels: add viral_sensor + Phase3D for 2025_01_28, 2024_10_09, 2024_10_16; fix dragonfly tracks_path to point to inner zarr store (tracking.zarr/2024_08_14_...zarr) - DynaCLR-3D-BagOfChannels-v2: add viral_sensor + Phase3D for 2025_01_28, 2024_10_09, 2024_10_16 - DynaCLR-3D-BagOfChannels-v3: new collection copied from v2 with dragonfly tracks_path fix; v2 left intact for running training job - DynaCLR-BoC-lc-evaluation-v1: add viral_sensor for all datasets; add Phase3D for 2025_01_28 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wire load_config to delegate to load_composed_config so eval configs support base: recipe inheritance (same mechanism as training configs) - Extract shared eval settings into 4 recipes: predict.yml, reduce.yml, plot_infectomics.yml, linear_classifiers_infectomics.yml - Slim down DynaCLR-2D-BagOfChannels-v3, DynaCLR-2D-MIP-BagOfChannels-v1, DINOv3-temporal-MLP-2D-BagOfChannels-v1, and test_evaluation configs to use base: references — eliminating copy-pasted 14-experiment annotation blocks and shared step configs - Fix ONNX inference to use GPU (CUDAExecutionProvider) and suppress pthread_setaffinity_np noise with intra/inter_op_num_threads=1 - Switch CTC tracking SLURM script to gpu partition Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix \bbf[\b_] -> \bbf(\b|_): inside a character class, \b is a backspace character, not a word boundary - Add \bphc\b to detect phase-contrast (PhC) as label-free Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
pandas 3+ uses Arrow-backed strings by default, which breaks anndata's
zarr writer. Apply the same fix in two code paths:
- embedding_writer.py: replace select_dtypes("string") with per-column
isinstance checks for pd.StringDtype and Arrow-backed Categoricals
- zarr_utils.py: convert ArrowStringArray columns and index to object
dtype before calling append_to_anndata_zarr
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- PHATE: default n_jobs from -1 (all cores) to 1 to prevent hogging shared SLURM nodes; exposed in PHATEConfig and compute_phate() - Annotation: support (fov_name, t, track_id) join as fallback when both sides lack an 'id' column; normalize fov_name by stripping leading/trailing slashes to prevent join mismatches Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
For multiclass problems, compute one-vs-rest AUROC per class and report
as val_{class_name}_auroc columns in the results DataFrame.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- viscy-utils: add onnx, onnxscript to core deps; copairs to eval extras - dynaclr: add tracking optional group (gurobipy, onnxruntime-gpu, py-ctcmetrics, tabulate, tracksdata) for CTC tracking benchmark - Regenerate uv.lock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- index.py: replace O(N*tau) Python loop in _compute_valid_anchors with vectorized pd.MultiIndex.isin(); add fit=False predict-mode fast path that skips anchor computation; add precomputed_valid_anchors to clone_with_subset() to avoid redundant recomputation; accept cell_index_df to avoid double-reading parquet - dataset.py: replace per-row loops in _build_match_lookup with groupby().indices; skip lookup build in predict mode; add organelle, well, microscope to exported metadata columns - datamodule.py: tune defaults (num_workers=4, cache_pool=500MB, pin_memory=True, buffer_size=4); use vectorized MultiIndex.isin for FOV split; reuse pre-loaded cell_index_df from ExperimentRegistry - experiment.py: from_cell_index returns (registry, dataframe) tuple so callers can reuse the DataFrame without re-reading from disk Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use .get() with None default for transcriptome_anndata and skip the barcode join when it is absent, allowing embeddings on datasets that lack paired scRNA-seq. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Centralize cell_index_path to shared /hpc/projects/.../collections/ dir across all training configs - MIP model: extend z_extraction_window 11->20, z_focus_offset 0.5->0.3, yx_patch_size 192->256, add BatchedRandSpatialCropd for Z-invariance - 3D BoC: num_workers 2->4; SLURM time limit 2d->4d - Collection: mark DynaCLR-2D-BagOfChannels-v3 as [LEGACY]; fix well assignments in BoC-lc-evaluation-v1 (add A/1 for 07_24, remove incorrect B/1 and B/2 from 01_28) - Add new collections: annotated MIP subset, test subset, alfi-eval (ALFI mitosis, 3 cell lines), microglia-eval (5 perturbations), benchmark_2exp (dataloader profiling) - predict.yml: add TQDMProgressBar callback (refresh_rate=10) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- evaluate.py: remove all SLURM script generation (_generate_*_sh, _slurm_header, _run_local*); replace with prepare_configs() that generates YAML configs and prints a JSON manifest to stdout; rename CLI command evaluate -> prepare-eval-configs; add MMD config generators - evaluate_config.py: remove SlurmConfig; add MMDStepConfig and ComparisonSpec imports; split PlotStepConfig.color_by into per-exp and combined_color_by; update TaskSpec.marker_filters docstring for auto-expand behaviour - cli.py: add prepare-eval-configs, check-evals, append-annotations, append-predictions, split-embeddings, compute-mmd, plot-mmd-heatmap, evaluate-tracking-accuracy commands - split_embeddings.py: new CLI to split combined embeddings.zarr by experiment, replacing inline SLURM script logic - check_evals.py: new CLI to print eval completion status from registry - eval_registry.yaml: declarative registry of models to evaluate - Delete 4 stale SLURM-era eval configs (SlurmConfig schema removed) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three modes for measuring embedding-space distribution shifts: - Per-experiment (explicit comparison pairs, faceted by marker) - Combined (pairwise cross-experiment with batch centering) - Pooled (concatenates all experiments, BH FDR correction) Core implementation: - viscy_utils/evaluation/mmd.py: kernel MMD with median heuristic, Gaussian RBF kernel, unbiased estimator, and vectorized permutation test (avoids Python loops via binary label matrix multiplication) - viscy_utils/evaluation/embedding_map.py: mAP via copairs for phenotypic profiling (optional dependency) - evaluation/mmd/config.py: Pydantic config hierarchy for all three modes; temporal binning, shared bandwidth, balance_samples - evaluation/mmd/compute_mmd.py: orchestrates the three analysis modes; computes activity_zscore = (mmd2 - null_mean) / null_std for cross-marker comparability; outputs per-marker CSV files - evaluation/mmd/plotting.py: kinetics lines, heatmaps, activity z-score heatmaps, combined cross-experiment heatmaps, multi-panel grids, paired heatmaps with shared colorbar - configs/evaluation/recipes/mmd_defaults.yml: shared algorithm defaults (1000 permutations, max 2000 cells, seed 42) for YAML inheritance - tests/test_mmd.py: unit tests for MMD implementation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ver-time
- orchestrated.py: when marker_filters is None, auto-discover all unique
obs["marker"] values and run one classifier per marker; save trained
pipelines as {task}_{marker}.joblib with manifest.json; add
_plot_f1_over_time for per-class F1 at each timepoint; output one
{task}_summary.pdf per task (was a single merged PDF)
- orchestrated_test.py: update fixtures to expect 2 rows per task with
auto-expansion; add test for sparse-marker skipping and F1-over-time
plot generation
- append_annotations.py: new CLI to persist ground-truth annotation
columns directly into per-experiment zarr obs
- append_predictions.py: new CLI to apply saved classifier pipelines to
all cells in per-experiment zarrs, writing predicted_{task} to obs and
predicted_{task}_proba to obsm
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When group_by is set (default "marker"), evaluate_smoothness iterates over unique group values, computes smoothness per group, saves per-group CSV, generates per-group plots, then aggregates via mean/std. Output filenames now include experiment_name for disambiguation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Evaluates whether DynaCLR embeddings improve cell tracking on Cell Tracking Challenge datasets vs an IoU baseline. - tracking_accuracy/config.py: Pydantic models for ONNX model entries, CTC dataset entries, ILP solver weights, and full benchmark config - tracking_accuracy/utils.py: seg_dir layout helper, pad_to_shape, normalize_crop (z-score using whole-frame statistics) - tracking_accuracy/evaluate_tracking.py: main benchmark driver - ctc_tracking_2d_mip_boc.yaml: DynaCLR-2D-MIP vs IoU on DIC-C2DL-HeLa - ctc_tracking_2d_mip_boc_all.yaml: all CTC sequences variant - export_onnx_2d_mip_boc.yml: config for exporting the MIP model to ONNX Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Pairplot: change diag_kind kde -> hist; rasterize scatter points to prevent PDF bloat; improve legend (alpha=1.0, larger marker sizes) - Scatter 2D: improve legend (markerscale=6, fontsize=10, framealpha=1.0, edgecolor="black") Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace 5 monolithic analysis scripts with a structured 5-stage pipeline using DTW Barycenter Averaging (DBA) for principled trajectory alignment. Core library (evaluation/pseudotime/dtw_alignment.py): - build_infection_template(): DBA with medoid initialization from annotated transitioning cells; per-experiment z-score -> PCA -> L2-normalize preprocessing; time calibration maps template positions to real minutes - dtw_align_tracks(): per-track DTW to template, produces pseudotime in [0,1] and label propagation fractions per template position - alignment_results_to_dataframe(): assembles results DataFrame Pipeline stages (scripts/pseudotime/): - 0-build_templates: build DBA templates from annotated transitions, diagnostic lineage overview - 1-align_cells: DTW-align all cell trajectories to template; alignment diagnostic plots (pseudotime vs real time, cost distributions, PCA) - 2-evaluate_dtw: evaluate alignment against annotations (AUC, onset concordance, IoU) - 3-organelle_dynamics: per-organelle embedding dynamics along infection pseudotime, remodeling heatmaps and montage grids - 4-export_anndata: merge DTW results back into AnnData zarr copies - cell_count_funnel.py: summarize cell/track filtering across all stages Configs and tests: - multi_template.yaml: switch to MIP embeddings dir, update embedding patterns for viral_sensor, G3BP1, SEC61 channels - test_pseudotime.py: add TestTimeCalibration (monotonicity, round-trip) and TestMetricsContinuous (onset/peak detection) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- profile_stages.py: extend z_window 16->32; add I/O bandwidth reporting (MB/s, MB read per anchor+positive) - benchmark_setup_time.py: benchmark _compute_valid_anchors and _build_match_lookup on 3.3M-row parquet to validate vectorization - profile_num_workers.py: sweep num_workers to find optimal parallelism - profile_predict_batch_size.py: sweep predict batch sizes - test_2d_mip_augmentation.py: visual verification of 2D MIP augmentation pipeline (z-crop + MIP) - explore_gut_parquet.py: exploratory script for gut dataset parquet Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- compare_evals.py: cross-model evaluation comparison that reads eval_registry.yaml outputs and generates comparison plots for smoothness, AUROC, and MMD activity z-scores across models - microglia_alfi_analysis.py: PCA/UMAP embedding analysis for microglia (by perturbation) and ALFI HeLa (by cell cycle phase) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Config-driven pipeline for NFS-to-VAST dataset preparation: - prepare.py: orchestrates concatenation, QC, and preprocessing steps driven by Airtable metadata - prepare_cli.py: CLI entry point for the prepare pipeline - configs/prepare_config.yml: example config for dataset preparation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- configs/cellanome/: per-run embed_dinov3.yml and embed_dynaclr.yml configs for 5 Cellanome flow cell runs (A549 infectomics panels, mixed GFP+RFP, SEC61B/G3BP1/pAL40 DENV rerun); embed_all.sh helper - docs/DAGs/ai_ready_datasets.md: DAG for AI-ready dataset preparation pipeline - docs/DAGs/pseudotime.md: DAG for DTW pseudotime pipeline stages - docs/DAGs/training.md: DAG for model training workflow Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
anndata 0.12.9+ pulls pandas <3, so we pin 0.12.6 with pandas 3 and manually downcast Arrow-backed strings. Remove once anndata 0.13 supports pandas 3 natively. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…es on registration - Add microscope, labelfree_modality, treatment, hours_post_treatment to FOVRecord and DatasetRecord; parse from Airtable singleSelect responses - Add all four fields to WELL_TEMPLATE_FIELDS so they propagate to per-FOV records - Raise ValueError when a well template has no cell_line set (required for channel marker derivation — previously silently skipped) - Auto-delete well template records after registration batch: register_fovs populates template_ids_to_delete; CLI calls batch_delete after create/update - Add batch_delete to AirtableDatasets - Wire microscope into build_collection so it flows to ExperimentEntry and cell_index parquet (was previously always empty string) - Update all tests; fix pre-existing test regressions from is_dir() filter Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
compute_timing_metrics.py reduces each cell's cosine-distance-from-pre-baseline
curve to SNR-robust scalars (t_onset_abs, t50, t_peak, delta_peak,
rise_rate_per_hour) and pools into per-organelle distributions.
compute_label_timing.py does the same from LC predicted_{state} labels
(t_first_pos, t_run_start, pos_fraction, flips). Supervised projection
gives sharper cross-organelle separation (e.g. SEC61 pos_fraction=0.81 vs
G3BP1=0.00, p=1.6e-4) than unsupervised cosine distance.
Both ship a compute sub-command for per-organelle per-cell parquet plus
summary markdown, and a compare sub-command that merges parquets and
emits strip plots plus pairwise rank-sum tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds directory-layout entries for compute_timing_metrics.py (embedding cosine-distance timing) and compute_label_timing.py (LC-prediction timing), plus dedicated sections documenting per-cell scalars, outputs, the aligned-only vs whole-track asymmetry, and example numbers for SEC61 vs G3BP1 on sensor_all_07_24. Notes the next planned iteration: configurable multi-dataset pool with ZIKV/DENV virus-stratified comparison. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When multiple per-cell parquets from `compute` share an organelle_channel but differ in query_set (e.g. ZIKV pool vs DENV pool, both on sensor), the old compare step collapsed them into one group. Now: - Auto-detect: split by organelle_channel if >1 present, else query_set. - --group-by CLI flag to override the default. - Markdown + plot headers reflect the grouping column. Unblocks cross-virus comparison via paired single-virus query sets in align_cells.yaml (sensor_zikv_pool, sensor_denv_pool). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five zarrs were generated by predict but skipped by the LC step because they weren't listed in the annotations block: - 2025_01_28_A549_viral_sensor_ZIKV_DENV - 2025_01_28_A549_Phase3D_ZIKV_DENV - 2024_11_07_A549_SEC61_DENV_viral_sensor - 2025_01_24_A549_G3BP1_DENV_viral_sensor - 2025_08_26_A549_viral_sensor_ZIKV All five reuse their dataset's existing combined annotations CSV. The effect for downstream Stage 3d label-timing: the ZIKV pool (07_22 + 07_24 + 08_26 + 01_28 ZIKV) gains predicted_infection_state on every sensor zarr, and DENV gets full coverage across 2024_11_07, 2025_01_24, and 2025_01_28 DENV well. Re-run: `nextflow run main.nf --eval_config ... -resume` will skip cached predict/split/reduce and only rerun LC + append_predictions + plot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_get_position` and `_get_tensorstore` were keyed by `fov_name` alone, so the same FOV path (e.g. `A/3/0`, `0/3/000000`) shared across experiments in a MultiExperimentDataModule returned the first-cached experiment's zarr for every subsequent lookup. This caused samples from later experiments to read pixels from the wrong store while metadata still reported the correct experiment — silently corrupting training batches. Key the caches by `(store_path, fov_name)` instead. Verified by Pearson-correlating dataloader output against direct zarr reads at the same coordinates: all 8 SEC61B anchors from 3 experiments sharing `A/1/0`/`A/2/0`/`A/3/0` now match 1.0 (previously 2/8 matched, 6/8 had ~0 correlation). Also explains previously-observed edge artifacts in patches despite clamping: the cached zarr was from a different experiment with different FOV dimensions, so clamp margins no longer matched the actual image bounds. Affects OPS and every DynaCLR training run with multiple experiments sharing FOV names (DynaCLR-2D-MIP-BagOfChannels: 157 collisions, DynaCLR-3D-BagOfChannels-v2: 112 collisions). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The per-row pandas .iloc / .iterrows pattern in positive-pair lookup
was the dominant per-batch bottleneck: 4500 ms/batch at batch=512 on
the 81.5M-row OPS index. Each anchor triggered multiple pd.Series
constructions (~9 ms each) to look up match-key columns, resolve
lineage timepoints, and filter candidates by marker. At 50% GPU
utilization in the lite run, this bottleneck gated the whole pipeline.
Replace with a precomputed NumPy column cache:
- `_build_anchor_cache()` extracts every valid_anchors column and
the hot tracks columns (marker, channel_name, experiment, t,
lineage_id) as `np.ndarray` at dataset __init__.
- `_sample_positives_temporal()` vectorizes the lineage + tau
lookup using NumPy fancy-index filtering.
- `_sample_positives()` for column-match (SupCon) mode takes
positional anchor indices from the sampler and does NumPy-direct
key construction, with a single batched tracks.iloc gather at
the end (one call instead of 512).
- `_match_lookup` now stores np.ndarray values (zero-copy random
choice) instead of Python lists.
- `_extract_meta` uses NumPy label arrays instead of .iterrows().
- SimCLR (`positive_cell_source="self"`) now clones the anchor
tensor directly instead of running a second zarr read + meta
extraction — halves per-batch wall time for SimCLR baselines.
- `__getitems__` bag-of-channels path reads channel_name from the
NumPy cache.
- Predict branch replaces .iterrows() with NumPy column arrays.
Delete the now-unused per-row paths (`_find_positive`,
`_find_temporal_positive`, `_find_column_match_positive`) entirely —
keeping them as fallbacks would be a performance footgun for future
contributors.
Measured per-batch wall time (batch=64, demo subsample):
- SupCon OPS: ~80 ms (was 4500 ms at batch=512)
- SimCLR self: ~30 ms
- Temporal: ~200 ms (2D-MIP)
Correctness verified end-to-end:
- Pearson correlation anchor vs direct zarr read = 1.0
- SupCon positives share (gene_name, marker) 64/64
- Temporal positives share lineage 64/64, all non-zero Δt
- 22/22 existing dataset unit tests pass after test refactor to
call the vectorized entry points
Affects every DynaCLR training configuration: OPS (SupCon),
DynaCLR-2D-MIP, DynaCLR-3D-BagOfChannels (temporal), and any SimCLR
baseline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two independent fixes for FlexibleBatchSampler on 16M+ row valid_anchors: 1. `__iter__` materialized the full epoch upfront — blocking DDP for several minutes before batch 0. Now yields batches lazily while preserving RNG draws across all ranks so DDP stays bit-identical. 2. `_precompute_groups` called pandas groupby on Arrow-backed columns, which routes every group slice through pyarrow.compute.take and took tens of minutes. Categorical fast path uses `cat.codes` + `np.flatnonzero`, and per-group-per-stratum uses `np.intersect1d` between prebuilt group/strat arrays. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MultiExperimentTripletDataset caching fixes for 81M-row indices: - `_build_anchor_cache` cached every column of valid_anchors/tracks, blowing per-rank RSS. Whitelist the 13 columns actually read in the hot path (store_path, fov_name, experiment, t, y_clamp, x_clamp, norm_*, channel_name, marker, lineage_id) plus user-supplied positive_match_columns and label columns. - Cast high-cardinality string columns to Categorical before caching so indexing hits 4-8 byte codes instead of 40-80 byte object refs. - Wrap cat-array lookups with `str()` in `_sample_positive_indices_temporal` and in `_build_match_lookup` because `_materialize_strings` upstream leaves these columns as Categorical — hashing a Categorical scalar would not match the str keys in `_lineage_timepoints`. - Precompute per-experiment `tau_range_frames` to drop a registry call per anchor in the temporal sampling hot path. - Refactor `_slice_patch` / `_slice_patches` / `_sample_positives` to take (arrays, indices) instead of DataFrame rows, eliminating `iterrows()` and per-row Series construction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deferred Categorical cast for `fov_name` and `well_name` in `_align_parquet_columns` — upstream `cell_index.py` already casts the low-cardinality text columns on load, but `fov_name` is rewritten here by the position-prefix logic (Categorical columns would reject the string concatenation), so the cast has to happen after the rewrite. Makes the downstream train/val boolean-mask slice a fast int-code gather instead of pyarrow.compute.take over the string buffer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three setup-time fixes for MultiExperimentDataModule._setup_fov_split on ~80M-row indices: - `_materialize_strings`: cast ArrowStringArray columns to Categorical before slicing. `df[bool_mask]` on Arrow-backed string columns routes through pyarrow.compute.take and scales catastrophically (7-8 min per call on 16M rows × 15 string cols). Categorical codes + categories make slicing pure NumPy fancy indexing on int codes. - Replace `pd.MultiIndex.from_arrays / from_tuples` (hashes a Python tuple per row) with a per-experiment groupby walk that writes a row-aligned boolean mask, eliminating the 80M-tuple index build. - Guard `val_index` / `val_dataset` construction on `val_tracks.empty` instead of `val_keys`, which gets dropped in the new mask-based flow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`read_cell_index` now casts low-cardinality string columns (experiment, marker, store_path, microscope, organelle, reporter, channel_name) to pandas Categorical. ArrowStringArray-backed columns route every boolean mask slice through pyarrow.compute.take, which allocates a fresh buffer per string column and spiked peak RSS by 50+ GiB during train/val FOV partitioning on 80M-row indices. High-cardinality columns (cell_id, tracks_path, lineage_id) stay ArrowStringArray so we don't allocate millions of Python string objects up front — the dataset reads them via the NumPy column cache. `fov_name` is intentionally left as-is because `_align_parquet_columns` rewrites it via string concatenation, which Categorical doesn't support; it gets cast after the rewrite in the runtime index layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Subclassing Lightning's SaveConfigCallback to call `wandb_logger.experiment` inside the setup hook deadlocked DDP on ≥2 ranks: non-zero ranks blocked at the wandb init barrier while rank 0 was inside the hook, so the setup fence never cleared. Bug was hidden under `fast_dev_run=True` because Lightning swaps the real logger for DummyLogger, which doesn't touch wandb internals. The resulting config saved to `trainer.log_dir` is already picked up by the wandb files tab automatically when `save_dir` matches, so the custom callback was net-negative — delete rather than patch. Removes: - `packages/viscy-utils/.../save_config_wandb.py` - `SaveConfigToWandb` export in callbacks/`__init__.py` - Entry in shared `trainer.yml` recipe - Entry in OPS-1000genes-lite.yml See `feedback_wandb_ddp_deadlock.md` for the full postmortem. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds OPS-style single-marker batch composition variants (`batch_group_by: marker`, one reporter per batch) to complement the default mixed-markers runs (`stratify_by=[perturbation, marker]`). Run pairs for direct A/B comparison: - DynaCLR-2D-MIP-BagOfChannels: mixed vs single-marker - DynaCLR-3D-BagOfChannels-v2: mixed vs single-marker Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Keeps the diagnostic configs accessible for reproducing DDP hangs, memory profiling, and fast_dev_run sanity checks without cluttering the production training directory. Production entry points stay in `configs/training/`; `debug/` holds the single-node/single-GPU variants that were used to isolate the SaveConfigToWandb DDP deadlock and the ArrowStringArray memory spike. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
In flat-parquet / bag-of-channels mode (one row per cell × channel), `_pick_temporal_candidate` restricts positive candidates to rows with the same marker as the anchor. But `_compute_valid_anchors` only checked (lineage_id, t+tau) existence, so an anchor with (lid, marker=Phase3D, t=50) could pass validation when (lid, marker=GFP, t=51) exists — and then crash at sample time with "No positive found" because no same-marker row exists in the window. Fix: include `marker` in the match key when it's present as a column in `tracks`. Validity now requires the shifted (lineage_id, marker, t+tau) tuple to exist, matching what the sampler actually enforces. Detected in SLURM job 31265738 (2D-MIP single-marker): 268 "No positive found" errors across 66 epochs of training, with the validation dataloader failing to complete even once — which is why `loss/val` never appeared in wandb despite train loss logging. Non-flat-parquet configs (one row per cell) are unaffected since marker is constant per (lineage, t) there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_reconstruct_lineage` grouped tracks by `(experiment, fov)`, which fuses cells from different wells that share an FOV number (e.g. B/2/002001 and C/2/002001 both have `fov="002001"`). The per-group track_id → global_track_id map then routes parent_track_id lookups across wells, producing `lineage_id` strings that aliase across wells. Downstream this crashes the temporal positive sampler with "No positive found" because `_lineage_timepoints[(exp, lid)]` holds rows from multiple wells mashed together. About 15-30% of lineages in the 2D-MIP-BagOfChannels dataset were affected (29 of 30 experiments had cross-well collisions). Fix: group by `(experiment, well, fov)` when the `well` column is available. `global_track_id` already embeds well/fov, so root-walks inside each group only see track_ids from one biological FOV. Existing parquets built with the old code carry the aliased lineage IDs and need to be regenerated; a later commit can flag that at load time once the rebuild lands. Also adds: - `_compute_valid_anchors`: includes `marker` in the validity key when present, matching the same-marker filter `_pick_temporal_candidate` enforces in flat-parquet / bag-of-channels mode. - Unit tests: `TestReconstructLineage` in `test_cell_index.py` and `test_valid_anchors_marker.py` for the index fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rebuilds the two timelapse parquets with the fixed `_reconstruct_lineage` that scopes by (experiment, well, fov) instead of (experiment, fov). - `collections/DynaCLR-2D-MIP-BagOfChannels-v2.yml`: copy of the unversioned collection YAML. - `collections/DynaCLR-3D-BagOfChannels-v4.yml`: copy of v2 with the dragonfly `tracks_path` corrected to point at the nested `2024_08_14_ZIKV_pal17_48h.zarr` (zarr v2 tracking store; the outer `tracking.zarr` is just a container). - Training configs updated to the new parquet paths. Verified collision-free (0 cross-well lineage aliasing) on both: - 2D-MIP v2: 3.36M rows across 32 experiments - 3D-BoC v4: 766k rows across 26 experiments Also drops the `SaveConfigToWandb` callback entry that was still referenced in these two training configs (missed in 40ed2f7). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before: `val_dataloader` was a plain torch DataLoader with `shuffle=False`, ignoring `batch_group_by` and `stratify_by`. That served val in parquet order — one FOV/marker at a time — so the first N val batches all shared the same marker (visually confirmed in dataloader_demo), and in DDP `loss/val` ALLREDUCE silently desynced because each rank's shard saw a different subset of markers. After: val uses the same `FlexibleBatchSampler` as train with identical `batch_group_by` / `stratify_by` / `group_weights` / `seed` settings. For the BoC configs this means: - mixed-markers (`batch_group_by=None`, `stratify_by=[perturbation,marker]`) produces diverse val batches that mirror train batches. - single-marker (`batch_group_by=marker`) produces per-marker val batches that cycle through all markers across the val epoch instead of stalling on one. Temporal enrichment is disabled for val (no biology-of-interest oversampling skewing loss/val). Also: - `dataloader_demo.py`: add a "Validation dataloader" section that iterates val batches, flags NaN/Inf before and after normalization, and plots with the same `plot_batch` helper. Confirms val now serves diverse markers matching the train composition. - `OnlineEvalCallback.effective_rank`: guard against NaN/Inf in features so a degenerate validation epoch can't crash the whole run with "SVD did not converge" from `np.linalg.svd`. Drops affected rows and returns NaN when no finite rows remain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split the flat applications/dynaclr/configs/training/ directory into per-family subfolders so related runs stay grouped and the root directory is skimmable: - DynaCLR-2D/ — 2D (and MIP) time-lapse contrastive runs - DynaCLR-3D/ — 3D time-lapse contrastive runs - DINOv3/ — DINOv3 frozen-encoder + MLP probes - Phase-contrastive/ — Phase-contrastive-timeaware Each .yml and its paired .sh stay together in the same folder. OPS/ is organized separately (not included in this commit). Mechanical updates: - `base:` paths in leaf YAMLs rewritten from `recipes/...` to `../recipes/...` so composition still resolves relative to the YAML. - `CONFIGS=` in each sbatch script now points at the new subfolder. - `sbatch ...` comment headers in YAML and SH files updated. - debug/ sbatch comment headers also updated for references to the renamed launch scripts. Also: - Deleted stale `slurm-287*.out` logs and the stray `wandb/` directory that had accumulated in the configs directory. - Rewrote README.md to document the new layout, composition rules via `base:`, SLURM entry points, and resume semantics. Verified composition still works via `viscy_utils.compose.load_composed_config` on a representative yml from each subfolder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.