Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
c54f568
utility to combine multiple ann datasets and compute dim reduction me…
edyoshikun Mar 31, 2026
497bcfa
batch z transform for 2D MIP
edyoshikun Apr 1, 2026
5f5acea
cell_index: add preprocess_cell_index and flat parquet schema extensions
edyoshikun Apr 3, 2026
f536c5a
DynaCLR data: parquet-first pipeline + CenterCrop final crop
edyoshikun Apr 3, 2026
90c697a
training configs + dataloader demo script
edyoshikun Apr 3, 2026
b50e81b
dynaclr info: handle sparse X matrices
edyoshikun Apr 4, 2026
474b66d
adding files for training
edyoshikun Apr 7, 2026
55d2004
spurious slash in the file
edyoshikun Apr 7, 2026
8bea25b
-multiexperiment prediction
edyoshikun Apr 8, 2026
2f3c1bc
Merge branch 'modular-viscy-staging' into dynadtw
edyoshikun Apr 8, 2026
bd44317
add recipes " - trainer.yml: shared seed, accelerator, logger enti…
edyoshikun Apr 8, 2026
8e0b3b9
Add linear classifier summary plots and remove evaluate_dataset.py
edyoshikun Apr 9, 2026
04e68e4
Add per-well channel validity to ChannelEntry and FOVRecord
edyoshikun Apr 10, 2026
34f04bf
Fix position filter to use is_dir() instead of name prefix check
edyoshikun Apr 10, 2026
e8ff671
Add viral_sensor and Phase3D channels to BoC collections; add v3 3D c…
edyoshikun Apr 10, 2026
e03ddda
Add base: inheritance to eval configs via load_composed_config
edyoshikun Apr 14, 2026
d6d3614
Fix channel_utils regex for PhC and BF label-free detection
edyoshikun Apr 14, 2026
44e0545
Fix ArrowStringArray compatibility in embedding writer and zarr utils
edyoshikun Apr 14, 2026
42d0879
Add PHATE n_jobs control and improve annotation join flexibility
edyoshikun Apr 14, 2026
c796a4d
Add per-class AUROC to linear classifier metrics
edyoshikun Apr 14, 2026
6e1d6c3
Add onnx, copairs, and tracking optional dependencies
edyoshikun Apr 14, 2026
00d2166
Vectorize data pipeline for large-scale performance
edyoshikun Apr 14, 2026
84a9140
Make cellanome embedding scripts work without transcriptome data
edyoshikun Apr 14, 2026
bd63bc9
Update training and collection configs; add new dataset collections
edyoshikun Apr 14, 2026
210efb0
Refactor eval orchestrator: replace SLURM scripts with Nextflow manifest
edyoshikun Apr 14, 2026
031c40f
Add MMD perturbation evaluation system
edyoshikun Apr 14, 2026
45dd151
Improve linear classifiers: auto-expand markers, save pipelines, F1-o…
edyoshikun Apr 14, 2026
7193d48
Add per-marker smoothness grouping with mean/std aggregation
edyoshikun Apr 14, 2026
f10fb51
Add CTC tracking accuracy benchmark
edyoshikun Apr 14, 2026
364f83c
Improve embedding plot rendering: rasterization, legends, histograms
edyoshikun Apr 14, 2026
a5657fd
Add eval configs for ALFI mitosis and microglia datasets
edyoshikun Apr 14, 2026
6ac0a0d
Update evaluation DAG documentation for Nextflow pipeline
edyoshikun Apr 14, 2026
b8b0fce
Rewrite pseudotime pipeline with DTW-based alignment
edyoshikun Apr 14, 2026
50cf2bb
Add dataloader profiling and inspection scripts
edyoshikun Apr 14, 2026
2a95062
Add evaluation comparison and analysis scripts
edyoshikun Apr 14, 2026
ea5f8e2
Add Airtable dataset preparation utilities
edyoshikun Apr 14, 2026
1f8f885
Add cellanome embedding configs and DAG documentation
edyoshikun Apr 14, 2026
087c4e3
Add TODO notes for ArrowStringArray workarounds pending anndata 0.13
edyoshikun Apr 14, 2026
5f49bc8
Add microscope/modality/treatment fields and auto-delete well templat…
edyoshikun Apr 15, 2026
a45b061
remove the cellanome configs
edyoshikun Apr 15, 2026
02bca0b
restructure pseudotime evals
edyoshikun Apr 15, 2026
332a8b8
Add per-cell timing metrics (Stage 3c/3d) for organelle remodeling
edyoshikun Apr 18, 2026
833f917
Document Stage 3c/3d timing metrics in pseudotime DAG
edyoshikun Apr 18, 2026
e1d0cb1
Auto-select group-by in label-timing compare step
edyoshikun Apr 18, 2026
14aefd9
Add missing viral_sensor + Phase3D experiments to LC recipe
edyoshikun Apr 18, 2026
1435f49
Fix cross-experiment FOV cache collision in triplet dataset
edyoshikun Apr 18, 2026
e500a05
Vectorize per-batch positive lookup in triplet dataset
edyoshikun Apr 18, 2026
162790a
Lazy batch generation + NumPy Categorical groupby in sampler
edyoshikun Apr 19, 2026
73fe3e1
Whitelist anchor-cache columns and coerce Categorical keys in dataset
edyoshikun Apr 19, 2026
08c0c92
Cast fov_name and well_name to Categorical after alignment
edyoshikun Apr 19, 2026
015b526
Materialize strings, mask-based FOV split, val-empty guard in datamodule
edyoshikun Apr 19, 2026
b721cd6
Cast low-cardinality strings to Categorical at parquet load
edyoshikun Apr 19, 2026
40ed2f7
Delete SaveConfigToWandb callback (DDP setup-hook deadlock)
edyoshikun Apr 19, 2026
b1730e2
Add single-marker A/B variants for 2D-MIP and 3D-BoC-v2
edyoshikun Apr 19, 2026
249e1bf
Move fastdev/tiny diagnostic configs to training/debug/
edyoshikun Apr 19, 2026
80edf4c
Include marker in temporal valid_anchors match key
edyoshikun Apr 20, 2026
1bf15de
Scope lineage reconstruction by well, not just fov
edyoshikun Apr 20, 2026
a78ad08
Bump 2D-MIP-BoC (→v2) and 3D-BoC (→v4) parquets after lineage fix
edyoshikun Apr 20, 2026
43263fe
Use FlexibleBatchSampler for val so composition matches train
edyoshikun Apr 20, 2026
ba81457
Organize training configs into per-model-family subfolders
edyoshikun Apr 20, 2026
1632b5f
remove spurious file
edyoshikun Apr 20, 2026
f4f40c3
Fix FlexibleBatchSampler DDP wiring + epoch advance + dataset mixed-C…
edyoshikun Apr 23, 2026
83cfd02
Merge branch 'modular-viscy-staging' into dynadtw
edyoshikun Apr 24, 2026
7b3daed
Delete stale DynaCLR tests after iohub 0.3.2 merge
edyoshikun Apr 24, 2026
9c14f8a
Fix MultiExperimentIndex.clone_with_subset to propagate tensorstore_c…
edyoshikun Apr 24, 2026
bc8c8bd
Make _ddp_topology robust to trainer stubs without DDP attrs
edyoshikun Apr 24, 2026
742b426
move profiling
edyoshikun Apr 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions applications/airtable/configs/prepare_config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Dataset preparation pipeline: NFS -> VAST rechunked zarr v3
# Usage: prepare run <dataset_name> -c prepare_config.yml [--dry-run]

nfs_root: /hpc/projects/intracellular_dashboard/organelle_dynamics
vast_root: /hpc/projects/organelle_phenotyping/datasets
workspace_dir: /hpc/mydata/eduardo.hirata/repos/viscy

concatenate:
# null = auto-detect raw channels (Phase3D + raw *). Set explicitly to override.
channel_names: null
chunks_czyx: [1, 16, 256, 256]
shards_ratio: [1, 1, 8, 8, 8]
output_ome_zarr_version: "0.5"
conda_env: biahub
# Override biahub's internal SLURM settings (passed via -sb flag)
# Set to null to use biahub defaults
sbatch_overrides:
partition: cpu

qc:
channel_names: [Phase3D]
NA_det: 1.35
lambda_ill: 0.450
pixel_size: 0.1494
midband_fractions: [0.125, 0.25]
device: cuda
num_workers: 16

preprocess:
channel_names: -1
num_workers: 32
block_size: 32

# biahub concatenate submits its own SLURM jobs via submitit (no config needed)
# QC and preprocess run as separate SLURM jobs (no race condition)
slurm:
qc:
partition: gpu
gres: "gpu:1"
cpus_per_task: 16
mem_per_cpu: 4G
time: "00:30:00"
preprocess:
partition: cpu
cpus_per_task: 32
mem_per_cpu: 4G
time: "04:00:00"
3 changes: 3 additions & 0 deletions applications/airtable/scripts/write_experiment_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,9 @@ def register(position_paths: list[Path], dry_run: bool = False, dataset: str | N
if result.updated:
db.batch_update(result.updated)
logger.info("Updated %d existing records", len(result.updated))
if result.template_ids_to_delete:
db.batch_delete(result.template_ids_to_delete)
logger.info("Deleted %d well template records", len(result.template_ids_to_delete))

print(format_register_summary(result, dry_run=dry_run))

Expand Down
15 changes: 15 additions & 0 deletions applications/airtable/src/airtable_utils/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,3 +143,18 @@ def batch_create(self, records: list[dict]) -> list[dict]:
Created records as returned by the Airtable API.
"""
return self._table.batch_create([r["fields"] for r in records])

def batch_delete(self, record_ids: list[str]) -> list[dict]:
"""Batch-delete records by ID.

Parameters
----------
record_ids : list[str]
Airtable record IDs to delete.

Returns
-------
list[dict]
Deletion confirmations from the Airtable API.
"""
return self._table.batch_delete(record_ids)
Loading