Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 5 additions & 41 deletions notebooks/braggtrack_demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"cell_type": "markdown",
"id": "e65e834e",
"metadata": {},
"source": "# BraggTrack end-to-end demo\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/BASE-Laboratory/BraggTrack/blob/main/notebooks/braggtrack_demo.ipynb)\n\nRuns the full pipeline on the bundled `data/sample_operando/` scans:\n\n1. **Discover** — find the per-scan H5 files.\n2. **Segment (Week 2)** — LoG → h-maxima → seeded watershed → instance features.\n3. **Track physics-only (Week 3)** — Hungarian over a geometry cost with per-axis gating; build a lifecycle DAG.\n4. **Semantic descriptors (Week 4)** — orthogonal MIPs + frozen-encoder embeddings.\n5. **Geometry + semantic tracking (Week 4)** — compose `α · geometry + β · (1 − cos)`.\n6. **α/β ablation** — how the semantic weight shifts tracking metrics.\n7. **Synthetic crossing** — a case where geometry alone fails and semantics recover identity.\n\nFinal section shows the one-line CLI equivalents for each stage.\n\nThis notebook uses the **mock** DINO backend by default, so no PyTorch / HuggingFace weights are required. Set `BRAGGTRACK_DINO_BACKEND=torch` if you have them installed and want real embeddings."
"source": "# BraggTrack end-to-end demo\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/BASE-Laboratory/BraggTrack/blob/main/notebooks/braggtrack_demo.ipynb)\n\nRuns the full pipeline on the bundled `data/sample_operando/` scans:\n\n1. **Discover** — find the per-scan H5 files.\n2. **Segment** — LoG → h-maxima → seeded watershed → instance features.\n3. **Track (physics-only)** — Hungarian over a geometry cost with per-axis gating; build a lifecycle DAG.\n4. **Semantic descriptors** — orthogonal MIPs + frozen-encoder embeddings.\n5. **Geometry + semantic tracking** — compose `α · geometry + β · (1 − cos)`.\n6. **α/β ablation** — how the semantic weight shifts tracking metrics.\n7. **Synthetic crossing** — a case where geometry alone fails and semantics recover identity.\n\nFinal section shows the one-line CLI equivalents for each stage.\n\nThis notebook uses the **mock** DINO backend by default, so no PyTorch / HuggingFace weights are required. Set `BRAGGTRACK_DINO_BACKEND=torch` if you have them installed and want real embeddings."
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -103,7 +103,7 @@
"cell_type": "markdown",
"id": "d74a25a8",
"metadata": {},
"source": "## 2 — Week 2: classical segmentation\n\n`segment_classical` runs 3-D Gaussian blur → Laplacian → h-maxima seeds → seeded watershed.\n\n### Threshold stabilisation across scans\n\nEach scan produces its own Otsu threshold on the raw intensity histogram.\nIn theory these should be nearly identical for back-to-back operando acquisitions,\nbut minor intensity fluctuations (beam drift, detector warm-up, etc.) cause\nper-frame Otsu to jitter — and because everything downstream (foreground mask → seed\nfloor → watershed) is threshold-sensitive, small jitter produces wildly different\nspot counts.\n\n**Fix:** compute per-frame Otsu thresholds, then pass them through a\nrolling-median smoother (`smooth_thresholds`). The median suppresses isolated\noutliers (beam drops, detector flashes) while still tracking genuine long-term\ndrift. For 500+ frame sequences this runs in O(N·W) on scalar thresholds —\nno need to pool raw volumes in memory.\n\nTwo further knobs that matter for real data:\n\n* `threshold` — **intensity-domain** foreground, now smoothed across scans. Controls the watershed mask.\n* `seed_peak_fraction` / `seed_response_percentile` — **LoG-response-domain** admissibility floor inside the foreground."
"source": "## 2 — Classical segmentation\n\n`segment_classical` runs 3-D Gaussian blur → Laplacian → h-maxima seeds → seeded watershed.\n\n### Threshold stabilisation across scans\n\nEach scan produces its own Otsu threshold on the raw intensity histogram.\nIn theory these should be nearly identical for back-to-back operando acquisitions,\nbut minor intensity fluctuations (beam drift, detector warm-up, etc.) cause\nper-frame Otsu to jitter — and because everything downstream (foreground mask → seed\nfloor → watershed) is threshold-sensitive, small jitter produces wildly different\nspot counts.\n\n**Fix:** compute per-frame Otsu thresholds, then pass them through a\nrolling-median smoother (`smooth_thresholds`). The median suppresses isolated\noutliers (beam drops, detector flashes) while still tracking genuine long-term\ndrift. For 500+ frame sequences this runs in O(N·W) on scalar thresholds —\nno need to pool raw volumes in memory.\n\nTwo further knobs that matter for real data:\n\n* `threshold` — **intensity-domain** foreground, now smoothed across scans. Controls the watershed mask.\n* `seed_peak_fraction` / `seed_response_percentile` — **LoG-response-domain** admissibility floor inside the foreground."
},
{
"cell_type": "code",
Expand Down Expand Up @@ -229,11 +229,7 @@
"cell_type": "markdown",
"id": "6fc3abf4",
"metadata": {},
"source": [
"## 3 — Week 3: physics-only tracking\n",
"\n",
"`PositionShapeCost` combines squared centroid distance with squared eigenvalue distance; `build_tracks` runs pairwise Hungarian assignments and stitches them into a NetworkX `DiGraph` with `TrackEvent` annotations."
]
"source": "## 3 — Physics-only tracking\n\n`PositionShapeCost` combines squared centroid distance with squared eigenvalue distance; `build_tracks` runs pairwise Hungarian assignments and stitches them into a NetworkX `DiGraph` with `TrackEvent` annotations."
},
{
"cell_type": "code",
Expand Down Expand Up @@ -382,11 +378,7 @@
"cell_type": "markdown",
"id": "94bdcacb",
"metadata": {},
"source": [
"## 4 — Week 4: multi-view MIPs\n",
"\n",
"For each spot, crop a padded sub-volume, zero out voxels that don't belong to the instance, and take three maximum-intensity projections — one along each physical axis."
]
"source": "## 4 — Multi-view MIPs\n\nFor each spot, crop a padded sub-volume, zero out voxels that don't belong to the instance, and take three maximum-intensity projections — one along each physical axis."
},
{
"cell_type": "code",
Expand Down Expand Up @@ -724,35 +716,7 @@
"cell_type": "markdown",
"id": "7e605de2",
"metadata": {},
"source": [
"## 8 — The same pipeline from the command line\n",
"\n",
"Every library call above is exposed as a CLI — feed a dataset root and an output directory, get reproducible artifacts under `artifacts/`.\n",
"\n",
"```bash\n",
"# 1. Segment every scan under data/sample_operando/\n",
"python -m braggtrack.cli.segment_dataset --outdir artifacts/week2\n",
"\n",
"# 2. Compute mock multi-view embeddings\n",
"python -m braggtrack.cli.embed_dataset --segdir artifacts/week2 --outdir artifacts/week4 --backend mock\n",
"\n",
"# 3. Track with geometry + semantic cost (β=0.5)\n",
"python -m braggtrack.cli.track_dataset artifacts/week2 \\\n",
" --outdir artifacts/week3 \\\n",
" --embedding-dir artifacts/week4 \\\n",
" --cost-alpha 1.0 --cost-beta 0.5\n",
"\n",
"# 4. Ablate α/β and write a JSON report\n",
"python scripts/ablation_week4.py \\\n",
" --indir artifacts/week2 \\\n",
" --embedding-dir artifacts/week4 \\\n",
" --betas 0,0.25,0.5,1.0 \\\n",
" --output artifacts/week4_ablation/report.json\n",
"\n",
"# 5. Full CI-equivalent check (unit tests + all weekly acceptance gates)\n",
"python scripts/ci_report.py\n",
"```"
]
"source": "## 8 — The same pipeline from the command line\n\nEvery library call above is exposed as a CLI — feed a dataset root and an output directory, get reproducible artifacts under `artifacts/`.\n\n```bash\n# 1. Segment every scan under data/sample_operando/\npython -m braggtrack.cli.segment_dataset --outdir artifacts/segmentation\n\n# 2. Compute mock multi-view embeddings\npython -m braggtrack.cli.embed_dataset --segdir artifacts/segmentation --outdir artifacts/embedding --backend mock\n\n# 3. Track with geometry + semantic cost (β=0.5)\npython -m braggtrack.cli.track_dataset artifacts/segmentation \\\n --outdir artifacts/tracking \\\n --embedding-dir artifacts/embedding \\\n --cost-alpha 1.0 --cost-beta 0.5\n\n# 4. Ablate α/β and write a JSON report\npython scripts/ablation_semantic.py \\\n --indir artifacts/segmentation \\\n --embedding-dir artifacts/embedding \\\n --betas 0,0.25,0.5,1.0 \\\n --output artifacts/ablation/report.json\n\n# 5. Full CI-equivalent check (unit tests + all acceptance gates)\npython scripts/ci_report.py\n```"
}
],
"metadata": {
Expand Down
Loading
Loading