diff --git a/docs/AI_DOCUMENTATION_INDEX.md b/docs/AI_DOCUMENTATION_INDEX.md
new file mode 100644
index 00000000..c3bd5d6e
--- /dev/null
+++ b/docs/AI_DOCUMENTATION_INDEX.md
@@ -0,0 +1,117 @@
+# Documentation Index
+
+> **Note**: This documentation was generated with AI assistance (Claude, Anthropic)
+
+---
+
+## Overview
+
+Complete documentation for the linumpy microscopy processing library. The library provides tools for processing Serial Optical Coherence Tomography (S-OCT) data into reconstructed 3D volumes.
+
+---
+
+## Pipeline Documentation
+
+### Workflow Guides
+
+1. **[PIPELINE_OVERVIEW.md](PIPELINE_OVERVIEW.md)** - Complete overview of preprocessing and 3D reconstruction pipelines
+2. **[NEXTFLOW_WORKFLOWS.md](NEXTFLOW_WORKFLOWS.md)** - Nextflow workflow configuration and execution guide
+
+### Data Formats
+
+3. **[MOSAIC_GRID_FORMAT.md](MOSAIC_GRID_FORMAT.md)** - OME-Zarr mosaic grid format specification
+4. **[SHIFTS_FILE_FORMAT.md](SHIFTS_FILE_FORMAT.md)** - XY shifts CSV file format and usage
+
+---
+
+## Feature Documentation
+
+5. **[SLICE_CONFIG_FEATURE.md](SLICE_CONFIG_FEATURE.md)** - Slice selection and filtering system
+6. **[SLICE_INTERPOLATION_FEATURE.md](SLICE_INTERPOLATION_FEATURE.md)** - Missing slice reconstruction using registration-based morphing
+7. **[GPU_ACCELERATION.md](GPU_ACCELERATION.md)** - GPU acceleration using NVIDIA CUDA/CuPy
+
+---
+
+## Reference
+
+8. **[SCRIPTS_REFERENCE.md](SCRIPTS_REFERENCE.md)** - Command-line scripts reference guide
+9. **[LIBRARY_MODULES.md](LIBRARY_MODULES.md)** - Python library module documentation
+
+---
+
+## Contributing
+
+10. **[CONTRIBUTING.md](CONTRIBUTING.md)** - Contribution guidelines
+
+---
+
+## Source Code Structure
+
+```
+linumpy/
+├── _thread_config.py      # CPU thread management (NumPy/SciPy/Dask/JAX/ITK)
+├── io/                    # Input/output modules
+│   ├── allen.py          # Allen Brain Atlas integration
+│   ├── npz.py            # NPZ file handling
+│   ├── test_data.py      # Test dataset access
+│   ├── thorlabs.py       # Thorlabs microscope support
+│   └── zarr.py           # OME-Zarr I/O
+├── gpu/                   # GPU acceleration
+│   ├── __init__.py       # GPU detection & utilities
+│   ├── array_ops.py      # Per-pixel operations
+│   ├── corrections.py    # Galvo detection
+│   ├── cuda_env.py       # CUDA environment utilities
+│   ├── fft_ops.py        # FFT & phase correlation
+│   ├── image_quality.py  # GPU image quality assessment
+│   ├── interpolation.py  # Resampling & transforms
+│   ├── morphology.py     # Binary operations & filtering
+│   └── registration.py   # Hybrid GPU/CPU registration
+├── microscope/           # Microscope-specific modules
+│   └── oct.py            # OCT tile reading
+├── preproc/              # Preprocessing modules
+│   ├── icorr.py          # Illumination correction
+│   ├── normalization.py  # Intensity normalization
+│   ├── resampling.py     # Mosaic grid resampling utilities
+│   └── xyzcorr.py        # XYZ correction & galvo shift detection
+├── psf/                  # Point spread function
+│   └── psf_estimator.py  # PSF estimation
+├── stitching/            # Image stitching
+│   ├── FileUtils.py      # File handling utilities
+│   ├── interpolation.py  # Missing-slice interpolation
+│   ├── manual_registration.py  # GUI-based manual registration
+│   ├── motor.py          # Motor-position-based tile placement
+│   ├── registration.py   # Image registration
+│   ├── stacking.py       # 3D slice stacking utilities
+│   ├── stitch_utils.py   # Stitching utilities
+│   └── topology.py       # Mosaic topology
+├── utils/                # Utility modules
+│   ├── data_io.py        # Data I/O helpers
+│   ├── image_quality.py  # Image quality assessment (SSIM, edge, variance)
+│   ├── io.py             # CLI argument helpers
+│   ├── metrics.py        # Quality metrics collection
+│   ├── mosaic_grid.py    # Mosaic grid utilities
+│   ├── orientation.py    # Volume orientation codes & RAS transforms
+│   ├── shifts.py         # XY shift loading & outlier filtering
+│   └── visualization.py  # Orthogonal view screenshots
+├── reconstruction.py     # Core reconstruction
+├── segmentation.py       # Segmentation tools
+└── utils_images.py       # Image utilities
+```
+
+---
+
+## Workflow Files
+
+```
+workflows/
+├── preproc/
+│   ├── nextflow.config       # Preprocessing config
+│   └── preproc_rawtiles.nf   # Raw tiles → mosaic grids
+├── reconst_3d/
+│   ├── nextflow.config       # 3D reconstruction config
+│   └── soct_3d_reconst.nf    # Mosaic grids → 3D volume
+└── reconst_2.5d/
+    ├── soct_2.5d_reconst.nf              # 2.5D reconstruction workflow
+    ├── soct_2.5d_reconst_beluga.config   # Beluga HPC cluster config
+    └── soct_2.5d_reconst_docker.config   # Docker container config
+```
diff --git a/docs/GPU_ACCELERATION.md b/docs/GPU_ACCELERATION.md
new file mode 100644
index 00000000..2499ea0e
--- /dev/null
+++ b/docs/GPU_ACCELERATION.md
@@ -0,0 +1,289 @@
+# GPU Acceleration
+
+
+---
+
+## Overview
+
+linumpy supports GPU acceleration for compute-intensive operations using NVIDIA CUDA via CuPy. GPU acceleration is **optional** - all functions automatically fall back to CPU (NumPy/SciPy) if:
+
+- CuPy is not installed
+- No CUDA-capable GPU is available
+- GPU memory is insufficient
+
+---
+
+## Quick Start
+
+```bash
+# Check your CUDA version
+nvidia-smi | grep "CUDA Version"
+
+# Install linumpy with GPU support (choose your CUDA version)
+pip install linumpy[gpu]           # CUDA 12.x (default)
+pip install linumpy[gpu-cuda11]    # CUDA 11.x
+pip install linumpy[gpu-cuda13]    # CUDA 13.x (requires extra setup for JAX)
+
+# Verify GPU
+linum_gpu_info.py
+linum_diagnose_pipeline.py --benchmark
+```
+
+---
+
+## Installation
+
+### Requirements
+
+- NVIDIA GPU with CUDA Compute Capability 3.0+
+- CUDA Toolkit 11.x, 12.x, or 13.x
+- CuPy matching your CUDA version
+
+**Recommended GPU:** NVIDIA A6000 (48GB) or similar professional GPU.
+
+### CuPy Version Reference
+
+| CUDA Version | CuPy Package |
+|--------------|--------------|
+| CUDA 11.x | `cupy-cuda11x` |
+| CUDA 12.x | `cupy-cuda12x` |
+| CUDA 13.x | `cupy-cuda13x` |
+
+---
+
+## JAX GPU for BaSiCPy (fix_illumination)
+
+The `fix_illumination` step uses BaSiCPy which is built on JAX. JAX GPU requires additional setup.
+
+### Important: JAX 0.4.23 Library Requirements
+
+BaSiCPy requires `jax<=0.4.23`. JAX 0.4.23 was compiled against specific library versions:
+- cuSOLVER 11 (libcusolver.so.11)
+- cuSPARSE 12 (libcusparse.so.12)
+- cuFFT 11 (libcufft.so.11)
+- cuBLAS 12 (libcublas.so.12)
+- cuDNN 8 (libcudnn.so.8)
+
+These exact versions are only available in **specific pinned versions** of the `nvidia-xxx-cu12` packages. Newer versions of these packages have different `.so` versions that are **incompatible**.
+
+### Automated Setup (Recommended)
+
+```bash
+# Run the fix script - handles everything
+source shell_scripts/fix_jax_cuda_plugin.sh
+```
+
+This script:
+1. Removes conflicting nvidia packages
+2. Installs JAX 0.4.23 with **pinned nvidia package versions**:
+   - `nvidia-cublas-cu12==12.3.4.1`
+   - `nvidia-cudnn-cu12==8.9.7.29`
+   - `nvidia-cusolver-cu12==11.5.4.101`
+   - etc.
+3. Applies patchelf fix (required for Linux 6.x+ kernels)
+4. Sets up LD_LIBRARY_PATH
+5. Tests JAX CUDA with SVD operation
+
+### Manual Setup
+
+If you prefer manual setup:
+
+```bash
+# 1. Uninstall all conflicting packages
+pip uninstall -y jax jaxlib jax-cuda12-plugin nvidia-cusolver nvidia-cufft \
+    nvidia-cusparse nvidia-cublas nvidia-cuda-runtime nvidia-cudnn nvidia-nvjitlink \
+    nvidia-cublas-cu12 nvidia-cuda-cupti-cu12 nvidia-cuda-runtime-cu12 \
+    nvidia-cudnn-cu12 nvidia-cufft-cu12 nvidia-cusolver-cu12 nvidia-cusparse-cu12 \
+    nvidia-nccl-cu12 nvidia-nvjitlink-cu12
+
+# 2. Install JAX 0.4.23 with CUDA wheel
+pip install 'jax==0.4.23' 'jaxlib==0.4.23+cuda12.cudnn89' \
+    -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
+
+# 3. Install PINNED nvidia package versions (critical - newer versions won't work!)
+pip install \
+    'nvidia-cublas-cu12==12.3.4.1' \
+    'nvidia-cuda-cupti-cu12==12.3.101' \
+    'nvidia-cuda-runtime-cu12==12.3.101' \
+    'nvidia-cudnn-cu12==8.9.7.29' \
+    'nvidia-cufft-cu12==11.0.12.1' \
+    'nvidia-cusolver-cu12==11.5.4.101' \
+    'nvidia-cusparse-cu12==12.2.0.103' \
+    'nvidia-nccl-cu12==2.19.3' \
+    'nvidia-nvjitlink-cu12==12.3.101'
+
+# 4. Apply patchelf fix (required for modern Linux kernels)
+sudo apt install patchelf
+JAXLIB_PATH=$(python -c "import jaxlib; print(jaxlib.__path__[0])")
+find "$JAXLIB_PATH" -name "*.so" -exec patchelf --clear-execstack {} \;
+find $(python -c "import site; print(site.getsitepackages()[0])")/jax_plugins \
+    -name "*.so" -exec patchelf --clear-execstack {} \;
+
+# 5. Set LD_LIBRARY_PATH (before running JAX/BaSiCPy)
+SP=$(python -c "import site; print(site.getsitepackages()[0])")
+export LD_LIBRARY_PATH="${SP}/nvidia/cublas/lib:${SP}/nvidia/cuda_runtime/lib:${SP}/nvidia/cusolver/lib:${SP}/nvidia/cusparse/lib:${SP}/nvidia/cufft/lib:${SP}/nvidia/cudnn/lib:${LD_LIBRARY_PATH}"
+
+# 6. Test
+python -c "import jax; print(jax.devices()); import jax.numpy as jnp; print(jnp.linalg.svd(jnp.eye(2)))"
+```
+
+### Verify Installation
+
+```bash
+# Check GPU availability
+linum_gpu_info.py
+
+# Run full pipeline diagnostics
+linum_diagnose_pipeline.py --benchmark
+```
+
+---
+
+## GPU-Accelerated Scripts
+
+| GPU Script | CPU Equivalent | Typical Speedup |
+|------------|----------------|-----------------|
+| `linum_estimate_transform_gpu.py` | `linum_estimate_transform.py` | 8-47x |
+| `linum_create_masks_gpu.py` | `linum_create_masks.py` | 7-67x |
+| `linum_create_mosaic_grid_3d_gpu.py` | `linum_create_mosaic_grid_3d.py` | 5-12x |
+| `linum_resample_mosaic_grid_gpu.py` | `linum_resample_mosaic_grid.py` | 5-12x |
+| `linum_normalize_intensities_per_slice_gpu.py` | `linum_normalize_intensities_per_slice.py` | 4-10x |
+| `linum_fix_illumination_3d_gpu.py` | `linum_fix_illumination_3d.py` | 2-5x |
+| `linum_assess_slice_quality_gpu.py` | `linum_assess_slice_quality.py` | 3-8x |
+| `linum_aip_gpu.py` | `linum_aip_png.py` | ≤1x (mean projection; transfer overhead dominates for typical sizes) |
+
+### Usage
+
+```bash
+# Use GPU (default)
+linum_create_masks_gpu.py input.ome.zarr output.ome.zarr
+
+# Disable GPU (force CPU)
+linum_create_masks_gpu.py input.ome.zarr output.ome.zarr --no-use_gpu
+```
+
+---
+
+## GPU-Accelerated Operations
+
+### Major Improvements (7-70x speedup)
+
+| Operation | Function | Typical Speedup |
+|-----------|----------|-----------------|
+| Binary Morphology | `binary_closing()`, etc. | 7-67x |
+| FFT/iFFT | `fft2()`, `ifft2()` | 9-47x |
+| Gaussian Filter | `gaussian_filter()` | 7-20x |
+| Phase Correlation | `phase_correlation()` | 8-16x |
+| Resampling | `resize()` | 5-12x |
+
+### Medium Improvements (4-10x speedup)
+
+| Operation | Typical Speedup |
+|-----------|-----------------|
+| Normalization | 4-10x |
+| Percentile Clipping | 4-10x |
+| Interpolation | 5-10x |
+| Mask Creation | 2-4x |
+
+### No GPU Benefit (use CPU)
+
+| Operation | Reason |
+|-----------|--------|
+| Mean/Max Projection | Simple reduction, transfer overhead dominates |
+| Galvo Detection | Simple computation |
+
+---
+
+## Multi-GPU Systems
+
+```bash
+# Show status of all GPUs
+linum_gpu_info.py --status
+
+# Select best GPU (most free memory)
+linum_gpu_info.py --select-best
+
+# Use specific GPU via environment
+CUDA_VISIBLE_DEVICES=1 nextflow run pipeline.nf --use_gpu true
+```
+
+---
+
+## Memory Management
+
+The NVIDIA A6000's 48GB VRAM typically holds entire mosaic grids. For larger volumes:
+
+```python
+import cupy as cp
+
+# Clear GPU memory cache
+cp.get_default_memory_pool().free_all_blocks()
+
+# Check memory usage
+mempool = cp.get_default_memory_pool()
+print(f"GPU memory used: {mempool.used_bytes() / 1e9:.2f} GB")
+```
+
+---
+
+## Troubleshooting
+
+### GPU Not Detected
+
+```bash
+nvidia-smi
+python -c "import cupy; print(cupy.cuda.runtime.getDeviceCount())"
+linum_gpu_info.py
+```
+
+### JAX CUDA Issues
+
+```bash
+# Run the fix script
+source shell_scripts/fix_jax_cuda_plugin.sh
+
+# Or check diagnostics
+linum_diagnose_pipeline.py --debug-cuda
+```
+
+### Out of Memory
+
+- Reduce batch size / chunk size
+- Use `cp.get_default_memory_pool().free_all_blocks()` between operations
+- Set `use_gpu=False` for specific operations
+
+---
+
+## Reference Benchmarks
+
+Actual benchmarks on NVIDIA RTX A6000 (48GB):
+
+| Operation | Image Size | CPU Time | GPU Time | Speedup |
+|-----------|------------|----------|----------|---------|
+| FFT2 | 2048×2048 | 207ms | 4.4ms | 47x |
+| Phase Correlation | 2048×2048 | 3876ms | 240ms | 16x |
+| Gaussian Filter | 2048×2048 | 79ms | 4.1ms | 20x |
+| Binary Closing | 2048×2048 | 99ms | 1.5ms | 67x |
+| Resize | 2048→1024 | 29ms | 3.4ms | 9x |
+| Mean Projection | 100×2048×2048 | 71ms | 146ms | **0.5x** |
+
+---
+
+## Module Reference
+
+```python
+from linumpy.gpu import (
+    GPU_AVAILABLE,      # bool: Is GPU available?
+    CUPY_AVAILABLE,     # bool: Is CuPy installed?
+    GPU_DEVICE_NAME,    # str: GPU name
+    GPU_MEMORY_GB,      # float: GPU memory in GB
+    to_gpu, to_cpu,     # Transfer functions
+    print_gpu_info,     # Print info
+)
+
+# Submodules
+from linumpy.gpu.fft_ops import fft2, ifft2, phase_correlation
+from linumpy.gpu.interpolation import resize, affine_transform
+from linumpy.gpu.morphology import binary_closing, gaussian_filter
+from linumpy.gpu.array_ops import normalize_percentile, clip_percentile
+```
diff --git a/docs/LIBRARY_MODULES.md b/docs/LIBRARY_MODULES.md
new file mode 100644
index 00000000..7fc3bb3a
--- /dev/null
+++ b/docs/LIBRARY_MODULES.md
@@ -0,0 +1,819 @@
+# Library Modules Documentation
+
+
+---
+
+## Overview
+
+The linumpy library provides Python modules for microscopy data processing. This document describes the main modules and their functionality.
+
+---
+
+## Module Structure
+
+```
+linumpy/
+├── _thread_config.py      # CPU thread management
+├── __init__.py
+├── io/                    # Input/Output modules
+├── gpu/                   # GPU acceleration modules
+├── microscope/            # Microscope-specific modules
+├── preproc/               # Preprocessing modules
+├── psf/                   # Point spread function
+├── stitching/             # Image stitching
+├── utils/                 # Utility modules
+├── reconstruction.py      # Core reconstruction
+├── segmentation.py        # Segmentation tools
+└── utils_images.py        # Image utilities
+```
+
+---
+
+## I/O Modules (`linumpy.io`)
+
+### zarr.py - OME-Zarr I/O
+
+Read and write OME-Zarr format files.
+
+```python
+from linumpy.io.zarr import read_omezarr, save_omezarr, OmeZarrWriter, AnalysisOmeZarrWriter
+
+# Read OME-Zarr
+image, resolution = read_omezarr("input.ome.zarr")
+# image: dask.array.Array
+# resolution: tuple (z, x, y) in mm/pixel
+
+# Save OME-Zarr
+save_omezarr(image, "output.ome.zarr", resolution, chunks=(10, 128, 128))
+
+# Write large volumes incrementally (traditional power-of-2 pyramid)
+writer = OmeZarrWriter("output.ome.zarr", shape, chunks, dtype)
+writer[0:10] = data_slice
+writer.finalize(resolution, n_levels=5)  # 2x downsampling per level
+
+# Write with analysis-optimized resolutions (10, 25, 50, 100 µm)
+writer = AnalysisOmeZarrWriter("output.ome.zarr", shape, chunks, dtype)
+writer[0:10] = data_slice
+writer.finalize(resolution, [10, 25, 50, 100])  # or use default
+writer.finalize(resolution)  # defaults to [10, 25, 50, 100] µm
+```
+
+### allen.py - Allen Brain Atlas
+
+Download and access Allen Brain Atlas data.
+
+```python
+from linumpy.io.allen import download_allen_atlas
+
+# Download atlas
+download_allen_atlas(output_dir)
+```
+
+### npz.py - NPZ Files
+
+Handle NumPy compressed archives.
+
+```python
+from linumpy.io.npz import load_npz, save_npz
+
+data = load_npz("data.npz")
+save_npz("output.npz", array)
+```
+
+### thorlabs.py - Thorlabs Files
+
+Read Thorlabs microscope data formats.
+
+```python
+from linumpy.io.thorlabs import read_thorlabs
+
+data = read_thorlabs("thorlabs_file")
+```
+
+### test_data.py - Test Data
+
+Access test datasets for development.
+
+```python
+from linumpy.io.test_data import get_data
+
+# Get path to test data
+raw_tiles_path = get_data('raw_tiles')
+```
+
+---
+
+## Microscope Module (`linumpy.microscope`)
+
+### oct.py - OCT Tile Reading
+
+Read raw OCT tiles with metadata and optional corrections.
+
+```python
+from linumpy.microscope.oct import OCT
+
+# Open tile
+tile = OCT("/path/to/tile_x00_y00_z00")
+
+# Access properties
+shape = tile.shape           # (z, x, y)
+resolution = tile.resolution # (res_z, res_x, res_y)
+dimension = tile.dimension   # Physical dimensions
+position = tile.position     # Stage position (if available)
+
+# Read data with corrections
+data = tile.load_image(
+    crop=True,              # Crop galvo return region
+    fix_galvo_shift=True,   # Auto-detect and fix galvo artifacts
+    fix_camera_shift=False  # Fix camera timing shift (old data)
+)
+```
+
+**Properties:**
+- `shape`: Data shape (z, x, y)
+- `resolution`: Pixel size in mm
+- `dimension`: Physical dimensions in mm
+- `position`: Stage position (x, y, z) in mm
+- `position_available`: Whether position metadata exists
+
+**load_image() Parameters:**
+- `crop`: Remove extra pixels from galvo return
+- `fix_galvo_shift`: 
+  - `True`: Auto-detect artifact and fix if confident (≥0.3 confidence)
+  - `int`: Apply specific shift value
+  - `False`: No correction
+- `fix_camera_shift`: Correct camera timing offset (legacy data)
+
+---
+
+## Preprocessing Modules (`linumpy.preproc`)
+
+### normalization.py - Intensity Normalization
+
+Normalize OCT volume intensities based on agarose background.
+
+```python
+from linumpy.preproc.normalization import normalize_volume
+
+# Normalize volume intensities
+# agarose_mask: 2D binary mask indicating agarose regions
+normalized, background_thresholds = normalize_volume(
+    vol,                    # Input volume (Z, X, Y)
+    agarose_mask,           # 2D mask for agarose detection
+    percentile_max=99.9     # Clip values above this percentile
+)
+```
+
+### icorr.py - Illumination Correction
+
+Correct illumination inhomogeneity.
+
+```python
+from linumpy.preproc.icorr import estimate_illumination, apply_illumination_correction
+
+# Estimate illumination profile
+profile = estimate_illumination(image)
+
+# Apply correction
+corrected = apply_illumination_correction(image, profile)
+```
+
+### xyzcorr.py - XYZ Corrections
+
+Apply spatial corrections including galvo shift detection and correction.
+
+```python
+from linumpy.preproc.xyzcorr import (
+    detect_galvo_shift,
+    detect_galvo_for_slice,
+    detect_galvo_artifact_presence,
+    fix_galvo_shift,
+    findTissueInterface,
+    cropVolume
+)
+
+# Detect galvo shift with confidence score
+aip = volume.mean(axis=0)  # Average intensity projection
+shift, confidence = detect_galvo_shift(
+    aip, 
+    n_pixel_return=40,      # Number of pixels in galvo return region
+    return_confidence=True  # Return confidence score (0-1)
+)
+
+# Only apply fix if confident (artifact is present)
+if confidence >= 0.3:
+    corrected = fix_galvo_shift(volume, shift=shift, axis=1)
+
+# Or apply with known shift value
+corrected = fix_galvo_shift(volume, shift=15, axis=1)
+
+# For slice-level detection (samples multiple tiles, skips background)
+shift, confidence = detect_galvo_for_slice(
+    tiles,                  # zarr array of tiles
+    n_extra=40,             # Number of extra pixels from galvo return
+    threshold=0.6,          # Minimum confidence threshold
+    n_samples=5,            # Number of tiles to sample
+    min_intensity=20.0      # Skip tiles with mean intensity below this
+)
+
+# For batch processing: check artifact presence separately
+presence_score = detect_galvo_artifact_presence(
+    aip,
+    n_pixel_return=40,
+    detected_shift=shift
+)
+```
+
+**Galvo Shift Detection:**
+
+The galvo mirror in OCT systems can cause horizontal banding artifacts when the galvo return region is not at the edge of the raw tile data. The detection system has three parts:
+
+1. **`detect_galvo_shift()`** - Finds *where* the galvo return boundary is located
+   - Analyzes average A-line intensity profile
+   - Searches for the shift that maximizes boundary discontinuities
+   - Returns shift value (in pixels) and optional confidence score
+
+2. **`detect_galvo_for_slice(tiles, n_extra, ...)`** - Slice-level detection
+   - Samples tiles from the center of the mosaic (more likely to contain tissue)
+   - Skips background tiles with low mean intensity
+   - Returns the detection with highest confidence
+   - Returns `(0, confidence)` if no artifact detected above threshold
+
+**Confidence Score Interpretation (0-1):**
+| Score | Meaning | Action |
+|-------|---------|--------|
+| < 0.5 | No clear artifact detected | Skip correction |
+| ≥ 0.5 | Galvo artifact likely present | Apply correction |
+| > 0.7 | Clear galvo artifact | High confidence |
+
+**Key Algorithm Details:**
+- Uses **gradient-based detection** to find intensity discontinuities
+- Finds pairs of high gradients separated by exactly `n_pixel_return` pixels
+- Checks B-scan subregions for subtle artifacts only visible in parts of the tile
+- Validates using peak dominance, boundary gradient ranking, and intensity contrast
+- Default threshold: 0.6 (configurable via `galvo_threshold` parameter)
+
+### resampling.py - Mosaic Grid Resampling
+
+Resample mosaic grid volumes to a target isotropic resolution, processing tile-by-tile to avoid loading the full volume into memory.
+
+```python
+from linumpy.preproc.resampling import resample_mosaic_grid
+
+# Resample to 10 µm isotropic
+resample_mosaic_grid(
+    vol,                # Dask/Zarr array with chunk structure (Z, nx*h, ny*w)
+    source_res,         # Source resolution (res_z, res_y, res_x)
+    target_res_um=10.0, # Target isotropic resolution in µm
+    n_levels=5,         # Number of output pyramid levels
+    out_path="output.ome.zarr"
+)
+```
+
+---
+
+## PSF Module (`linumpy.psf`)
+
+### psf_estimator.py - PSF Estimation
+
+Estimate and model point spread functions.
+
+```python
+from linumpy.psf.psf_estimator import estimate_psf, apply_deconvolution
+
+# Estimate PSF from data
+psf = estimate_psf(image)
+
+# Apply deconvolution
+deconvolved = apply_deconvolution(image, psf)
+```
+
+---
+
+## Stitching Module (`linumpy.stitching`)
+
+### registration.py - Image Registration
+
+Register images using various methods.
+
+```python
+from linumpy.stitching.registration import (
+    register_2d_images_sitk,
+    apply_transform,
+    pairWisePhaseCorrelation
+)
+
+# Get initial translation estimate using phase correlation
+deltas = pairWisePhaseCorrelation(fixed_image, moving_image)
+initial_translation = (deltas[1], deltas[0])  # (x, y) order for SimpleITK
+
+# Register 2D images with phase correlation initialization
+transform, moving_registered, error = register_2d_images_sitk(
+    fixed_image, 
+    moving_image,
+    metric='MSE',              # MSE, CC, AntsCC, MI
+    method='affine',           # affine, euler, translation
+    max_iterations=2500,
+    grad_mag_tol=1e-6,
+    moving_mask=None,
+    fixed_mask=None,
+    return_3d_transform=True,
+    initial_translation=initial_translation,  # Optional: phase correlation result
+    initial_step=None          # Optional: optimizer step size (auto-reduced with initial_translation)
+)
+
+# Apply transform to volume
+transformed = apply_transform(volume, transform)
+```
+
+**Note:** When `initial_translation` is provided, the optimizer uses a smaller step size (1.0 vs 4.0 pixels) to prevent drifting away from the correct solution.
+
+### stitch_utils.py - Stitching Utilities
+
+Helper functions for stitching operations.
+
+```python
+from linumpy.stitching.stitch_utils import (
+    compute_overlap,
+    blend_images
+)
+```
+
+### topology.py - Mosaic Topology
+
+Manage tile arrangements and topology.
+
+```python
+from linumpy.stitching.topology import (
+    build_topology_graph,
+    find_optimal_path
+)
+```
+
+### motor.py - Motor-Position Tile Placement
+
+Compute tile pixel positions from motor grid geometry (regular grid with configurable overlap, scale, and rotation).
+
+```python
+from linumpy.stitching.motor import compute_motor_positions
+
+positions, step_y, step_x = compute_motor_positions(
+    nx=3,                   # Number of tiles in X
+    ny=3,                   # Number of tiles in Y
+    tile_shape=(100, 512, 512),
+    overlap_fraction=0.2,
+    scale_factor=1.0,       # Scale step size (default: no scaling)
+    rotation_deg=0.0        # Global grid rotation
+)
+# positions: list of (row_pos, col_pos) pixel positions
+```
+
+### stacking.py - 3D Slice Stacking Utilities
+
+Z-overlap detection between consecutive slices using normalized cross-correlation.
+
+```python
+from linumpy.stitching.stacking import find_z_overlap
+
+best_overlap, best_corr = find_z_overlap(
+    fixed_vol,               # Bottom (fixed) slice volume (Z, Y, X)
+    moving_vol,              # Top (moving) slice volume (Z, Y, X)
+    slicing_interval_mm=0.05,
+    search_range_mm=0.1,
+    resolution_um=3.5
+)
+# best_overlap: optimal overlap in Z voxels
+# best_corr: correlation score at optimal overlap
+```
+
+### interpolation.py - Missing Slice Interpolation
+
+Compute the affine transform halfway between two images for morphing-based slice interpolation.
+
+```python
+from linumpy.stitching.interpolation import compute_half_affine_transform
+
+half_transform = compute_half_affine_transform(full_transform)
+# full_transform: sitk.Transform from image A to image B
+# Returns: sitk.AffineTransform representing half the transformation
+```
+
+### manual_registration.py - Manual Registration
+
+GUI-based manual registration tools.
+
+```python
+from linumpy.stitching.manual_registration import ManualRegistrationGUI
+```
+
+### FileUtils.py - File Utilities
+
+File handling utilities for stitching.
+
+```python
+from linumpy.stitching.FileUtils import (
+    list_tiles,
+    parse_tile_name
+)
+```
+
+---
+
+## Utilities Module (`linumpy.utils`)
+
+### mosaic_grid.py - Mosaic Grid Utilities
+
+Functions for working with mosaic grids.
+
+```python
+from linumpy.utils.mosaic_grid import (
+    getDiffusionBlendingWeights,
+    compute_grid_shape
+)
+
+# Compute blending weights for slice fusion
+weights = getDiffusionBlendingWeights(
+    mask_fixed,
+    mask_moving,
+    factor=2
+)
+```
+
+### io.py - I/O Utilities
+
+Command-line argument helpers.
+
+```python
+from linumpy.utils.io import (
+    add_overwrite_arg,
+    add_processes_arg,
+    assert_output_exists,
+    get_available_cpus,
+    parse_processes_arg
+)
+
+# Add standard arguments to parser
+parser = argparse.ArgumentParser()
+add_overwrite_arg(parser)
+add_processes_arg(parser)
+
+# Parse and validate
+args = parser.parse_args()
+n_processes = parse_processes_arg(args.n_processes)
+assert_output_exists(args.output, parser, args)
+
+# Get available CPUs (respects LINUMPY_MAX_CPUS and LINUMPY_RESERVED_CPUS env vars)
+available = get_available_cpus()
+```
+
+#### CPU Core Management
+
+The `get_available_cpus()` function respects environment variables for limiting CPU usage:
+
+```python
+import os
+from linumpy.utils.io import get_available_cpus
+
+# Default: uses all CPUs minus 1
+cpus = get_available_cpus()  # e.g., 15 on a 16-core system
+
+# With LINUMPY_RESERVED_CPUS=4
+os.environ['LINUMPY_RESERVED_CPUS'] = '4'
+cpus = get_available_cpus()  # e.g., 12 on a 16-core system
+
+# With LINUMPY_MAX_CPUS=8 (takes precedence)
+os.environ['LINUMPY_MAX_CPUS'] = '8'
+cpus = get_available_cpus()  # 8 (or total if less than 8)
+```
+
+### data_io.py - Data I/O Helpers
+
+Data reading/writing utilities.
+
+```python
+from linumpy.utils.data_io import (
+    load_image,
+    save_image
+)
+```
+
+### metrics.py - Pipeline Quality Metrics
+
+Collect, save, and aggregate quality metrics from pipeline steps.
+
+```python
+from linumpy.utils.metrics import (
+    PipelineMetrics,
+    collect_mask_metrics,
+    collect_normalization_metrics,
+    collect_xy_transform_metrics,
+    collect_pairwise_registration_metrics,
+    collect_interface_crop_metrics,
+    collect_psf_compensation_metrics,
+    collect_stack_metrics,
+    collect_stitch_3d_metrics,
+    aggregate_metrics,
+    compute_summary_statistics
+)
+
+# Manual metrics collection
+metrics = PipelineMetrics('my_step', output_dir)
+metrics.add_info('input', input_path, 'Input file path')
+metrics.add_metric('error', 0.05, unit='pixels', threshold_name='registration_error')
+metrics.save()
+
+# Use step-specific collectors (recommended - simpler)
+collect_mask_metrics(mask, input_vol, output_path, input_path, params={'sigma': 5.0})
+collect_normalization_metrics(vol, agarose_mask, otsu_thresh, bg_thresh, output_path)
+collect_pairwise_registration_metrics(error, tx, ty, rot, best_z, expected_z, output_path)
+
+# Aggregate metrics from pipeline run
+aggregated = aggregate_metrics('/path/to/pipeline/output')
+# Returns: {'step_name': [list of metrics dicts], ...}
+
+# Compute summary statistics
+summary = compute_summary_statistics(aggregated['pairwise_registration'])
+# Returns: {'count': N, 'status_counts': {...}, 'metric_name': {'mean': ..., 'std': ...}}
+```
+
+**Available Threshold Names:**
+| Name | Warning | Error | Higher is Better |
+|------|---------|-------|------------------|
+| `registration_error` | 0.05 | 0.15 | No |
+| `translation_magnitude` | 30.0 | 50.0 | No |
+| `rotation_degrees` | 1.0 | 2.0 | No |
+| `correlation` | 0.7 | 0.5 | Yes |
+| `mask_coverage` | 0.05 | 0.01 | Yes |
+| `agarose_coverage` | 0.05 | 0.01 | Yes |
+| `rms_residual` | 5.0 | 15.0 | No |
+| `z_offset_std` | 10.0 | 25.0 | No |
+| `z_offset_range` | 15.0 | 30.0 | No |
+
+### shifts.py - XY Shift Utilities
+
+Load and process XY shift CSV files for inter-slice alignment.
+
+```python
+from linumpy.utils.shifts import load_shifts_csv, detect_shift_units
+
+# Load shifts and compute cumulative positions
+cumsum, all_ids = load_shifts_csv("shifts_xy.csv")
+# cumsum: {slice_id: (cumulative_dx_mm, cumulative_dy_mm)}
+# all_ids: sorted list of all slice IDs
+
+# Detect resolution units (mm vs µm)
+res_x_um, res_y_um = detect_shift_units(resolution)
+```
+
+### orientation.py - Volume Orientation
+
+Parse 3-letter orientation codes and compute axis permutations/flips for RAS alignment.
+
+```python
+from linumpy.utils.orientation import parse_orientation_code
+
+# Parse orientation for RAS alignment
+axis_permutation, axis_flips = parse_orientation_code('PIR')
+# axis_permutation: source indices for each target dimension
+# axis_flips: +1 (keep) or -1 (flip) per axis after permutation
+
+# Example: PIR → (1, 2, 0), (-1, 1, -1)
+# Example: SRA → (0, 1, 2), (1, 1, 1)  # identity
+```
+
+**Supported letters:** R/L (Right/Left), A/P (Anterior/Posterior), S/I (Superior/Inferior)
+
+### image_quality.py - Image Quality Assessment
+
+CPU-based image quality metrics for slice analysis.
+
+```python
+from linumpy.utils.image_quality import (
+    compute_ssim_2d,
+    compute_ssim_3d,
+    compute_edge_score,
+    compute_variance_score,
+    assess_slice_quality,
+)
+
+# Compare two 3D volumes
+ssim = compute_ssim_3d(vol1, vol2)
+
+# Assess overall quality of a slice relative to its neighbors
+quality, metrics = assess_slice_quality(
+    vol,        # The slice to assess
+    vol_before, # Previous slice
+    vol_after   # Next slice
+)
+# quality: float in [0, 1]
+# metrics: dict with ssim, edge_score, variance_score, etc.
+```
+
+For GPU-accelerated versions see `linumpy.gpu.image_quality`.
+
+### visualization.py - Orthogonal View Screenshots
+
+Save orthogonal (XY, XZ, YZ) views of a 3D volume as a figure.
+
+```python
+from linumpy.utils.visualization import save_orthogonal_views, save_annotated_views
+
+# Basic orthogonal view
+save_orthogonal_views(
+    image,              # 3D volume (Z, X, Y)
+    "view.png",
+    z_slice=None,       # Default: center
+    x_slice=None,
+    y_slice=None,
+    cmap='magma',
+    percentile_max=99.9
+)
+
+# Annotated view with Z-slice index labels
+save_annotated_views(
+    image,
+    "annotated_view.png",
+    n_slices=50,        # Number of input slices (for label spacing)
+    slice_ids=None,     # Optional explicit slice IDs
+    font_size=7,
+    label_every=1,
+    show_lines=False
+)
+```
+
+---
+
+## Core Modules
+
+### reconstruction.py - Core Reconstruction
+
+Core reconstruction functions.
+
+```python
+from linumpy.reconstruction import (
+    get_tiles_ids,
+    get_mosaic_info,
+    getLargestCC
+)
+
+# Get tile IDs from directory
+tiles, tile_ids = get_tiles_ids(directory, z=None)
+# tiles: list of Path objects
+# tile_ids: list of (mx, my, mz) tuples
+
+# Get mosaic information
+mosaic_info = get_mosaic_info(
+    directory, 
+    z=0, 
+    overlap_fraction=0.2,
+    use_stage_positions=True
+)
+# Returns dict with:
+# - mosaic_shape, tile_positions, etc.
+# - mosaic_xmin_mm, mosaic_ymin_mm
+# - tile_resolution
+
+# Get largest connected component
+largest_cc = getLargestCC(binary_mask)
+```
+
+### segmentation.py - Segmentation
+
+Image segmentation tools.
+
+```python
+from linumpy.segmentation import (
+    segment_tissue,
+    create_mask
+)
+```
+
+### utils_images.py - Image Utilities
+
+General image processing utilities.
+
+```python
+from linumpy.utils_images import (
+    apply_xy_shift,
+    normalize_image
+)
+
+# Apply XY shift to image
+shifted = apply_xy_shift(
+    image,       # Source image
+    reference,   # Reference (determines output shape)
+    dy,          # Y shift in pixels
+    dx           # X shift in pixels
+)
+```
+
+---
+
+## Common Patterns
+
+### Reading and Processing a Mosaic Grid
+
+```python
+from linumpy.io.zarr import read_omezarr, save_omezarr
+import numpy as np
+
+# Read
+image, resolution = read_omezarr("input.ome.zarr")
+
+# Process (example: normalize)
+data = image[:]  # Load into memory
+data = (data - np.min(data)) / (np.max(data) - np.min(data))
+
+# Save
+import dask.array as da
+save_omezarr(da.from_array(data), "output.ome.zarr", resolution)
+```
+
+### Parallel Processing with Multiple Tiles
+
+```python
+from linumpy.reconstruction import get_tiles_ids
+from linumpy.microscope.oct import OCT
+from tqdm.contrib.concurrent import process_map
+
+def process_tile(tile_path):
+    tile = OCT(tile_path)
+    # Process tile...
+    return result
+
+tiles, tile_ids = get_tiles_ids(directory)
+results = process_map(process_tile, tiles, max_workers=8)
+```
+
+### Registration Workflow
+
+```python
+from linumpy.io.zarr import read_omezarr
+from linumpy.stitching.registration import register_2d_images_sitk, apply_transform
+
+# Load images
+fixed, _ = read_omezarr("fixed.ome.zarr")
+moving, _ = read_omezarr("moving.ome.zarr")
+
+# Register
+transform, _, error = register_2d_images_sitk(
+    fixed[0],  # 2D slice
+    moving[0],
+    method='affine',
+    metric='MSE'
+)
+
+# Apply to full volume
+registered = apply_transform(moving, transform)
+```
+
+---
+
+## Type Hints
+
+Most functions include type hints for better IDE support:
+
+```python
+def read_omezarr(path: str | Path) -> tuple[da.Array, tuple[float, ...]]:
+    ...
+
+def apply_xy_shift(
+    image: np.ndarray,
+    reference: np.ndarray,
+    dy: float,
+    dx: float
+) -> np.ndarray:
+    ...
+```
+
+---
+
+## Error Handling
+
+Functions typically raise standard Python exceptions:
+
+```python
+try:
+    image, res = read_omezarr("nonexistent.ome.zarr")
+except FileNotFoundError:
+    print("File not found")
+except ValueError as e:
+    print(f"Invalid format: {e}")
+```
+
+---
+
+## Dependencies
+
+Key dependencies used by the library:
+
+| Package | Purpose |
+|---------|---------|
+| `numpy` | Array operations |
+| `dask` | Lazy/parallel arrays |
+| `zarr` | Chunked array storage |
+| `SimpleITK` | Image registration |
+| `scikit-image` | Image processing |
+| `scipy` | Scientific computing |
+| `pandas` | Data manipulation |
+| `tqdm` | Progress bars |
diff --git a/docs/MOSAIC_GRID_FORMAT.md b/docs/MOSAIC_GRID_FORMAT.md
new file mode 100644
index 00000000..8248eb52
--- /dev/null
+++ b/docs/MOSAIC_GRID_FORMAT.md
@@ -0,0 +1,385 @@
+# Mosaic Grid Format (OME-Zarr)
+
+
+---
+
+## Overview
+
+Mosaic grids in linumpy are stored in the **OME-Zarr** format, a cloud-optimized, chunked array format designed for large microscopy datasets. The format supports multi-resolution pyramids, metadata, and efficient partial reads.
+
+---
+
+## File Structure
+
+### Directory Layout
+
+```
+mosaic_grid_3d_z00.ome.zarr/
+├── .zattrs                 # Root attributes (OME metadata)
+├── .zgroup                 # Zarr group marker
+├── 0/                      # Full resolution level
+│   ├── .zarray             # Array metadata
+│   └── <chunks>/           # Data chunks
+├── 1/                      # 2x downsampled (optional)
+│   ├── .zarray
+│   └── <chunks>/
+├── 2/                      # 4x downsampled (optional)
+│   └── ...
+└── ...
+```
+
+### Resolution Levels
+
+Two pyramid creation modes are available:
+
+#### Traditional Power-of-2 Pyramids (Mosaic Grids)
+
+Used during preprocessing for intermediate mosaic grids:
+- **Level 0**: Full resolution
+- **Level 1**: 2x downsampled
+- **Level 2**: 4x downsampled
+- etc.
+
+The number of levels is controlled by `--n_levels` parameter during creation.
+
+#### Analysis-Optimized Pyramids (Final 3D Volume)
+
+For the final stacked 3D volume, specific analysis-friendly resolutions are used:
+
+| Level | Resolution | Use Case |
+|-------|------------|----------|
+| 0 | 10 µm | High-resolution cellular analysis |
+| 1 | 25 µm | Standard analysis resolution |
+| 2 | 50 µm | Overview, atlas registration |
+| 3 | 100 µm | Quick visualization, large-scale |
+
+This is controlled by the `--pyramid_resolutions` parameter in `linum_stack_slices_3d.py` or the `pyramid_resolutions` parameter in the Nextflow workflow.
+
+**Note:** Only resolutions ≥ the base processing resolution are included. For example, processing at 25 µm will create levels at 25, 50, and 100 µm.
+
+---
+
+## Array Dimensions
+
+### 3D Mosaic Grid
+
+```
+Shape: (Z, X, Y)
+```
+
+| Dimension | Description |
+|-----------|-------------|
+| Z | Depth/axial dimension |
+| X | First lateral dimension |
+| Y | Second lateral dimension |
+
+### 2D Mosaic Grid (AIP)
+
+```
+Shape: (X, Y)
+```
+
+---
+
+## Metadata
+
+### OME Metadata (`.zattrs`)
+
+```json
+{
+    "multiscales": [{
+        "version": "0.4",
+        "name": "mosaic_grid_3d_z00",
+        "axes": [
+            {"name": "z", "type": "space", "unit": "micrometer"},
+            {"name": "y", "type": "space", "unit": "micrometer"},
+            {"name": "x", "type": "space", "unit": "micrometer"}
+        ],
+        "datasets": [
+            {
+                "path": "0",
+                "coordinateTransformations": [{
+                    "type": "scale",
+                    "scale": [1.5, 10.0, 10.0]
+                }]
+            },
+            {
+                "path": "1",
+                "coordinateTransformations": [{
+                    "type": "scale",
+                    "scale": [1.5, 20.0, 20.0]
+                }]
+            }
+        ]
+    }]
+}
+```
+
+### Key Metadata Fields
+
+| Field | Description |
+|-------|-------------|
+| `axes` | Dimension names, types, and units |
+| `datasets` | Resolution levels with paths and transforms |
+| `coordinateTransformations` | Pixel-to-physical coordinate mapping |
+| `scale` | Resolution per dimension (µm/pixel) |
+
+---
+
+## Reading OME-Zarr Files
+
+### Using linumpy
+
+```python
+from linumpy.io.zarr import read_omezarr
+
+# Read image and resolution
+image, resolution = read_omezarr("mosaic_grid_3d_z00.ome.zarr")
+
+# image: dask.array.Array with shape (Z, X, Y)
+# resolution: tuple (res_z, res_x, res_y) in mm/pixel
+
+print(f"Shape: {image.shape}")
+print(f"Resolution: {resolution} mm/pixel")
+print(f"Chunks: {image.chunks}")
+```
+
+### Using zarr directly
+
+```python
+import zarr
+
+# Open store
+store = zarr.open("mosaic_grid_3d_z00.ome.zarr", mode='r')
+
+# Read full resolution
+data = store['0'][:]
+
+# Read downsampled level
+data_2x = store['1'][:]
+
+# Access metadata
+import json
+with open("mosaic_grid_3d_z00.ome.zarr/.zattrs") as f:
+    metadata = json.load(f)
+```
+
+### Using napari
+
+```bash
+napari mosaic_grid_3d_z00.ome.zarr
+```
+
+---
+
+## Writing OME-Zarr Files
+
+### Using linumpy
+
+```python
+from linumpy.io.zarr import save_omezarr
+import dask.array as da
+import numpy as np
+
+# Create sample data
+data = da.from_array(np.random.rand(100, 512, 512), chunks=(10, 128, 128))
+resolution = (0.0015, 0.010, 0.010)  # mm/pixel
+
+# Save
+save_omezarr(data, "output.ome.zarr", resolution, chunks=data.chunks)
+```
+
+### With Multiple Resolution Levels
+
+```python
+from linumpy.io.zarr import OmeZarrWriter, AnalysisOmeZarrWriter
+
+# Traditional power-of-2 pyramid (2x downsampling per level)
+writer = OmeZarrWriter(
+    "output.ome.zarr",
+    shape=(100, 512, 512),
+    chunks=(10, 128, 128),
+    dtype=np.float32
+)
+writer[:] = data[:]
+writer.finalize(resolution, n_levels=5)  # Creates levels 0-5
+
+# Analysis-optimized pyramid (specific resolutions in µm)
+writer = AnalysisOmeZarrWriter(
+    "output.ome.zarr",
+    shape=(100, 512, 512),
+    chunks=(10, 128, 128),
+    dtype=np.float32
+)
+writer[:] = data[:]
+writer.finalize(resolution, [10, 25, 50, 100])  # Creates 10, 25, 50, 100 µm levels
+# or simply: writer.finalize(resolution)  # uses default [10, 25, 50, 100]
+```
+
+---
+
+## Chunking and Sharding
+
+### Chunk Size
+
+Chunks determine how data is stored and accessed. Optimal chunk sizes balance:
+- Read/write performance
+- Compression efficiency
+- Memory usage
+
+Typical values:
+```
+3D: (16, 128, 128) to (64, 256, 256)
+2D: (256, 256) to (1024, 1024)
+```
+
+### Sharding (Zarr v3)
+
+Sharding groups multiple chunks into larger files, reducing filesystem overhead:
+
+```
+sharding_factor = 4  # 4x4 chunks per shard
+```
+
+Controlled by `--sharding_factor` parameter in `linum_create_mosaic_grid_3d.py`.
+
+---
+
+## Data Types
+
+### Common Types
+
+| Type | Description | Typical Use |
+|------|-------------|-------------|
+| `uint8` | 0-255 | Display, compressed |
+| `uint16` | 0-65535 | Raw microscopy data |
+| `float32` | 32-bit float | Processed data |
+| `float64` | 64-bit float | High-precision processing |
+
+### Type Conversion
+
+```python
+# During reading
+image = image.astype(np.float32)
+
+# During writing (specify dtype)
+save_omezarr(data.astype(np.uint16), "output.ome.zarr", resolution, ...)
+```
+
+---
+
+## Naming Convention
+
+### Mosaic Grid Files
+
+```
+mosaic_grid_3d_z{slice_id}.ome.zarr
+```
+
+- `mosaic_grid_3d`: Indicates 3D mosaic grid
+- `z{slice_id}`: Two-digit slice identifier (e.g., `z00`, `z01`, `z15`)
+- `.ome.zarr`: OME-Zarr format extension
+
+### Examples
+
+```
+mosaic_grid_3d_z00.ome.zarr  # First slice
+mosaic_grid_3d_z01.ome.zarr  # Second slice
+mosaic_grid_3d_z15.ome.zarr  # 16th slice
+```
+
+### Processed Files
+
+```
+slice_z{slice_id}_{process}.ome.zarr
+```
+
+Examples:
+```
+slice_z00_stitch_3d.ome.zarr
+slice_z00_axial_corr.ome.zarr
+slice_z00_normalize.ome.zarr
+```
+
+---
+
+## Viewing OME-Zarr Files
+
+### Napari (Recommended)
+
+```bash
+pip install napari[all]
+napari mosaic_grid_3d_z00.ome.zarr
+```
+
+### Neuroglancer
+
+OME-Zarr files can be viewed in Neuroglancer if hosted on a web server.
+
+### Using linumpy Scripts
+
+```bash
+# View OME-Zarr
+linum_view_omezarr.py mosaic_grid_3d_z00.ome.zarr
+
+# Take screenshot
+linum_screenshot_omezarr.py mosaic_grid_3d_z00.ome.zarr screenshot.png
+```
+
+---
+
+## Compression
+
+### Supported Codecs
+
+| Codec | Description | Use Case |
+|-------|-------------|----------|
+| `blosc` | Fast, general purpose | Default |
+| `zstd` | High compression ratio | Archival |
+| `lz4` | Very fast | Real-time |
+| `gzip` | Widely compatible | Exchange |
+
+### Default Configuration
+
+linumpy uses `blosc` with `lz4` compressor by default.
+
+---
+
+## Troubleshooting
+
+### File Won't Open
+
+```bash
+# Check file structure
+ls -la mosaic_grid_3d_z00.ome.zarr/
+
+# Verify .zattrs exists
+cat mosaic_grid_3d_z00.ome.zarr/.zattrs
+```
+
+### Memory Issues
+
+```python
+# Use dask for lazy loading
+image, res = read_omezarr("large_file.ome.zarr")
+# image is a dask array - not loaded into memory
+
+# Load only a subset
+subset = image[0:10, 100:200, 100:200].compute()
+```
+
+### Wrong Dimensions
+
+Check the axes metadata in `.zattrs` to verify dimension order.
+
+---
+
+## Related Tools
+
+| Tool | Purpose |
+|------|---------|
+| `linum_create_mosaic_grid_3d.py` | Create mosaic from raw tiles |
+| `linum_resample_mosaic_grid.py` | Resample to different resolution |
+| `linum_convert_omezarr_to_nifti.py` | Convert to NIfTI format |
+| `linum_view_omezarr.py` | Interactive viewer |
+| `linum_screenshot_omezarr.py` | Generate preview images |
diff --git a/docs/NEXTFLOW_WORKFLOWS.md b/docs/NEXTFLOW_WORKFLOWS.md
new file mode 100644
index 00000000..ad952f04
--- /dev/null
+++ b/docs/NEXTFLOW_WORKFLOWS.md
@@ -0,0 +1,924 @@
+# Nextflow Workflows Guide
+
+
+---
+
+## Overview
+
+linumpy uses [Nextflow](https://www.nextflow.io/) for orchestrating complex processing pipelines. Nextflow provides:
+
+- **Parallelization**: Automatic parallel execution of independent tasks
+- **Portability**: Run on local machines, clusters, or cloud
+- **Reproducibility**: Containerized execution with Apptainer/Singularity
+- **Fault tolerance**: Automatic retry and error handling
+
+---
+
+## Available Workflows
+
+| Workflow | Location | Purpose |
+|----------|----------|---------|
+| `preproc_rawtiles.nf` | `workflows/preproc/` | Raw tiles → Mosaic grids |
+| `soct_3d_reconst.nf` | `workflows/reconst_3d/` | Mosaic grids → 3D volume |
+
+---
+
+## Prerequisites
+
+### Nextflow Installation
+
+```bash
+# Install Nextflow
+curl -s https://get.nextflow.io | bash
+
+# Or via conda
+conda install -c bioconda nextflow
+
+# Verify installation
+nextflow -version
+```
+
+**Required version**: >= 23.10
+
+### Apptainer/Singularity (Optional)
+
+For containerized execution:
+
+```bash
+# Install Apptainer
+sudo apt install apptainer
+
+# Or Singularity
+sudo apt install singularity
+```
+
+---
+
+## Preprocessing Workflow
+
+### Location
+
+```
+workflows/preproc/
+├── preproc_rawtiles.nf     # Workflow definition
+└── nextflow.config          # Default configuration
+```
+
+### Purpose
+
+Converts raw OCT tiles into organized mosaic grids and extracts metadata.
+
+### Running
+
+```bash
+cd workflows/preproc
+
+# Basic usage
+nextflow run preproc_rawtiles.nf \
+    --input /path/to/raw/tiles \
+    --output /path/to/output
+
+# With options
+nextflow run preproc_rawtiles.nf \
+    --input /path/to/raw/tiles \
+    --output /path/to/output \
+    --processes 8 \
+    --resolution 10 \
+    --axial_resolution 1.36
+```
+
+### Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `input` | (required) | Raw tiles directory |
+| `output` | `"output"` | Output directory |
+| `use_old_folder_structure` | `false` | Use flat folder structure |
+| `use_gpu` | `true` | Enable GPU acceleration (auto-fallback to CPU) |
+| `processes` | `1` | Parallel Python processes per task (CPU mode only) |
+| `max_mosaic_forks` | `4` | Max concurrent `create_mosaic_grid` GPU jobs |
+| `max_aip_forks` | `4` | Max concurrent `generate_aip` GPU jobs |
+| `max_quality_forks` | `2` | Max concurrent `assess_slice_quality` GPU jobs |
+| `axial_resolution` | `1.36` | Axial resolution (µm) |
+| `resolution` | `-1` | Output resolution (-1 = full native resolution) |
+| `sharding_factor` | `4` | Zarr sharding (NxN chunks/shard) |
+| `fix_galvo_shift` | `true` | Correct galvo shifts |
+| `fix_camera_shift` | `false` | Correct camera shifts |
+| `galvo_confidence_threshold` | `0.6` | Minimum confidence to apply galvo fix |
+| `generate_slice_config` | `true` | Generate slice_config.csv |
+| `exclude_first_slices` | `1` | Number of leading slices to mark as excluded |
+| `detect_galvo` | `false` | Include galvo detection results in slice_config.csv |
+| `generate_previews` | `false` | Generate orthogonal view previews of mosaic grids |
+| `generate_aips` | `false` | Generate AIP images from mosaic grids for QC |
+| `assess_quality` | `false` | Run quality assessment and update slice_config |
+| `min_quality_score` | `0.2` | Minimum quality score to include slice (0 = report only) |
+| `quality_sample_depth` | `10` | Z-planes sampled per slice during quality assessment |
+
+### Outputs
+
+```
+output/
+├── mosaic_grid_3d_z00.ome.zarr/
+├── mosaic_grid_3d_z01.ome.zarr/
+├── ...
+├── shifts_xy.csv
+├── slice_config.csv
+├── aips/                          # Only when generate_aips = true
+│   ├── aip_z00.png
+│   └── ...
+└── previews/                      # Only when generate_previews = true
+    ├── mosaic_grid_z00_preview.png
+    └── ...
+```
+
+---
+
+## 3D Reconstruction Workflow
+
+### Location
+
+```
+workflows/reconst_3d/
+├── soct_3d_reconst.nf      # Workflow definition
+└── nextflow.config          # Default configuration
+```
+
+### Purpose
+
+Processes mosaic grids through multiple correction and stitching steps to produce a final 3D volume.
+
+### Running
+
+```bash
+cd workflows/reconst_3d
+
+# Basic usage
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/mosaic/grids \
+    --shifts_xy /path/to/shifts_xy.csv \
+    --output /path/to/output
+
+# With slice config
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/mosaic/grids \
+    --shifts_xy /path/to/shifts_xy.csv \
+    --slice_config /path/to/slice_config.csv \
+    --output /path/to/output
+
+# Full options
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/mosaic/grids \
+    --shifts_xy /path/to/shifts_xy.csv \
+    --output /path/to/output \
+    --resolution 10 \
+    --processes 4 \
+    --fix_curvature_enabled true \
+    --fix_illum_enabled true \
+    --create_registration_masks true
+```
+
+### Parameters
+
+#### Input/Output
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `input` | `"."` | Mosaic grids directory |
+| `shifts_xy` | `""` | XY shifts file (default: `{input}/shifts_xy.csv`) |
+| `slice_config` | `""` | Optional slice config |
+| `output` | `"."` | Output directory |
+| `subject_name` | `""` | Subject identifier (auto-extracted from path) |
+| `processes` | `8` | Parallel Python processes per task |
+
+#### Compute Resources
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `use_gpu` | `true` | Enable GPU acceleration (auto-fallback to CPU) |
+| `enable_cpu_limits` | `true` | Enable CPU limiting |
+| `max_cpus` | `16` | Maximum CPUs to use (0 = no limit) |
+| `reserved_cpus` | `4` | CPUs reserved for system overhead |
+
+#### Resolution & Basic Settings
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `resolution` | `10` | Target resolution (µm/pixel) |
+| `clip_percentile_upper` | `99.9` | Upper percentile for intensity clipping |
+| `fix_curvature_enabled` | `false` | Detect and compensate focal curvature artifacts |
+| `fix_illum_enabled` | `true` | Fix illumination inhomogeneity (BaSiCPy algorithm) |
+| `crop_interface_out_depth` | `600` | Maximum tissue depth after interface crop (µm) |
+| `normalize_min_contrast` | `0.1` | Min contrast fraction to prevent over-amplification of empty slices (0–1) |
+
+#### Tile Stitching
+
+These parameters control how tiles within each slice are assembled in XY.
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `use_motor_positions_for_stitching` | `true` | Use motor encoder positions for tile layout (recommended) |
+| `stitch_overlap_fraction` | `0.2` | Expected tile overlap fraction — should match acquisition settings |
+| `stitch_blending_method` | `'diffusion'` | Tile blending: `'none'`, `'average'`, `'diffusion'` |
+| `max_blend_refinement_px` | `10` | Maximum sub-pixel refinement shift for blending (pixels) |
+
+#### Common Space Alignment
+
+Aligns each slice into a shared XY canvas using `shifts_xy.csv` motor positions, with optional outlier and step filtering.
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `filter_shift_outliers` | `true` | Enable outlier filtering (strongly recommended) |
+| `outlier_method` | `'iqr'` | Detection method: `iqr`, `local`, `median`, `clamp`, or `zero` |
+| `outlier_iqr_multiplier` | `1.5` | IQR multiplier for outlier detection |
+| `max_shift_mm` | `0.5` | Maximum allowed shift (mm) — floor on IQR threshold |
+| `common_space_max_step_mm` | `0.5` | Maximum per-step shift change (mm, 0 = disabled) |
+| `common_space_step_window` | `3` | Window size for step outlier detection |
+| `common_space_step_method` | `'local_median'` | Step correction method: `local_median`, `clamp`, `local_mad` |
+| `common_space_step_mad_threshold` | `3.0` | MADs above local median to flag a step outlier (only with `local_mad`) |
+| `common_space_excluded_slice_mode` | `'local_median'` | Shift interpolation for excluded slices |
+| `common_space_excluded_slice_window` | `2` | Window for excluded slice interpolation |
+
+**Outlier Methods:**
+- `iqr` (recommended): Auto-detect outliers using IQR statistics, replace with local median
+- `local`: Replace outliers with local median of neighboring shifts
+- `median`: Replace outliers with global median
+- `clamp`: Limit magnitude to `max_shift_mm` while preserving direction
+- `zero`: Replace outliers with zero shift
+
+#### Pairwise Registration
+
+Computes small corrections (rotation, sub-pixel translation) between consecutive slices. The main XY alignment comes from motor positions; these transforms are refinements applied on top.
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `registration_transform` | `'euler'` | Transform type: `euler` (XY + rotation) or `translation` (XY only) |
+| `registration_max_translation` | `200.0` | Optimizer bound on translation (pixels) |
+| `registration_max_rotation` | `5.0` | Optimizer bound on rotation (degrees) |
+| `moving_slice_first_index` | `4` | Starting Z-index in the moving volume |
+| `registration_slicing_interval_mm` | `0.200` | Physical slice thickness (mm) |
+| `registration_allowed_drifting_mm` | `0.100` | Z-search range (mm) |
+
+**Registration Masks:**
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `create_registration_masks` | `true` | Create tissue masks to focus registration on tissue |
+| `mask_smoothing_sigma` | `5.0` | Gaussian smoothing sigma for mask creation (µm) |
+| `selem_radius` | `1` | Morphological structuring element radius (pixels) |
+| `min_size` | `100` | Minimum mask component size (pixels²) |
+| `mask_normalize` | `true` | Normalize intensities before masking |
+| `mask_fill_holes` | `'slicewise'` | Hole filling: `none`, `3d`, `slicewise` |
+
+#### Stacking & Output
+
+Choose a stacking method with `stacking_method`:
+- **`motor`** (recommended): XY from motor positions (`shifts_xy.csv`), Z from correlation
+- **`registration`**: XY and Z both from pairwise registration
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `stacking_method` | `'motor'` | Stacking method: `motor` or `registration` |
+| `stack_blend_enabled` | `true` | Blend overlapping regions between slices |
+
+**Motor stacking (stacking_method = 'motor'):**
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `use_expected_z_overlap` | `true` | Use expected Z-overlap instead of correlation-based matching |
+| `apply_pairwise_transforms` | `true` | Apply pairwise registration transforms during stacking |
+| `apply_rotation_only` | `true` | Apply only the rotation component from registration (keeps XY from motor) |
+| `max_rotation_deg` | `5.0` | Clamp rotation values larger than this before application |
+| `skip_error_transforms` | `true` | Skip transforms flagged as `error` status (prevents artifacts near interpolated slices) |
+| `skip_warning_transforms` | `false` | Skip transforms flagged as `warning` status |
+| `stack_accumulate_translations` | `false` | Accumulate pairwise translations as cumulative canvas offsets |
+| `stack_max_pairwise_translation` | `0` | Max pairwise translation (pixels) included in accumulation (0 = include all) |
+| `stitch_rehoming_enabled` | `false` | Apply one-time segment offset at re-homing event boundaries |
+| `stitch_rehoming_threshold_mm` | `0.7` | Motor shift magnitude that identifies a re-homing event (mm) |
+| `stitch_rehoming_use_motor` | `false` | Use motor delta instead of pairwise registration for re-homing corrections |
+| `stack_smooth_window` | `5` | Moving-average window (slices) for smoothing per-slice rotations (0 = disabled) |
+
+**Registration stacking (stacking_method = 'registration'):**
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `stack_max_overlap` | `10` | Maximum blending overlap in voxels (-1 = unlimited) |
+| `stack_no_accumulate_transforms` | `true` | Apply each transform independently (recommended when slices are already XY-aligned) |
+
+**Output pyramid:**
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `pyramid_resolutions` | `[10, 25, 50, 100]` | Target resolutions (µm) for output pyramid levels |
+| `pyramid_n_levels` | `null` | Fixed level count (overrides `pyramid_resolutions`) |
+| `pyramid_make_isotropic` | `true` | Resample to isotropic voxel spacing |
+
+The `pyramid_resolutions` parameter controls the multi-resolution pyramid in the final 3D volume. Instead of power-of-2 downsampling, specific analysis-friendly resolutions are used:
+
+- **10 µm**: High-resolution analysis
+- **25 µm**: Standard analysis resolution
+- **50 µm**: Overview and atlas registration
+- **100 µm**: Quick visualization and large-scale analysis
+
+**Note:** Only resolutions ≥ the base `resolution` parameter will be included. For example, if `resolution = 25`, then only 25, 50, and 100 µm levels will be created.
+
+#### Z-Intensity Normalization
+
+Corrects slow intensity drift across serial sections after stacking. Disabled by default.
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `normalize_z_slices` | `false` | Enable post-stacking Z-intensity normalization |
+| `znorm_mode` | `'histogram'` | Normalization mode: `histogram` (preserves contrast) or `percentile` (linear scaling) |
+| `znorm_strength` | `0.5` | Correction mixing strength (0 = passthrough, 1 = full correction) |
+
+**Histogram mode** (`znorm_mode = 'histogram'`):
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `znorm_tissue_threshold` | `0.02` | Minimum intensity to classify as tissue (below this left unchanged) |
+
+**Percentile mode** (`znorm_mode = 'percentile'`):
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `znorm_smooth_sigma` | `10.0` | Gaussian smoothing sigma (sections); ~10 corrects ~2mm drift and preserves anatomy |
+| `znorm_percentile` | `80.0` | Percentile of non-zero tissue voxels used as intensity reference |
+| `znorm_max_scale` | `2.0` | Maximum correction scale factor |
+| `znorm_min_scale` | `0.5` | Minimum correction scale factor |
+
+#### Atlas Registration (RAS Alignment)
+
+Register the final reconstructed volume to the Allen Mouse Brain Atlas (CCF) to produce an RAS-aligned OME-Zarr output. Atlas data is downloaded automatically.
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `align_to_ras_enabled` | `false` | Enable Allen atlas registration |
+| `allen_resolution` | `25` | Atlas resolution for registration (µm): 10, 25, 50, 100 |
+| `allen_metric` | `'MI'` | Registration metric: `MI`, `MSE`, `CC`, `AntsCC` |
+| `allen_max_iterations` | `1000` | Maximum registration iterations |
+| `allen_registration_level` | `2` | Pyramid level of input zarr for registration (0 = full; level 2 ≈ 50 µm, fast) |
+| `ras_input_orientation` | `''` | 3-letter orientation code of the INPUT brain volume (see table below) |
+| `ras_initial_rotation` | `''` | Initial rotation hint `'Rx Ry Rz'` in degrees (leave empty for automatic MOMENTS initialization) |
+| `allen_preview` | `true` | Generate a 3-panel alignment comparison image |
+
+**Output:** `align_to_ras/{subject}_ras.ome.zarr` with all pyramid resolutions.
+
+---
+
+#### RAS Orientation Lookup Table
+
+The `ras_input_orientation` parameter is a 3-letter code describing the anatomical direction each axis of the **input ZYX zarr** points toward. The code is interpreted as:
+
+```
+letter 1 → dim0 (zarr Z) = slice stacking direction (perpendicular to cutting plane)
+letter 2 → dim1 (zarr Y) = in-plane row direction
+letter 3 → dim2 (zarr X) = in-plane column direction
+```
+
+Each letter is one of: `R`/`L` (right/left), `A`/`P` (anterior/posterior), `S`/`I` (superior/inferior).
+
+The script `linum_align_to_ras.py` uses the code to permute and flip axes before registration, bringing the volume into approximate RAS space. The `ras_initial_rotation` then seeds the registration optimizer with a coarse rotation, which is essential for oblique cuts.
+
+**Standard setup assumption** used in the table below:
+
+> Brain mounted with dorsal side up. OCT motor rows (zarr Y) scan dorsal→ventral (I). OCT motor columns (zarr X) scan left→right for coronal/axial (R), or posterior→anterior for sagittal (A).
+
+<!--
+Orientation code construction:
+  letter_map: R→axis0(+), L→axis0(−), A→axis1(+), P→axis1(−), S→axis2(+), I→axis2(−)
+  The code permutes/flips zarr ZYX axes so that axis0=R, axis1=A, axis2=S in the output.
+-->
+
+##### Cardinal (in-plane) cutting orientations
+
+| Cutting plane | Stack direction | Row dir (Y) | Col dir (X) | `ras_input_orientation` |
+|---|---|---|---|---|
+| Coronal — anterior→posterior | A→P | Dorsal→Ventral (I) | Left→Right (R) | `PIR` |
+| Coronal — posterior→anterior | P→A | Dorsal→Ventral (I) | Left→Right (R) | `AIR` |
+| Sagittal — left→right | L→R | Dorsal→Ventral (I) | Posterior→Anterior (A) | `RIA` |
+| Sagittal — right→left | R→L | Dorsal→Ventral (I) | Posterior→Anterior (A) | `LIA` |
+| Axial/Horizontal — dorsal→ventral | D→V | Anterior→Posterior (P) | Left→Right (R) | `IPR` |
+| Axial/Horizontal — ventral→dorsal | V→D | Anterior→Posterior (P) | Left→Right (R) | `SPR` |
+
+> **Important:** The in-plane letters (2nd and 3rd) depend on the physical stage motor orientation and brain mounting. If the output looks mirrored or rotated 90°, swap or negate the in-plane letters. Run `linum_align_to_ras.py --preview-only` to inspect the raw volume orientation before registering.
+
+##### 45° oblique cutting orientations
+
+For cuts between two cardinal planes, use the closest cardinal code plus `ras_initial_rotation` to seed the registration with the approximate tilt angle. The sign depends on which specific diagonal direction the cut follows — verify with `--preview` after registration.
+
+| Cutting plane | Between planes | `ras_input_orientation` | `ras_initial_rotation`¹ | Rotation axis |
+|---|---|---|---|---|
+| Corono-sagittal 45° | Coronal ↔ Sagittal | `PIR` | `'0 0 ±45'` | Around RAS Superior-Inferior (Rz) |
+| Corono-axial 45° | Coronal ↔ Axial | `PIR` | `'±45 0 0'` | Around RAS Right-Left (Rx) |
+| Sagitto-axial 45° | Sagittal ↔ Axial | `RIA` | `'0 ±45 0'` | Around RAS Anterior-Posterior (Ry) |
+
+¹ Sign (+ or −) depends on the specific oblique direction. Start with +45 and inspect the preview; negate if the alignment is worse.
+
+**Rotation axis guide** (applied in the approximately-RAS frame after orientation correction):
+- `Rx` — tilts the A-P axis toward/away from S-I (e.g., pitch)
+- `Ry` — tilts the R-L axis toward/away from S-I (e.g., roll)
+- `Rz` — rotates in the axial plane, mixing R-L and A-P (e.g., yaw)
+
+**Example config (coronal A→P, standard setup):**
+```groovy
+align_to_ras_enabled    = true
+ras_input_orientation   = 'PIR'
+ras_initial_rotation    = ''        // automatic MOMENTS initialization
+allen_resolution        = 25
+allen_registration_level = 2        // ~50 µm pyramid level for speed
+```
+
+**Example config (corono-sagittal 45° oblique cut):**
+```groovy
+align_to_ras_enabled    = true
+ras_input_orientation   = 'PIR'
+ras_initial_rotation    = '0 0 45'  // adjust sign after checking preview
+allen_resolution        = 25
+allen_registration_level = 2
+```
+
+---
+
+#### Previews & Reports
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `stitch_preview` | `true` | Generate stitched slice preview images |
+| `common_space_preview` | `false` | Generate common space alignment previews |
+| `interpolation_preview` | `false` | Generate interpolated slice previews |
+| `generate_report` | `true` | Generate HTML quality report after stacking |
+| `report_verbose` | `false` | Include detailed per-slice metrics in report |
+| `annotated_label_every` | `1` | Label every Nth slice in annotated preview (1 = all slices) |
+| `annotated_show_lines` | `false` | Draw slice boundary lines on annotated preview |
+
+#### Debugging
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `analyze_shifts` | `false` | Generate shifts analysis report and drift plots |
+| `mask_preview` | `false` | Save mask preview images alongside mask zarrs |
+| `debug_slices` | `""` | Comma-separated slice IDs or ranges to process (e.g. `"25,26"` or `"25-29"`); leave empty to process all |
+
+The `analyze_shifts` option runs drift analysis on the shifts file before processing, producing:
+- A text report with statistics and outlier detection
+- A PNG plot showing drift patterns
+- A filtered shifts CSV file
+
+#### Diagnostic Mode
+
+Diagnostic mode enables additional analysis processes for troubleshooting reconstruction artifacts (edge mismatches, overhangs, alignment issues).
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `diagnostic_mode` | `false` | Master switch: enables all diagnostic analyses |
+| `analyze_rotation_drift` | `false` | Analyze cumulative rotation between slices |
+| `analyze_acquisition_rotation` | `false` | Analyze acquisition-time rotation from shifts + registration |
+| `analyze_tile_dilation` | `false` | Analyze tile position refinements for scale drift (works best with `max_blend_refinement_px = 0`) |
+| `motor_only_stitch` | `false` | Stitch slices using motor positions only (no image registration) |
+| `motor_only_stack` | `false` | Stack slices using motor positions only (no pairwise registration) |
+| `compare_stitching` | `false` | Compare motor-only vs refined stitching side-by-side |
+
+Diagnostic parameters:
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `motor_only_overlap` | `0.2` | Expected tile overlap for motor-only diagnostics (matches `stitch_overlap_fraction`) |
+| `motor_only_stitch_blending` | `'diffusion'` | Blending for motor-only stitching: `none`, `average`, `diffusion` |
+| `motor_only_stack_blending` | `'none'` | Blending for motor-only stacking: `none`, `average`, `max`, `feather` |
+| `diagnostic_rotation_threshold` | `2.0` | Rotation warning threshold (degrees) |
+| `save_refinement_data` | `false` | Save refined stitching transform data as JSON |
+| `comparison_tile_step` | `60` | Tile step for seam detection in stitching comparison |
+
+Diagnostic outputs are written to `{output}/diagnostics/` and include rotation plots, dilation reports, motor-only stitch results, and side-by-side comparisons.
+
+#### GPU Acceleration
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `use_gpu` | `true` | Enable GPU acceleration (auto-fallback to CPU) |
+
+### Outputs
+
+```
+output/
+├── README/readme.txt
+├── resample_mosaic_grid/
+├── fix_focal_curvature/
+├── fix_illumination/
+├── generate_aip/
+├── estimate_xy_transformation/
+├── stitch_3d/
+├── beam_profile_correction/
+├── crop_interface/
+├── normalize/
+├── bring_to_common_space/
+├── create_registration_masks/
+├── register_pairwise/
+├── stack/
+│   ├── 3d_volume.ome.zarr
+│   ├── 3d_volume.ome.zarr.zip
+│   └── 3d_volume.png
+├── normalize_z_intensity/              # Only when normalize_z_slices = true
+│   └── 3d_volume_znorm.ome.zarr
+├── align_to_ras/                       # Only when align_to_ras_enabled = true
+│   ├── {subject}_ras.ome.zarr          # RAS-aligned volume (all pyramid levels)
+│   ├── {subject}_ras_transform.tfm     # Registration transform (SimpleITK)
+│   └── {subject}_ras_preview.png       # 3-panel alignment comparison
+├── diagnostics/                        # Only when diagnostic_mode = true or individual flags set
+│   ├── rotation_analysis/
+│   ├── acquisition_rotation/
+│   ├── dilation_analysis/
+│   ├── aggregated_dilation/
+│   ├── motor_only_stitch/
+│   ├── motor_only_stack/
+│   └── stitch_comparison/
+└── {subject}_quality_report.html
+```
+
+---
+
+## GPU Acceleration
+
+Both workflows support GPU acceleration using NVIDIA CUDA via CuPy. GPU processing is enabled by default and automatically falls back to CPU if no GPU is available.
+
+### GPU-Accelerated Processes
+
+| Workflow | Process | GPU Operations |
+|----------|---------|----------------|
+| `preproc_rawtiles.nf` | `create_mosaic_grid` | Galvo detection, volume resize |
+| `preproc_rawtiles.nf` | `generate_aip` | Mean projection |
+| `preproc_rawtiles.nf` | `assess_slice_quality` | SSIM, edge detection (Sobel) |
+| `soct_3d_reconst.nf` | `resample_mosaic_grid` | Volume resize |
+| `soct_3d_reconst.nf` | `fix_illumination` | BaSiCPy background correction (JAX on GPU) |
+| `soct_3d_reconst.nf` | `estimate_xy_transformation` | Phase correlation (FFT) |
+| `soct_3d_reconst.nf` | `normalize` | Intensity normalization, percentile clipping |
+| `soct_3d_reconst.nf` | `create_registration_masks` | Gaussian filter, morphology |
+
+### Usage
+
+```bash
+# GPU enabled (default)
+nextflow run preproc_rawtiles.nf --input /data --output /output
+
+# Disable GPU
+nextflow run preproc_rawtiles.nf --input /data --output /output --use_gpu false
+
+# 3D reconstruction with GPU
+nextflow run soct_3d_reconst.nf --input /mosaics --output /output --use_gpu true
+```
+
+### Config-Based Control
+
+```groovy
+// In nextflow.config
+params {
+    use_gpu = true   // Enable GPU (default)
+    // use_gpu = false  // Force CPU only
+}
+```
+
+### Requirements
+
+For GPU support:
+- NVIDIA GPU with CUDA support
+- CuPy installed: `pip install cupy-cuda12x`
+- See [GPU_ACCELERATION.md](GPU_ACCELERATION.md) for detailed setup
+
+### Expected Speedups
+
+On NVIDIA A6000 (48GB):
+
+| Operation | Speedup |
+|-----------|---------|
+| Phase correlation | 10-15x |
+| Volume resize | 5-10x |
+| AIP projection | 3-4x |
+| Mask creation | 2-4x |
+
+---
+
+## CPU Core Management
+
+The pipelines provide fine-grained control over CPU usage, allowing you to reserve cores for system overhead and manage the interplay between Nextflow parallelism and Python multiprocessing.
+
+### Configuration Options
+
+Both pipelines support two approaches:
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `max_cpus` | `null` | Explicit maximum CPUs to use (takes precedence) |
+| `reserved_cpus` | `2` | Number of cores to keep free for overhead |
+| `processes` | `1` | Python processes per Nextflow task |
+
+### Usage Examples
+
+#### Reserve Cores for Overhead (Recommended)
+
+```bash
+# Keep 2 cores free for system overhead (default)
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/data \
+    --reserved_cpus 2
+
+# Keep 4 cores free on a heavily-loaded system
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/data \
+    --reserved_cpus 4
+```
+
+#### Set Explicit Core Limit
+
+```bash
+# Use exactly 16 cores maximum
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/data \
+    --max_cpus 16
+```
+
+### Understanding the Interplay
+
+The total CPU usage depends on three factors:
+
+1. **Nextflow parallelism**: How many tasks run simultaneously
+2. **Python processes per task**: The `processes` parameter
+3. **Thread libraries**: NumPy/SciPy threading (OMP, MKL, OpenBLAS)
+
+The effective formula is:
+```
+Total threads ≈ (Nextflow parallel tasks) × (processes) × (threads per process)
+```
+
+The pipeline automatically:
+- Sets `LINUMPY_MAX_CPUS` or `LINUMPY_RESERVED_CPUS` environment variables for Python scripts
+- Configures `OMP_NUM_THREADS`, `MKL_NUM_THREADS`, `OPENBLAS_NUM_THREADS`, and `ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS` to prevent thread oversubscription
+
+### Disabling CPU Limits
+
+If the CPU limiting system causes issues (e.g., unexpectedly slow performance), you can disable it entirely:
+
+```bash
+# Disable CPU limits - all cores will be used
+nextflow run workflow.nf --enable_cpu_limits false
+```
+
+This will skip all environment variable settings and let processes use all available cores.
+
+### Recommended Configurations
+
+| System Type | reserved_cpus | processes | Notes |
+|-------------|--------------|-----------|-------|
+| Workstation (8-16 cores) | 2 | 2-4 | Good balance |
+| Server (32+ cores) | 4 | 4-8 | Leave room for I/O |
+| Shared system | 8+ | 2 | Conservative to avoid impacting others |
+| Dedicated processing | 1 | auto | Maximum throughput |
+
+### Environment Variables
+
+Python scripts in linumpy respect these environment variables:
+
+| Variable | Description |
+|----------|-------------|
+| `LINUMPY_MAX_CPUS` | Maximum CPUs to use (explicit limit) |
+| `LINUMPY_RESERVED_CPUS` | CPUs to reserve for overhead |
+| `ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS` | Thread limit for SimpleITK operations |
+
+These can also be set manually when running scripts directly:
+
+```bash
+# Reserve 4 cores when running standalone scripts
+LINUMPY_RESERVED_CPUS=4 linum_create_mosaic_grid_3d.py input.ome.zarr output.ome.zarr
+
+# Or set explicit max
+LINUMPY_MAX_CPUS=8 linum_stitch_3d.py mosaic_grid.ome.zarr transform.npy output.ome.zarr
+```
+
+---
+
+## Configuration Files
+
+### nextflow.config Structure
+
+```groovy
+manifest {
+    nextflowVersion = '>= 23.10'
+}
+
+params {
+    // Default parameter values
+    input = "."
+    output = "."
+    // ... more parameters
+}
+
+process {
+    // Process-level settings
+    publishDir = {"$params.output/$slice_id/$task.process"}
+    scratch = true
+    errorStrategy = { task.attempt <= 2 ? 'retry' : 'ignore' }
+    maxRetries = 2
+}
+
+apptainer {
+    autoMounts = true
+    enabled = true
+}
+
+profiles {
+    // Environment-specific profiles
+    calliste {
+        // HPC cluster settings
+    }
+}
+```
+
+### Using Custom Config
+
+```bash
+# Use custom config file
+nextflow run workflow.nf -c my_config.config
+
+# Override specific parameters
+nextflow run workflow.nf --resolution 5 --processes 8
+```
+
+---
+
+## Execution Profiles
+
+### Local Execution
+
+```bash
+nextflow run workflow.nf
+```
+
+### HPC Cluster (SLURM)
+
+```groovy
+// In nextflow.config
+profiles {
+    slurm {
+        process.executor = 'slurm'
+        process.queue = 'normal'
+        process.memory = '16 GB'
+        process.cpus = 4
+    }
+}
+```
+
+```bash
+nextflow run workflow.nf -profile slurm
+```
+
+### Containerized Execution
+
+```groovy
+// In nextflow.config
+apptainer {
+    enabled = true
+    cacheDir = '/path/to/cache'
+}
+```
+
+```bash
+nextflow run workflow.nf -with-apptainer linumpy.sif
+```
+
+---
+
+## Monitoring and Debugging
+
+### Progress Monitoring
+
+```bash
+# Real-time progress
+nextflow run workflow.nf
+
+# With execution report
+nextflow run workflow.nf -with-report report.html
+
+# With timeline
+nextflow run workflow.nf -with-timeline timeline.html
+
+# With DAG visualization
+nextflow run workflow.nf -with-dag dag.png
+```
+
+### Resume Failed Runs
+
+```bash
+# Resume from last checkpoint
+nextflow run workflow.nf -resume
+```
+
+### Clean Up
+
+```bash
+# Clean work directory
+nextflow clean -f
+
+# Clean specific run
+nextflow clean -f <run_name>
+```
+
+### Log Files
+
+```
+.nextflow.log         # Main log file
+.nextflow/            # Nextflow cache and history
+work/                 # Task working directories
+```
+
+---
+
+## Common Issues
+
+### Out of Memory
+
+```groovy
+// Increase memory in config
+process {
+    memory = '32 GB'
+}
+```
+
+### Disk Space
+
+```bash
+# Check work directory size
+du -sh work/
+
+# Clean after successful run
+rm -rf work/
+```
+
+### Container Issues
+
+```bash
+# Pull container manually
+apptainer pull linumpy.sif docker://ghcr.io/linum/linumpy:latest
+
+# Run with explicit container
+nextflow run workflow.nf -with-apptainer linumpy.sif
+```
+
+### Permission Errors
+
+```bash
+# Check file permissions
+ls -la work/
+
+# Fix ownership
+sudo chown -R $USER:$USER work/
+```
+
+---
+
+## Best Practices
+
+### 1. Use Version Control for Configs
+
+```bash
+# Track your custom configs
+git add nextflow.config
+git commit -m "Add custom pipeline config"
+```
+
+### 2. Test with Small Data First
+
+```bash
+# Run on subset
+nextflow run workflow.nf --input /path/to/test_data
+```
+
+### 3. Monitor Resource Usage
+
+```bash
+# With resource report
+nextflow run workflow.nf -with-report -with-trace
+```
+
+### 4. Use Profiles for Different Environments
+
+```groovy
+profiles {
+    local { /* laptop settings */ }
+    hpc { /* cluster settings */ }
+    cloud { /* AWS/GCP settings */ }
+}
+```
+
+### 5. Keep Work Directory on Fast Storage
+
+```bash
+# Set work directory
+nextflow run workflow.nf -w /fast/storage/work
+```
+
+---
+
+## Reference
+
+- [Nextflow Documentation](https://www.nextflow.io/docs/latest/)
+- [Nextflow Patterns](https://nextflow-io.github.io/patterns/)
+- [nf-core Guidelines](https://nf-co.re/docs/)
diff --git a/docs/PIPELINE_OVERVIEW.md b/docs/PIPELINE_OVERVIEW.md
new file mode 100644
index 00000000..1a292247
--- /dev/null
+++ b/docs/PIPELINE_OVERVIEW.md
@@ -0,0 +1,599 @@
+# Pipeline Overview
+
+
+---
+
+## Overview
+
+The linumpy processing pipeline converts raw S-OCT (Serial Optical Coherence Tomography) microscopy data into reconstructed 3D volumes. The pipeline consists of two main stages:
+
+1. **Preprocessing Pipeline** (`preproc_rawtiles.nf`) - Converts raw tiles to mosaic grids
+2. **3D Reconstruction Pipeline** (`soct_3d_reconst.nf`) - Creates 3D volumes from mosaic grids
+
+![Workflow Diagram](workflow_reconstruction_2-5d.png)
+
+---
+
+## Data Flow
+
+```
+Raw Tiles (tile_x*_y*_z*)
+        ↓
+┌───────────────────────────────┐
+│   PREPROCESSING PIPELINE      │
+│   preproc_rawtiles.nf         │
+├───────────────────────────────┤
+│ • Create 3D mosaic grids      │
+│ • Estimate XY shifts          │
+│ • Generate slice config       │
+│ • [opt] Generate AIPs         │
+│ • [opt] Assess slice quality  │
+│ • [opt] Generate previews     │
+└───────────────────────────────┘
+        ↓
+Mosaic Grids (*.ome.zarr) + shifts_xy.csv + slice_config.csv
+        ↓
+┌───────────────────────────────┐
+│   3D RECONSTRUCTION PIPELINE  │
+│   soct_3d_reconst.nf          │
+├───────────────────────────────┤
+│ • [opt] Resample mosaic grids │
+│ • [opt] Fix focal curvature   │
+│ • [opt] Fix illumination      │
+│ • Generate AIP                │
+│ • Estimate XY transforms      │
+│ • Stitch tiles in 3D          │
+│ • Beam profile correction     │
+│ • Crop at interface           │
+│ • Normalize intensities       │
+│ • Align to common space       │
+│ • [opt] Create reg. masks     │
+│ • Pairwise registration       │
+│ • Stack into 3D volume        │
+│ • [opt] Z-intensity normalize │
+│ • [opt] Register to atlas     │
+└───────────────────────────────┘
+        ↓
+3D Volume (3d_volume.ome.zarr)
+```
+
+---
+
+## Preprocessing Pipeline
+
+### Purpose
+
+Converts raw OCT tiles into organized mosaic grids and extracts metadata for subsequent reconstruction.
+
+### Input
+
+- **Raw tiles directory**: Contains `tile_x*_y*_z*` folders with raw OCT data
+- **Folder structure**: Can be flat or organized by Z slice
+
+### Output
+
+1. **Mosaic grids**: `mosaic_grid_3d_z{slice_id}.ome.zarr` files
+2. **XY shifts file**: `shifts_xy.csv` containing pairwise slice shifts
+3. **Slice config** (optional): `slice_config.csv` for controlling slice selection
+
+### Key Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `input` | (required) | Raw tiles directory |
+| `output` | `"output"` | Output directory |
+| `use_gpu` | `true` | Enable GPU acceleration (auto-fallback to CPU) |
+| `processes` | `1` | Parallel Python processes per task (CPU mode only) |
+| `max_cpus` | `null` | Maximum CPUs to use (null = auto) |
+| `reserved_cpus` | `2` | CPUs to keep free for overhead |
+| `max_mosaic_forks` | `4` | Max concurrent `create_mosaic_grid` GPU jobs |
+| `max_aip_forks` | `4` | Max concurrent `generate_aip` GPU jobs |
+| `max_quality_forks` | `2` | Max concurrent `assess_slice_quality` GPU jobs |
+| `axial_resolution` | `1.36` | Axial resolution in microns |
+| `resolution` | `-1` | Output resolution (-1 = full native resolution) |
+| `sharding_factor` | `4` | Zarr sharding factor |
+| `fix_galvo_shift` | `true` | Enable galvo shift detection and correction |
+| `fix_camera_shift` | `false` | Correct camera shifts (old data) |
+| `galvo_confidence_threshold` | `0.6` | Minimum confidence to apply galvo fix |
+| `generate_slice_config` | `true` | Generate slice_config.csv |
+| `exclude_first_slices` | `1` | Number of leading slices to mark as excluded in slice_config |
+| `detect_galvo` | `false` | Include galvo detection results in slice_config.csv |
+| `generate_previews` | `false` | Generate orthogonal view previews of mosaic grids |
+| `generate_aips` | `false` | Generate AIP images from mosaic grids for QC |
+| `assess_quality` | `false` | Run quality assessment and update slice_config |
+| `min_quality_score` | `0.2` | Minimum quality score to include slice (0 = report only) |
+| `quality_sample_depth` | `10` | Z-planes sampled per slice during quality assessment |
+
+### Processes
+
+1. **create_mosaic_grid**: Creates 3D OME-Zarr mosaic from raw tiles
+2. **estimate_xy_shifts_from_metadata**: Extracts XY shifts from tile metadata
+3. **generate_slice_config**: Creates slice configuration file
+4. **generate_aip** *(optional, `generate_aips = true`)*: Generates AIP images for QC visualisation
+5. **generate_mosaic_preview** *(optional, `generate_previews = true`)*: Generates orthogonal view previews
+6. **assess_slice_quality** *(optional, `assess_quality = true`)*: Runs GPU-accelerated quality assessment and updates slice_config
+
+### Galvo Shift Correction
+
+The galvo mirror in OCT systems can introduce horizontal banding artifacts. During acquisition, the galvo mirror sweeps across the sample and then returns to its starting position. Data collected during this "return" period creates a distinctive intensity discontinuity if not handled correctly.
+
+**How it works:**
+- When `fix_galvo_shift = true`, each tile is analyzed for galvo artifacts
+- The detection algorithm analyzes the Average Intensity Projection (AIP) of each raw tile
+- It looks for **intensity discontinuities** at the galvo return region boundaries
+- Detection uses **absolute intensity difference** (the return region can be brighter OR darker)
+- A **confidence score** (0-1) indicates how certain the artifact is present
+- The correction is **only applied if confidence ≥ threshold** (default 0.6)
+
+**Detection algorithm details:**
+- Computes three metrics: boundary contrast (50%), edge sharpness (30%), anomaly score (20%)
+- Boundary contrast: How different is the return region intensity from surrounding image?
+- Edge sharpness: Are there sharp transitions at the expected boundary locations?
+- Anomaly score: Is the return region statistically different (z-score > 1.5)?
+
+**When to use:**
+- Set `fix_galvo_shift = true` for acquisitions that may contain galvo artifacts (most new data)
+- Set `fix_galvo_shift = false` if you know the data is clean (skips detection entirely)
+- Adjust `galvo_confidence_threshold` if needed:
+  - Lower (e.g., 0.4): More aggressive, applies fix more often
+  - Higher (e.g., 0.7): More conservative, only fixes obvious artifacts
+
+**Note:** The artifact appears differently in raw tiles vs. stitched mosaics. In raw tiles, it's an intensity band at the return position. In stitched images, it appears as horizontal banding across the full mosaic.
+
+---
+
+## 3D Reconstruction Pipeline
+
+### Purpose
+
+Processes mosaic grids through multiple correction and stitching steps to produce a final 3D volume.
+
+### Input
+
+1. **Mosaic grids**: `mosaic_grid*_z*.ome.zarr` files
+2. **XY shifts file**: `shifts_xy.csv`
+3. **Slice config** (optional): `slice_config.csv`
+
+### Output
+
+1. **3D volume**: `3d_volume.ome.zarr`
+2. **Compressed volume**: `3d_volume.ome.zarr.zip`
+3. **Preview image**: `3d_volume.png`
+
+### Processing Steps
+
+#### 1. Resampling (Optional)
+
+```
+resample_mosaic_grid
+```
+- Resamples mosaic grids to target resolution
+- Skip if `resolution = -1`
+
+#### 2. Focal Curvature Correction (Optional)
+
+```
+fix_focal_curvature
+```
+- Detects and compensates for focal plane curvature
+- Enabled by `fix_curvature_enabled = true`
+
+#### 3. Illumination Correction (Optional)
+
+```
+fix_illumination
+```
+- Compensates for XY illumination inhomogeneity
+- Uses BaSiC algorithm
+- Enabled by `fix_illum_enabled = true`
+
+#### 4. Average Intensity Projection
+
+```
+generate_aip
+```
+- Creates 2D AIP from 3D mosaic for registration
+
+#### 5. XY Transformation Estimation
+
+```
+estimate_xy_transformation
+```
+- Estimates tile positions from AIP mosaic grid
+
+#### 6. 3D Stitching
+
+```
+stitch_3d
+```
+- Stitches tiles into 3D slice using estimated transforms
+- Tile blending is controlled by `stitch_blending_method` (default: `'diffusion'`); sub-pixel refinement by `max_blend_refinement_px`
+- Motor-only stitching is available for diagnostics via `motor_only_stitch = true`
+
+#### 7. Beam Profile Correction
+
+```
+beam_profile_correction
+```
+- Model-free PSF compensation
+- Corrects axial intensity variations
+
+#### 8. Interface Cropping
+
+```
+crop_interface
+```
+- Crops volume below sample interface
+- Removes agarose/mounting medium
+
+#### 9. Intensity Normalization
+
+```
+normalize
+```
+- Normalizes intensities per slice
+- Compensates signal attenuation with depth
+
+#### 10. Common Space Alignment
+
+```
+bring_to_common_space
+```
+- Aligns all slices using XY shifts from microscope metadata
+- Resamples to common shape
+- **Outlier Filtering** (enabled by default):
+  - Detects erroneous large shifts using IQR statistics
+  - Replaces outliers with local median of neighboring shifts
+  - Prevents slices from drifting out of the common volume
+- Centers drift around the middle slice to keep tissue centered
+
+**Key Parameters:**
+- `filter_shift_outliers`: Enable outlier filtering (default: `true`)
+- `outlier_method`: Detection method - `iqr` recommended (default: `'iqr'`)
+- `outlier_iqr_multiplier`: IQR multiplier for detection (default: `1.5`)
+
+**Debugging:**
+- Enable `common_space_preview = true` to generate preview images
+- Check `bring_to_common_space/` output directory for aligned slices
+
+#### 11. Registration Mask Creation (Optional)
+
+```
+create_registration_masks
+```
+- Creates binary masks for pairwise registration
+- Enabled by `create_registration_masks = true`
+
+#### 12. Slice Registration
+
+```
+register_pairwise
+```
+- Registers consecutive slices to align them in 3D
+- Finds the best matching Z-plane using zero-lag NCC (Pearson correlation) over a centre-cropped ROI
+- XY alignment is seeded from motor positions; only small corrections are computed
+- Refines with SimpleITK gradient descent (translation, or rotation + translation via `euler` transform)
+- Falls back to identity transform if registration exceeds thresholds
+
+The `bring_to_common_space` step (step 10) provides initial XY alignment using microscope metadata, while pairwise registration fine-tunes the alignment between adjacent slices.
+
+#### 13. Volume Stacking
+
+```
+stack
+```
+- Stacks all slices into final 3D volume
+- Creates multi-resolution pyramid with analysis-friendly resolutions (10, 25, 50, 100 µm)
+- Optional blending between slices
+- **Two stacking methods** (controlled by `stacking_method`):
+  - **`motor`** (default, recommended): XY positions from motor encoder data (`shifts_xy.csv`), Z from correlation matching between slices. Pairwise registration transforms are applied as rotation-only corrections on top.
+  - **`registration`**: Both XY and Z from pairwise image registration. More fully image-driven but accumulates registration errors.
+
+#### 14. Z-Intensity Normalization (Optional)
+
+```
+normalize_z_intensity
+```
+- Corrects slow intensity drift across serial sections after stacking
+- Enabled by `normalize_z_slices = true`
+- Two modes: `histogram` (preserves relative tissue contrast) or `percentile` (linear scaling)
+- Controlled by `znorm_strength` (0 = passthrough, 1 = full correction)
+
+#### 15. Atlas Registration (Optional)
+
+```
+align_to_ras
+```
+- Registers the final 3D volume to RAS orientation via rigid registration to the Allen Mouse Brain Atlas (CCF)
+- Enabled by `align_to_ras_enabled = true`
+- Atlas data is downloaded automatically at the specified resolution (10/25/50/100 µm)
+- Input volume orientation must be specified via `ras_input_orientation` (see [RAS Orientation Lookup Table](NEXTFLOW_WORKFLOWS.md#ras-orientation-lookup-table))
+- Output is an OME-Zarr at all pyramid resolutions in RAS space
+
+### Pyramid Resolution Levels
+
+The final 3D volume is stored as an OME-Zarr with multiple resolution levels optimized for analysis:
+
+| Resolution | Use Case |
+|------------|----------|
+| 10 µm | High-resolution cellular analysis |
+| 25 µm | Standard analysis resolution |
+| 50 µm | Overview, atlas registration |
+| 100 µm | Quick visualization, large-scale analysis |
+
+**Note:** Only resolutions ≥ the base `resolution` parameter are included. For example, if processing at 25 µm resolution, the pyramid will contain 25, 50, and 100 µm levels.
+
+### Key Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `input` | `"."` | Input mosaic grids directory |
+| `shifts_xy` | `""` | XY shifts file (default: `{input}/shifts_xy.csv`) |
+| `slice_config` | `""` | Optional slice config file |
+| `output` | `"."` | Output directory |
+| `resolution` | `10` | Target resolution (µm/pixel) |
+| `clip_percentile_upper` | `99.9` | Upper percentile for clipping |
+| `fix_curvature_enabled` | `false` | Enable focal curvature fix |
+| `fix_illum_enabled` | `true` | Enable illumination fix |
+| `crop_interface_out_depth` | `600` | Crop depth in microns |
+| `use_motor_positions_for_stitching` | `true` | Use motor encoder positions for tile layout |
+| `stitch_overlap_fraction` | `0.2` | Expected tile overlap fraction |
+| `stitch_blending_method` | `'diffusion'` | Tile blending: `'none'`, `'average'`, `'diffusion'` |
+| `max_blend_refinement_px` | `10` | Maximum sub-pixel refinement shift during blending (pixels) |
+| `create_registration_masks` | `true` | Create registration masks |
+| `registration_transform` | `'euler'` | Transform type: `euler` (XY + rotation) or `translation` |
+| `registration_max_translation` | `200.0` | Optimizer bound on translation (pixels) |
+| `registration_max_rotation` | `5.0` | Optimizer bound on rotation (degrees) |
+| `stacking_method` | `'motor'` | Stacking method: `motor` (recommended) or `registration` |
+| `stack_blend_enabled` | `true` | Enable blending between slices |
+| `apply_rotation_only` | `true` | Apply only rotation from pairwise registration during stacking |
+| `interpolation_blend_method` | `'gaussian'` | Blend method: `gaussian` (feathered) or `linear` |
+| `normalize_z_slices` | `false` | Enable post-stacking Z-intensity normalization |
+| `pyramid_resolutions` | `[10, 25, 50, 100]` | Pyramid resolution levels (µm) |
+| `pyramid_make_isotropic` | `true` | Resample to isotropic voxel spacing |
+| `use_gpu` | `true` | Enable GPU acceleration (auto-fallback to CPU) |
+
+### Registration Algorithm
+
+The pairwise registration uses a two-step approach:
+
+1. **Z-Plane Matching**: The best-matching Z-plane in the fixed volume is found by scanning a search range around the expected position and scoring each candidate with zero-lag NCC (Pearson correlation) computed on a centre-cropped ROI. XY initial alignment comes from motor encoder positions.
+
+2. **Intensity-Based Refinement**: SimpleITK gradient descent with a Pearson correlation metric (`SetMetricAsCorrelation`) using a multiscale pyramid. The transform type is controlled by `registration_transform`: `euler` (rotation + translation) or `translation` (XY only).
+
+3. **Masking**: Registration masks created by `create_registration_masks` focus registration on tissue content, reducing sensitivity to background edges. The `mask_fill_holes` parameter (`none`, `3d`, `slicewise`) controls hole-filling in masks.
+
+4. **Transform application** (motor stacking): When `apply_rotation_only = true` only the rotation component is used during stacking; XY translation comes from motor positions. Rotation is clamped to `max_rotation_deg`. Transforms flagged as `error` are skipped when `skip_error_transforms = true`.
+
+**Recommendations:**
+- Use `euler` transform to correct small rotations between slices
+- Keep `registration_max_translation` large (optimizer bound only — actual corrections are controlled by `apply_rotation_only` and `max_rotation_deg`)
+- Enable `create_registration_masks` to focus registration on tissue
+- Set `registration_slicing_interval_mm` to match your actual slice thickness
+
+
+---
+
+## GPU Acceleration
+
+Both pipelines support optional GPU acceleration using NVIDIA CUDA via CuPy. GPU acceleration is enabled by default (`use_gpu = true`) and automatically falls back to CPU if no GPU is available.
+
+### GPU-Accelerated Processes
+
+| Pipeline | Process | GPU Operations |
+|----------|---------|----------------|
+| Preprocessing | `create_mosaic_grid` | Galvo detection, volume resize |
+| Preprocessing | `generate_aip` | Mean projection |
+| Preprocessing | `assess_slice_quality` | SSIM, edge detection (Sobel) |
+| 3D Reconstruction | `resample_mosaic_grid` | Volume resize |
+| 3D Reconstruction | `fix_illumination` | BaSiCPy background correction (JAX on GPU) |
+| 3D Reconstruction | `estimate_xy_transformation` | Phase correlation (FFT) |
+| 3D Reconstruction | `normalize` | Intensity normalization, percentile clipping |
+| 3D Reconstruction | `create_registration_masks` | Filtering, morphology |
+
+### Running with GPU
+
+```bash
+# GPU enabled (default)
+nextflow run preproc_rawtiles.nf --input /path/to/data --output /path/to/output
+
+# Explicitly disable GPU
+nextflow run preproc_rawtiles.nf --input /path/to/data --output /path/to/output --use_gpu false
+```
+
+### Requirements
+
+- NVIDIA GPU with CUDA support
+- CuPy installed (`pip install cupy-cuda12x`)
+- See [GPU_ACCELERATION.md](GPU_ACCELERATION.md) for detailed setup
+
+---
+
+## Running the Pipelines
+
+### Prerequisites
+
+- Nextflow >= 23.10
+- linumpy package installed
+- Apptainer/Singularity (optional, for containerized execution)
+
+### Preprocessing
+
+```bash
+nextflow run preproc_rawtiles.nf \
+    --input /path/to/raw/tiles \
+    --output /path/to/output \
+    --processes 4
+```
+
+### 3D Reconstruction
+
+```bash
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/mosaic/grids \
+    --shifts_xy /path/to/shifts_xy.csv \
+    --output /path/to/output
+```
+
+### With Slice Config (Subset of Slices)
+
+```bash
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/mosaic/grids \
+    --shifts_xy /path/to/shifts_xy.csv \
+    --slice_config /path/to/slice_config.csv \
+    --output /path/to/output
+```
+
+---
+
+## Quality Metrics and Reporting
+
+The pipeline automatically collects quality metrics at each processing step to help identify potential issues. At the end of the run, a comprehensive HTML report is generated.
+
+### Collected Metrics
+
+| Step | Metrics Collected |
+|------|-------------------|
+| **XY Transform Estimation** | Tile pairs used, transform matrix, estimated overlap, RMS residual |
+| **Create Masks** | Mask coverage, per-slice coverage, min/std slice coverage |
+| **Crop Interface** | Interface depth (voxels/µm), crop indices, interface quality |
+| **PSF Compensation** | PSF max, peak depth, agarose coverage, profile quality |
+| **Normalize Intensities** | Agarose coverage, Otsu threshold, background stats |
+| **Pairwise Registration** | Registration error, translation (x/y), rotation, z-drift |
+| **Stack Slices** | Total depth, mean/std z-offsets, z-offset range |
+
+### Quality Thresholds
+
+Metrics are automatically evaluated against configurable thresholds:
+
+- **OK** (green): Value within acceptable range
+- **Warning** (yellow): Value approaching problematic range
+- **Error** (red): Value indicates likely issue
+
+### Report Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `generate_report` | `true` | Generate quality report at end of pipeline |
+| `report_verbose` | `false` | Include detailed per-file metrics |
+
+### Generating Reports Manually
+
+You can regenerate reports from existing metrics files:
+
+```bash
+# HTML report (recommended)
+linum_generate_pipeline_report.py /path/to/pipeline/output report.html --format html
+
+# Text report
+linum_generate_pipeline_report.py /path/to/pipeline/output report.txt --format text
+
+# Verbose report with all details
+linum_generate_pipeline_report.py /path/to/pipeline/output report.html --verbose
+```
+
+### Interpreting the Report
+
+1. **Summary Section**: Shows overall status (OK/Warnings/Errors) and counts
+2. **Per-Step Sections**: Statistics and issues for each pipeline step
+3. **Warnings/Errors**: Specific metrics that exceeded thresholds
+
+Common issues indicated by metrics:
+- High registration error → Check slice alignment
+- Large z-offset variance → Inconsistent slice thickness or registration problems
+- Low mask coverage → Segmentation issues or sparse tissue
+- High translation magnitude → Significant sample drift between slices
+
+---
+
+## Error Handling
+
+### Common Issues
+
+1. **Missing shifts file**: Ensure `shifts_xy.csv` exists and contains all required slices
+2. **Slice mismatch**: Use `slice_config.csv` to exclude problematic slices
+3. **Memory issues**: Reduce `processes` parameter or increase available memory
+4. **Registration failures**: Try different `pairwise_transform` or `pairwise_registration_metric`
+
+### Troubleshooting Reconstruction Artifacts
+
+#### Boomerang/Curved Shape (Cumulative Drift)
+
+**Symptoms:** Brain appears curved when viewed from the side instead of straight.
+
+**Cause:** Small registration biases accumulate over many slices.
+
+**Solutions:**
+- Enable registration masks: `--create_registration_masks true` (focuses registration on tissue)
+- Enable outlier filtering: `--filter_shift_outliers true --outlier_method iqr`
+- Use `--stack_no_accumulate_transforms true` since slices are already aligned
+
+#### Shadowing/Banding Effect
+
+**Symptoms:** Horizontal bands visible in sagittal/coronal views; slices appear to have different sizes.
+
+**Cause:** Inconsistent normalization, varying tissue coverage, or registration errors.
+
+**Solutions:**
+- Enable illumination correction: `--fix_illum_enabled true`
+- Enable stack blending: `--stack_blend_enabled true`
+- Check agarose mask quality in normalization step
+
+#### Double Edges in Interpolated Slices
+
+**Symptoms:** Visible edge artifacts at slice boundaries after interpolation.
+
+**Cause:** Linear blending creates hard transitions at volume boundaries.
+
+**Solutions:**
+- Use feathered blending: `--interpolation_blend_method gaussian` (default)
+- The gaussian method uses distance transform to create soft edges
+
+### Slice Selection
+
+When working with a subset of slices:
+
+1. Generate slice config: `linum_generate_slice_config.py`
+2. Edit `slice_config.csv` to set `use=false` for excluded slices
+3. Run reconstruction with `--slice_config` parameter
+
+---
+
+## Output Structure
+
+```
+output/
+├── README/
+│   └── readme.txt                    # Pipeline parameters
+├── resample_mosaic_grid/             # Resampled mosaics (if enabled)
+├── fix_focal_curvature/              # Curvature-corrected mosaics
+├── fix_illumination/                 # Illumination-corrected mosaics
+├── generate_aip/                     # AIP projections
+├── estimate_xy_transformation/       # XY transforms
+├── stitch_3d/                        # Stitched 3D slices
+├── beam_profile_correction/          # PSF-corrected slices
+├── crop_interface/                   # Cropped slices
+├── normalize/                        # Normalized slices
+├── bring_to_common_space/            # Aligned slices
+├── create_registration_masks/        # Registration masks (with *_metrics.json)
+├── register_pairwise/                # Pairwise transforms (with *_metrics.json)
+├── stack/
+│   ├── 3d_volume.ome.zarr           # Final 3D volume
+│   ├── 3d_volume.ome.zarr.zip       # Compressed volume
+│   └── 3d_volume.png                # Preview image
+├── normalize_z_intensity/           # Z-normalized volume (only when normalize_z_slices = true)
+│   └── 3d_volume_znorm.ome.zarr
+├── align_to_ras/                    # RAS alignment (only when align_to_ras_enabled = true)
+│   ├── {subject}_ras.ome.zarr       # RAS-aligned volume (all pyramid levels)
+│   ├── {subject}_ras_transform.tfm  # Registration transform (SimpleITK)
+│   └── {subject}_ras_preview.png    # 3-panel alignment comparison
+├── diagnostics/                     # (only when diagnostic_mode = true or individual flags set)
+│   ├── rotation_analysis/
+│   ├── acquisition_rotation/
+│   ├── dilation_analysis/
+│   ├── motor_only_stitch/
+│   ├── motor_only_stack/
+│   └── stitch_comparison/
+└── {subject}_quality_report.html     # Quality report
+```
diff --git a/docs/PIPELINE_PERFORMANCE_ANALYSIS.md b/docs/PIPELINE_PERFORMANCE_ANALYSIS.md
new file mode 100644
index 00000000..f8b669a7
--- /dev/null
+++ b/docs/PIPELINE_PERFORMANCE_ANALYSIS.md
@@ -0,0 +1,302 @@
+# 3D Reconstruction Pipeline Performance Analysis
+
+## Executive Summary
+
+**Root cause of slowdown: `params.processes = 1` default**
+
+The `fix_illumination` step is now the bottleneck because it defaults to running with `processes = 1`, meaning each slice is processed sequentially. With ~20 slices and the BaSiC algorithm taking 5-10 minutes per slice, this alone could account for 2-3 hours of additional runtime.
+
+However, the 8x slowdown (4-6h → 32h) suggests **additional factors** may be at play:
+
+1. Thread limiting being too aggressive
+2. GPU acceleration not being utilized
+3. **SimpleITK thread pool not being limited** (CRITICAL)
+4. Multiprocessing workers not respecting thread limits
+
+---
+
+## Identified CPU Limiting Gaps
+
+### Gap 1: SimpleITK Thread Pool (CRITICAL)
+**Problem**: SimpleITK spawns its own thread pool for registration operations that **ignores** environment variables like `OMP_NUM_THREADS`.
+
+**Impact**: Each `register_pairwise` process can spawn 48+ threads, leading to massive thread oversubscription when multiple slices are processed.
+
+**Fix Applied**: Added `configure_all_libraries()` calls after SimpleITK import in:
+- `linum_estimate_transform.py`
+- `linum_estimate_transform_gpu.py`
+- `linum_interpolate_missing_slice.py`
+- `linum_stack_slices_3d.py`
+
+### Gap 2: Multiprocessing Workers Re-Import Libraries
+**Problem**: When using `multiprocessing.Pool` or `pqdm`, each worker process is a fresh Python interpreter that re-imports all libraries. Even though environment variables are inherited, libraries like SimpleITK and numpy need runtime configuration.
+
+**Impact**: Worker processes don't respect thread limits configured in the main process.
+
+**Fix Applied**: 
+- Added `worker_initializer` function in `_thread_config.py`
+- Updated `multiprocessing.Pool` calls to use the initializer
+- Added `apply_threadpool_limits()` call in `process_tile()` for pqdm workers
+
+### Gap 3: configure_sitk() Never Called
+**Problem**: The `configure_sitk()` function existed but was never called anywhere in the codebase.
+
+**Fix Applied**: Added automatic SimpleITK configuration in `configure_all_libraries()`.
+
+### Gap 4: Dask Configuration Only in zarr.py
+**Problem**: `configure_dask()` was only called from `linumpy/io/zarr.py`, so scripts that use Dask without zarr.py wouldn't have proper thread limits.
+
+**Fix Applied**: Included Dask configuration in `configure_all_libraries()`.
+
+### Gap 5: JAX/XLA Thread Pool (BaSiCPy)
+**Problem**: JAX (used by BaSiCPy for the BaSiC algorithm) has its own thread pool controlled by XLA_FLAGS, which was not being set.
+
+**Impact**: The `fix_illumination` step could spawn excessive threads even when OMP_NUM_THREADS was limited, because JAX ignores OpenMP settings.
+
+**Fix Applied**: 
+- Added `XLA_FLAGS` environment variable setting in `_thread_config.py`
+- Added `XLA_FLAGS` to Nextflow `beforeScript` in both workflow configs
+- Added explicit `XLA_FLAGS` setting in `linum_fix_illumination_3d.py` before JAX import
+
+**XLA_FLAGS format**:
+```bash
+export XLA_FLAGS='--xla_cpu_multi_thread_eigen=false intra_op_parallelism_threads=N'
+```
+
+---
+
+## Thread Configuration Architecture
+
+### Before (Gaps Present)
+```
+Nextflow beforeScript → sets OMP_NUM_THREADS
+         ↓
+Python script imports linumpy._thread_config → sets env vars
+         ↓
+NumPy/SciPy import → ✓ respects OMP_NUM_THREADS  
+SimpleITK import → ✗ IGNORES limits, spawns all CPU threads
+Dask import → ✗ May not be configured
+         ↓
+Subprocess workers (pqdm/Pool) → Re-import libraries
+         ↓
+Workers: NumPy → ✓ inherits env vars
+Workers: SimpleITK → ✗ IGNORES limits again
+```
+
+### After (Gaps Fixed)
+```
+Nextflow beforeScript → sets OMP_NUM_THREADS, XLA_FLAGS
+         ↓
+Python script imports linumpy._thread_config → sets env vars (incl. XLA_FLAGS)
+         ↓
+All imports complete
+         ↓
+configure_all_libraries() called → 
+  ✓ NumPy/SciPy (threadpoolctl)
+  ✓ SimpleITK (ProcessObject.SetGlobalDefaultNumberOfThreads)
+  ✓ Dask (dask.config)
+  ✓ Numba (set_num_threads)
+  ✓ JAX/XLA (XLA_FLAGS environment variable)
+         ↓
+Subprocess workers created with worker_initializer →
+  ✓ Re-applies all limits in each worker
+```
+
+### Process-by-Process Breakdown
+
+| Process | GPU Support | Parallelism | Est. Time/Slice | Bottleneck Risk |
+|---------|-------------|-------------|-----------------|-----------------|
+| `resample_mosaic_grid` | ✅ Yes | Per-slice parallel | ~2 min | Low |
+| `fix_focal_curvature` | ❌ No | Per-slice parallel | ~1 min | Low |
+| **`fix_illumination`** | ❌ No | **Uses `params.processes`** | **5-10 min** | **HIGH** |
+| `generate_aip` | ❌ No | Per-slice parallel | ~30 sec | Low |
+| `estimate_xy_transformation` | ✅ Yes | Per-slice parallel | ~1 min | Low |
+| `stitch_3d` | ❌ No | Per-slice parallel | ~2 min | Medium |
+| `beam_profile_correction` | ❌ No | Per-slice parallel | ~2 min | Medium |
+| `crop_interface` | ❌ No | Per-slice parallel | ~1 min | Low |
+| `normalize` | ✅ Yes | Per-slice parallel | ~1 min | Low |
+| `create_registration_masks` | ✅ Yes | Per-slice parallel | ~1 min | Low |
+| `register_pairwise` | ❌ No | Sequential (by design) | ~5 min | Medium |
+| `stack` | ❌ No | Single process | ~10 min | Low |
+
+### Critical Issue: `fix_illumination`
+
+```groovy
+// In soct_3d_reconst.nf line 62-72
+process fix_illumination {
+    cpus params.processes  // <-- Uses params.processes
+
+    script:
+    """
+    linum_fix_illumination_3d.py ${mosaic_grid} ... --n_processes ${params.processes}
+    """
+}
+```
+
+```groovy
+// In nextflow.config line 7
+params {
+    processes = 1  // <-- DEFAULT IS 1!
+}
+```
+
+**Impact**: With 20 slices × 10 minutes each = 200 minutes (~3.3 hours) when running sequentially.
+
+With `processes = 12`: All slices can run in parallel with internal parallelism, reducing to ~15-20 minutes total.
+
+---
+
+## Thread Configuration Analysis
+
+The pipeline has a complex thread limiting system:
+
+### Nextflow `beforeScript` (nextflow.config lines 110-140)
+```groovy
+int threadsPerProcess = Math.max(1, (int)(maxCpus / numProcesses))
+envVars << "export OMP_NUM_THREADS=${threadsPerProcess}"
+```
+
+### Python Override (linum_fix_illumination_3d.py lines 13-14)
+```python
+from os import environ
+environ["OMP_NUM_THREADS"] = "1"  # <-- HARDCODED TO 1!
+```
+
+**Problem**: The Python script overrides Nextflow's thread configuration, forcing BaSiC to use single-threaded execution.
+
+---
+
+## Recommended Configuration
+
+For your 48-core, 512GB RAM, dual A6000 server:
+
+### nextflow.config changes:
+
+```groovy
+params {
+    // Change from 1 to 12-16 for parallel processing
+    processes = 12  // Recommended for 48-core server
+    
+    // CPU management
+    enable_cpu_limits = true
+    reserved_cpus = 4  // Leave 4 cores for system overhead
+    
+    // GPU (already enabled)
+    use_gpu = true
+}
+```
+
+### Explanation of `processes = 12`:
+- 48 cores total - 4 reserved = 44 available
+- BaSiC algorithm benefits from ~3-4 threads per worker
+- 44 / 3 ≈ 14-15 workers maximum
+- Using 12 provides headroom for I/O and other processes
+
+---
+
+## Diagnostic Scripts Created
+
+### 1. `linum_diagnose_pipeline.py`
+Comprehensive Python diagnostic script:
+```bash
+# Quick check
+python scripts/linum_diagnose_pipeline.py --quick
+
+# Full benchmark
+python scripts/linum_diagnose_pipeline.py --benchmark
+
+# Save to file
+python scripts/linum_diagnose_pipeline.py --output diagnosis.json
+```
+
+### 2. Server Checks
+Quick server verification commands:
+```bash
+nproc && nvidia-smi && python -c "import cupy; print('CuPy OK')"
+```
+
+---
+
+## Verification Checklist
+
+Run these on your server before the next pipeline execution:
+
+### 1. Check CPU Configuration
+```bash
+nproc  # Should show 48
+```
+
+### 2. Check GPU Availability
+```bash
+nvidia-smi  # Should show 2x A6000
+```
+
+### 3. Check CuPy Installation
+```bash
+python3 -c "import cupy; print(cupy.__version__)"
+```
+
+### 4. Check linumpy GPU Module
+```bash
+python3 -c "from linumpy.gpu import GPU_AVAILABLE; print(f'GPU: {GPU_AVAILABLE}')"
+```
+
+### 5. Verify nextflow.config Parameters
+```bash
+grep "processes" /path/to/workflows/reconst_3d/nextflow.config
+# Should show: processes = 12 (or similar)
+```
+
+---
+
+## Quick Fix
+
+If you want to run immediately without modifying config files:
+
+```bash
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/input \
+    --output /path/to/output \
+    --processes 12 \
+    --use_gpu true \
+    --reserved_cpus 4 \
+    -resume
+```
+
+---
+
+## Additional Optimizations
+
+### 1. Consider GPU-accelerated Illumination Correction
+The BaSiC algorithm doesn't have a GPU version, but the processing could potentially be optimized by:
+- Using JAX backend for BaSiC (if available)
+- Processing multiple slices simultaneously on GPU
+
+### 2. I/O Optimization
+```groovy
+// In nextflow.config
+process {
+    scratch = true  // Currently enabled
+    stageInMode = 'symlink'  // Uses symlinks (good)
+    stageOutMode = 'rsync'   // Consider 'move' for faster output
+}
+```
+
+### 3. Memory-Intensive Process Limits
+For processes that use a lot of memory, add `maxForks`:
+```groovy
+withName: "fix_illumination" {
+    maxForks = 6  // Limit concurrent instances
+}
+```
+
+---
+
+## Expected Performance After Fixes
+
+| Scenario | fix_illumination Time | Total Pipeline |
+|----------|----------------------|----------------|
+| Current (processes=1) | ~3-4 hours | 25-35 hours |
+| Optimized (processes=12) | ~20-30 min | 4-6 hours |
+
+The 8x improvement comes primarily from parallelizing the fix_illumination step.
diff --git a/docs/RECONSTRUCTION_DIAGNOSTICS.md b/docs/RECONSTRUCTION_DIAGNOSTICS.md
new file mode 100644
index 00000000..bf8ae898
--- /dev/null
+++ b/docs/RECONSTRUCTION_DIAGNOSTICS.md
@@ -0,0 +1,380 @@
+# Diagnostic Tools for 3D Reconstruction Troubleshooting
+
+This document describes diagnostic tools for identifying and fixing reconstruction artifacts
+in serial OCT microscopy data, particularly for **obliquely-mounted samples**.
+
+## Context: Oblique Sample Mounting
+
+Standard serial blockface histology acquisitions image samples in the three normal anatomical
+directions (coronal, sagittal, axial). However, for studying how contrast changes across 
+different orientations, samples can be mounted at angles relative to these standard planes
+(e.g., **45° between sagittal and coronal**).
+
+While the imaging acquisition itself is standard serial blockface histology, the oblique 
+mounting introduces additional challenges for 3D reconstruction:
+- Sample edges may not align with image borders
+- Tissue deformation patterns differ from standard orientations
+- Registration between slices may need rotation correction
+
+## Common Artifacts and Their Causes
+
+### Edge Mismatches / "Overhangs"
+**Symptoms**: Visible discontinuities at slice boundaries, tissue edges don't align in 3D view.
+
+**Possible Causes**:
+1. **Cumulative rotation drift** (mosaic-level): Small rotations between slices accumulate, causing edges to diverge
+2. **Tile dilation** (tile-level): Physical positions from microscope motor don't match actual image positions after tissue cutting/relaxation
+3. **Registration failures**: Poor-quality slices cause incorrect transform estimation
+
+## Pipeline Diagnostic Options
+
+### Master Switch: `diagnostic_mode`
+
+The simplest way to enable all diagnostics is with a single flag:
+
+```bash
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/data \
+    --diagnostic_mode true \
+    --debug_slices "10-20"  # Limit to subset for faster testing
+```
+
+When `diagnostic_mode=true`, **all** diagnostic analyses are enabled:
+- ✓ Rotation drift analysis (mosaic-level)
+- ✓ Tile dilation analysis (tile-level)
+- ✓ Motor-only stitching
+
+### Granular Control
+
+If you only need specific diagnostics, keep `diagnostic_mode=false` and enable individual flags:
+
+```groovy
+params {
+    // Enable specific diagnostics
+    analyze_rotation_drift = true   // Analyze cumulative rotation between slices
+    analyze_tile_dilation = true    // Compare motor vs registration positions
+    motor_only_stitch = true        // Create slices using only motor positions
+    
+    // Configure diagnostics
+    diagnostic_rotation_threshold = 2.0  // Flag rotations above this (degrees)
+    motor_only_overlap = 0.1             // Expected tile overlap for motor-only stitch
+}
+```
+
+## Diagnostic Scripts
+
+### 1. Acquisition Rotation Analysis (`linum_analyze_acquisition_rotation.py`)
+
+**NEW**: Analyzes rotation patterns from raw acquisition shift data (before registration).
+
+This examines the direction of shift vectors between consecutive slices to detect:
+- Consistent drift direction vs varying direction (rotation indicator)
+- Systematic angular drift (stage rotation)
+- Sudden direction changes (sample movement)
+
+```bash
+linum_analyze_acquisition_rotation.py \
+    shifts_xy.csv \
+    output_dir \
+    --registration_dir /path/to/register_pairwise  # optional: for comparison
+```
+
+**Output**:
+- `acquisition_rotation_data.csv`: Shift angles and angular velocity per slice
+- `acquisition_rotation_analysis.png`: Vector plots and angle trends
+- `acquisition_rotation_analysis.txt`: Text report with interpretation
+
+**Interpretation**:
+- **High angle std (>90°)**: Shift directions vary widely - complex drift or sample movement
+- **Low angle std (<30°)**: Consistent drift direction - minimal rotation
+- **Systematic angular drift**: Stage rotating during acquisition
+- Compare with registration rotation to see if registration is compensating
+
+### 2. Registration Rotation Drift Analysis (`linum_analyze_registration_transforms.py`)
+
+Analyzes cumulative rotation between consecutive slices from pairwise registration outputs.
+
+```bash
+linum_analyze_registration_transforms.py \
+    /path/to/register_pairwise \
+    output_dir \
+    --rotation_threshold 2.0
+```
+
+**Output**:
+- `rotation_data.csv`: Per-slice rotation values
+- `rotation_analysis.png`: Visualization plots
+- `rotation_analysis.txt`: Text report
+
+**Interpretation**:
+- **Mean rotation ≠ 0**: Systematic bias, may indicate tilted sample or stage
+- **High cumulative drift**: Edges will diverge in 3D; consider `registration_transform='euler'`
+- **Sudden large rotations**: Check slice quality, may need exclusion
+
+### 3. Tile Dilation Analysis (`linum_analyze_tile_dilation.py`)
+
+Compares expected motor positions to registration-derived positions.
+
+```bash
+linum_analyze_tile_dilation.py \
+    mosaic_grid_z10.ome.zarr \
+    transform_xy.npy \
+    output_dir \
+    --overlap_fraction 0.1
+```
+
+**Output**:
+- `dilation_analysis.json`: Quantitative metrics
+- `dilation_analysis.png`: Vector field and heatmap visualizations
+- `dilation_analysis.txt`: Text report with interpretation
+
+**Interpretation**:
+- **Scale factor > 1**: Tiles spread more than expected (tissue expansion after cutting)
+- **Scale factor < 1**: Tiles spread less than expected (stage calibration issue)
+- **Anisotropic scaling**: Different X/Y scales cause shearing in 3D
+
+### 4. Motor-Only Stitching (`linum_stitch_motor_only.py`)
+
+Creates stitched mosaic using ONLY motor positions (bypassing image registration).
+
+```bash
+linum_stitch_motor_only.py \
+    mosaic_grid_z10.ome.zarr \
+    slice_z10_motor_only.ome.zarr \
+    --overlap_fraction 0.1
+```
+
+**Interpretation**:
+By comparing motor-only vs fully-registered stitches:
+- **Large differences**: Registration is compensating for motor errors (may indicate dilation)
+- **Systematic offsets**: Stage calibration issue
+- **Random scatter**: Normal registration refinement
+
+### 5. Aggregated Dilation Analysis (`linum_aggregate_dilation_analysis.py`)
+
+**NEW**: Aggregates dilation analysis from multiple slices to compute recommended scale correction factors.
+
+```bash
+linum_aggregate_dilation_analysis.py \
+    /path/to/dilation_analysis \
+    output_dir
+```
+
+**Output**:
+- `aggregated_dilation_analysis.json`: Summary statistics and correction factors
+- `per_slice_correction_factors.csv`: Per-slice factors for advanced use
+- `aggregated_dilation_report.txt`: Text report with recommendations
+- `aggregated_dilation_analysis.png`: Visualizations across slices
+
+**Interpretation**:
+- **Recommended scale_y, scale_x**: Apply these to compensate for systematic dilation
+- **Deviation from unity**: How much mosaics contract/expand vs expected
+- **Anisotropy**: If X/Y differ, use separate correction factors
+
+### 6. Comprehensive Diagnostics (`linum_diagnose_reconstruction.py`)
+
+Runs all analyses and generates a unified report.
+
+```bash
+linum_diagnose_reconstruction.py \
+    /path/to/pipeline_output \
+    output_dir \
+    --rotation_threshold 2.0 \
+    --slice_range "10-40"
+```
+
+**Output**:
+- `diagnostic_report.txt`: Comprehensive text report
+- `diagnostic_results.json`: All metrics in JSON format
+- Individual analysis plots
+
+## Troubleshooting Workflow
+
+### Step 1: Quick Assessment
+```bash
+# Run diagnostics on existing output
+linum_diagnose_reconstruction.py /path/to/sub-18 diagnostics
+```
+
+Review `diagnostic_report.txt` for issues flagged.
+
+### Step 2: Identify Root Cause
+
+**If rotation drift is high**:
+```groovy
+// Ensure Euler transform is used
+registration_transform = 'euler'
+registration_max_rotation = 35.0  // Increase if needed for oblique cuts
+```
+
+**If dilation is detected**:
+
+The most common cause of edge misalignment ("overhangs") is systematic tile dilation/contraction.
+The diagnostic analysis reveals that motor positions don't match the actual image positions.
+
+**Solution**: Apply scale correction during 3D assembly:
+
+#### Using Nextflow Pipeline (Recommended)
+
+**Option A: Auto-aggregate and apply** (fully automated):
+```bash
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/data \
+    --scale_correction_enabled true \
+    --aggregate_dilation true
+```
+This will:
+1. Run dilation analysis on all slices
+2. Aggregate results to compute optimal correction factors
+3. Apply scale correction during common space alignment
+
+**Option B: Use pre-computed correction factors**:
+```bash
+# First run diagnostic mode to generate analysis
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/data \
+    --diagnostic_mode true \
+    --debug_slices "30-40"  # Test subset first
+
+# Then re-run with scale correction using the generated JSON
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/data \
+    --scale_correction_enabled true \
+    --dilation_json /path/to/output/diagnostics/aggregated_dilation/aggregated_dilation_analysis.json
+```
+
+**Option C: Manual scale factors**:
+```bash
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/data \
+    --scale_correction_enabled true \
+    --scale_correction_y 1.031 \
+    --scale_correction_x 1.009
+```
+
+#### Using Command Line (Manual)
+
+```bash
+# Step 1: Aggregate dilation analysis from multiple slices
+linum_aggregate_dilation_analysis.py \
+    sub-18/diagnostics/dilation_analysis \
+    sub-18/diagnostics/aggregated_dilation
+
+# Step 2: Apply correction during 3D alignment
+linum_align_mosaics_3d_from_shifts.py \
+    mosaics_dir \
+    shifts_xy.csv \
+    aligned_output \
+    --dilation_json sub-18/diagnostics/aggregated_dilation/aggregated_dilation_analysis.json
+
+# Or use manual values:
+linum_align_mosaics_3d_from_shifts.py \
+    mosaics_dir \
+    shifts_xy.csv \
+    aligned_output \
+    --scale_y 1.031 \
+    --scale_x 1.009
+```
+
+**Understanding scale factors**:
+- `scale < 1` measured → mosaic is smaller than expected → use correction `> 1` to expand
+- `scale > 1` measured → mosaic is larger than expected → use correction `< 1` to shrink
+- The aggregation script outputs the **correction** factors directly (inverse of measured)
+
+**If specific slices are problematic**:
+```groovy
+// Exclude degraded slices via slice_config.csv
+// Or limit analysis:
+debug_slices = "20-40"  // Focus on problem region
+```
+
+### Step 3: Reprocess with Fixes
+```bash
+# Re-run with adjusted parameters
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/data \
+    --registration_transform euler \
+    --registration_max_rotation 40.0 \
+    --debug_slices "20-40"  # Test on subset first
+```
+
+### Step 4: Validate Fix
+```bash
+# Re-run diagnostics to confirm improvement
+linum_diagnose_reconstruction.py /path/to/new_output diagnostics_after_fix
+```
+
+## Example: Troubleshooting sub-18 Oblique Brain
+
+Based on a 45° oblique-cut mouse brain with edge matching issues:
+
+1. **Run initial diagnostics**:
+```bash
+linum_diagnose_reconstruction.py sub-18 sub-18/diagnostics
+```
+
+2. **Check rotation analysis** (`diagnostics/rotation_analysis.txt`):
+   - If cumulative rotation > 5°, edges will misalign
+   
+3. **Check dilation analysis** (`diagnostics/dilation_analysis/*/dilation_analysis.txt`):
+   - Scale factor deviation from 1.0 indicates motor vs registration mismatch
+
+4. **Compare motor-only stitches** with registered stitches:
+   - Visual inspection of differences reveals registration contribution
+
+## Parameters Reference
+
+### Diagnostic Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `diagnostic_mode` | false | **Master switch**: enables ALL diagnostics when true |
+| `analyze_rotation_drift` | false | Analyze inter-slice rotation (mosaic-level) |
+| `analyze_tile_dilation` | false | Compare motor vs registration positions (tile-level) |
+| `motor_only_stitch` | false | Create motor-position-only stitches |
+| `motor_only_stack` | false | Create 3D stack using motor positions only |
+| `diagnostic_rotation_threshold` | 2.0 | Flag rotations above this (degrees) |
+| `motor_only_overlap` | 0.2 | Expected tile overlap fraction |
+
+### Tile Stitching Parameters
+
+The pipeline uses motor positions for tile placement by default, which provides better 
+alignment than registration-based positioning. Motor positions are precise and don't 
+introduce the systematic compression seen with image-based registration.
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `use_motor_positions_for_stitching` | **true** | Use motor positions for tile placement (recommended) |
+| `stitch_overlap_fraction` | 0.2 | Expected tile overlap fraction (must match acquisition) |
+
+**Why motor positions?** Analysis of registration-based stitching showed systematic compression
+(scale_y ≈ 0.97, scale_x ≈ 0.99), meaning tiles were placed ~3% closer than motor positions
+indicated. This caused "overhang" artifacts when stacking slices. Motor positions are precise
+and don't introduce this compression.
+
+**Note**: When `diagnostic_mode=true`, the individual diagnostic flags (`analyze_rotation_drift`, 
+`analyze_tile_dilation`, `motor_only_stitch`, `motor_only_stack`) are automatically enabled.
+
+## Output Directory Structure
+
+When diagnostics are enabled, outputs appear in:
+
+```
+output/
+├── diagnostics/
+│   ├── rotation_analysis/
+│   │   ├── rotation_data.csv
+│   │   ├── rotation_analysis.png
+│   │   └── rotation_analysis.txt
+│   ├── dilation_analysis/
+│   │   ├── 02/
+│   │   │   ├── dilation_analysis.json
+│   │   │   └── dilation_analysis.png
+│   │   ├── 03/
+│   │   └── ...
+│   └── motor_only_stitch/
+│       ├── slice_z02_motor_only.ome.zarr
+│       ├── slice_z03_motor_only.ome.zarr
+│       └── ...
+└── ...
+```
diff --git a/docs/SCRIPTS_REFERENCE.md b/docs/SCRIPTS_REFERENCE.md
new file mode 100644
index 00000000..b12f5367
--- /dev/null
+++ b/docs/SCRIPTS_REFERENCE.md
@@ -0,0 +1,1116 @@
+# Scripts Reference
+
+
+---
+
+## Overview
+
+linumpy provides a comprehensive set of command-line scripts for microscopy data processing. All scripts follow a consistent interface with `--help` for usage information.
+
+---
+
+## Script Categories
+
+1. [Data Conversion](#data-conversion)
+2. [Mosaic Grid Operations](#mosaic-grid-operations)
+3. [Preprocessing](#preprocessing)
+4. [Reconstruction](#reconstruction)
+5. [Registration & Stitching](#registration--stitching)
+6. [Diagnostics](#diagnostics)
+7. [Slice Configuration](#slice-configuration)
+8. [Visualization](#visualization)
+9. [Utilities](#utilities)
+10. [GPU-Accelerated Scripts](#gpu-accelerated-scripts)
+
+---
+
+## Data Conversion
+
+### linum_convert_tiff_to_nifti.py
+
+Convert TIFF stack to NIfTI format.
+
+```bash
+linum_convert_tiff_to_nifti.py <input.tiff> <output.nii.gz>
+```
+
+### linum_convert_tiff_to_omezarr.py
+
+Convert TIFF to OME-Zarr format.
+
+```bash
+linum_convert_tiff_to_omezarr.py <input.tiff> <output.ome.zarr>
+```
+
+### linum_convert_nifti_to_nrrd.py
+
+Convert NIfTI to NRRD format.
+
+```bash
+linum_convert_nifti_to_nrrd.py <input.nii.gz> <output.nrrd>
+```
+
+### linum_convert_nifti_to_zarr.py
+
+Convert NIfTI to Zarr format.
+
+```bash
+linum_convert_nifti_to_zarr.py <input.nii.gz> <output.zarr>
+```
+
+### linum_convert_omezarr_to_nifti.py
+
+Convert OME-Zarr to NIfTI format.
+
+```bash
+linum_convert_omezarr_to_nifti.py <input.ome.zarr> <output.nii.gz>
+```
+
+### linum_convert_zarr_to_omezarr.py
+
+Convert Zarr to OME-Zarr format.
+
+```bash
+linum_convert_zarr_to_omezarr.py <input.zarr> <output.ome.zarr>
+```
+
+### linum_convert_bin_to_nii.py
+
+Convert binary file to NIfTI.
+
+```bash
+linum_convert_bin_to_nii.py <input.bin> <output.nii.gz>
+```
+
+---
+
+## Mosaic Grid Operations
+
+### linum_create_mosaic_grid_2d.py
+
+Create 2D mosaic grid from tiles.
+
+```bash
+linum_create_mosaic_grid_2d.py <output.ome.zarr> --from_tiles_list <tiles...>
+```
+
+### linum_create_mosaic_grid_3d.py
+
+Create 3D mosaic grid from raw OCT tiles.
+
+```bash
+linum_create_mosaic_grid_3d.py <output.ome.zarr> \
+    --from_tiles_list <tiles...> \
+    --resolution <res> \
+    --n_processes <n> \
+    --axial_resolution <axial_res> \
+    --sharding_factor <factor>
+```
+
+**Options:**
+- `--resolution`: Output resolution in µm/pixel (-1 for full)
+- `--n_processes`: Number of parallel processes
+- `--axial_resolution`: Axial resolution in µm
+- `--sharding_factor`: Zarr sharding factor
+- `--fix_galvo_shift`: Correct galvo shifts
+- `--fix_camera_shift`: Correct camera shifts
+
+### linum_create_all_mosaic_grids_2d.py
+
+Create all 2D mosaic grids from a tiles directory.
+
+```bash
+linum_create_all_mosaic_grids_2d.py <tiles_dir> <output_dir>
+```
+
+### linum_generate_mosaic_aips.py
+
+Generate AIP (Average Intensity Projection) PNG previews from a directory of mosaic grid OME-Zarr files. Useful for quick QC visualization of all slices at once.
+
+```bash
+linum_generate_mosaic_aips.py <mosaics_dir> <output_dir> \
+    [--level <pyramid_level>]
+```
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `--level` | `0` | Pyramid level to use (higher = faster, lower resolution) |
+
+### linum_resample_mosaic_grid.py
+
+Resample mosaic grid to different resolution.
+
+```bash
+linum_resample_mosaic_grid.py <input.ome.zarr> <output.ome.zarr> -r <resolution>
+```
+
+### linum_resample_mosaic_grid_gpu.py
+
+GPU-accelerated version of mosaic grid resampling (5-12x speedup).
+
+```bash
+linum_resample_mosaic_grid_gpu.py <input.ome.zarr> <output.ome.zarr> -r <resolution> --use_gpu
+```
+
+Falls back to CPU if GPU is not available.
+
+---
+
+## Preprocessing
+
+### linum_fix_illumination_3d.py
+
+Correct illumination inhomogeneity using BaSiC algorithm.
+
+```bash
+linum_fix_illumination_3d.py <input.ome.zarr> <output.ome.zarr> \
+    --n_processes <n> \
+    --percentile_max <pmax>
+```
+
+### linum_detect_focal_curvature.py
+
+Detect and correct focal plane curvature.
+
+```bash
+linum_detect_focal_curvature.py <input.ome.zarr> <output.ome.zarr>
+```
+
+### linum_compensate_illumination.py
+
+Apply illumination compensation.
+
+```bash
+linum_compensate_illumination.py <input> <output> <illumination_profile>
+```
+
+### linum_estimate_illumination.py
+
+Estimate illumination profile from data.
+
+```bash
+linum_estimate_illumination.py <input> <output_profile>
+```
+
+### linum_compensate_attenuation.py
+
+Compensate for signal attenuation with depth.
+
+```bash
+linum_compensate_attenuation.py <input> <output>
+```
+
+### linum_compute_attenuation.py
+
+Compute attenuation profile.
+
+```bash
+linum_compute_attenuation.py <input> <output_profile>
+```
+
+### linum_compute_attenuation_bias_field.py
+
+Compute attenuation bias field.
+
+```bash
+linum_compute_attenuation_bias_field.py <input> <output_field>
+```
+
+### linum_compensate_psf_model_free.py
+
+Model-free PSF compensation (beam profile correction).
+
+```bash
+linum_compensate_psf_model_free.py <input.ome.zarr> <output.ome.zarr> \
+    --percentile_max <pmax>
+```
+
+### linum_compensate_psf_from_model.py
+
+PSF compensation using a model.
+
+```bash
+linum_compensate_psf_from_model.py <input> <output> <psf_model>
+```
+
+### linum_clip_percentile.py
+
+Clip image intensities at percentiles.
+
+```bash
+linum_clip_percentile.py <input> <output> --lower <low> --upper <high>
+```
+
+### linum_crop_tiles.py
+
+Crop tiles to specific region.
+
+```bash
+linum_crop_tiles.py <input_dir> <output_dir> --region <x1> <y1> <x2> <y2>
+```
+
+### linum_crop_3d_mosaic_below_interface.py
+
+Crop 3D mosaic below sample interface.
+
+```bash
+linum_crop_3d_mosaic_below_interface.py <input.ome.zarr> <output.ome.zarr> \
+    --depth <depth_um> \
+    --crop_before_interface \
+    --percentile_max <pmax>
+```
+
+### linum_normalize_intensities_per_slice.py
+
+Normalize intensities per slice.
+
+```bash
+linum_normalize_intensities_per_slice.py <input.ome.zarr> <output.ome.zarr> \
+    --percentile_max <pmax>
+```
+
+### linum_intensity_normalization.py
+
+General intensity normalization.
+
+```bash
+linum_intensity_normalization.py <input> <output>
+```
+
+### linum_normalize_z_intensity.py
+
+Correct slow intensity drift across serial sections after stacking. Two modes are supported: `histogram` (per-section histogram matching that preserves relative contrast) and `percentile` (linear scaling to a smoothed percentile curve).
+
+```bash
+linum_normalize_z_intensity.py <input.ome.zarr> <output.ome.zarr> \
+    [--mode {histogram,percentile}] \
+    [--strength <0.0-1.0>] \
+    [--tissue_threshold <val>] \
+    [--smooth_sigma <sections>] \
+    [--percentile <pct>] \
+    [--max_scale <val>] \
+    [--min_scale <val>]
+```
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `--mode` | `histogram` | Normalization mode: `histogram` or `percentile` |
+| `--strength` | `0.5` | Mixing strength (0 = passthrough, 1 = full correction) |
+| `--tissue_threshold` | `0.02` | Minimum intensity to classify as tissue (histogram mode) |
+| `--smooth_sigma` | `10.0` | Smoothing sigma in sections for trend estimation (percentile mode) |
+| `--percentile` | `80.0` | Tissue percentile used as reference intensity (percentile mode) |
+| `--max_scale` | `2.0` | Maximum scale factor |
+| `--min_scale` | `0.5` | Minimum scale factor |
+
+---
+
+## Reconstruction
+
+### linum_aip.py
+
+Generate Average Intensity Projection.
+
+```bash
+linum_aip.py <input.ome.zarr> <output.ome.zarr>
+```
+
+### linum_stitch_2d.py
+
+Stitch 2D tiles into mosaic.
+
+```bash
+linum_stitch_2d.py <input.ome.zarr> <output>
+```
+
+### linum_stitch_3d.py
+
+Stitch 3D tiles into volume using a pre-computed transform.
+
+```bash
+linum_stitch_3d.py <mosaic_grid.ome.zarr> <transform.npy> <output.ome.zarr>
+```
+
+### linum_stitch_3d_refined.py
+
+Stitch 3D tiles with image-registration-refined blend transitions to reduce visible tile seams.
+
+```bash
+linum_stitch_3d_refined.py <mosaic_grid.ome.zarr> <output.ome.zarr> \
+    [--overlap_fraction <frac>] \
+    [--blending_method {none,average,diffusion}] \
+    [--refinement_mode blend_shift] \
+    [--max_refinement_px <pixels>] \
+    [--output_refinements <refinements.json>]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--overlap_fraction` | Expected tile overlap fraction (default: 0.2) |
+| `--blending_method` | Tile blending method: `none`, `average`, `diffusion` |
+| `--refinement_mode` | How refinement shifts are used (e.g. `blend_shift`) |
+| `--max_refinement_px` | Maximum sub-pixel refinement shift (pixels) |
+| `--output_refinements` | Optional JSON file to save refinement data |
+
+### linum_stitch_motor_only.py
+
+Stitch tiles using motor encoder positions only (no image registration). Useful for diagnostics and comparing against refined stitching.
+
+```bash
+linum_stitch_motor_only.py <mosaic_grid.ome.zarr> <output.ome.zarr> \
+    [--overlap_fraction <frac>] \
+    [--blending_method {none,average,diffusion}]
+```
+
+### linum_stack_slices.py
+
+Stack 2D slices into 3D volume.
+
+```bash
+linum_stack_slices.py <slices_dir> <output> --xy_shifts <shifts.csv>
+```
+
+### linum_stack_slices_3d.py
+
+Stack 3D mosaics into final volume with analysis-optimized multi-resolution pyramid.
+
+```bash
+linum_stack_slices_3d.py <mosaics_dir> <transforms_dir> <output.ome.zarr> \
+    [--blend] [--overlap <n>] [--pyramid_resolutions 10 25 50 100]
+```
+
+**Options:**
+- `--blend`: Enable diffusion blending between slices
+- `--overlap`: Maximum overlap voxels for blending
+- `--pyramid_resolutions`: Target resolutions (µm) for pyramid levels (default: 10 25 50 100)
+- `--n_levels`: Number of traditional power-of-2 downsample levels (alternative to pyramid_resolutions)
+
+**Pyramid Resolution Modes:**
+
+1. **Custom resolutions (default)**: Specify exact analysis-friendly resolutions
+   ```bash
+   linum_stack_slices_3d.py mosaics/ transforms/ output.ome.zarr \
+       --pyramid_resolutions 10 25 50 100
+   ```
+
+2. **Traditional power-of-2**: Use `--n_levels` for 2x downsampling
+   ```bash
+   linum_stack_slices_3d.py mosaics/ transforms/ output.ome.zarr \
+       --n_levels 4
+   ```
+
+**Note:** Only resolutions ≥ the base resolution will be created. The base resolution is automatically determined from the input data.
+
+### linum_stack_slices_motor.py
+
+Stack slices into a 3D volume using motor positions for XY placement and correlation-based Z matching. This is the primary stacking script used by the `motor` stacking method in the pipeline.
+
+```bash
+linum_stack_slices_motor.py <slices_dir> <shifts_file> <transforms_dir> <output.ome.zarr> \
+    [--blending {none,average,max,feather}] \
+    [--apply_rotation_only] \
+    [--max_rotation_deg <degrees>] \
+    [--skip_error_transforms] \
+    [--rehoming_threshold_mm <mm>] \
+    [--smooth_window <n>]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--blending` | Blending method for overlapping regions |
+| `--apply_rotation_only` | Apply only the rotation component from pairwise registration |
+| `--max_rotation_deg` | Clamp rotations larger than this value |
+| `--skip_error_transforms` | Skip transforms with error status |
+| `--rehoming_threshold_mm` | Motor shift threshold to detect re-homing events |
+| `--smooth_window` | Moving-average window for smoothing per-slice rotations |
+
+### linum_stack_motor_only.py
+
+Stack slices using motor positions only (no pairwise registration). Used for diagnostics to isolate the motor-position contribution.
+
+```bash
+linum_stack_motor_only.py <slices_dir> <shifts_file> <output.ome.zarr> \
+    [--blending {none,average,max,feather}] \
+    [--preview <preview.png>]
+```
+
+---
+
+## Registration & Stitching
+
+### linum_estimate_transform.py
+
+Estimate XY transformation from mosaic grid.
+
+```bash
+linum_estimate_transform.py <aip.ome.zarr> <output_transform.npy>
+```
+
+### linum_estimate_xy_shift_from_metadata.py
+
+Estimate XY shifts from tile metadata.
+
+```bash
+linum_estimate_xy_shift_from_metadata.py <tiles_dir> <output.csv> \
+    --n_processes <n>
+```
+
+### linum_align_mosaics_3d_from_shifts.py
+
+Align mosaics to common space using shifts file. This script brings all slices into a common coordinate system based on the physical positions recorded by the microscope.
+
+```bash
+linum_align_mosaics_3d_from_shifts.py <mosaics_dir> <shifts.csv> <output_dir> \
+    [--slice_config <config.csv>] \
+    [--filter_outliers] \
+    [--outlier_method {clamp,median,zero,local,iqr}] \
+    [--max_shift_mm <mm>] \
+    [--iqr_multiplier <mult>] \
+    [--no_center_drift]
+```
+
+**Options:**
+- `--slice_config`: Optional CSV file to filter which slices to process
+- `--filter_outliers`: Enable outlier detection and filtering (recommended)
+- `--outlier_method`: Method to handle outliers:
+  - `clamp`: Limit shift magnitude to `--max_shift_mm`
+  - `median`: Replace with global median of non-outliers
+  - `zero`: Replace with zero shift
+  - `local`: Replace with local median of neighbors (preserves trends)
+  - `iqr`: Auto-detect using IQR statistics and replace with local median (recommended)
+- `--max_shift_mm`: Maximum allowed shift in mm (default: 0.5, only used if method != 'iqr')
+- `--iqr_multiplier`: IQR multiplier for outlier detection (default: 1.5, only with 'iqr' method)
+- `--no_center_drift`: Don't center drift around middle slice
+
+**Example with outlier filtering:**
+```bash
+linum_align_mosaics_3d_from_shifts.py mosaics/ shifts_xy.csv output/ \
+    --slice_config slice_config.csv \
+    --filter_outliers \
+    --outlier_method iqr
+```
+
+**Note:** The shifts file may contain erroneous large shifts due to stage positioning errors. The IQR-based filtering automatically detects these outliers and replaces them with reasonable values based on neighboring shifts.
+
+### linum_apply_slices_transforms.py
+
+Apply transforms to slices.
+
+```bash
+linum_apply_slices_transforms.py <input_dir> <transforms_dir> <output_dir>
+```
+
+### linum_estimate_slices_transforms_gui.py
+
+GUI for manual slice transform estimation.
+
+```bash
+linum_estimate_slices_transforms_gui.py <slices_dir>
+```
+
+### linum_create_masks.py
+
+Create binary masks for registration. Masks are saved with pyramid levels matching the input image.
+
+```bash
+linum_create_masks.py <input.ome.zarr> <output.ome.zarr> \
+    --sigma <sigma> \
+    --selem_radius <radius> \
+    --min_size <size> \
+    [--normalize] \
+    [--n_levels <levels>] \
+    [--preview <path.png>]
+```
+
+**Options:**
+- `--sigma`: Gaussian smoothing sigma (default: 5.0)
+- `--selem_radius`: Structuring element radius (default: 1)
+- `--min_size`: Minimum object size in pixels (default: 100)
+- `--normalize`: Normalize image before processing
+- `--n_levels`: Number of pyramid levels (default: matches input image)
+- `--preview`: Path to save a preview PNG for visual verification
+
+### linum_register_pairwise.py
+
+Perform pairwise registration between consecutive slices to compute small rotation and Z-overlap corrections. This is the primary registration script used by the motor-based reconstruction pipeline. It **does not** compute large XY translations (those are handled by motor positions from `shifts_xy.csv`).
+
+Two outputs are produced per pair:
+- `transform.tfm`: SimpleITK transform (rotation + sub-pixel translation)
+- `offsets.txt`: Z-index correspondence between fixed and moving slices
+- `metrics.json`: Registration quality metrics
+
+```bash
+linum_register_pairwise.py <fixed.ome.zarr> <moving.ome.zarr> <output_dir> \
+    [--slicing_interval_mm <mm>] \
+    [--search_range_mm <mm>] \
+    [--robustness {0,1,2}] \
+    [--use_mask] \
+    [--mask_dir <dir>]
+```
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `--slicing_interval_mm` | `0.05` | Expected physical slice thickness in mm |
+| `--search_range_mm` | `0.1` | Z search range around expected overlap |
+| `--robustness` | `1` | Robustness level: 0=fast, 1=balanced, 2=thorough |
+| `--use_mask` | off | Use tissue masks to focus registration on tissue regions |
+| `--mask_dir` | — | Directory containing mask OME-Zarr files |
+
+---
+
+### linum_align_to_ras.py
+
+Align a 3D brain volume to RAS orientation by rigid registration to the Allen Mouse Brain Atlas (CCF). The result is an OME-Zarr at all pyramid resolutions in RAS space.
+
+```bash
+linum_align_to_ras.py <input.ome.zarr> <output_ras.ome.zarr> \
+    [--allen-resolution {10,25,50,100}] \
+    [--metric {MI,MSE,CC,AntsCC}] \
+    [--max-iterations N] \
+    [--level L] \
+    [--input-orientation <CODE>] \
+    [--initial-rotation RX RY RZ] \
+    [--preview <preview.png>]
+```
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `--allen-resolution` | `100` | Allen atlas resolution in µm: 10, 25, 50, 100 |
+| `--metric` | `MI` | Registration metric: `MI`, `MSE`, `CC`, `AntsCC` |
+| `--max-iterations` | `1000` | Maximum optimizer iterations |
+| `--level` | `0` | Pyramid level for registration (0 = full) |
+| `--input-orientation` | — | 3-letter orientation code (see below) |
+| `--initial-rotation` | `0 0 0` | Initial rotation hint Rx Ry Rz (degrees) |
+| `--preview` | — | Save a 3-panel alignment comparison image |
+| `--preview-only` | — | Only generate preview without registering |
+| `--store-transform-only` | — | Store transform in metadata without resampling |
+| `--verbose` | — | Print registration progress |
+
+**Orientation codes** — see the [RAS Orientation Lookup Table](NEXTFLOW_WORKFLOWS.md#ras-orientation-lookup-table) for a complete reference.
+
+---
+
+## Slice Configuration
+
+### linum_generate_slice_config.py
+
+Generate slice configuration file with optional galvo shift detection.
+
+```bash
+# From mosaic grids
+linum_generate_slice_config.py <mosaics_dir> <output.csv>
+
+# From raw tiles
+linum_generate_slice_config.py <tiles_dir> <output.csv> --from_tiles
+
+# From shifts file
+linum_generate_slice_config.py <shifts.csv> <output.csv> --from_shifts
+
+# With exclusions
+linum_generate_slice_config.py <input> <output.csv> --exclude 1 2 5
+
+# With galvo detection (adds galvo_confidence and galvo_fix columns)
+linum_generate_slice_config.py <tiles_dir> <output.csv> --from_tiles --detect_galvo
+
+# From shifts with galvo detection
+linum_generate_slice_config.py <shifts.csv> <output.csv> --from_shifts \
+    --detect_galvo --tiles_dir <tiles_dir>
+
+# Custom galvo threshold
+linum_generate_slice_config.py <tiles_dir> <output.csv> --from_tiles \
+    --detect_galvo --galvo_threshold 0.6
+```
+
+**Galvo Detection Options:**
+| Option | Default | Description |
+|--------|---------|-------------|
+| `--detect_galvo` | off | Enable galvo shift detection |
+| `--tiles_dir` | - | Raw tiles directory (if input is shifts file) |
+| `--galvo_threshold` | 0.5 | Confidence threshold for galvo fix |
+
+---
+
+## Diagnostics
+
+Scripts for troubleshooting reconstruction artifacts. These are typically invoked by the pipeline's diagnostic mode but can also be run standalone.
+
+### linum_analyze_shifts.py
+
+Analyze XY shifts from a shifts file and generate a drift analysis report with summary statistics, outlier detection, and cumulative drift visualization.
+
+```bash
+linum_analyze_shifts.py <shifts_xy.csv> <output_dir> \
+    [--resolution <um>] \
+    [--iqr_multiplier <mult>] \
+    [--slice_config <config.csv>]
+```
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `--resolution` | `10.0` | Resolution in µm/pixel (for mm → pixel conversion) |
+| `--iqr_multiplier` | `1.5` | IQR multiplier for outlier detection |
+| `--slice_config` | — | Optional slice config CSV to filter slices |
+
+### linum_assess_slice_quality.py
+
+Assess mosaic grid slice quality and optionally create or update a slice configuration file. Uses SSIM, edge preservation, and variance consistency metrics.
+
+```bash
+linum_assess_slice_quality.py <mosaics_dir> <output_slice_config.csv> \
+    [--min_quality <0.0-1.0>] \
+    [--exclude_first <n>] \
+    [--update_existing]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--min_quality` | Automatically exclude slices below this quality score |
+| `--exclude_first` | Exclude the first N calibration slices |
+| `--update_existing` | Update an existing slice config with quality info |
+
+### linum_assess_slice_quality_gpu.py
+
+GPU-accelerated version of `linum_assess_slice_quality.py` (3-8x speedup). Accepts the same arguments plus `--use_gpu / --no-use_gpu`.
+
+```bash
+linum_assess_slice_quality_gpu.py <mosaics_dir> <output_slice_config.csv> [options]
+```
+
+### linum_analyze_registration_transforms.py
+
+Analyze cumulative rotation and translation across pairwise registration transforms to detect drift.
+
+```bash
+linum_analyze_registration_transforms.py <register_pairwise_dir> <output_dir> \
+    [--resolution <um>] \
+    [--rotation_threshold <degrees>]
+```
+
+### linum_analyze_acquisition_rotation.py
+
+Analyze acquisition-time rotation from the shifts file combined with registration outputs.
+
+```bash
+linum_analyze_acquisition_rotation.py <shifts_file> <output_dir> \
+    [--registration_dir <register_pairwise_dir>] \
+    [--resolution <um>]
+```
+
+### linum_analyze_tile_dilation.py
+
+Analyze tile position refinements to detect scale drift (mosaic dilation).
+
+```bash
+linum_analyze_tile_dilation.py <mosaic_grid.ome.zarr> <transform.npy> <output_dir> \
+    [--resolution <um>] \
+    [--overlap_fraction <frac>] \
+    [--slice_id <id>]
+```
+
+### linum_aggregate_dilation_analysis.py
+
+Aggregate per-slice tile dilation analysis results across the full sample.
+
+```bash
+linum_aggregate_dilation_analysis.py <input_dir> <output_dir> \
+    [--pattern <glob_pattern>]
+```
+
+### linum_compare_stitching.py
+
+Compare motor-only vs refined stitching side-by-side by computing seam sharpness metrics and generating comparison visualizations.
+
+```bash
+linum_compare_stitching.py <motor_stitch.ome.zarr> <refined_stitch.ome.zarr> <output_dir> \
+    [--label1 <name>] \
+    [--label2 <name>] \
+    [--tile_step <pixels>]
+```
+
+### linum_diagnose_pipeline.py
+
+High-level pipeline diagnostic script. Aggregates metrics and produces a summarized diagnostic report.
+
+```bash
+linum_diagnose_pipeline.py <pipeline_output_dir> <output_dir> \
+    [--resolution <um>]
+```
+
+### linum_diagnose_reconstruction.py
+
+Detailed reconstruction diagnostic script. Checks registration transforms, rotation drift, and alignment quality.
+
+```bash
+linum_diagnose_reconstruction.py <pipeline_output_dir> <output_dir> \
+    [--resolution <um>] \
+    [--rotation_threshold <degrees>]
+```
+
+---
+
+## Visualization
+
+### linum_view_omezarr.py
+
+Interactive OME-Zarr viewer.
+
+```bash
+linum_view_omezarr.py <input.ome.zarr>
+```
+
+### linum_view_zarr.py
+
+Interactive Zarr viewer.
+
+```bash
+linum_view_zarr.py <input.zarr>
+```
+
+### linum_view_oct_raw_tile.py
+
+View raw OCT tile.
+
+```bash
+linum_view_oct_raw_tile.py <tile_dir>
+```
+
+### linum_screenshot_omezarr.py
+
+Generate screenshot from OME-Zarr.
+
+```bash
+linum_screenshot_omezarr.py <input.ome.zarr> <output.png>
+```
+
+### linum_screenshot_omezarr_annotated.py
+
+Generate annotated orthogonal view screenshots from an OME-Zarr volume. Adds Z-slice index labels to the coronal and sagittal views so each input slice can be easily identified in the reconstruction.
+
+```bash
+linum_screenshot_omezarr_annotated.py <input.ome.zarr> <output.png> \
+    [--x_slice <idx>] \
+    [--y_slice <idx>] \
+    [--n_slices <n>] \
+    [--slice_ids <id1,id2,...>] \
+    [--font_size <size>] \
+    [--label_every <n>] \
+    [--show_lines]
+```
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `--x_slice` | center | X-axis slice index for ZY view |
+| `--y_slice` | center | Y-axis slice index for ZX view |
+| `--n_slices` | auto | Number of input slices (auto-detected from OME-Zarr metadata) |
+| `--slice_ids` | — | Comma-separated actual slice IDs (e.g. `05,12,18`) |
+| `--font_size` | `7` | Font size for slice labels |
+| `--label_every` | `1` | Label every Nth slice |
+| `--show_lines` | off | Draw horizontal lines at slice boundaries |
+
+### linum_generate_pipeline_report.py
+
+Generate a quality report from pipeline metrics. Aggregates metrics JSON files from all processing steps and produces an HTML or text report.
+
+```bash
+# Generate HTML report (default)
+linum_generate_pipeline_report.py <pipeline_output_dir> report.html
+
+# Generate text report
+linum_generate_pipeline_report.py <pipeline_output_dir> report.txt --format text
+
+# Verbose report with all metric details
+linum_generate_pipeline_report.py <pipeline_output_dir> report.html --verbose
+
+# Custom title
+linum_generate_pipeline_report.py <pipeline_output_dir> report.html --title "My Pipeline Report"
+```
+
+**Parameters:**
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `input_dir` | (required) | Directory containing pipeline output with metrics files |
+| `output_report` | (required) | Output report file (.html or .txt) |
+| `--format` | `auto` | Output format: `html`, `text`, or `auto` (infer from extension) |
+| `--title` | `"Pipeline Quality Report"` | Report title |
+| `--verbose` | `false` | Include all metric details in report |
+
+---
+
+## Utilities
+
+
+### linum_extract_pyramid_levels.py
+
+Extract one or more pyramid levels from an OME-Zarr volume and save as NIfTI files. Useful for exporting analysis-specific resolutions without converting the full volume.
+
+```bash
+# List available pyramid levels
+linum_extract_pyramid_levels.py <input.ome.zarr> --list
+
+# Extract specific levels
+linum_extract_pyramid_levels.py <input.ome.zarr> 0 2
+```
+
+Output files are named `<zarr_stem>_level<N>_<resolution>.nii.gz` and saved next to the input.
+
+### linum_resample.py
+
+Resample image to different resolution.
+
+```bash
+linum_resample.py <input> <output> -r <resolution>
+```
+
+### linum_reorient_nifti_to_ras.py
+
+Reorient NIfTI to RAS orientation.
+
+```bash
+linum_reorient_nifti_to_ras.py <input.nii.gz> <output.nii.gz>
+```
+
+### linum_axis_XYZ_to_ZYX.py
+
+Transpose axes from XYZ to ZYX.
+
+```bash
+linum_axis_XYZ_to_ZYX.py <input> <output>
+```
+
+### linum_segment_brain_3d.py
+
+Segment brain tissue in 3D volume.
+
+```bash
+linum_segment_brain_3d.py <input> <output_mask>
+```
+
+### linum_download_allen.py
+
+Download Allen Brain Atlas data.
+
+```bash
+linum_download_allen.py <output_dir>
+```
+
+### linum_merge_slices_into_folders.py
+
+Organize slices into folder structure.
+
+```bash
+linum_merge_slices_into_folders.py <input_dir> <output_dir>
+```
+
+### linum_suggest_params.py
+
+Suggest 3D reconstruction pipeline parameters from raw input files. Analyses the motor-positions file (`shifts_xy.csv`) and, optionally, the raw data directory produced by the preprocessing pipeline to automatically estimate suitable `nextflow.config` parameters.
+
+```bash
+linum_suggest_params.py <shifts_file> <output_dir> \
+    [--data_dir <raw_data_dir>] \
+    [--n_calibration_slices N] \
+    [--axial_res_um UM] \
+    [--resolution_um UM] \
+    [-f]
+```
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `shifts_file` | (required) | Motor-positions CSV file (`shifts_xy.csv`) |
+| `output_dir` | (required) | Directory for the report and suggested config snippet |
+| `--data_dir` | — | Raw data directory (contains `state.json` and `slice_z##/` subdirectories) |
+| `--n_calibration_slices` | `1` | Leading calibration slices to skip when reading per-slice metadata (`slice_z00` is always calibration) |
+| `--axial_res_um` | `3.5` | OCT axial resolution in µm/pixel |
+| `--resolution_um` | — | Override target pipeline resolution in µm/pixel (derived automatically if omitted) |
+| `-f`, `--overwrite` | — | Overwrite existing output directory |
+
+**Estimated parameters:**
+
+From `shifts_xy.csv`:
+- `stitch_rehoming_threshold_mm` — re-homing boundary threshold (MAD-robust detection)
+- `stitch_rehoming_enabled` — true if re-homing events are detected
+- `stitch_rehoming_use_motor` — always recommended true when re-homing is present
+- `max_shift_mm` — IQR upper bound of normal inter-slice shifts
+- `common_space_max_step_mm` — 95th percentile of consecutive normal shift changes
+
+From `--data_dir` (raw data directory):
+- `registration_slicing_interval_mm` — from `slice_thickness` in `metadata.json` / `state.json`
+- `stitch_overlap_fraction` — from `overlap_fraction` in `metadata.json` / `state.json`
+- `resolution` — smallest standard resolution ≥ native lateral pixel size
+- `crop_interface_out_depth` — estimate based on OCT depth and focus position (must be verified)
+
+**Outputs:**
+- `param_estimation_report.txt` — human-readable analysis report with shift statistics
+- `suggested_params.config` — annotated `nextflow.config` parameter block ready to copy-paste
+
+**Example:**
+```bash
+linum_suggest_params.py /data/sub-01/shifts_xy.csv /tmp/sub-01_params \
+    --data_dir /data/sub-01/raw \
+    --n_calibration_slices 1
+```
+
+---
+
+## GPU-Accelerated Scripts
+
+GPU-accelerated versions of key scripts. These use NVIDIA CUDA via CuPy for acceleration and automatically fall back to CPU if no GPU is available.
+
+### linum_gpu_info.py
+
+Check GPU availability and run quick benchmarks.
+
+```bash
+# Show GPU info
+linum_gpu_info.py
+
+# Run quick performance test
+linum_gpu_info.py --test
+
+# Output as JSON
+linum_gpu_info.py --json
+```
+
+### linum_benchmark_gpu.py
+
+Comprehensive CPU vs GPU benchmark suite.
+
+```bash
+# Quick benchmark (512x512)
+linum_benchmark_gpu.py
+
+# Full benchmark with multiple sizes
+linum_benchmark_gpu.py --full
+
+# Custom sizes
+linum_benchmark_gpu.py --sizes 1024 2048 4096
+
+# With real data
+linum_benchmark_gpu.py --input /path/to/mosaic.ome.zarr
+
+# Save results
+linum_benchmark_gpu.py --output results.json --iterations 10
+```
+
+| Option | Description |
+|--------|-------------|
+| `--input` | Path to OME-Zarr for real-data benchmark |
+| `--output, -o` | Save results to JSON file |
+| `--iterations, -n` | Iterations per test (default: 3) |
+| `--full` | Run with multiple sizes |
+| `--sizes` | Custom sizes to test |
+| `--skip-correctness` | Skip result verification |
+
+### linum_estimate_transform_gpu.py
+
+GPU-accelerated transform estimation using phase correlation.
+
+```bash
+linum_estimate_transform_gpu.py <input_images> <output.npy> [--use_gpu] [-v]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--initial_overlap` | Initial overlap fraction (default: 0.3) |
+| `--tile_shape` | Tile shape in pixels |
+| `--n_samples` | Max tile pairs for optimization |
+| `--use_gpu/--no-use_gpu` | Enable/disable GPU |
+
+### linum_create_masks_gpu.py
+
+GPU-accelerated tissue mask creation.
+
+```bash
+linum_create_masks_gpu.py <input.ome.zarr> <output.ome.zarr> [options]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--sigma` | Gaussian smoothing sigma (default: 5.0) |
+| `--selem_radius` | Structuring element radius (default: 1) |
+| `--min_size` | Minimum object size (default: 100) |
+| `--normalize` | Normalize before processing |
+| `--use_gpu/--no-use_gpu` | Enable/disable GPU |
+
+### linum_create_mosaic_grid_3d_gpu.py
+
+GPU-accelerated mosaic grid creation with galvo detection.
+
+```bash
+linum_create_mosaic_grid_3d_gpu.py <output.ome.zarr> --from_tiles_list <tiles> [options]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--resolution` | Output resolution in µm/pixel |
+| `--fix_galvo_shift` | Enable galvo correction |
+| `--galvo_threshold` | Galvo detection threshold (default: 0.6) |
+| `--use_gpu/--no-use_gpu` | Enable/disable GPU |
+
+### linum_normalize_intensities_per_slice_gpu.py
+
+GPU-accelerated normalization of intensities per slice (4-10x speedup).
+
+```bash
+linum_normalize_intensities_per_slice_gpu.py <input.ome.zarr> <output.ome.zarr> \
+    [--percentile_max <pmax>] \
+    [--use_gpu/--no-use_gpu]
+```
+
+### linum_fix_illumination_3d_gpu.py
+
+GPU-accelerated illumination correction using BaSiCPy on JAX (2-5x speedup). Requires JAX GPU setup — see [GPU_ACCELERATION.md](GPU_ACCELERATION.md#jax-gpu-for-basicpy-fix_illumination).
+
+```bash
+linum_fix_illumination_3d_gpu.py <input.ome.zarr> <output.ome.zarr> \
+    [--n_processes <n>] \
+    [--use_gpu/--no-use_gpu]
+```
+
+### linum_generate_mosaic_aips_gpu.py
+
+GPU-accelerated version of `linum_generate_mosaic_aips.py`. Note: mean projection offers no genuine speedup (≤1x); this script is provided for pipeline consistency.
+
+```bash
+linum_generate_mosaic_aips_gpu.py <mosaics_dir> <output_dir> \
+    [--level <pyramid_level>] \
+    [--use_gpu/--no-use_gpu]
+```
+
+### GPU Script Comparison
+
+| GPU Script | CPU Equivalent | Accelerated Operations |
+|------------|----------------|------------------------|
+| `linum_estimate_transform_gpu.py` | `linum_estimate_transform.py` | FFT (9-47x), phase correlation (8-16x) |
+| `linum_create_masks_gpu.py` | `linum_create_masks.py` | Gaussian filter (7-20x), binary morphology (7-67x) |
+| `linum_create_mosaic_grid_3d_gpu.py` | `linum_create_mosaic_grid_3d.py` | Resize (5-12x) |
+| `linum_resample_mosaic_grid_gpu.py` | `linum_resample_mosaic_grid.py` | Resize (5-12x) |
+| `linum_normalize_intensities_per_slice_gpu.py` | `linum_normalize_intensities_per_slice.py` | Normalization (4-10x) |
+| `linum_fix_illumination_3d_gpu.py` | `linum_fix_illumination_3d.py` | BaSiCPy via JAX (2-5x) |
+| `linum_assess_slice_quality_gpu.py` | `linum_assess_slice_quality.py` | SSIM, morphology (3-8x) |
+| `linum_generate_mosaic_aips_gpu.py` | `linum_generate_mosaic_aips.py` | AIP mean projection (≤1x) |
+
+*Note: `linum_aip_gpu.py` exists but offers no speedup for mean projection (0.5x = GPU is 2x slower due to transfer overhead). Use `linum_aip_png.py` or `linum_generate_mosaic_aips.py` instead.*
+
+---
+
+## Common Options
+
+Most scripts support these common options:
+
+| Option | Description |
+|--------|-------------|
+| `--help`, `-h` | Show help message |
+| `--overwrite`, `-f` | Overwrite existing output |
+| `--n_processes`, `-p` | Number of parallel processes |
+
+---
+
+## Getting Help
+
+```bash
+# Show script help
+linum_<script_name>.py --help
+
+# Example
+linum_create_mosaic_grid_3d.py --help
+```
diff --git a/docs/SHIFTS_FILE_FORMAT.md b/docs/SHIFTS_FILE_FORMAT.md
new file mode 100644
index 00000000..54b2e55e
--- /dev/null
+++ b/docs/SHIFTS_FILE_FORMAT.md
@@ -0,0 +1,303 @@
+# XY Shifts File Format
+
+
+---
+
+## Overview
+
+The `shifts_xy.csv` file contains pairwise XY shifts between consecutive slices in a serial sectioning dataset. This file is essential for aligning slices during 3D reconstruction.
+
+---
+
+## File Format
+
+### Location
+
+Generated by preprocessing pipeline at: `{output}/shifts_xy.csv`
+
+### Structure
+
+CSV format with header row:
+
+```csv
+fixed_id,moving_id,x_shift,y_shift,x_shift_mm,y_shift_mm
+0,1,156.234,-23.456,0.0234,-0.0035
+1,2,142.567,-18.234,0.0214,-0.0027
+2,3,161.890,-25.678,0.0243,-0.0039
+```
+
+### Columns
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `fixed_id` | int | Slice ID of the fixed (reference) slice |
+| `moving_id` | int | Slice ID of the moving slice |
+| `x_shift` | float | X shift in **pixels** |
+| `y_shift` | float | Y shift in **pixels** |
+| `x_shift_mm` | float | X shift in **millimeters** |
+| `y_shift_mm` | float | Y shift in **millimeters** |
+
+---
+
+## Shift Direction
+
+The shifts represent the **transformation needed to align the moving slice to the fixed slice**:
+
+```
+moving_slice_aligned = moving_slice + (x_shift, y_shift)
+```
+
+Or equivalently, the shifts represent the **position difference** between consecutive slices:
+
+```
+x_shift = mosaic_xmin[fixed] - mosaic_xmin[moving]
+y_shift = mosaic_ymin[fixed] - mosaic_ymin[moving]
+```
+
+---
+
+## Generation
+
+### Script
+
+```bash
+linum_estimate_xy_shift_from_metadata.py <tiles_directory> <output_file> [--n_processes N]
+```
+
+### Process
+
+1. Scans tile directory for all slices
+2. Extracts stage positions from tile metadata
+3. Computes mosaic origin (xmin, ymin) for each slice
+4. Calculates pairwise shifts between consecutive slices
+5. Converts to pixels using tile resolution
+6. Writes CSV file
+
+### Source Code Reference
+
+```python
+# From linum_estimate_xy_shift_from_metadata.py
+
+# Compute the shift between slices in mm
+x_shifts_mm = []
+y_shifts_mm = []
+for i in range(n_slices - 1):
+    dx = xmin_mm[i] - xmin_mm[i + 1]
+    dy = ymin_mm[i] - ymin_mm[i + 1]
+    x_shifts_mm.append(dx)
+    y_shifts_mm.append(dy)
+
+# Convert the shifts in pixel
+x_shift_px = np.array(x_shifts_mm) / tile_resolution[0]
+y_shift_px = np.array(y_shifts_mm) / tile_resolution[1]
+```
+
+---
+
+## Usage in Reconstruction
+
+### Common Space Alignment
+
+The `linum_align_mosaics_3d_from_shifts.py` script uses shifts to bring all slices into a common coordinate space:
+
+1. **Load shifts**: Read all pairwise shifts from CSV
+2. **Compute cumulative shifts**: Sum shifts from first slice to each subsequent slice
+3. **Determine common shape**: Find bounding box encompassing all aligned slices
+4. **Apply shifts**: Translate each slice by its cumulative shift
+
+### Cumulative Shift Calculation
+
+```python
+# Cumulative shifts (position relative to first slice)
+cumsum = [0]  # First slice has zero offset
+for i in range(n_slices - 1):
+    cumsum.append(cumsum[i] + shifts[i])
+
+# Example:
+# Pairwise shifts: [10, 8, 12, 5]
+# Cumulative:      [0, 10, 18, 30, 35]
+```
+
+---
+
+## Handling Missing or Skipped Slices
+
+### Problem
+
+If some slices are excluded from reconstruction, the pairwise shifts must be accumulated correctly.
+
+### Solution
+
+See [SLICE_CONFIG_FEATURE.md](SLICE_CONFIG_FEATURE.md) for the slice configuration system that handles this.
+
+### Example
+
+Original slices: 0, 1, 2, 3  
+Shifts: 0→1: 10, 1→2: 8, 2→3: 12
+
+If slice 2 is skipped:
+- Processing slices: 0, 1, 3
+- Cumulative for slice 3: 10 + 8 + 12 = 30 (NOT just 12!)
+
+---
+
+## Validation
+
+### Check File Contents
+
+```bash
+# View shifts file
+cat shifts_xy.csv
+
+# Count entries
+wc -l shifts_xy.csv
+
+# Check for NaN or invalid values
+grep -E "nan|inf|NaN" shifts_xy.csv
+```
+
+### Validate Against Slices
+
+```python
+import pandas as pd
+
+# Load shifts
+df = pd.read_csv('shifts_xy.csv')
+
+# Get all slice IDs mentioned
+slice_ids = set(df['fixed_id'].tolist() + df['moving_id'].tolist())
+print(f"Slices in shifts file: {sorted(slice_ids)}")
+
+# Check for consecutive IDs
+expected = set(range(min(slice_ids), max(slice_ids) + 1))
+missing = expected - slice_ids
+if missing:
+    print(f"WARNING: Missing slice IDs: {missing}")
+```
+
+---
+
+## Outlier Filtering
+
+### Problem
+
+The shifts file may contain erroneous large shifts due to:
+- Stage positioning errors during acquisition
+- Mechanical issues with the microscope stage
+- Metadata recording errors
+
+These outliers cause slices to drift out of the common volume during alignment.
+
+### Solution: IQR-Based Filtering
+
+The `linum_align_mosaics_3d_from_shifts.py` script supports automatic outlier detection and correction:
+
+```bash
+linum_align_mosaics_3d_from_shifts.py mosaics/ shifts_xy.csv output/ \
+    --filter_outliers --outlier_method iqr
+```
+
+### Outlier Detection Methods
+
+| Method | Description | Best For |
+|--------|-------------|----------|
+| `iqr` | Uses Interquartile Range (Q3 + 1.5×IQR) to detect outliers, replaces with local median | Most cases (recommended) |
+| `local` | Uses fixed threshold, replaces with local median of neighbors | When threshold is known |
+| `median` | Uses fixed threshold, replaces with global median | Simple cases |
+| `clamp` | Limits magnitude to max_shift_mm while preserving direction | When direction matters |
+| `zero` | Replaces outliers with zero shift | When shifts are unreliable |
+
+### How IQR Detection Works
+
+```python
+# Calculate shift magnitudes
+magnitude = sqrt(x_shift_mm² + y_shift_mm²)
+
+# IQR-based threshold
+Q1 = 25th percentile of magnitudes
+Q3 = 75th percentile of magnitudes
+IQR = Q3 - Q1
+threshold = Q3 + 1.5 × IQR
+
+# Outliers are shifts with magnitude > threshold
+```
+
+### Example
+
+For a typical dataset:
+- Q1 = 0.115 mm, Q3 = 0.331 mm, IQR = 0.216 mm
+- Threshold = 0.331 + 1.5 × 0.216 = 0.655 mm
+- Any shift > 0.655 mm is flagged as an outlier
+
+### Pipeline Configuration
+
+In `nextflow.config`:
+
+```groovy
+params {
+    filter_shift_outliers = true  // Enable outlier filtering (recommended)
+    outlier_method = 'iqr'        // Auto-detect using IQR
+    outlier_iqr_multiplier = 1.5  // IQR multiplier (standard: 1.5)
+    max_shift_mm = 0.5            // Only used if method != 'iqr'
+}
+```
+
+---
+
+## Troubleshooting
+
+### Shift Values Too Large
+
+**Symptom**: Large pixel shifts (>1000 pixels)  
+**Cause**: Stage position metadata may be incorrect or in wrong units  
+**Solution**: Check tile metadata; verify stage coordinates are in mm
+
+### Inconsistent Shift Signs
+
+**Symptom**: Shifts alternate between positive and negative  
+**Cause**: Stage movement direction may vary between slices  
+**Solution**: This is usually normal; verify reconstructed volume looks correct
+
+### Missing Slices in File
+
+**Symptom**: Some slice IDs missing from shifts file  
+**Cause**: Those slices weren't found during preprocessing  
+**Solution**: Check if raw tiles exist for missing slices; re-run preprocessing
+
+### Zero Shifts
+
+**Symptom**: All shifts are exactly 0  
+**Cause**: Stage positions not available in tile metadata  
+**Solution**: Check that tiles have position metadata; use `use_stage_positions=True`
+
+---
+
+## Related Files
+
+| File | Description |
+|------|-------------|
+| `slice_config.csv` | Controls which slices to include |
+| `mosaic_grid_3d_z*.ome.zarr` | Mosaic grids aligned using these shifts |
+| `3d_volume.ome.zarr` | Final reconstructed volume |
+
+---
+
+## Example
+
+### Sample shifts_xy.csv
+
+```csv
+fixed_id,moving_id,x_shift,y_shift,x_shift_mm,y_shift_mm
+0,1,156.234,-23.456,0.0234,-0.0035
+1,2,142.567,-18.234,0.0214,-0.0027
+2,3,161.890,-25.678,0.0243,-0.0039
+3,4,148.123,-20.567,0.0222,-0.0031
+4,5,155.456,-22.890,0.0233,-0.0034
+```
+
+### Interpretation
+
+- Slices: 0 through 5 (6 total slices)
+- Average X shift: ~153 pixels (~0.023 mm)
+- Average Y shift: ~-22 pixels (~-0.003 mm)
+- Slices are shifting consistently in negative Y direction
diff --git a/docs/SLICE_CONFIG_FEATURE.md b/docs/SLICE_CONFIG_FEATURE.md
new file mode 100644
index 00000000..8f87295c
--- /dev/null
+++ b/docs/SLICE_CONFIG_FEATURE.md
@@ -0,0 +1,642 @@
+# Slice Configuration Feature
+
+
+---
+
+## Overview
+
+The slice configuration feature allows users to control which slices are included in the 3D reconstruction pipeline. This is essential when:
+
+- Some slices have quality issues (motion artifacts, bad cuts, etc.)
+- You want to reconstruct only a subset of the data
+- Debugging reconstruction issues with specific slices
+- Reviewing which slices have galvo shift artifacts detected
+
+---
+
+## Problem Statement
+
+### Previous Behavior
+
+The reconstruction pipeline would process all slices found in the input directory. If some slices were problematic, users had to:
+
+1. Manually delete or move problematic slice files
+2. Edit the `shifts_xy.csv` file to match remaining slices
+3. Risk errors from mismatched slice IDs between files and shifts
+
+### The Bug
+
+The `linum_align_mosaics_3d_from_shifts.py` script had a critical bug where slice IDs extracted from filenames were used as array indices:
+
+```python
+# BUG: slice IDs used as array indices
+slice_ids -= np.min(slice_ids)  # Normalize to start from 0
+img, res = read_omezarr(mosaics_files[slice_ids[0]])  # Uses ID as index!
+dx_cumsum[id]  # Assumes consecutive IDs
+```
+
+This caused errors when:
+- Slices were non-consecutive (e.g., 0, 2, 5)
+- The shifts file had more entries than slices being processed
+
+---
+
+## Solution
+
+### Slice Configuration File
+
+A CSV file (`slice_config.csv`) controls which slices to include:
+
+**Basic format:**
+```csv
+slice_id,use,notes
+00,true,
+01,true,
+02,false,Motion artifact during cut
+03,true,
+04,true,
+05,false,Bad image quality
+```
+
+**With galvo detection (when `detect_galvo = true`):**
+```csv
+slice_id,use,galvo_confidence,galvo_fix,notes
+00,true,0.234,false,
+01,true,0.891,true,
+02,false,0.756,true,Motion artifact
+03,true,0.512,false,
+04,true,0.189,false,
+05,true,0.923,true,
+```
+
+### File Format
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `slice_id` | string | Two-digit slice identifier (e.g., "00", "01") |
+| `use` | boolean | `true`/`false` whether to include this slice |
+| `galvo_confidence` | float | (Optional) Galvo detection confidence score (0-1) |
+| `galvo_fix` | boolean | (Optional) Whether galvo fix would be applied |
+| `notes` | string | Optional notes explaining exclusion |
+
+### Boolean Values
+
+The `use` column accepts:
+- **True**: `true`, `1`, `yes`
+- **False**: `false`, `0`, `no`
+
+### Galvo Detection Columns
+
+When `detect_galvo = true` in the preprocessing pipeline:
+- **galvo_confidence**: Detection confidence (0.0 - 1.0)
+  - High (≥0.6): Galvo artifact likely present
+  - Low (<0.6): No clear artifact detected
+- **galvo_fix**: Whether the fix would be applied during mosaic creation
+  - `true`: Confidence ≥ threshold, fix applied
+  - `false`: Confidence < threshold, fix skipped
+
+---
+
+## Quality Assessment
+
+### Automatic Quality Detection
+
+The `linum_assess_slice_quality.py` script (and GPU version `linum_assess_slice_quality_gpu.py`) can analyze mosaic grids to detect quality issues and update the slice configuration.
+
+**Quality Metrics:**
+| Metric | Weight | Description |
+|--------|--------|-------------|
+| **SSIM** | 50% | Structural Similarity with neighboring slices |
+| **Edge Preservation** | 30% | Correlation of edge maps with expected structure |
+| **Variance Ratio** | 20% | Consistency of signal variance |
+
+**Quality Score Formula:**
+```
+Quality Score = 0.5 × SSIM + 0.3 × EdgeScore + 0.2 × VarianceScore
+```
+
+### Calibration Slice Detection
+
+The first slice in an acquisition is typically a **calibration slice** that is thicker than the others and should be excluded. The scripts support two methods:
+
+1. **Automatic exclusion**: Use `--exclude_first N` to exclude the first N slices (default: 1)
+2. **Thickness detection**: Use `--detect_calibration` to automatically detect slices that are significantly thicker than the median
+
+### Quality Assessment with Extended Format
+
+When quality assessment is enabled, the config file includes additional columns:
+
+```csv
+slice_id,use,quality_score,ssim_mean,edge_score,variance_score,depth,exclude_reason
+00,false,0.000,0.000,0.000,0.000,120,calibration_slice
+01,true,0.756,0.891,0.612,0.534,85,
+02,false,0.198,0.231,0.112,0.198,85,low_quality
+03,true,0.812,0.923,0.701,0.645,85,
+```
+
+### Usage Examples
+
+```bash
+# Create new config with quality assessment (exclude first slice)
+linum_assess_slice_quality.py /path/to/mosaics slice_config.csv
+
+# Update existing config with quality info
+linum_assess_slice_quality.py /path/to/mosaics slice_config.csv \
+    --update_existing --existing_config existing_config.csv
+
+# Automatically exclude low quality slices
+linum_assess_slice_quality.py /path/to/mosaics slice_config.csv \
+    --min_quality 0.3
+
+# GPU-accelerated quality assessment
+linum_assess_slice_quality_gpu.py /path/to/mosaics slice_config.csv
+
+# Report only (don't write file)
+linum_assess_slice_quality.py /path/to/mosaics slice_config.csv --report_only
+```
+
+---
+
+## Galvo Shift Detection Algorithm
+
+### Background: The Galvo Artifact Problem
+
+In serial OCT (SOCT) imaging, the galvo mirror physically sweeps across the sample during acquisition. At the end of each sweep, the mirror must return to its starting position. During this "galvo return" period, the system continues acquiring data, but this data represents the mirror's return path rather than the sample.
+
+**The artifact occurs when:**
+- The galvo return region is **not** at the edge of the raw tile data
+- Instead, it appears somewhere in the middle of the A-line data
+- This creates a visible intensity discontinuity (banding) in stitched images
+
+**The fix:**
+- Apply a circular shift to move the galvo return region to the edge
+- The shift amount is determined by detecting where the intensity discontinuity occurs
+
+### Detection Algorithm Overview
+
+The detection happens in two stages:
+
+1. **Shift Detection** (`detect_galvo_shift`): Find *where* the galvo return boundary is located
+2. **Slice-level Detection** (`detect_galvo_for_slice`): Sample multiple tiles to find best detection
+
+### Tile Sampling Strategy
+
+When detecting galvo artifacts for a slice, the algorithm samples multiple tiles because:
+- Background/empty tiles at the edges of the mosaic have low intensity
+- Low-intensity tiles produce unreliable detection (low confidence)
+- Tiles with actual tissue content give more reliable results
+
+**Key parameters:**
+- `n_samples`: Number of tiles to sample (default: 5)
+- `min_intensity`: Minimum mean intensity threshold (default: 20.0)
+- Tiles below `min_intensity` are skipped as background
+
+The algorithm samples up to 3× more candidate tiles than `n_samples` to account for empty tiles, ensuring enough high-quality tiles are tested.
+
+### Stage 1: Shift Detection
+
+```python
+def detect_galvo_shift(aip, n_pixel_return):
+    # Compute average A-line profile from the AIP (Average Intensity Projection)
+    profile = aip.mean(axis=1)
+    
+    # For each possible shift, compute intensity differences
+    # between start and end of the shifted profile
+    # The correct shift makes both boundaries of the galvo return 
+    # show high differences (since they transition to/from image data)
+    
+    similarities = []
+    for s in range(len(profile) - n_pixel_return):
+        # Product of differences at both boundaries
+        foo = differences[s] * differences[s + n_pixel_return]
+        similarities.append(foo)
+    
+    # The shift with maximum similarity product is the correct one
+    shift = argmax(similarities)
+```
+
+### Stage 2: Slice-Level Detection (`detect_galvo_for_slice`)
+
+This function samples multiple tiles from a slice to find the most reliable galvo shift detection.
+
+**Why sample multiple tiles?**
+- Edge tiles in a mosaic often contain mostly background (air/mounting medium)
+- Background tiles have low intensity and produce unreliable detection
+- Tiles with actual tissue content give consistent, high-confidence results
+
+**Sampling strategy:**
+```python
+def detect_galvo_for_slice(tiles, n_extra, threshold=0.6, n_samples=5,
+                           axial_resolution=None, min_intensity=20.0):
+    # Sample 3x more candidates to account for empty tiles
+    n_candidates = n_samples * 3
+    
+    for idx in candidate_indices:
+        tile = tiles[idx]
+        
+        # Skip low-intensity (background) tiles
+        if tile.mean() < min_intensity:
+            continue
+        
+        # Detect shift and confidence for this tile
+        shift, confidence = detect_galvo_shift(tile_aip, n_extra)
+        
+        # Keep track of valid detections
+        if confidence >= threshold:
+            valid_detections.append((shift, confidence))
+    
+    # Return the detection with highest confidence
+    return best_shift, best_confidence
+```
+
+**Key parameters:**
+- `n_samples=5`: Number of valid (high-intensity) tiles to test
+- `min_intensity=20.0`: Skip tiles with mean intensity below this value
+- `threshold=0.6`: Minimum confidence required to consider detection valid
+
+### Stage 3: Artifact Presence Detection
+
+The key insight is that **not all slices have the galvo artifact**. Even when `fix_galvo_shift = true` in the pipeline, we should only apply the fix to slices that actually need it.
+
+**Detection Method: Intensity Discontinuity Analysis**
+
+The algorithm analyzes the raw tile AIP (Average Intensity Projection) looking for telltale signs of a misplaced galvo return region:
+
+```
+Image Region | Galvo Return | Image Region
+-------------|--------------|-------------
+  normal     |   different  |    normal
+  intensity  |   intensity  |   intensity
+```
+
+**Three metrics are combined:**
+
+1. **Boundary Contrast (50% weight)**
+   - Compare intensity of the suspected return region vs. surrounding image regions
+   - The galvo return can be **brighter OR darker** than the image (depends on sample)
+   - Algorithm uses **absolute difference** to catch both cases
+   - A 15%+ relative intensity difference → high score
+
+2. **Edge Sharpness (30% weight)**
+   - Sharp intensity transitions at the return region boundaries
+   - Compare edge strength at boundaries vs. typical gradient in the image
+   - Strong edges at expected locations → higher confidence
+
+3. **Return Region Anomaly (20% weight)**
+   - How statistically different is the return region intensity?
+   - Uses z-score: `|image_intensity - return_intensity| / std`
+   - z-score > 1.5 indicates significant anomaly
+
+**Final Score Calculation:**
+```python
+artifact_score = (
+    contrast_score * 0.50 +
+    sharpness_score * 0.30 +
+    anomaly_score * 0.20
+)
+```
+
+### Score Interpretation
+
+| Score Range | Interpretation | Action |
+|-------------|----------------|--------|
+| 0.0 - 0.2 | No clear discontinuity | Skip correction |
+| 0.4 - 0.5 | Borderline, likely clean | Skip correction |
+| 0.5 - 0.6 | Mild discontinuity | Consider correction |
+| 0.6 - 1.0 | Clear discontinuity | Apply correction |
+
+**Default threshold:** 0.6 (configurable via `galvo_confidence_threshold`)
+
+### Why Absolute Difference Matters
+
+Early versions of the algorithm assumed the galvo return region would always be **darker** than the image. However, in practice:
+
+- Galvo return can be **brighter** if surrounding tissue is dark
+- Galvo return can be **darker** if surrounding tissue is bright
+- The key feature is the **discontinuity**, not the direction
+
+Using absolute difference ensures both cases are detected:
+```python
+# CORRECT: Detects both brighter and darker return regions
+relative_diff = abs(image_intensity - return_intensity) / image_intensity
+```
+
+### Raw Tiles vs. Stitched Images
+
+**Important:** The galvo artifact appears differently in:
+
+| View | Appearance |
+|------|------------|
+| **Raw tile AIP** | Intensity band at galvo return position |
+| **Stitched mosaic** | Horizontal banding across the full image |
+
+The detection runs on **raw tiles** (single tile per slice) because:
+- More memory efficient (no need to load full mosaic)
+- Galvo shift is consistent within a slice
+- Direct access to the underlying data structure
+
+---
+
+## Implementation
+
+### New Script: `linum_generate_slice_config.py`
+
+Generates a slice configuration file from existing data:
+
+```bash
+# From mosaic grids directory
+linum_generate_slice_config.py /path/to/mosaics slice_config.csv
+
+# From raw tiles directory
+linum_generate_slice_config.py /path/to/raw_tiles slice_config.csv --from_tiles
+
+# From existing shifts file
+linum_generate_slice_config.py /path/to/shifts_xy.csv slice_config.csv --from_shifts
+
+# With pre-excluded slices
+linum_generate_slice_config.py /path/to/shifts_xy.csv slice_config.csv --from_shifts --exclude 2 5
+
+# With galvo detection (requires raw tiles)
+linum_generate_slice_config.py /path/to/raw_tiles slice_config.csv --from_tiles --detect_galvo
+
+# From shifts file with galvo detection
+linum_generate_slice_config.py /path/to/shifts_xy.csv slice_config.csv --from_shifts \
+    --detect_galvo --tiles_dir /path/to/raw_tiles
+
+# With custom galvo threshold
+linum_generate_slice_config.py /path/to/raw_tiles slice_config.csv --from_tiles \
+    --detect_galvo --galvo_threshold 0.4
+```
+
+### Updated Script: `linum_align_mosaics_3d_from_shifts.py`
+
+Fixed bugs and added slice config support:
+
+```bash
+# Without slice config (backward compatible)
+linum_align_mosaics_3d_from_shifts.py inputs shifts_xy.csv output
+
+# With slice config
+linum_align_mosaics_3d_from_shifts.py inputs shifts_xy.csv output --slice_config slice_config.csv
+```
+
+**Key Improvements:**
+- Matches slice IDs from filenames to shifts file by ID (not array index)
+- Properly accumulates shifts when intermediate slices are skipped
+- Validates slices against shifts file
+- Prints informative messages about processing
+
+### Updated Preprocessing Pipeline
+
+New parameters and process:
+
+```groovy
+// Parameters in nextflow.config
+params.generate_slice_config = true  // Generate slice_config.csv
+params.detect_galvo = false          // Include galvo detection in slice_config
+params.galvo_confidence_threshold = 0.6  // Threshold for galvo fix
+
+// Process
+process generate_slice_config {
+    publishDir "$params.output", mode: 'copy'
+    
+    input:
+        tuple path(shifts_file), path(input_dir)
+    
+    output:
+        path("slice_config.csv")
+    
+    script:
+    String galvo_opts = params.detect_galvo ? 
+        "--detect_galvo --tiles_dir ${input_dir} --galvo_threshold ${params.galvo_confidence_threshold}" : ""
+    """
+    linum_generate_slice_config.py ${shifts_file} slice_config.csv --from_shifts ${galvo_opts}
+    """
+}
+```
+
+### Updated Reconstruction Pipeline
+
+```groovy
+// Parameter in nextflow.config
+params.slice_config = ""  // Optional path to slice_config.csv
+
+// Workflow logic
+def slicesToUse = null
+if (params.slice_config && params.slice_config != "") {
+    slicesToUse = parseSliceConfig(params.slice_config)
+    log.info "Slice config loaded: using ${slicesToUse.size()} slices"
+}
+
+// Filter input slices
+inputSlices = channel
+    .fromFilePairs(...)
+    .filter { slice_id, _files ->
+        if (slicesToUse != null) {
+            return slicesToUse.contains(slice_id)
+        }
+        return true
+    }
+```
+
+---
+
+## Cumulative Shift Handling
+
+### Problem
+
+The shifts file contains pairwise shifts between consecutive slices:
+
+```csv
+fixed_id,moving_id,x_shift,y_shift,x_shift_mm,y_shift_mm
+0,1,10,5,0.01,0.005
+1,2,8,3,0.008,0.003
+2,3,12,7,0.012,0.007
+```
+
+If slice 2 is excluded, the shift from slice 1 to slice 3 must be the **sum** of shifts 1→2 and 2→3.
+
+### Solution
+
+The `build_cumulative_shifts()` function:
+
+1. Builds cumulative shifts for **all** slices in the shifts file
+2. Extracts only the values for selected slices
+
+```python
+def build_cumulative_shifts(shifts_df, selected_slice_ids, resolution):
+    # Build cumulative for ALL slices first
+    cumsum_all = {all_slice_ids[0]: (0.0, 0.0)}
+    for i in range(len(all_slice_ids) - 1):
+        fixed_id = all_slice_ids[i]
+        moving_id = all_slice_ids[i + 1]
+        dx_mm, dy_mm = shift_lookup[(fixed_id, moving_id)]
+        prev_dx, prev_dy = cumsum_all[fixed_id]
+        cumsum_all[moving_id] = (prev_dx + dx_mm, prev_dy + dy_mm)
+    
+    # Extract only for selected slices
+    cumsum_selected = {}
+    for slice_id in selected_slice_ids:
+        cumsum_selected[slice_id] = cumsum_all[slice_id]
+    
+    return cumsum_selected
+```
+
+### Example
+
+Original slices: 0, 1, 2, 3, 4, 5  
+Shifts: 0→1: (10, 5), 1→2: (8, 3), 2→3: (12, 7), 3→4: (5, 2), 4→5: (9, 4)
+
+Cumulative shifts for all slices:
+- Slice 0: (0, 0)
+- Slice 1: (10, 5)
+- Slice 2: (18, 8)
+- Slice 3: (30, 15)
+- Slice 4: (35, 17)
+- Slice 5: (44, 21)
+
+If processing only slices 0, 2, 4:
+- Slice 0: (0, 0)
+- Slice 2: (18, 8) ← Correct! Sum of 0→1 + 1→2
+- Slice 4: (35, 17) ← Correct! Sum of all up to slice 4
+
+---
+
+## Usage Workflow
+
+### For New Datasets
+
+1. Run preprocessing pipeline (generates `slice_config.csv` automatically)
+2. Review and edit `slice_config.csv` if needed
+3. Run reconstruction pipeline
+
+```bash
+# Preprocessing (generates slice_config.csv)
+nextflow run preproc_rawtiles.nf --input /raw/data --output /output
+
+# Review slice config
+cat /output/slice_config.csv
+
+# Edit if needed (set use=false for bad slices)
+nano /output/slice_config.csv
+
+# Reconstruction
+nextflow run soct_3d_reconst.nf \
+    --input /output \
+    --slice_config /output/slice_config.csv
+```
+
+### For Existing Datasets
+
+1. Generate slice config from existing files
+2. Edit to exclude problematic slices
+3. Run reconstruction
+
+```bash
+# Generate config from existing shifts file
+linum_generate_slice_config.py /output/shifts_xy.csv slice_config.csv --from_shifts
+
+# Edit to exclude slice 2
+# slice_id,use,notes
+# 00,true,
+# 01,true,
+# 02,false,Bad quality
+# 03,true,
+
+# Reconstruction with slice config
+nextflow run soct_3d_reconst.nf \
+    --input /output \
+    --slice_config slice_config.csv
+```
+
+---
+
+## Backward Compatibility
+
+- **Slice config is optional**: If not provided, all slices are processed
+- **Preprocessing parameter**: Set `generate_slice_config = false` to skip generation
+- **Existing pipelines**: Continue to work without modification
+
+---
+
+## Validation
+
+The pipeline performs validation:
+
+1. **Slice config parse errors**: Reports malformed CSV files
+2. **Missing slices in shifts**: Warns if selected slices aren't in shifts file
+3. **Empty selection**: Errors if no slices remain after filtering
+4. **Logging**: Prints which slices are included/excluded
+
+---
+
+## Files Changed
+
+| File | Changes |
+|------|---------|
+| `scripts/linum_generate_slice_config.py` | **NEW** - Generate slice config |
+| `scripts/linum_align_mosaics_3d_from_shifts.py` | Fixed indexing bug, added `--slice_config` |
+| `workflows/preproc/preproc_rawtiles.nf` | Added `generate_slice_config` process |
+| `workflows/reconst_3d/soct_3d_reconst.nf` | Added slice filtering logic |
+| `workflows/reconst_3d/nextflow.config` | Added `slice_config` parameter |
+
+---
+
+## Testing
+
+```bash
+# Test help
+linum_generate_slice_config.py --help
+
+# Test generation from shifts file
+linum_generate_slice_config.py shifts_xy.csv test_config.csv --from_shifts
+
+# Test with exclusions
+linum_generate_slice_config.py shifts_xy.csv test_config.csv --from_shifts --exclude 1 2
+
+# Test with galvo detection
+linum_generate_slice_config.py /path/to/tiles test_config.csv --from_tiles --detect_galvo
+
+# Test align script with config
+linum_align_mosaics_3d_from_shifts.py inputs shifts_xy.csv output --slice_config test_config.csv
+```
+
+---
+
+## Troubleshooting Galvo Detection
+
+### Problem: Galvo fix not applied despite `galvo_fix=true`
+
+**Symptoms:**
+- `slice_config.csv` shows `galvo_fix=true` with confidence > 0.5
+- Log shows "No galvo fix needed for slice" or shift=0
+- Banding artifacts still visible in mosaic
+
+**Cause:**
+The detection during mosaic creation re-samples tiles and may sample different tiles than the original detection. If it samples background tiles (low intensity), it gets low confidence and returns shift=0.
+
+**Solutions:**
+1. The detection skips tiles with mean intensity < 20.0
+2. Detection samples 3× more candidate tiles to find enough high-intensity ones
+
+### Problem: Low confidence scores for all tiles
+
+**Symptoms:**
+- All tiles return confidence < 0.6
+- `galvo_fix` set to `false` for all slices
+
+**Possible causes:**
+1. Tiles genuinely don't have galvo artifact (correct behavior)
+2. Very homogeneous tissue with weak intensity variation
+3. `n_extra` parameter doesn't match actual galvo return size
+
+**Debugging:**
+```python
+from linumpy.preproc.xyzcorr import detect_galvo_for_slice
+
+# Test with verbose output
+shift, conf = detect_galvo_for_slice(
+    tiles, n_extra=40, threshold=0.6, n_samples=5, min_intensity=20.0
+)
+print(f"shift={shift}, confidence={conf}")
+```
diff --git a/docs/SLICE_INTERPOLATION_FEATURE.md b/docs/SLICE_INTERPOLATION_FEATURE.md
new file mode 100644
index 00000000..188fd5e6
--- /dev/null
+++ b/docs/SLICE_INTERPOLATION_FEATURE.md
@@ -0,0 +1,360 @@
+# Slice Interpolation Feature
+
+
+---
+
+## Overview
+
+The slice interpolation feature reconstructs missing or degraded slices in Serial OCT datasets using information from adjacent slices. This is particularly useful when a single physical slice was lost or damaged during acquisition but surrounding slices are intact.
+
+**Key Features**:
+- Pure interpolation from neighboring slices when slice is completely missing
+- **Quality-weighted blending** with degraded slices that have partial usable data
+- Automatic quality assessment to determine how much to trust degraded data
+
+**Important Limitation**: This method only works for **single missing slices**. When two or more consecutive slices are missing, there is insufficient information for accurate reconstruction.
+
+---
+
+## Background
+
+### The Problem
+
+In Serial Optical Coherence Tomography (SOCT), as described in Lefebvre et al. (2017)[^1], mouse brains are imaged through serial sectioning with a ~200 µm slice interval. Occasionally, a slice may be:
+
+- Damaged during cutting
+- Lost during handling
+- Unusable due to imaging artifacts
+
+When a single slice is missing, the gap can be filled using registration-based interpolation.
+
+### Scientific Basis
+
+The implemented method is based on **registration-based morphing interpolation**, which:
+
+1. Registers the slice before the gap to the slice after
+2. Computes a "half-transform" representing the midpoint
+3. Warps both adjacent slices toward the midpoint
+4. Blends the warped volumes
+
+This approach is inspired by:
+- **Video frame interpolation** techniques (Bao et al., CVPR 2019)
+- **Shape-based interpolation** in medical imaging (Lee et al., IEEE TMI 1991)
+- **Motion-compensated interpolation** for volumetric data
+
+---
+
+## Method Details
+
+### Registration-Based Morphing (Recommended)
+
+```
+Slice N-1 (before)    Slice N+1 (after)
+       \                   /
+        \                 /
+         ↓               ↓
+    [Find best overlap planes]
+    (NCC search at volume boundaries)
+              ↓
+    [Register best plane pair (2D affine)]
+              ↓
+    [Compute half-transform H]
+              ↓
+    ┌─────────┴─────────┐
+    ↓                   ↓
+[Warp N-1 by H]  [Warp N+1 by H⁻¹]
+    ↓                   ↓
+    └─────────┬─────────┘
+              ↓
+         [Blend 50/50]
+              ↓
+      Interpolated Slice N
+```
+
+#### Overlap Plane Selection
+
+In serial sectioning, the physically adjacent tissue is near the **bottom** of
+vol_before and the **top** of vol_after (the surface exposed after each cut).
+Because the exact cut depth can vary slightly, `find_best_overlap_planes` searches
+the last `overlap_search_window` z-planes of vol_before against the first
+`overlap_search_window` z-planes of vol_after using normalized cross-correlation
+on the central ROI. The pair with the highest correlation is selected for
+registration.
+
+The correlation score also acts as a **quality gate**: if no pair exceeds
+`min_overlap_correlation` (default 0.1), the volumes cannot be reliably aligned and
+the method falls back to a simple average rather than producing a bad warp.
+
+### Half-Transform Computation
+
+For an affine transform with center *c*, matrix *M*, and translation *t*:
+
+```
+T(x) = M(x - c) + c + t
+```
+
+The half-transform *H* satisfying *H(H(x)) = T(x)* is computed as:
+
+```python
+# Matrix square root via eigendecomposition
+eigenvalues, eigenvectors = np.linalg.eig(M)
+half_matrix = eigenvectors @ diag(sqrt(eigenvalues)) @ inv(eigenvectors)
+
+# Correct half-translation: (H_m + I) * h_t = t
+half_translation = np.linalg.solve(half_matrix + np.eye(dim), t)
+```
+
+Note: a naive `t / 2` is only correct for pure translations. The solve-based
+formula correctly accounts for the interaction between the matrix and translation
+components for any affine transform.
+
+### Alternative Methods
+
+| Method | Description | Use Case |
+|--------|-------------|----------|
+| `registration` | Registration-based morphing | **Recommended** - handles deformation |
+| `average` | Simple 50/50 pixel average | Fast baseline, no deformation handling |
+| `weighted` | Gaussian-smoothed average | Reduces discontinuities |
+
+### Blending Methods
+
+When combining the warped before and after volumes, two blending methods are available:
+
+| Blend Method | Description |
+|--------------|-------------|
+| `gaussian` | **Recommended.** Uses distance transform to create feathered edges. Interior pixels get more weight than edge pixels, eliminating visible double-edge artifacts. |
+| `linear` | Simple 50/50 blend. Fast but may show visible edges where the two warped volumes have different boundaries. |
+
+---
+
+## Degraded Slice Support
+
+When a slice exists but has quality issues (motion artifacts, partial data loss, etc.), it can still provide valuable structural information. The interpolation can blend the degraded data with the pure interpolation result.
+
+### Automatic Quality Assessment
+
+The quality of a degraded slice is automatically assessed using three metrics:
+
+| Metric | Weight | Description |
+|--------|--------|-------------|
+| **SSIM** | 50% | Structural Similarity with neighboring slices |
+| **Edge Preservation** | 30% | Correlation of edge maps with expected structure |
+| **Variance Ratio** | 20% | Consistency of signal variance |
+
+```
+Quality Score = 0.5 × SSIM + 0.3 × EdgeScore + 0.2 × VarianceScore
+```
+
+### Quality Thresholding
+
+- If `quality_score >= min_quality_threshold`: blend degraded with interpolated
+- If `quality_score < min_quality_threshold`: use pure interpolation (degraded is too damaged)
+
+Default threshold: **0.2** (very damaged slices are rejected)
+
+### Blending Formula
+
+```python
+final = quality_weight × degraded + (1 - quality_weight) × interpolated
+```
+
+Where `quality_weight` is either:
+- Automatically computed from quality assessment
+- Manually specified via `--degraded_weight`
+
+---
+
+## Usage
+
+### Enable in Pipeline
+
+```bash
+nextflow run soct_3d_reconst.nf \
+    --input /path/to/mosaics \
+    --interpolate_missing_slices true \
+    --interpolation_method registration
+```
+
+### Configuration Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `interpolate_missing_slices` | `true` | Enable interpolation |
+| `interpolation_method` | `'registration'` | Method: `registration`, `average`, `weighted` |
+| `interpolation_blend_method` | `'gaussian'` | Blend method: `gaussian` (feathered), `linear` (50/50) |
+| `interpolation_registration_metric` | `'MSE'` | Registration metric: `MSE`, `CC`, `MI` |
+| `interpolation_max_iterations` | `1000` | Max registration iterations |
+| `interpolation_overlap_search_window` | `5` | Z-planes to search at each boundary for best overlap pair |
+| `interpolation_min_overlap_correlation` | `0.1` | Min cross-correlation to proceed with registration; below this falls back to simple average |
+| `interpolation_use_degraded` | `true` | Use degraded slice data if available |
+| `interpolation_min_quality` | `0.2` | Minimum quality score to use degraded slice |
+
+### Standalone Script
+
+```bash
+# Basic usage (pure interpolation)
+linum_interpolate_missing_slice.py slice_z00.ome.zarr slice_z02.ome.zarr \
+    slice_z01_interpolated.ome.zarr
+
+# With options
+linum_interpolate_missing_slice.py slice_z00.ome.zarr slice_z02.ome.zarr \
+    slice_z01_interpolated.ome.zarr \
+    --method registration \
+    --registration_metric MSE \
+    --max_iterations 1000 \
+    --blend_method linear
+
+# With degraded slice (automatic quality assessment)
+linum_interpolate_missing_slice.py slice_z00.ome.zarr slice_z02.ome.zarr \
+    slice_z01_interpolated.ome.zarr \
+    --degraded_slice slice_z01_damaged.ome.zarr
+
+# With manual quality weight override
+linum_interpolate_missing_slice.py slice_z00.ome.zarr slice_z02.ome.zarr \
+    slice_z01_interpolated.ome.zarr \
+    --degraded_slice slice_z01_damaged.ome.zarr \
+    --degraded_weight 0.4
+
+# With custom quality threshold
+linum_interpolate_missing_slice.py slice_z00.ome.zarr slice_z02.ome.zarr \
+    slice_z01_interpolated.ome.zarr \
+    --degraded_slice slice_z01_damaged.ome.zarr \
+    --min_quality_threshold 0.3
+```
+
+---
+
+## Gap Detection
+
+The pipeline automatically detects gaps in the slice sequence:
+
+```groovy
+// Example: slices 00, 01, 03, 04 present
+// Detected: slice 02 missing (single gap - can interpolate)
+
+// Example: slices 00, 01, 04, 05 present
+// Detected: slices 02, 03 missing (multiple gap - CANNOT interpolate)
+```
+
+### Gap Detection Logic
+
+1. Extract numeric slice IDs from filenames
+2. Sort slice IDs
+3. Find gaps where `next_id - current_id == 2` (single gap)
+4. Warn if `next_id - current_id > 2` (multiple gap)
+
+---
+
+## Workflow Integration
+
+The interpolation happens after `bring_to_common_space` and before `register_pairwise`:
+
+```
+bring_to_common_space
+        ↓
+[detect_gaps]
+        ↓
+┌───────┴───────┐
+│  Single gap?  │
+└───────┬───────┘
+        ↓ yes
+interpolate_missing_slice
+        ↓
+[merge with existing slices]
+        ↓
+register_pairwise
+        ↓
+stack
+```
+
+---
+
+## Limitations
+
+### Only Single Gaps
+
+❌ **Cannot interpolate** when 2+ consecutive slices are missing:
+- Insufficient information to estimate intermediate content
+- Deformation between non-adjacent slices is too large
+
+### Structural Assumptions
+
+The method assumes:
+- Adjacent slices have similar content
+- Tissue deformation between slices is smooth
+- No major structural changes between slices
+
+### Quality Considerations
+
+- Interpolated slices are **estimates**, not measured data
+- Quality depends on registration accuracy
+- Fine structures may be blurred or misaligned
+- Consider marking interpolated slices in downstream analysis
+
+---
+
+## Output
+
+### Interpolated Slice Naming
+
+```
+slice_z{missing_id}_interpolated.ome.zarr
+```
+
+Example: If slice 02 is missing, output is `slice_z02_interpolated.ome.zarr`
+
+### Published Location
+
+```
+{output}/interpolate_missing_slice/
+└── slice_z02_interpolated.ome.zarr
+```
+
+---
+
+## Validation
+
+### Visual Inspection
+
+Compare interpolated slice with adjacent slices:
+
+```bash
+# View all three slices
+napari slice_z01.ome.zarr slice_z02_interpolated.ome.zarr slice_z03.ome.zarr
+```
+
+### Quantitative Metrics
+
+If ground truth is available (e.g., from a complete dataset):
+
+```python
+from skimage.metrics import structural_similarity as ssim
+from skimage.metrics import peak_signal_noise_ratio as psnr
+
+# Compare interpolated to ground truth
+ssim_score = ssim(interpolated, ground_truth)
+psnr_score = psnr(ground_truth, interpolated)
+```
+
+---
+
+## References
+
+[^1]: J. Lefebvre, A. Castonguay, P. Pouliot, M. Descoteaux, and F. Lesage, "Whole mouse brain imaging using optical coherence tomography: reconstruction, normalization, segmentation, and comparison with diffusion MRI," *Neurophotonics*, vol. 4, no. 4, p. 041501, July 2017, doi: 10.1117/1.NPh.4.4.041501.
+
+### Additional References
+
+- Lee, T. Y., & Wang, W. H. (2000). "Morphology-based three-dimensional interpolation." IEEE TMI, 19(7), 711-720.
+- Raya, S. P., & Udupa, J. K. (1990). "Shape-Based Interpolation of Multidimensional Objects." IEEE TMI, 9(1), 32-42.
+- Bao, W., et al. (2019). "Depth-Aware Video Frame Interpolation." CVPR.
+- Penney, G. P., et al. (2004). "A comparison of similarity measures for use in 2-D-3-D medical image registration." IEEE TMI.
+
+---
+
+## Files
+
+| File | Description |
+|------|-------------|
+| `scripts/linum_interpolate_missing_slice.py` | Standalone interpolation script |
+| `workflows/reconst_3d/soct_3d_reconst.nf` | Pipeline with interpolation process |
+| `workflows/reconst_3d/nextflow.config` | Configuration with interpolation params |