Skip to content

A framework for cryo-EM micrographs using semantic segmentation trained by projection of reconstructed 3D mask.

License

Notifications You must be signed in to change notification settings

phonchi/CryoParticleSegment

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CRISP

A Modular Framework for Cryo-EM Image Segmentation and Processing with Conditional Random Fields

CRISP is a modular framework that facilitates experimentation with advanced image segmentation strategies for cryo-electron microscopy (cryo-EM). It streamlines the process of generating high-quality segmentation maps and integrates seamlessly with downstream workflows for particle picking and analysis.


Features

  • Automated Segmentation Mask Generation: Automatically create high-quality reference segmentation maps.
  • Modular Segmentation Package: Customize and experiment with a variety of segmentation strategies.
  • Advanced CRF Layer: Integrates a novel Conditional Random Fields layer utilizing a regularized Frank-Wolfe algorithm and class-discriminative features to refine coarse pixel-level predictions.
  • Center Finding Algorithm with Hyperparameter Search: Integrates several center-finding algorithms for downstream particle picking tasks, with optimal configurations determined through our proposed hyperparameter search algorithms.

Manuscript

  • Title: CRISP: A Framework for Cryo-EM Image Segmentation and Processing with Conditional Random Field
  • Authors: Szu-Chi Chung and Po-Cheng Chou
  • Read the Manuscript

Table of Contents


Installation

While we recommend using Google Colab for the best user experience and easier setup, you can also install CRISP locally by following these steps:

Prerequisites

  • Anaconda Python Distribution
    If you don’t have it installed, download it from here.

Create a Conda Environment

conda create --name CRISP python=3.10
conda activate CRISP

Install Dependencies

pip install mrcfile torch scikit-image ipython_genutils notebook segmentation_models_pytorch

Note: Additional dependencies may be required depending on your specific use case. Please refer to the individual notebook files for any specialized requirements.


Setup

Clone the repository and change into the project directory:

git clone https://github.com/phonchi/CryoParticleSegment.git
cd CryoParticleSegment

Supported Components

Component Short Description
Segmentation Mask Generation Pipeline (Automates the creation of training masks from particle coordinates)
Projection-Based Mask Generation Generates masks by creating an initial 3D map from a few particle coordinates, reprojecting a 3D mask into 2D, and binarizing the result. This is the recommended default method.
Circle-Based Mask Approximation An alternative method for lower-quality data that generates coarse masks by drawing fixed-radius circles at particle coordinates, requiring no 3D reconstruction.
Synthetic Dataset Simulation Creates artificial micrographs and corresponding perfect masks from known 3D structures, simulating realistic noise levels (e.g., SNR 0.005) for controlled experiments and validation.
Preprocessing (Prepares images for efficient model training)
Patch-based approach Divides large micrographs into smaller patches (e.g., $512 \times 512$ pixels). During training, random patches are used to save memory; during inference, overlapping patches are processed and merged with a soft weighting scheme for a seamless result.
Modular Segmentation Pipeline (The core deep learning framework for segmenting particles)
Segmentation Models & Encoders A flexible framework allowing users to combine various models and encoders. The default configuration uses a UNet++ model with an EfficientNet-B5 encoder. Supported segmentation models and Supported encoders
Loss Functions Supports multiple loss functions, with Dice Loss as the default to handle class imbalance. It also includes Tversky Loss to fine-tune the trade-off between false positives and false negatives. Supported Loss functions
Optimizer & Scheduler Employs the Adam optimizer with a One-Cycle Learning rate scheduler to facilitate faster and more stable model training. Supported Optimizers
Spatial Regularization (Refines the model's raw output for cleaner, more accurate boundaries)
CRF / CD-CRF Layer A Conditional Random Field (CRF) layer is used as a post-processing step to refine coarse segmentation maps by enforcing spatial consistency. The novel Class-Discriminative CRF (CD-CRF) uses learned, class-discriminative features from the CNN instead of raw pixel intensities, making it more robust to high noise levels in cryo-EM data.
CRF Solvers Supports both the classic mean-field solver and a regularized Frank-Wolfe algorithm, which is the default due to its faster convergence and improved performance.
Center-Finding (Extracts final particle coordinates from the segmentation map)
Traditional Computer Vision Methods An algorithm that processes the segmentation map with normalization and erosion, detects object contours, filters them by area, and identifies particle centers from the minimal enclosing circle of each valid contour.
Crocker-Grier Algorithm A classic blob detection method that identifies initial centers as local brightness maxima and refines their positions by calculating the intensity-weighted centroid. Overlapping blobs are resolved by discarding the one with the smaller "mass".
Non-Maximum Suppression (NMS) A method where each pixel's probability is a confidence score. The algorithm retains only the highest-scoring candidate in local grids and then suppresses (discards) candidates that have significant overlap with a higher-scoring neighbor.

Tutorials and Guides

For detailed documentation and analysis on both synthetic and real datasets, check out our Example Notebook.


Data

  • Synthetic Data: Generate synthetic data using the scripts in the simulation directory.
  • Tested Data: Download tested data from CryoPPP.

License

CRISP is open-source software released under the GNU General Public License, Version 3.


Credits

About

A framework for cryo-EM micrographs using semantic segmentation trained by projection of reconstructed 3D mask.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages

  • Jupyter Notebook 96.0%
  • Python 2.4%
  • Other 1.6%