Skip to content

Python class for improved detection of cells using cellpose in shallow 3D stacks using K-means clustering

License

Notifications You must be signed in to change notification settings

thomasmichaelkane/haystack

Repository files navigation


Logo

haystack

Finding cells in 3D stacks using cellpose and K-means clustering
Explore the docs »

Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. License
  5. Contact

About The Project

This project is for the purpose of finding the locations of cell types that pose a detection challenge for two reasons; difficulty in differentiation from other cells given the imaging method; difficulty mapping cell locations in 2D due to cells occupying a shallow 3D space and appearing at different depths.

The below example shows imaging with two channel staining (for human validation/training) and a single channel (the desired imaging method for automated cell differentiation). In the green only channel the two cell types are often seemingly indistinguishable from one another.

dual-channel

single-channel

Haystack is built on top of cellpose to detect difficult cells such as these across a shallow 3D stack (ones where cells do not significantly overlap). Running the detection model on each slice individually allows for improved robustness and removal of false positives that do not reach a threshold. The detection process is shown below on an input stack, on a black background, and added cumulatively.

detections

Clustering using sklearn algorithms we can then find the 'true' cells. This outputs a set of coordinates estimating the centre if each cell in a 2D representation, with significantly added robustness in detecting difficult to distinuish cells.

clustering

clusters

(back to top)

Built With

  • Python
  • OpenCV
  • Scikit-learn

Cellpose

cellpose-logo

(back to top)

Getting Started

To get a local copy up and running follow these simple example steps.

Installation

  1. Clone the repo at the desired location
    git clone https://github.com/thomasmichaelkane/haystack.git
    cd haystack
  2. Create a virtual environment
    python -m venv .venv
  3. Activate environment Windows
    .venv/Scripts/activate.ps1  
    Mac/Linux
    source .venv/bin/activate
  4. Install prerequisites
    python -m pip install -r requirements.txt

(back to top)

Usage

Included in this project is one YAML config file and two scripts, that can be used to run the haystack process. One allows for the creation of training data from stacks to use for creating cellpose models, and the other for the primary haystack functionality. An API for using the classes in your own scripts is also desctibed further on.

Scripts

These are example scripts that can be run on the command line to use the haystack class easily. These scripts rely on settings in the config.yaml file, so these should be changed first.

run_haystack.py

Uses the haystack class and settings in the config.yaml file to detect cells using cellpose (or load rois directoly), cluster, and save all processing outputs.

python run_haystack.py path/to/stack [rois_exist]
  • Arguments:
    • path/to/stack (path): The path to the image stack. This can be a path to a directory containing individual images, or to a single file tiff stack.
    • rois_exists (bool, optional): Whether cellpose has already been run and the ROIs already exist in the same directory. The default is False.

random_subsect.py

Create random small subsections of images from a stack for using as training data in cellpose to create detection models for the haystack class.

python random_subsect.py path/to/stack [num_squares] [size]
  • Arguments:
    • path/to/stack (path): The path to the image stack. This can be a path to a directory containing individual images, or to a single file tiff stack.
    • num_squares (int, optional): The number of squares to cut from each slice. The default is 1.
    • size (int, optional): The siz eof each square (pixels) to be cut. The default is 300px.

(back to top)

API

Class: Haystack

The haystack class is built by loading an image stack into the object. The class can then be used to detect cells in the stack using cellpose models, cluster these using sklearn algorithms, show videos and images of these processes, and save clustered cell positions that represent a flattened 2D representation of the cells from the shallow 3D space input.

The class can be imported like so:

from haystack import Haystack

Constructor

__init__(path):

The class is constructed by loading a tiff stack or an image directory.

  • Parameters:
    • path (str): The path to the image stack. This can be a path to a directory containing individual images, or to a single file tiff stack.

Methods

choose_colormap(cm):

Assign a different colormap for cell detections. The default is 'jet'.

  • Parameters:
    • cm (str): The name of the colormap to be used.

load_rois_directly(roi_path=None):

If ROIs already exist they can be loaded directly without rerunning the cellpose algorithm.

  • Parameters:
    • roi_path (str, optional): The path to the region of interest (ROI) directory. If not provided, ROIs are expected to be in the same directory as the image stacks.

write_images_dir():

Creates a directory of images from a loaded stack (needed for the cellpose algorithm).


detect_cells(model_path, cellprob_threshold, channels):

Detect cells using the cellpose algorithm. Check the cellpose documentation for details on the model and attributes.

  • Parameters:
    • model_path (str): The path to the cell detection model.
    • cellprob_threshold (float): The threshold for cell probability.
    • channels (list of str): List of the channel names.

cluster_cells(min_samples, max_clustering_distance):

Use cell detection across layers to find clusters using a max distance between detections and a minimum number of samples.

  • Parameters:
    • min_samples (int): The number of detections in a neighborhood for a cluster to be considered a true cell detection.
    • max_clustering_distance (float): The maximum distance between two detections for one to be considered as in the neighborhood of the other.

show_stack(fps=8):

Plays the raw stack as a video for visualization.

  • Parameters:
    • fps (int, optional): Frames per second for display (default is 8).

show_detections_stack(on_frames=False, fps=8):

Plays all detections as a video for visualization, either on a black background or on the stack itself.

  • Parameters:
    • on_frames (bool, optional): Indicates if detections should be overlaid on frames (default is False).
    • fps (int, optional): Frames per second for display (default is 8).

show_cumulative_stack(fps=8):

Plays all detections as a video for visualization on a black background. Detections from previous frames remain through the stack.

  • Parameters:
    • fps (int, optional): Frames per second for display (default is 8).

show_clustering():

Display an image with all detections and how the cells were clustered.


show_clustered_cells():

Display an image with all locations of clusters.


save_raw():

Saves stack of raw frames.


save_detections_stack():

Saves the stack with cell detections on a black background.


save_detections_stack_on_frames():

Saves the stack with cell detections overlaid on the frames.


save_cumulative_stack():

Saves the stack with cumulative cell detections on a black background.


save_clustering_image():

Saves an image showering the clustering process.


save_clustered_cells_image():

Saves an image with cells from cluster centres.


save_all_processing_images(save_raw=False):

Save all function for all the images and stacks showing the processing steps.

  • Parameters:
    • save_raw (bool, optional): Indicates if raw images should be saved as well(default is False).

save_cells_as_coords():

Saves the cell coordinates from cluster centres as a .txt file.


(back to top)

Class: SubsectionGenerator

A class for taking random image samples of a set size from a tiff stack or a directory of images. For creating images for training data for cellpose model training.

The class can be imported like so:

from haystack import SubsectionGenerator

Constructor

__init__(path, square_size):
  • Parameters:
    • path (str): The path to the image stack. This can be a path to a directory containing individual images, or to a single file tiff stack.
    • square_size (int): The size of the random image samples (pixels) to be cut from the original image. It is reccomended that this is not too close to the original image size to avoid too much duplication of sections in training data.

Methods

make_Samples(squares_per_slice):

Cuta set number of random image samples from the stack.

  • Parameters:
    • squares_per_slice (int): The number of random image samples to be cut from each slice of the stack.

Functions

Loading the configuration file. This will load all additional settings from the config.yaml file into a dictionary called config.

from haystack import load_config

config = load_config()

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Thomas Kane - thomas.kane.ucl@gmail.com

Me: https://thomasmichaelkane.github.io/me/

Project Link: https://github.com/thomasmichaelkane/haystack

(back to top)

About

Python class for improved detection of cells using cellpose in shallow 3D stacks using K-means clustering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages