This project involves the development of a 3D segmentation model to accurately segment and identify specific abdominal organs (Liver, Right Kidney, Left Kidney, and Spleen) from CT scans. The model is built using a VNet architecture and is trained on a public dataset of abdominal CT scans. The primary goal is to assist in medical imaging by automating the segmentation process, which can aid in disease diagnosis, surgical planning, and treatment monitoring.
Ensure that you have Python 3.8 or higher installed. You will also need to install the following Python libraries:
pip install torch torchvision nibabel numpy matplotlib scikit-image monai
The model used in this project is based on the VNet architecture, which is specifically designed for 3D medical image segmentation. The key features of the model include:
Input: Single-channel (grayscale) 3D CT scans. Output: 4-channel output representing the background, Liver, Right Kidney, Left Kidney, and Spleen. Layers: The VNet is composed of several layers of 3D convolutions, followed by ReLU activations and downsampling/upscaling operations.
The training process includes the following steps:
Data Loading: The CT scans and labels are loaded and preprocessed using a custom dataset class. Loss Function: The model is trained using the Dice Loss, which is effective for segmentation tasks. Optimizer: The Adam optimizer is used to update model weights during training. Metrics: The performance is evaluated using the Dice Score for each organ separately.
Example command : python scripts/train.py
Validation: The model’s performance is validated using a separate validation dataset, and the Dice score is computed for each organ. Inference: After training, the model is used to generate segmentation masks for unseen CT scans.
Example command : python scripts/inference.py
Dice Score = (2×∣A∩B∣)/(∣A∣+∣B∣)
Where:
-
∣A∩B∣ is the number of elements (e.g., pixels or voxels) where both the predicted mask A and the ground truth mask B are 1 (i.e., the intersection of A and B).
-
∣A∣ is the number of elements in the predicted mask A.
-
∣B∣ is the number of elements in the ground truth mask B.

