Skip to content

ELELAB/Moonlight2_GMA_case_studies

Repository files navigation

Cancer Systems Biology, Section of Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800, Lyngby, Denmark

Moonlight2_GMA_case_studies

This repository contains scripts for case studies related to the discovery of cancer driver genes using the Moonlight framework. The case studies are conducted on basal-like breast cancer, lung adenocarcinoma, and thyroid cancer using data from The Cancer Genome Atlas (TCGA). The associated publication is:

Revealing cancer driver genes through integrative transcriptomic and epigenomic analyses with Moonlight. Mona Nourbakhsh, Yuanning Zheng, Humaira Noor, Matteo Tiberti, Olivier Gevaert, Elena Papaleo. bioRxiv 2024.03.14.584946; doi: https://doi.org/10.1101/2024.03.14.584946

Please cite the above publication if you use the contents, scripts or results for your own research.

Below are instructions for reproducing the analyses.

Structure and content of GitHub and OSF repositories

This GitHub repository contains scripts associated with the publication with a main folder for each cancer (sub)type. Within each cancer (sub)type folder, a subfolder called scripts contains the associated scripts. The scripts are numbered according to the order in which they are run.

The corresponding OSF repository contains data and results associated with the scripts and is organized in the same way as the GitHub repository with a main folder for each cancer (sub)type. Within each cancer (sub)type folder, subfolders called data and results contain the associated data and results, respectively. The results files in results are numbered according to the script that generated them.

Installing requirements and reproducing the analysis

All the analyses have been performed on a GNU/Linux server.

NB: When reproducing the analyses and results, the user cannot expect to obtain identical results to the ones of the case studies and associated with the publication due to stochastic processes in the GRN step of the Moonlight protocol.

Computing environment

In order to reproduce the paper data, you will need to set up a conda environment on which the expected version of R and the required packages will be installed; this requires being able to run Anaconda by means of the conda executable.

If you don't have access to conda please see the Miniconda installer page for instructions on how to install Miniconda.

Once you have access to conda, follow the below instructions:

  1. Clone our github repository into a local directory on your local machine:
git clone https://github.com/ELELAB/Moonlight2_GMA_case_studies.git
cd Moonlight2_GMA_case_studies
  1. Create a virtual environment using conda and activate it. The environment directory should be placed in the Moonlight2_GMA_case_studies folder:
conda env create --prefix ./methyl_case --file conda_environment.yml
conda activate ./methyl_case
  1. Download data from the COSMIC Cancer Gene Census. This data can be downloaded from https://cancer.sanger.ac.uk/census by exporting it as a csv file or from https://cancer.sanger.ac.uk/cosmic/download/cosmic/v99/cancergenecensus by choosing the file from the CRCh28 genome and afterwards converting it to a csv file. Once the data from the Cancer Gene Census has been downloaded, it must be a csv file named cancer_gene_census.csv and this file must be placed in the data folder of each cancer (sub)type:
breast_basal/data/cancer_gene_census.csv
lung/data/cancer_gene_census.csv
thyroid/data/cancer_gene_census.csv
  1. Run the analyses:
bash ./run_all.sh

WARNING: our scripts use the renv R package to handle automatic dependency installation. Renv writes packages in its own cache folder, which is by default in the user's home directory. This might not be desirable if free space in the home directory is limited. You can change the location of the Renv root folder by setting a system environment variable - please see comments in the run_all.sh script.

The run_all.sh script will perform the following steps to reproduce all results and data:

  1. Download data from the corresponding OSF repository which contains the required data to run the analyses and all results associated with the analyses.

  2. Install in the environment all necessary packages to run the analyses.

  3. Perform all analyses for basal-like breast cancer.

  4. Perform all analyses for lung adenocarcinoma.

  5. Perform all analsyes for thyroid cancer.

  6. Compare results across cancer (sub)types.

About

Repository for case studies in which the Gene Methylation Analysis in Moonlight2R is used

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors