An R package for molecular subtyping by integrating the tumor microenviroment heterogeneity of colorectal cancer
install.packages("devtools")
devtools::install_github("yswutan/CMSPlus")library(CMSPlus);
library(GSVA);
library(CMSclassifier);
library(ComplexHeatmap);
library(circlize);
library(randomForest);
library(ranger);
library(parallel)
res <- CMSPlus(exp2symbol=test.profile)
names(res)
# [1] "CMSPlusLabel" "CMSLabel" "gsva_matrix" "HeatmapPlot"-
CMSPlus Parameters
exp2symbol: A dataframe with Gene Expression Profiles data values, samples in columns, genes in rows, rownames corresponding to gene symbols.prob: A numeric value between 0 and 1 specifying the minimum posterior probability threshold. Required to assign a predicted subtype. Samples with maximum subtype probability below this threshold will be classified as "unassigned". Default is 0.5.plot:TRUEproduces plots;FALSEsuppresses plotting. Default isTRUE.CMSlabel: A character string specifying the label used in plots to denote CMS subtypes. Default is "RF.nearestCMS".CMSPluslabel: A character string specifying the label used in plots to denote CMSPlus subtypes. Default is "nearest".parallel.sz: Integer specifying the number of parallel workers used for GSVA computation. Default is 1.InterGroupRandomize: Logical value indicating whether to perform randomization in column order within subtypes plots. Default isTRUE.seed: Integer specifying the random seed used to ensure reproducibility of randomization step.
-
res$CMSLabel- Content: A sample-by-subtype probability matrix generated by the CMSPlus model, representing the predicted posterior probability of each colorectal cancer sample belonging to five CMSPlus subtypes, together with the nearest subtype and final predicted label.
Column Description CMS1 Predicted probability that the sample belongs to CMS1 subtype, as estimated by the CMSPlus classification model. CMS2 Predicted probability that the sample belongs to CMS2 subtype. CMS3 Predicted probability that the sample belongs to CMS3 subtype. CMS4-TME- Predicted probability that the sample belongs to the CMS4 subtype with low tumor microenvironment (TME) infiltration. CMS4-TME+ Predicted probability that the sample belongs to the CMS4 subtype with high tumor microenvironment (TME) infiltration. nearest Subtype with the maximum posterior probability for the sample, irrespective of any probability threshold. predict The subtype is assigned only if its probability is both the maximum among all subtypes and greater than or equal to the user-defined probability threshold ( prob). Samples not meeting this criterion are labeled asmix -
res$CMSPlusResult- Content: Subtype probabilities and CMS subtype assignments generated by the
CMSclassifierpackage using a Random Forest (RF)–based classification model.
Column Description RF.CMS1.posteriorProb Posterior probability that the sample belongs to CMS1, estimated by the Random Forest classifier. RF.CMS2.posteriorProb Posterior probability that the sample belongs to CMS2, estimated by the Random Forest classifier. RF.CMS3.posteriorProb Posterior probability that the sample belongs to CMS3, estimated by the Random Forest classifier. RF.CMS4.posteriorProb Posterior probability that the sample belongs to CMS4, estimated by the Random Forest classifier. RF.nearestCMS CMS subtype with the highest posterior probability for the sample, regardless of confidence threshold. RF.predictedCMS The subtype is assigned only if its posterior probability is both the maximum among all subtypes and greater than 0.5. - Content: Subtype probabilities and CMS subtype assignments generated by the
-
res$gsva_matrix- Content: A matrix of single-sample gene set enrichment scores computed from gene expression profiles using GSVA (Gene Set Variation Analysis). The matrix represents pathway-level activity inferred from gene expression data across individual samples, based on a predefined set of 42 biological pathways.
Dimension Description Rows 42 curated pathway gene sets used for subtype inference Columns Individual samples Values GSVA enrichment scores, reflecting the relative activity of each pathway within each sample -
res$HeatmapPlot- Content: A pathway-level heatmap visualizing GSVA enrichment scores across samples, with integrated molecular subtype annotations.