This is an ongoing project to apply machine learning to TCGA transcriptomic data to build a classifier and find contributing genes. As of now, the download of transcriptomic data was done using the TCGAbiolinks package and differential gene expression was run using DESEq2 to find genes differentiating the various cancer groups. A PCA plot and a dendogram was also generated to check clustering of cancer groups. A random forest and a gradient boosting model was run and variable importance values were generated.
das2000sidd/Machine-learning-with-TCGA
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|