GREE

Genome-wide cis-Regulatory Element Explorer

Installation

To install GREE, first install the devtools package, if it is not already installed:

install.packages("devtools")

Then, install GREE from GitHub:

devtools::install_github("JiekaiLab/GREE")

This should take no more than a few minutes.

Usage

Firstly , we create a tile matrix (user-defined tile size, default is 500 bp),the exon region, blacklist region and CpG island region overlapped by TSS are removed. This task is done by the function getBinByCellMatrix, and it takes as input a preprocessed fragment information in scATAC-seq data with 'RG' column containing cell barcodes. And genes, CpG, exon, blacklist region information with the same genomic version as the fragment information. These messages are in GRanges format. Then we can get the BinByCell matrix.

lnames = load("./data/data_examples.RData")
binByCell = getBinByCellMatrix(frag,genes,CpG,exon,Blacklist)

Secondly, 1000 cells were randomly selected from each group, and sum the accessibility of cells extracted from each group. The summed accessibility then multiplied by normalization factors. The normalization factors to counterbalance sequencing depth were calculated as 10^7 divided by the sum of nFrags per group. The getBinByGroupMatrix function, takes as input the BinByCell matrix obtained by the getBinByCellMatrix function and a data frame containing metadata associated with each cell, minimally including the cell group labels. You can specify group column these using the group_col arguments:

binByGroup_norm=getBinByGroupMatrix(binByCell,cell_anno,group_col="CellTtypes")

Thirdly, we filter out the bin that might be noisy, that is, the bin that has a smaller maximum accessibility. And select of highly variable bin based on the ratio of maximum to minimum accessibility of bin in each group. The getHVbinMatrix function, takes as input the binByGroup_norm matrix obtained by the getBinByGroupMatrix function and the threshold bin_max_thr and bin_fc_thr to get high variation bin.

binByGroup_norm_HVbin = getHVbinMatrix(binByGroup_norm,bin_max_thr=7,bin_fc_thr=70)

Finally, for each gene, GREE use getGeneByGroupMatrix function to sum the accessibility of highly variable bins that overlaps with the gene window (user-defined, default is 2 kb upstream and downstream of the gene body) and do not cross another gene region. And we can get the GeneByGroup matrix.

GeneByGroup = getGeneByGroupMatrix(genes,binByGroup_norm_HVbin)

Plot the results

We can use the peatmap package to show some marker genes.

library(stringr)
library(pheatmap)
library(ggplot2)

gene = c('GAD1','SLC17A6','TFAP2A','LMX1A','FEZF2') 
data = as.data.frame(t(as.matrix(GeneByGroup[gene,])))

options(repr.plot.width=8, repr.plot.height=5)
pal <- colorRampPalette( c("#7431A6","#000000","#F2F208"),bias = 1.7 )(255)
pheatmap(data,
         fontsize_col=12,
         color = pal,
         border_color = "#dcdcdc",
         scale = "column",
         cluster_rows = F,
         cluster_cols = F,
         width = 6,
         height = 3
)

Motif enrichment

Get the top 10% bins for each cell type and perform motif enrichment analysis using Homer.

bin_list = getGroupBinList(binByGroup_norm_HVbin,probs = 0.90)

PATH_findMotifsGenome = '/public/home/yyzhao_gibh/opt/biosoft/homer-4.11/bin/findMotifsGenome.pl'
motifAnalysis(bin_list,PATH_findMotifsGenome,output = "motif_result")

# plot
KnownMotif = read.csv("./motif_result/motif.csv")
options(repr.plot.width=8, repr.plot.height=6)
tf = c('DLX1',"LHX2","SOX3","RFX","AP-2ALPHA")
KnownMotif2 = KnownMotif[KnownMotif$TF %in% tf,]
KnownMotif2$TF = factor(KnownMotif2$TF,levels = rev(tf))
KnownMotif2$cluster = factor(KnownMotif2$cluster,levels = c('GABAergic','Glutamate','RG','RFX4+','Neural_Crest','Retinal'))
KnownMotif2$rank[KnownMotif2$rank>15]=15

ggplot(KnownMotif2, aes(x=cluster, y=TF,color=-Log.P.value,size=-rank)) + 
  geom_point() + scale_colour_gradient2(low = "white",high = "red") + 
  theme_bw(base_size = 20)+
  scale_size_continuous(breaks = -c(1,5,10,15,20), labels = c(1,5,10,15,20),range = c(1, 8), name = "Rank") +
  ylab("") + xlab("") + labs(title="") +
  theme(panel.grid=element_blank(),
        axis.text.x = element_text(angle = 60,vjust = 1,hjust = 1))

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Figures		Figures
R		R
man		man
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
GREE.Rproj		GREE.Rproj
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

GREE

Installation

Usage

Plot the results

Motif enrichment

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

Licenses found

JiekaiLab/GREE

Folders and files

Latest commit

History

Repository files navigation

GREE

Installation

Usage

Plot the results

Motif enrichment

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages