DEtail-seq

DEtail-seq provides a powerful approach to detect DSB ends in multiple species.
This pipeline is used to analyze DEtail-seq data.

Requirements

python	Bowtie2	bedtools	samtools	deeptTools
pyBigWig	numpy	scipy	statsmodels	biopython

Documentation

First, duplicated reads which had same sequences for both forward and reverse reads were removed. And our own scripts RemoveSamReads.py were used to remove duplicated reads. Then, reads were aligned to the reference genome with Bowtie2 using --local settings. For visualization and spliting strand, the aligned reads files (BAM) were converted to single base bigWig file with 1 bp bins using bamCoverage from deepTools. Finally, poisson distribution is used for Hotspots calling, which is implemented by our own script Hotspotcalling.py.

The detailed protocols were as follows:

1.Removing duplication reads

$ RemoveSamReads.py test_R1.fastq.gz test_R2.fastq.gz test

2.Alignment

$ bowtie2-build --threads 20 RefGenome.fasta RefGenome
$ bowtie2 --local --phred33 -p 20 -t -x RefGenome -1 test_dup_R1.fq.gz -2 test_dup_R2.fq.gz\
2>test_align.info|samtools view -bS -1 |samtools sort -@ 10 -m 5G -l 9 -o test.sort.bam

3.Converting BAM to single base bigWig and splitting strand

--minMappingQuality 30 may sometimes be needed

$ samtools index test.sort.bam
$ bamCoverage -v -p 60 -b test.sort.bam -o test_Crick.bw --binSize 1\
--Offset 1 --samFlagInclude 128 --filterRNAstrand forward
$ bamCoverage -v -p 60 -b test.sort.bam -o test_Watson.bw --binSize 1\
--Offset 1 --samFlagInclude 128 --filterRNAstrand reverse

4.Hotspot calling

usage: Hotspotcalling.py [-h] --bw BIGWIG [--strand {fwd,rev}] [--qvalue QVALUE] [--filter FILTER] [--res REST]
                         [--norm {MaxCut,AvgCut,Mean}] [--debug] --prefix PREFIX

Hotspotcalling.py

options:
  -h, --help            show this help message and exit
  --bw BIGWIG
  --strand {fwd,rev}
  --qvalue QVALUE
  --filter FILTER       A list of comma-delimited chromosome names,containing those chromosomes that should be
                        excluded for computing the hotspotcalling and normalization.
  --res REST            Restriction region bed file
  --norm {MaxCut,AvgCut,Mean}
                        normalizing bw file
  --debug
  --prefix PREFIX

Only calling hotspot

$ Hotspotcalling.py --bw test_Watson.bw --strand fwd --qvalue 0.01 --prefix test_Watson
$ Hotspotcalling.py --bw test_Crick.bw --strand rev --qvalue 0.01 --prefix test_Crick

Calling hotspot and Normalizing bw

$ Hotspotcalling.py --bw test_Watson.bw --strand fwd --qvalue 0.01 --prefix test_Watson --norm AvgCut --res cut.bed
$ Hotspotcalling.py --bw test_Crick.bw --strand rev --qvalue 0.01 --prefix test_Crick --norm AvgCut --res cut.bed

cut.bed is the location information of restriction digestion site.
--norm has three options:
MaxCut need restriction region bed file (--res), It uses the maximum value of the base signal in the restriction region for normalization.
AvgCut need restriction region bed file (--res), It sum the base signals of all restricted regions, and then divide by the number of restricted regions to obtain AvgCut.
Mean don't need restriction region bed file (--res), It uses the mean value of all non-zero base signals.

output test_Crick_basepeaks.bed and test_Watson_basepeaks.bed are the hotspot results in bed format.
test_Crick_basepeaks.xls and test_Watson_basepeaks.xls are the detailed hotspot results. test_Crick_norm_nocut.bw and test_Watson_norm_nocut.bw are the Normalizing bw and restricted region are removed from bw files.
If you want to process multiple samples in batches, please refer to DEtailPip.sh

Citing this work

Xu, W., Liu, C., Zhang, Z., Sun, C., Li, Q., Li, K., Jiang, H., Li, W., & Sun, Q. (2023). DEtail-seq is an ultra-efficient and convenient method for meiotic DNA break profiling in multiple organisms. Science China. Life sciences, 66(6), 1392–1407.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
DEtailPip.sh		DEtailPip.sh
Hotspotcalling.py		Hotspotcalling.py
LICENSE		LICENSE
README.md		README.md
RemoveSamReads.py		RemoveSamReads.py
norm.sh		norm.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DEtail-seq

Requirements

Documentation

1.Removing duplication reads

2.Alignment

3.Converting BAM to single base bigWig and splitting strand

4.Hotspot calling

Citing this work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DEtail-seq

Requirements

Documentation

1.Removing duplication reads

2.Alignment

3.Converting BAM to single base bigWig and splitting strand

4.Hotspot calling

Citing this work

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages