leenaput/microcredential-nextflow-project

Introduction

leenaput/microcredential-nextflow-project
For the Microcredential Nextflow project, I developed a pipeline to processes nanopore sequencing data from raw FASTQ to alignment, coverage calculation, and QC summary evaluation. A step-by-step outline of how the project was developed can be found here.

Pipeline overview

Pipeline steps:

Raw read QC - Generates stats for raw using NanoPlot
Read filtering - Quality and length filtering of raw reads using Chopper
Filtered reads QC - Generates stats for filtered reads using NanoPlot
Read alignment -Maps reads to reference genome using minimap2, sorts and indexes with SAMtools
Alignment QC – Computes coverage using SAMtools and generates stats of mapped reads using NanoPlot
QC Summary – Evaluates QC values against user-defined thresholds using custom script
Final report – Quickly displays QC pass/fail of sample in stdout

Usage

Installation

Clone the repository:

git clone git@github.com:leenput/microcredential-nextflow-project.git

Note: Nextflow should be installed on your system.
If working on VSC, make sure to carry out the following configurations before running the pipeline:

module load Nextflow/24.10.2
export APPTAINER_CACHEDIR=${VSC_SCRATCH}/.apptainer_cache
export APPTAINER_TMPDIR=${VSC_SCRATCH}/.apptainer_tmp

Set parameters

The parameters are included in params.config. Please modify according to your experimental needs.

General parameters

Parameter	Description	Example
`--reads`	Path to input FASTQ files	`./data/*.fastq`
`--fasta`	Reference genome FASTA file	`./data/genome.fasta`

Please make sure to store your genome sequence file (.fasta) and basecalled ONT reads (.fastq) in the /data workfolder.

For now, you can find the following test data there:

reference sequence: chr21 of the new human reference genome T2T-CHM13v2
ONT data: subsampled reads of GIAB sample HG002

Optional parameters

Parameter	Description	Default
`--qscore`	Minimum quality score (CHOPPER)	`7`
`--minlength`	Minimum read length (CHOPPER)	`1000`
`--pct_filtered`	% reads passing filtering (QC summary)	`80`
`--pct_mapped`	% reads mapped (QC summary)	`50`
`--coverage`	Minimum average coverage	`10`
`--filt_n50`	Minimum N50 for filtered reads	`5000`

Run the pipeline

Run the workflow with the following command:

nextflow run main.nf -profile <docker/singularity/conda>

Output structure

results/
    ├── pipeline_info/
    ├── <sample>/
           ├── filtered_reads/
           ├── alignment/
           ├── quality-control/
                     ├── raw/
                     ├── filtered/
                     ├── mapped/
                     ├── <sample>_QC_summary.txt

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
bin		bin
data		data
modules		modules
workflows		workflows
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
STEPBYSTEP.md		STEPBYSTEP.md
main.nf		main.nf
nextflow.config		nextflow.config
params.config		params.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

leenaput/microcredential-nextflow-project

Introduction

Pipeline overview

Usage

Installation

Set parameters

General parameters

Optional parameters

Run the pipeline

Output structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

leenaput/microcredential-nextflow-project

Introduction

Pipeline overview

Usage

Installation

Set parameters

General parameters

Optional parameters

Run the pipeline

Output structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages