This page provides a guide to QUT users to run the nf-core/sarek workflow on the QUT HPC.
Further details on the workflow can be found at:
https://nf-co.re/rnaseq/3.0/usage
https://nf-co.re/rnaseq/3.0/output
Install Nextflow
The Sarek workflow requires Nextflow to be installed in your account on the HPC. Find details on how to install and test Nextflow here prepare a cextflow.config file and run a PBS pro submission script for Nextflow pipelines.
Additional information available here: https://nf-co.re/usage/installation
Pipeline summary
FastQC - Raw read QC
UMI-tools extract - UMI barcode extraction
TrimGalore - Adapter and quality trimming
SortMeRNA - Removal of ribosomal RNA (optional)
STAR and Salmon - Fast spliced aware genome alignment and transcriptome quantification
STAR via RSEM - Alignment and quantification of expression levels
HISAT2 - Memory efficient splice aware alignment to a reference
SAMtools - Sort and index alignments
UMI-tools dedup - UMI-based deduplication
picard MarkDuplicates - Duplicate read marking
StringTie - Transcript assembly and quantification
BEDTools and bedGraphToBigWig - Create bigWig coverage files
RSeQC - Various RNA-seq QC metrics
Qualimap - Various RNA-seq QC metrics
dupRadar - Assessment of technical / biological read duplication
Preseq - Estimation of library complexity
featureCounts - Read counting relative to gene biotype
DESeq2 - PCA plot and sample pairwise distance heatmap and dendrogram
MultiQC - Present QC for raw reads, alignment, read counting and sample similiarity
Pseudo-alignment and quantification
Salmon - Wicked fast gene and isoform quantification relative to the transcriptome
Workflow reporting and genomes
Reference genome files - Saving reference genome indices/files
Pipeline information - Report metrics generated during the workflow execution