Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This page provides a guide to QUT users on how to install and run the nextflow nf-core/rnaseq workflow on the HPC.

Pre-requisites

Install Nextflow

The nf-core/rnaseq workflow requires Nextflow to be installed in your account on the HPC. Find details on how to install and test Nextflow here. Prepare a nextflow.config file and run a PBS pro submission script for Nextflow pipelines.

Additional information is available here: https://nf-co.re/usage/installation

Additional details on the workflow can be found at:

Overview: https://nf-co.re/rnaseq/3.0

...

GitHub: https://github.com/nf-core/rnaseq

Pipeline Summary

...

Getting Started

Download and run the workflow using minimal data provided by nf-core/rnaseq. We recommend using singularity as the profile for QUT’s HPC. Another profile option can be ‘conda.’ Note: the profile option ‘docker’ is unavailable on the HPC.

Code Block
nextflow run nf-core/rnaseq -profile test,singularity --outdir results -r 3.10.1

Running the pipeline using custom data

Example of a typical command to run an RNA-seq analysis for mouse samples:

...

Code Block
nextflow run nf-core/rnaseq --input samplesheet.csv \
        --outdir results \
        -r 3.10.1 \
        --genome GRCh38 \
        -profile singularity \
        --aligner star_rsem \
        --clip_r1 10 \
        --clip_r2 10 \
        --three_prime_clip_r1 1 \
        --three_prime_clip_r2 1 \
      -resume

Preparing a ‘samplesheet.csv’ file

Prepare a sample sheet file that specifies the input files to be used. To do this, we use an nf-core script to generate the ‘samplesheet.csv’ file as follows:

...

Code Block
sample,fastq_1,fastq_2,strandedness
control_1,/path/to/fastq/control-1_R1.fastq.gz,/path/to/fastq/control-1_R2.fastq.gz,unstranded
control_2,/path/to/fastq/control-2_R1.fastq.gz,/path/to/fastq/control-2_R2.fastq.gz,unstranded
control_3,/path/to/fastq/control-3_R1.fastq.gz,/path/to/fastq/control-3_R2.fastq.gz,unstranded
infected_1,/path/to/fastq/infected-1_R1.fastq.gz,/path/to/fastq/infected-1_R2.fastq.gz,unstranded
infected_2,/path/to/fastq/infected-1_R1.fastq.gz,/path/to/fastq/infected-2_R2.fastq.gz,unstranded
infected_3,/path/to/fastq/infected-1_R1.fastq.gz,/path/to/fastq/infected-3_R2.fastq.gz,unstranded

Preparing to run on the HPC

To run this on the HPC a PBS submission script needs to be created using a text editor. For example, create a file called launch.pbs using a text editor of choice (i.e., vi or nano) and then copy and paste the code below:

...

Code Block
#!/bin/bash -l
#PBS -N nfrna2
#PBS -l select=1:ncpus=2:mem=4gb
#PBS -l walltime=24:00:00

#work on current directory (folder)
cd $PBS_O_WORKDIR

#load java and set up memory settings to run nextflow
module load java
NXF_OPTS='-Xms1g -Xmx4g'

#run the rnaseq pipeline
nextflow run nf-core/rnaseq --input samplesheet.csv \
        --outdir results \
        -r 3.10.1 \
        --genome GRCh38 \
        -profile singularity \
        --aligner star_rsem \
        --clip_r1 10 \
        --clip_r2 10 \
        --three_prime_clip_r1 1 \
        --three_prime_clip_r2 1

Submitting the job

Once you have created the folder for the run, the input.tsv file, nextflow.config, and launch.pbs, you are ready to submit.

...

Code Block
qsub launch.pbs

Monitoring the Run

You can use the command

Code Block
qstat -u $USER

...