Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page provides a basic introduction to Unix commands to HPC users with no previous knowledge.

...

Running the nextflow nf-core/rnaseq pipeline

Requirements:

  • index.csv → a file that provides a list of sample IDs and their associated FASTQ files (read 1 and read 2)

  • launch.pbs → a script to submit the job to the HPC cluster

Example index.csv file for nf-core/rnaseq version 3.3:

Code Block
group,fastq_1,fastq_2,strandedness
control_r1,/work/kenna_team/data/raw_data/SRR1039508_1.fastq.gz,/work/kenna_team/data/raw_data/SRR1039508_2.fastq.gz,unstranded
dex_r1,/work/kenna_team/data/raw_data/SRR1039509_1.fastq.gz,/work/kenna_team/data/raw_data/SRR1039509_2.fastq.gz,unstranded
control_r2,/work/kenna_team/data/raw_data/SRR1039512_1.fastq.gz,/work/kenna_team/data/raw_data/SRR1039512_2.fastq.gz,unstranded
dex_r2,/work/kenna_team/data/raw_data/SRR1039513_1.fastq.gz,/work/kenna_team/data/raw_data/SRR1039513_2.fastq.gz,unstranded
control_r3,/work/kenna_team/data/raw_data/SRR1039516_1.fastq.gz,/work/kenna_team/data/raw_data/SRR1039516_2.fastq.gz,unstranded
dex_r3,/work/kenna_team/data/raw_data/SRR1039517_1.fastq.gz,/work/kenna_team/data/raw_data/SRR1039517_2.fastq.gz,unstranded
control_r4,/work/kenna_team/data/raw_data/SRR1039520_1.fastq.gz,/work/kenna_team/data/raw_data/SRR1039520_2.fastq.gz,unstranded
dex_r4,/work/kenna_team/data/raw_data/SRR1039521_1.fastq.gz,/work/kenna_team/data/raw_data/SRR1039521_2.fastq.gz,unstranded

Example launch.pbs script:

Code Block
#!/bin/bash -l
#PBS -N nfrnaseq
#PBS -l select=1:ncpus=2:mem=4gb
#PBS -l walltime=24:00:00

#Use the current directory to run the workflow
cd $PBS_O_WORKDIR

module load java
NXF_OPTS='-Xms1g -Xmx4g'

#run the nextflow RNA-seq pipeline:
nextflow run nf-core/rnaseq -profile singularity -r 3.3 --aligner star_salmon --input index.csv --genome GRCh38 -resume

...

Code Block
cp /work/kenna_team/scripts/star_rsem/* .

Visualizing results

The results generated in the pipeline can be visualized within the ‘results’ folder.

Code Block
#go to the results folder - note by default all nextflow pipelines show the key outputs within the'results' folder, while the 'work' folders contains all intermediate files generated during execution.
cd results

#list folders and files
ls

example output:

Code Block
drwxrws---  2 barrero 4.0K Sep  7 20:05 fastqc/
drwxrws---  3 barrero 4.0K Sep  7 20:16 trimgalore/
drwxrws---  3 barrero   23 Sep  9 13:03 multiqc/
drwxrws---  2 barrero 4.0K Sep  9 13:03 pipeline_info/
drwxrws--- 20 barrero 4.0K Sep 14 23:23 star_rsem/

Access the HPC files from your laptop

Mac laptop (note: need to be connected via VPN)

  • Open the ‘Finder' window

  • Click on the search file tab and hit “Command + K”

  • This will open a new window:

    Image Added
  • type the above to connect to the shared ‘work' space. Then click connect. To access your personal space replace ‘work’ with 'home’.