Session 2: RNA-seq expression

This page provides a basic introduction to Unix commands to HPC users with no previous knowledge.

Log into the HPC

ssh userID@lyra.qut.edu.au

Brief basic Unix commands recap

Once you log into the HPC, you will land in your personal home space (i.e. /home/myStudentID/). This space is only accessible to you. To work in collaboration with others we use workspaces (i.e. /work/myProjectName/).

To go to a shared directory for your project named “kenna_team” type the following command and hit enter:

cd /work/chlamydia_carey/

Display list of files in a directory

ls -lh

Print working directory

Create a folder

Enter new folder

Move back to the previous folder

Make a backup copy of the file

Move a copy of a file to a newly created folder - note it is recommended to make a copy of important files prior to modifying or executing commands on them.

View the content of a file (note hashtags # at the start of a line is used to provide information of the code underneath it)

Go back to my personal space. Type 'cd' and hit enter. This will move you to /home/mystudentID/

Interactive session:

Running the nextflow nf-core/rnaseq pipeline

Requirements:

  • index.csv → a file that provides a list of sample IDs and their associated FASTQ files (read 1 and read 2)

  • launch.pbs → a script to submit the job to the HPC cluster

Example index.csv file for nf-core/rnaseq version 3.3:

Example launch.pbs script:

where:

--aligner Specifies the alignment algorithm to use - available options are 'star_salmon', 'star_rsem', and 'hisat2'. The default option is 'star_salmon'.

more information at:

https://nf-co.re/rnaseq/3.3/usage

More advanced launch.pbs script example:

Session 2 exercises:

  1. Run the nf-core/rnaseq pipeline using the Airway smooth muscle public data (PMID: 24926665. GEO: GSE52778) - aligner option set to ‘star_salmon

  2. Same as above but aligner option set to ‘star_rsem

Create a new working folder:

Copy index.csv and launch.pbs files to the newly created folder

Check that files were copied into the new working folder

Run the workflow:

Monitor the progress of the workflow:

or

Repeat the above process for ‘star_rsem’

The only variation is copying the index.csv and launch.pbs script. As follows:

Visualizing results

The results generated in the pipeline can be visualized within the ‘results’ folder.

example output:

Access the HPC files from your laptop

Mac laptop (note: need to be connected via VPN)

  • Open the ‘Finder' window

  • Click on the search file tab and hit the “Command + K” keys simultaneously

  • This will open a new window:

  • Type the above to connect to the shared ‘work' space. To access your personal space replace ‘work’ with 'home’.

Next

Differential expression analysis using BioJupies | Get Started