Session 2: RNA-seq expression
This page provides a basic introduction to Unix commands to HPC users with no previous knowledge.
Log into the HPC
ssh userID@lyra.qut.edu.au
Brief basic Unix commands recap
Once you log into the HPC, you will land in your personal home space (i.e. /home/myStudentID/). This space is only accessible to you. To work in collaboration with others we use workspaces (i.e. /work/myProjectName/).
To go to a shared directory for your project named “kenna_team” type the following command and hit enter:
cd /work/chlamydia_carey/
Display list of files in a directory
ls -lh
Print working directory
Create a folder
Enter new folder
Move back to the previous folder
Make a backup copy of the file
Move a copy of a file to a newly created folder - note it is recommended to make a copy of important files prior to modifying or executing commands on them.
View the content of a file (note hashtags # at the start of a line is used to provide information of the code underneath it)
Go back to my personal space. Type 'cd' and hit enter. This will move you to /home/mystudentID/
Interactive session:
Running the nextflow nf-core/rnaseq pipeline
Requirements:
index.csv → a file that provides a list of sample IDs and their associated FASTQ files (read 1 and read 2)
launch.pbs → a script to submit the job to the HPC cluster
Example index.csv file for nf-core/rnaseq version 3.3:
Example launch.pbs script:
where:
--aligner Specifies the alignment algorithm to use - available options are 'star_salmon', 'star_rsem', and 'hisat2'. The default option is 'star_salmon'.
more information at:
https://nf-co.re/rnaseq/3.3/usage
More advanced launch.pbs script example:
Session 2 exercises:
Run the nf-core/rnaseq pipeline using the Airway smooth muscle public data (PMID: 24926665. GEO: GSE52778) - aligner option set to ‘star_salmon’
Same as above but aligner option set to ‘star_rsem’
Create a new working folder:
Copy index.csv and launch.pbs files to the newly created folder
Check that files were copied into the new working folder
Run the workflow:
Monitor the progress of the workflow:
or
Repeat the above process for ‘star_rsem’
The only variation is copying the index.csv and launch.pbs script. As follows:
Visualizing results
The results generated in the pipeline can be visualized within the ‘results’ folder.
example output:
Access the HPC files from your laptop
Mac laptop (note: need to be connected via VPN)
Open the ‘Finder' window
Click on the search file tab and hit the “Command + K” keys simultaneously
This will open a new window:
Type the above to connect to the shared ‘work' space. To access your personal space replace ‘work’ with 'home’.
Next
Differential expression analysis using BioJupies | Get Started