2024-2: 7c.1 Running R scripts on HPC
Running R Scripts on the HPC
If all your data is on the HPC, or your analysis is too large or takes too long on your desktop/laptop, it is possible to run the R scripts on the HPC.
Preparing your R script for the HPC
QUT’s HPC is based on Linux so the path names of where your files are, are likely different on the HPC so we must update them to the HPC path.
Using R studio, you can adjust the paths in your script. In the DESeq2.R script, there are a number of places that we need to change for it to work on the HPC:
# The setwd line needs to be changed to:
setwd("~/workshop/2024-2/session6_smallRNAseq/runs/run1_human_miRBase/DESeq2")
# The line where you read in the mature counts table needs to be changed:
metacounts <- read.table("~/workshop/2024-2/session6_smallRNAseq/runs/run1_human_miRBase/DESeq2/mature_counts.txt", header = TRUE, row.names = 1)
# The line where you read in the metadata table needs to be changed:
meta <- read.table("~/workshop/2024-2/session6_smallRNAseq/runs/run1_human_miRBase/DESeq2/metadata_microRNA.txt", header = TRUE)
The H: and W: drives to not exist on the HPC. The folders are there, just under a different path.
Preparing a Script to run the R script on the HPC
You will need to log on to the HPC, via PuTTY (or ssh to lyra in the terminal on a mac)…then change directory to where we saved the DESeq2.R file:
cd $HOME/workshop/2024-2/session6_smallRNAseq/runs/run1_human_miRBase/DESeq2
We will need to import the R libraries that we need for this R analysis (unlike when we used the rVDI, where we had imported them for you).
We will do this by making a folder for the libraries:
mkdir -p $HOME/workshop/2024-2/session6_smallRNAseq/runs/run1_human_miRBase/r_library
Then we will need to add these sections at the start of our DESeq2.R script:
Also a job script needs to be built to request resources and run the script. This one works well for the DESeq2.R script…
Using R Studio, create a Text File (or Shell Script) and paste in the contents of this script…
File → New File → Text File (or Shell Script)
Save it as launch_R.pbs (or launch_R if saving as a Shell Script [.sh is added automatically]) in H:\workshop\2024-2\session6_smallRNAseq\runs\run1_human_miRBase\DESeq2 (Same folder as DESeq2.R) - Remember, H: is pointed at your HPC Home Folder.
Running the Script on the HPC
Now the script is on the HPC, we can run it, but we have to convert it first. R Studio on Windows will save the text file as a “Windows” format file. The HPC has trouble reading this file so we can easily convert it “Linux” format file. Once we have converted the file, we can submit the script to the scheduler and wait for it to run. Copy and paste each of the unhashed lines into the linux command line on the HPC.