/
2024-2: 7c.1 Running R scripts on HPC

2024-2: 7c.1 Running R scripts on HPC

Running R Scripts on the HPC

If all your data is on the HPC, or your analysis is too large or takes too long on your desktop/laptop, it is possible to run the R scripts on the HPC.

Preparing your R script for the HPC

QUT’s HPC is based on Linux so the path names of where your files are, are likely different on the HPC so we must update them to the HPC path.

Using R studio, you can adjust the paths in your script. In the DESeq2.R script, there are a number of places that we need to change for it to work on the HPC:

# The setwd line needs to be changed to: setwd("~/workshop/2024-2/session6_smallRNAseq/runs/run1_human_miRBase/DESeq2") # The line where you read in the mature counts table needs to be changed: metacounts <- read.table("~/workshop/2024-2/session6_smallRNAseq/runs/run1_human_miRBase/DESeq2/mature_counts.txt", header = TRUE, row.names = 1) # The line where you read in the metadata table needs to be changed: meta <- read.table("~/workshop/2024-2/session6_smallRNAseq/runs/run1_human_miRBase/DESeq2/metadata_microRNA.txt", header = TRUE)

The H: and W: drives to not exist on the HPC. The folders are there, just under a different path.

Preparing a Script to run the R script on the HPC

You will need to log on to the HPC, via PuTTY (or ssh to lyra in the terminal on a mac)…then change directory to where we saved the DESeq2.R file:

cd $HOME/workshop/2024-2/session6_smallRNAseq/runs/run1_human_miRBase/DESeq2

We will need to import the R libraries that we need for this R analysis (unlike when we used the rVDI, where we had imported them for you).

We will do this by making a folder for the libraries:

mkdir -p $HOME/workshop/2024-2/session6_smallRNAseq/runs/run1_human_miRBase/r_library

Then we will need to add these sections at the start of our DESeq2.R script:

Also a job script needs to be built to request resources and run the script. This one works well for the DESeq2.R script…

Using R Studio, create a Text File (or Shell Script) and paste in the contents of this script…

File → New File → Text File (or Shell Script)

Save it as launch_R.pbs (or launch_R if saving as a Shell Script [.sh is added automatically]) in H:\workshop\2024-2\session6_smallRNAseq\runs\run1_human_miRBase\DESeq2 (Same folder as DESeq2.R) - Remember, H: is pointed at your HPC Home Folder.

Running the Script on the HPC

Now the script is on the HPC, we can run it, but we have to convert it first. R Studio on Windows will save the text file as a “Windows” format file. The HPC has trouble reading this file so we can easily convert it “Linux” format file. Once we have converted the file, we can submit the script to the scheduler and wait for it to run. Copy and paste each of the unhashed lines into the linux command line on the HPC.

Related content

2024-2: 7c.2 Installing other R packages on HPC
2024-2: 7c.2 Installing other R packages on HPC
More like this
2024-2: 7b-Exercises - MirGeneDB
2024-2: 7b-Exercises - MirGeneDB
Read with this
copy_2. Setup
copy_2. Setup
More like this
2024-2: 7a.1 DE analysis for smallRNAseq against mirBase
2024-2: 7a.1 DE analysis for smallRNAseq against mirBase
Read with this
2. Setup
More like this
3. Running pipelines
3. Running pipelines
Read with this