25S1W1 - 2. Initial setup
Overview
Initial requirements
To be able to run these exercises, you’ll need:
A HPC account
Nextflow installed on the HPC
PuTTy installed on your local computer
Access your HPC home directory from your PC
Instructions for getting a HPC account are here: https://qutvirtual4.qut.edu.au/group/staff/research/conducting/facilities/advanced-research-computing-storage/supercomputing/getting-started-with-hpc
If you haven’t installed Nextflow, follow the instructions in this link: 25S1W1 - 1. Getting started with Nextflow
You’ll need a Terminal (Mac Users) or PuTTY (Windows users) on your PC to access the HPC.
You can download PuTTY from here: https://the.earth.li/~sgtatham/putty/latest/w64/putty.exe
Then add the HPC (Lyra) address: aqua.qut.edu.au and then click ‘open’.
Setup Windows File Explorer to access your HPC home account. Follow the instructions here:
Interactive HPC session
Open a Terminal (Mac users) or PuTTy (Windows users) and paste the text below into the command prompt to start an Interactive Session:
qsub -I -S /bin/bash -l walltime=4:00:00 -l select=1:ncpus=4:mem=8gb
It should take less than a minute for the interactive session. The interactive session will allow
Create working directories
We’ll be analysing Illumina amplicon data, so first we need to create the workshop directories in your home drive on the HPC. Copy and paste the following into PuTTy or Terminal:
To make sure you are in your home directory, run the following command:
cd $HOME
Next, let’s create working folders for today's exercises:
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/scripts
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/runs
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/runs/run1_test
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/runs/run2_ampliseq
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/data/illumina
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/data/mydata
cd $HOME/workshop/2025/S1W1/metagenomics
Copy scripts for excercises
Let’s now copy the scripts for today’s workshop:
cp /work/training/2025/S1W1/session3_metagenomics/scripts/* $HOME/workshop/2025/S1W1/metagenomics/scripts
Check the list of scripts:
ls -l $HOME/workshop/2025/S1W1/metagenomics/scripts
├── create_samplesheet_nfcore_ampliseq.py
├── launch_nfcore_ampliseq_illumina.pbs
├── launch_nfcore_ampliseq_test.pbs
└── samplesheet.tsv
Copy public data
Now let’s copy previously downloaded Illumina amplicon data:
cp /work/training/2025/S1W1/session3_metagenomics/data/illumina/* $HOME/workshop/2025/S1W1/metagenomics/data/illumina
This can take a couple of minutes. To check that you have copied the data you can do the following:
ls $HOME/workshop/2025/S1W1/metagenomics/data/illumina
create_samplesheet_nfcore_ampliseq.py* Illumina24.fastq.gz Illumina39.fastq.gz Illumina53.fastq.gz
Illumina10.fastq.gz Illumina25.fastq.gz Illumina3.fastq.gz Illumina54.fastq.gz
Illumina11.fastq.gz Illumina26.fastq.gz Illumina40.fastq.gz Illumina55.fastq.gz
Illumina12.fastq.gz Illumina27.fastq.gz Illumina41.fastq.gz Illumina56.fastq.gz
Illumina13.fastq.gz Illumina28.fastq.gz Illumina42.fastq.gz Illumina57.fastq.gz
Illumina14.fastq.gz Illumina29.fastq.gz Illumina43.fastq.gz Illumina58.fastq.gz
Illumina15.fastq.gz Illumina2.fastq.gz Illumina44.fastq.gz Illumina59.fastq.gz
Illumina16.fastq.gz Illumina30.fastq.gz Illumina45.fastq.gz Illumina5.fastq.gz
Illumina17.fastq.gz Illumina31.fastq.gz Illumina46.fastq.gz Illumina6.fastq.gz
Illumina18.fastq.gz Illumina32.fastq.gz Illumina47.fastq.gz Illumina7.fastq.gz
Illumina19.fastq.gz Illumina33.fastq.gz Illumina48.fastq.gz Illumina8.fastq.gz
Illumina1.fastq.gz Illumina34.fastq.gz Illumina49.fastq.gz Illumina9.fastq.gz
Illumina20.fastq.gz Illumina35.fastq.gz Illumina4.fastq.gz samplesheet.tsv
Illumina21.fastq.gz Illumina36.fastq.gz Illumina50.fastq.gz
Illumina22.fastq.gz Illumina37.fastq.gz Illumina51.fastq.gz
Illumina23.fastq.gz Illumina38.fastq.gz Illumina52.fastq.gz
Let’s move to the data folder:
cd $HOME/workshop/2025/S1W1/metagenomics/data
Now we are ready for the next excercise downloading public data from the European Nucleotide Archive (ENA) ENA Browser