Overview

Initial requirements

To be able to run these exercises, you’ll need:

A HPC account
Nextflow installed on the HPC
PuTTy installed on your local computer
Access your HPC home directory from your PC

Instructions for getting a HPC account are here: https://qutvirtual4.qut.edu.au/group/staff/research/conducting/facilities/advanced-research-computing-storage/supercomputing/getting-started-with-hpc

If you haven’t installed Nextflow, follow the instructions in this link: 25S1W1 - 1. Getting started with Nextflow

You’ll need a Terminal (Mac Users) or PuTTY (Windows users) on your PC to access the HPC.

You can download PuTTY from here: https://the.earth.li/~sgtatham/putty/latest/w64/putty.exe

Then add the HPC (Lyra) address: aqua.qut.edu.au and then click ‘open’.

Setup Windows File Explorer to access your HPC home account. Follow the instructions here:

https://qutvirtual4.qut.edu.au/group/staff/research/conducting/facilities/advanced-research-computing-storage/supercomputing/using-hpc-filesystems

Interactive HPC session

Open a Terminal (Mac users) or PuTTy (Windows users) and paste the text below into the command prompt to start an Interactive Session:

qsub -I -S /bin/bash -l walltime=4:00:00 -l select=1:ncpus=4:mem=8gb

It should take less than a minute for the interactive session. The interactive session will allow

Create working directories

We’ll be analysing Illumina amplicon data, so first we need to create the workshop directories in your home drive on the HPC. Copy and paste the following into PuTTy or Terminal:

To make sure you are in your home directory, run the following command:

cd $HOME

Next, let’s create working folders for today's exercises:

mkdir -p $HOME/workshop/2025/S1W1/metagenomics/scripts
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/runs
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/runs/run1_test
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/runs/run2_ampliseq
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/data/illumina
mkdir -p $HOME/workshop/2025/S1W1/metagenomics/data/mydata
cd $HOME/workshop/2025/S1W1/metagenomics

Copy scripts for excercises

Let’s now copy the scripts for today’s workshop:

cp /work/training/2025/S1W1/session3_metagenomics/scripts/* $HOME/workshop/2025/S1W1/metagenomics/scripts

Check the list of scripts:

ls -l $HOME/workshop/2025/S1W1/metagenomics/scripts

├── create_samplesheet_nfcore_ampliseq.py
├── launch_nfcore_ampliseq_illumina.pbs
├── launch_nfcore_ampliseq_test.pbs
└── samplesheet.tsv

Copy public data

Now let’s copy previously downloaded Illumina amplicon data:

cp /work/training/2025/S1W1/session3_metagenomics/data/illumina/* $HOME/workshop/2025/S1W1/metagenomics/data/illumina

This can take a couple of minutes. To check that you have copied the data you can do the following:

ls $HOME/workshop/2025/S1W1/metagenomics/data/illumina

create_samplesheet_nfcore_ampliseq.py*  Illumina24.fastq.gz  Illumina39.fastq.gz  Illumina53.fastq.gz
Illumina10.fastq.gz                     Illumina25.fastq.gz  Illumina3.fastq.gz   Illumina54.fastq.gz
Illumina11.fastq.gz                     Illumina26.fastq.gz  Illumina40.fastq.gz  Illumina55.fastq.gz
Illumina12.fastq.gz                     Illumina27.fastq.gz  Illumina41.fastq.gz  Illumina56.fastq.gz
Illumina13.fastq.gz                     Illumina28.fastq.gz  Illumina42.fastq.gz  Illumina57.fastq.gz
Illumina14.fastq.gz                     Illumina29.fastq.gz  Illumina43.fastq.gz  Illumina58.fastq.gz
Illumina15.fastq.gz                     Illumina2.fastq.gz   Illumina44.fastq.gz  Illumina59.fastq.gz
Illumina16.fastq.gz                     Illumina30.fastq.gz  Illumina45.fastq.gz  Illumina5.fastq.gz
Illumina17.fastq.gz                     Illumina31.fastq.gz  Illumina46.fastq.gz  Illumina6.fastq.gz
Illumina18.fastq.gz                     Illumina32.fastq.gz  Illumina47.fastq.gz  Illumina7.fastq.gz
Illumina19.fastq.gz                     Illumina33.fastq.gz  Illumina48.fastq.gz  Illumina8.fastq.gz
Illumina1.fastq.gz                      Illumina34.fastq.gz  Illumina49.fastq.gz  Illumina9.fastq.gz
Illumina20.fastq.gz                     Illumina35.fastq.gz  Illumina4.fastq.gz   samplesheet.tsv
Illumina21.fastq.gz                     Illumina36.fastq.gz  Illumina50.fastq.gz
Illumina22.fastq.gz                     Illumina37.fastq.gz  Illumina51.fastq.gz
Illumina23.fastq.gz                     Illumina38.fastq.gz  Illumina52.fastq.gz

Let’s move to the data folder:

cd $HOME/workshop/2025/S1W1/metagenomics/data

Now we are ready for the next excercise downloading public data from the European Nucleotide Archive (ENA) https://www.ebi.ac.uk/ena/browser/view/PRJEB28612

Initial requirements

Interactive HPC session

Create working directories

Copy scripts for excercises

Copy public data

Next page