/
25S1W1 - 2. Initial setup

25S1W1 - 2. Initial setup

Overview

Initial requirements

To be able to run these exercises, you’ll need:

  • A HPC account

  • Nextflow installed on the HPC

  • PuTTy installed on your local computer

  • Access your HPC home directory from your PC

 

Instructions for getting a HPC account are here: https://qutvirtual4.qut.edu.au/group/staff/research/conducting/facilities/advanced-research-computing-storage/supercomputing/getting-started-with-hpc

 

If you haven’t installed Nextflow, follow the instructions in this link: 25S1W1 - 1. Getting started with Nextflow

 

You’ll need a Terminal (Mac Users) or PuTTY (Windows users) on your PC to access the HPC.

You can download PuTTY from here: https://the.earth.li/~sgtatham/putty/latest/w64/putty.exe

Then add the HPC (Lyra) address: aqua.qut.edu.au and then click ‘open’.

image-20240527-223342.png

 

Setup Windows File Explorer to access your HPC home account. Follow the instructions here:

https://qutvirtual4.qut.edu.au/group/staff/research/conducting/facilities/advanced-research-computing-storage/supercomputing/using-hpc-filesystems

Interactive HPC session

Open a Terminal (Mac users) or PuTTy (Windows users) and paste the text below into the command prompt to start an Interactive Session:

qsub -I -S /bin/bash -l walltime=4:00:00 -l select=1:ncpus=4:mem=8gb

It should take less than a minute for the interactive session. The interactive session will allow

Create working directories

We’ll be analysing Illumina amplicon data, so first we need to create the workshop directories in your home drive on the HPC. Copy and paste the following into PuTTy or Terminal:

To make sure you are in your home directory, run the following command:

cd $HOME

Next, let’s create working folders for today's exercises:

mkdir -p $HOME/workshop/2025/S1W1/metagenomics/scripts mkdir -p $HOME/workshop/2025/S1W1/metagenomics/runs mkdir -p $HOME/workshop/2025/S1W1/metagenomics/runs/run1_test mkdir -p $HOME/workshop/2025/S1W1/metagenomics/runs/run2_ampliseq mkdir -p $HOME/workshop/2025/S1W1/metagenomics/data/illumina mkdir -p $HOME/workshop/2025/S1W1/metagenomics/data/mydata cd $HOME/workshop/2025/S1W1/metagenomics

Copy scripts for excercises

Let’s now copy the scripts for today’s workshop:

cp /work/training/2025/S1W1/session3_metagenomics/scripts/* $HOME/workshop/2025/S1W1/metagenomics/scripts

Check the list of scripts:

ls -l $HOME/workshop/2025/S1W1/metagenomics/scripts
├── create_samplesheet_nfcore_ampliseq.py ├── launch_nfcore_ampliseq_illumina.pbs ├── launch_nfcore_ampliseq_test.pbs └── samplesheet.tsv

Copy public data

Now let’s copy previously downloaded Illumina amplicon data:

cp /work/training/2025/S1W1/session3_metagenomics/data/illumina/* $HOME/workshop/2025/S1W1/metagenomics/data/illumina

This can take a couple of minutes. To check that you have copied the data you can do the following:

ls $HOME/workshop/2025/S1W1/metagenomics/data/illumina
create_samplesheet_nfcore_ampliseq.py* Illumina24.fastq.gz Illumina39.fastq.gz Illumina53.fastq.gz Illumina10.fastq.gz Illumina25.fastq.gz Illumina3.fastq.gz Illumina54.fastq.gz Illumina11.fastq.gz Illumina26.fastq.gz Illumina40.fastq.gz Illumina55.fastq.gz Illumina12.fastq.gz Illumina27.fastq.gz Illumina41.fastq.gz Illumina56.fastq.gz Illumina13.fastq.gz Illumina28.fastq.gz Illumina42.fastq.gz Illumina57.fastq.gz Illumina14.fastq.gz Illumina29.fastq.gz Illumina43.fastq.gz Illumina58.fastq.gz Illumina15.fastq.gz Illumina2.fastq.gz Illumina44.fastq.gz Illumina59.fastq.gz Illumina16.fastq.gz Illumina30.fastq.gz Illumina45.fastq.gz Illumina5.fastq.gz Illumina17.fastq.gz Illumina31.fastq.gz Illumina46.fastq.gz Illumina6.fastq.gz Illumina18.fastq.gz Illumina32.fastq.gz Illumina47.fastq.gz Illumina7.fastq.gz Illumina19.fastq.gz Illumina33.fastq.gz Illumina48.fastq.gz Illumina8.fastq.gz Illumina1.fastq.gz Illumina34.fastq.gz Illumina49.fastq.gz Illumina9.fastq.gz Illumina20.fastq.gz Illumina35.fastq.gz Illumina4.fastq.gz samplesheet.tsv Illumina21.fastq.gz Illumina36.fastq.gz Illumina50.fastq.gz Illumina22.fastq.gz Illumina37.fastq.gz Illumina51.fastq.gz Illumina23.fastq.gz Illumina38.fastq.gz Illumina52.fastq.gz

Let’s move to the data folder:

cd $HOME/workshop/2025/S1W1/metagenomics/data

Now we are ready for the next excercise downloading public data from the European Nucleotide Archive (ENA) ENA Browser

Next page

25S1W1 - 3. Download public metagenomics data from ENA

Related content