2024-S2 eResearch - Session 3: Introduction to Nextflow
This instructional material was originally developed by Maely Gauthier in 2024 as part of the QUT eResearch infrastructure. It is free to distribute but we just require that you acknowledge eResearch for any outputs (e.g. training, presentation slides, publications) that might result from using this training material.
Some sections of this course were adapted from the Carpentry course: https://carpentries-incubator.github.io/workflows-nextflow/.
Aims
Learn what is Nextflow
Install and configure Nextflow
Find pipelines on repositories (e.g. nf-core and epi2me)
Run pipelines using either the command line or a PBS script
Understand input and parameter specifications
Understand the concept of caching and the resume function
Understand how Nextflow pipelines output results
What will be covered during the workshop
1. Getting started with Nextflow
What is Nextflow?
Installing Nextflow
Nextflow’s base configuration
2. Nextflow pipeline repositories
nf-core
What is nf-core?
What are nf-core pipelines?
Searching for available nf-core pipelines
nf-core support
epi2me workflows
3. Running pipelines
Fetching pipeline code
Software requirements for pipelines
Install and test that the pipeline installed successfully
From the command line
Launching Nextflow using a PBS script
4. Input specifications
Samplesheet input
Examples of samplesheets
Exercise 1
Exercise 2
Input folder
5. Parameters
Finding list of parameters available
Exercise 1
Specifying parameters on the command line
6. Nextflow caching
Resume option
Structure of work folder
Task execution directory
Specifying another work directory
Clean the work directory
7. Nextflow pipeline outputs
Results folder
Nextflow log, metrics and reports
8. Where to from now?
Prerequisites
You will require a basic knowledge of Linux/Unix commands to be able to participate effectively in this workshop. For this workshop we assume participants have either attended the first 2 workshops, reviewed the materials provided in these workshops (if unable to attend) and are comfortable with it, or are already using the HPC.
You can watch some videos that go overt the basics: https://mediahub.qut.edu.au/media/t/0_d0bsv333
Initial requirements
To be able to run these exercises, you’ll need:
A HPC account
PuTTy installed on your local computer
Access your HPC home directory from your PC
Instructions for getting a HPC account are here: https://qutvirtual4.qut.edu.au/group/staff/research/conducting/facilities/advanced-research-computing-storage/supercomputing/getting-started-with-hpc
You’ll need PuTTY on your PC to access the HPC.
You can download PuTTY from here: https://the.earth.li/~sgtatham/putty/latest/w64/putty.exe
Then add the HPC (Lyra) address: lyra.qut.edu.au and then click ‘open’.
Setup Windows File Explorer to access your HPC home account. Follow the instructions here: