Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
stylenone

View file
nameIntro_To_NextflowQUT20240929.pptx

This instructional material was originally developed by Maely Gauthier in 2024 as part of the QUT eResearch infrastructure. It is free to distribute but we just require that you acknowledge eResearch for any outputs (e.g. training, presentation slides, publications) that might result from using this training material.

Some sections of this course were adapted from the Carpentry course: https://carpentries-incubator.github.io/workflows-nextflow/.

Aims

  • Learn what is Nextflow

  • Install and configure Nextflow

  • Find pipelines on repositories (e.g. nf-core and epi2me)

  • Run pipelines using either the command line or a PBS script

  • Understand input and parameter specifications

  • Understand the concept of caching and the resume function

  • Understand how Nextflow pipelines output results

What will be covered during the workshop

...

  • 1. Getting started with Nextflow

    • What is Nextflow?

    • Installing Nextflow

    • Nextflow’s base configuration

  • 2. Nextflow pipeline repositories

    • nf-core

      • What is nf-core?

      • What are nf-core pipelines?

      • Searching for available nf-core pipelines

      • nf-core support

    • epi2me workflows

  • 3. Running pipelines

    • Fetching pipeline code

    • Software requirements for pipelines

    • Install and test that the pipeline installed successfully

      • From the command line

      • Launching Nextflow using a PBS script

  • 4. Input specifications

    • Samplesheet input

      • Examples of samplesheets

      • Exercise 1

      • Exercise 2

    • Input folder

  • 5. Parameters

    • Finding list of parameters available

      • Exercise 1

    • Specifying parameters on the command line

  • 6. Nextflow caching

    • Resume option

    • Structure of work folder

      • Task execution directory

      • Specifying another work directory

      • Clean the work directory

  • 7. Nextflow pipeline outputs and PBS outputs

    • Results folder

    • Nextflow log, metrics and reports

    • PBS output

  • 8. Where to from now?

Prerequisites

You will require a basic knowledge of Linux/Unix commands to be able to participate effectively in this workshop. If you don’t, please attend the following training [Introduction to HPC].

7. Nextflow pipeline outputs and PBS outputs

Results folder

The results are output in the folder name specified in the .nexftlow.config file under the outdir parameter. It is generally set to be results.

Code Block
// nextflow.config
params {
  outdir = 'results'
}

Nextflow log, metrics and reports

By default, Nextflow will create a log file in the working directory called .nextflow.log. This file is hidden but you can see it using the command:

Code Block
ls -a

If you rerun the pipeline in the same folder, the previous .nextflow.log will be renamed .nextflow.log.1 and a new .nextflow.log will be generated.

You can change the default location by specifying a different location

Code Block
nextflow -log ~/code/nextflow.log run  

PBS output

8. Where to from now?

Nextflow offers free Fundamentals Training: https://training.nextflow.io/basic_training/

...

For this workshop we assume participants have either attended the first 2 workshops, reviewed the materials provided in these workshops (if unable to attend) and are comfortable with it, or are already using the HPC.

You can watch some videos that go overt the basics: https://mediahub.qut.edu.au/media/t/0_d0bsv333

Initial requirements

 

To be able to run these exercises, you’ll need:

  1. A HPC account

  2. PuTTy installed on your local computer

  3. Access your HPC home directory from your PC

 

Instructions for getting a HPC account are here: https://qutvirtual4.qut.edu.au/group/staff/research/conducting/facilities/advanced-research-computing-storage/supercomputing/getting-started-with-hpc

 

You’ll need PuTTY on your PC to access the HPC.

You can download PuTTY from here: https://the.earth.li/~sgtatham/putty/latest/w64/putty.exe

Then add the HPC (Lyra) address: lyra.qut.edu.au and then click ‘open’.

...

 

Setup Windows File Explorer to access your HPC home account. Follow the instructions here:

https://qutvirtual4.qut.edu.au/group/staff/research/conducting/facilities/advanced-research-computing-storage/supercomputing/using-hpc-filesystems