Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
stylenone

...

  • Nextflow is a free and open-source pipeline management software that enables scalable and reproducible scientific workflows. It allows the adaptation of pipelines written in the most common scripting languages.

  • Key features of Nextflow:

    • Reproducible → version control and use of containers ensure the reproducibility of nextflow pipelines

    • Portable → compute agnostic (i.e., HPC, cloud, desktop)

    • Scalable → run from a single to thousands of samples

    • Minimal digital literacy → accessible to anyone

    • Active global community → more and more nextflow pipelines are available (i.e., https://nf-co.re/pipelines )

...

To install Nextflow, copy and paste the following block of code into your terminal (i.e., PuTTy that is already connected to the terminal) and hit 'enter':

Code Block
module load java
curl -s https://get.nextflow.io | bash
mv nextflow $HOME/bin
  • Line 1: The module load command is necessary to ensure java is available

  • Line 2: This command downloads and assembles the parts of nextflow - this step might take some time.

  • Line 3: When finished, the nextflow binary will be in the current folder so it should be moved to your “bin” folder” so it can be found later.

  • Line 5: Make a temporary folder for Nextflow to create files when it runs.

  • Line 6: Verify Nextflow is working.

  • Lines 7 and 8: Clean up

To verify that Nextflow is installed properly, you can run locally a simple Nextflow pipeline called Hello:

...

Code Block
[[ -d $HOME/.nextflow ]] || mkdir -p $HOME/.nextflow
cat <<EOF > $HOME/.nextflow/config
singularity {
    cacheDir = '$HOME/.nextflow/NXF_SINGULARITY_CACHEDIR'
    autoMounts = true
}
conda {
    cacheDir = '$HOME/.nextflow/NXF_CONDA_CACHEDIR'
}
process {
  executor = 'pbspro'
  scratch = false
  cleanup = false
}
EOF
  • Line 1: Check if a .nextflow/config file already exists in your home directory. Create it if it does not exist

  • Line 2-15: Using the cat command, paste text in the newly created .nextflow/config file which specifies the cache location for your singularity and conda.

  • What are the parameters you are setting?

  • Line 3-6 set the directory where remote Singularity images are stored and direct Nextflow to automatically mount host paths in the executed container.

  • Line 7-9 set the directory where Conda environments are stored.

  • Line 10-14 sets default directives for processes in your pipeline. Note that the executor is set to pbspro on line 11.

More in depth information on Nextflow configuration is described here: https://www.nextflow.io/docs/latest/config.html.

...

The pull command allows you to download the latest version of a project from a GitHub repository or to update it if that repository had already has previously been downloaded in your home directory.

Code Block
nextflow pull nf-core/<PIPELINE><pipeline>

Please do not run the command below, but note that Nextflow would also automatically fetch the pipeline code when you run the command below for the first time:

...

Downloaded pipeline projects are stored in the folder $HOME/.nextflow/assets in your computer.

Software requirements for pipelines

Nextflow pipeline software dependencies are specified using either Docker, Singularity or Conda. It is Nextflow that handles the downloading of containers and creation of conda environments. This is set using the -profile {docker,singularity,conda} parameter when you run Nextflow.

At QUT, we use singularity so we would specify: -profile singularity.

...

Code Block
cd
nextflow run nf-core/smrnaseq -profile test,singularity --outdir results -r 2.1.0

This will download the smrnaseq pipeline and then run the test code. It should take ~20-30 minutes to run to completion.

It will fist first display the version of the pipeline which was downloaded: version 2.1.0

It will then list all the parameters that differ from the pipeline default.

...

Before running a process, it will download the required simgularity singularity image.

By running the nexflow Nexflow pipeline on the command line, the progress of the analysis is captured in real-life.

...