Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Aims:

  • Install Nextflow in your HPC environment

  • Run a test Nextflow task to verify the installation process

Background

What is nextflow?

  • Nextflow is a free and open-source pipeline management software that enables scalable and reproducible scientific workflows. It allows the adaptation of pipelines written in the most common scripting languages.

  • Key features of Nextflow:

    • Reproducible → version control and use of containers ensure the reproducibility of nextflow pipelines

    • Portable → compute agnostic (i.e., HPC, cloud, desktop)

    • Scalable → run from a single to thousands of samples

    • Minimal digital literacy → accessible to anyone

    • Active global community → more and more nextflow pipelines are available (i.e., https://nf-co.re/pipelines )

...

A few commands can install Nextflow. Copy and paste the following block of code into your terminal (i.e., PuTTy that is already connected to the terminal) and hit 'enter'

Code Block
module load java
curl -s https://get.nextflow.io | bash
mv nextflow $HOME/bin
#verify Nextflow is installed
mkdir $HOME/nftemp && cd $HOME/nftemp
nextflow run hello
#check for output of running the short nextflow hello pipeline
cd $HOME
rm -rf nftemp
  • Line 1: The module load command is necessary to ensure java is available

  • Line 2: This command downloads and assembles the parts of nextflow - this step might take some time.

  • Line 3: When finished, the nextflow binary will be in the current folder so it should be moved to your “bin” folder” so it can be found later.

  • Line 5: Make a temporary folder for Nextflow to create files when it runs.

  • Line 6: Verify Nextflow is working.

  • Lines 7 and 8: Clean up

You should see something like this:

...

Code Block
[[ -d $HOME/.nextflow ]] || mkdir -p $HOME/.nextflow
cat <<EOF > $HOME/.nextflow/config
singularity {
    cacheDir = '$HOME/home/.nextflowbarrero/NXF_SINGULARITY_CACHEDIR'
    autoMounts = true
    enabled = true
    runOptions = '-B /data1'
}

conda {
    cacheDir = '$HOME/home/.nextflowbarrero/NXF_CONDA_CACHEDIR'
}

process {
    executor = 'pbspro'
    scratch = true
 false   cleanup = false
}

includeConfig '/work/datasets/reference/nextflow/qutgenome.config' 

plugins {
    id 'nf-validation@1.1.1' // Validation of pipeline parameters and creation of an input channel from a samplesheet
    id 'nf-prov@1.2.1'       // Provenance reports for pipeline runs
}
EOF

check the creation of the folders:

...