...
For more information about Nextflow, please visit Nextflow - A DSL for parallel and scalable computational pipelines
Installing Nextflow
Nextflow is meant to run from your home folder on a Linux machine like the HPC.
...
Line 1: The module load command is necessary to ensure java is available
Line 2: This command downloads and assembles the parts of nextflow - this step might take some time.
Line 3: When finished, the nextflow binary will be in the current folder so it should be moved to your “bin” folder” so it can be found later.
Line 5: Make a temporary folder for Nextflow to create files when it runs.
Line 6: Verify Nextflow is working.
Lines 7 and 8: Clean up
Nextflow’s Default Configuration
Once Nextflow is installed, there are some settings that should be applied to take advantage of the HPC environment. Nextflow has a hierarchy of configuration files, where the base configuration that is applied to every workflow you run is here:
...
Code Block |
---|
[[ -d $HOME/.nextflow ]] || mkdir -p $HOME/.nextflow cat <<EOF > $HOME/.nextflow/config singularity { cacheDir = '$HOME/.nextflow/NXF_SINGULARITY_CACHEDIR' autoMounts = true } conda { cacheDir = '$HOME/.nextflow/NXF_CONDA_CACHEDIR' } process { executor = 'pbspro' scratch = false cleanup = false } EOF |
Preparing Data
Typically, the pipeline you want to run will want the data prepared in a particular way. You can check the pipeline’s help or website for a guide. Accessing help is typically:
...
Some pipelines may need file names, and others may want a CSV file with file names, paths to raw data, and other information.
Running Nextflow
When you run Nextflow, it is a good idea to create a folder for the run - this keeps all the files separate and easy to manage.
...
To see the output of Nextflow while running as a job, you can use the Nextflow Tower.
Using the Nextflow Tower
Nextflow Tower allow monitoring of Nextflow runs. To use the NFTower, please visit
...