Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

Nextflow is a pipeline engine that can take advantage of the batch nature of the HPC environment to efficiently and quickly run Bioinformatic workflows.

For more information about Nextflow, please visit Nextflow - A DSL for parallel and scalable computational pipelines

The ampliseq pipeline is a bioinformatics analysis pipeline used for 16S rRNA amplicon sequencing data.

https://nf-co.re/ampliseq

https://github.com/nf-core/ampliseq

  1. Install NextFlow locally

  2. PBS session: qsub -I -S /bin/bash -l walltime=168:00:00 -l select=1:ncpus=4:mem=8gb

  3. tmux

  4. module load java

  5. Test NextFlow: nextflow run nf-core/ampliseq -profile test,singularity --metadata "Metadata.tsv"

params {
classifier = "GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-gg_13_8-85-qiime2_2019.7-classifier.qza"
metadata = "Metadata.tsv"
readPaths = [
['1_S103', ['1_S103_L001_R1_001.fastq.gz', '1_S103_L001_R2_001.fastq.gz']],
['1a_S103', ['1a_S103_L001_R1_001.fastq.gz', '1a_S103_L001_R2_001.fastq.gz']],
['2_S115', ['2_S115_L001_R1_001.fastq.gz', '2_S115_L001_R2_001.fastq.gz']],
['2a_S115', ['2a_S115_L001_R1_001.fastq.gz', '2a_S115_L001_R2_001.fastq.gz']]
]
}

6. Run Nextflow tower

7.

8 Stepping past test dataset to Mahsa’s data:

Q for Craig:

Do we need to add any of this to .nextflow/config file? Perhaps just for Tower?

process {
executor = 'pbspro'
scratch = 'true'
beforeScript = {
"""
mkdir -p /data1/whatmorp/singularity/mnt/session
source $HOME/.bashrc
source $HOME/.profile
"""
}
}

singularity {
cacheDir = '/home/whatmorp/NXF_SINGULARITY_CACHEDIR'
autoMounts = true
}
conda {
cacheDir = '/home/whatmorp/NXF_CONDA_CACHEDIR'
}

tower {
accessToken = 'c7f8cc62b24155c0150a6ce4b6db15946bfc19ef'
endpoint = 'https://nftower.qut.edu.au/api'
enabled = true
}

Installing NextFlow

Follow the eResearch wiki entry for installing NextFlow:

...

If you haven’t been set up or have used the HPC previously, click on this link for information on how to get access to and use the HPC:

Need a link here for HPC access and usage 

Creating a shared workspace on the HPC

...

To request a node using PBS, submit a shell script containing your RAM/CPU/analysis time requirements and the code needed to run your analysis. For an overview of submitting a PBS job, see here:

Need a link here for creating PBS jobs

Alternatively, you can start up an ‘interactive’ node, using the following:

...