Overview
Nextflow is a pipeline engine that can take advantage of the batch nature of the HPC environment to efficiently and quickly run Bioinformatic workflows.
...
8 On to testing Mahsa’s data…:
Downloaded Silva Db from: https://www.arb-silva.de/fileadmin/silva_databases/qiime/Silva_132_release.zip
Nextflow.config params:
params {
max_cpus=32
max_memory=512.GB
max_time = 48.h
input="/work/rumen_16S/nextflow/16S/fastq"
extension="/*.fastq.gz"
FW_primer = "TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG"
RV_primer = "GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC"
metadata = "metadata.txt"
reference_database = "/home/whatmorp/Annette/16S/Silva_132_release.zip"
retain_untrimmed = true
}
Pointing to the place I DL Silva DB to originally
Pointing to Mahsa’s fastq files dir
metadata.txt file and manifest.tst files already created (need to add details here)
Running with
Code Block |
---|
nextflow run nf-core/ampliseq -profile singularity --manifest manifest.txt |
Q for Craig:
Do we need to add any of this to .nextflow/config file? Perhaps just for Tower?
process {
executor = 'pbspro'
scratch = 'true'
beforeScript = {
"""
mkdir -p /data1/whatmorp/singularity/mnt/session
source $HOME/.bashrc
source $HOME/.profile
"""
}
}
...
If you haven’t been set up or have used the HPC previously, click on this link for information on how to get access to and use the HPC:
Need a link here for HPC access and usage
Creating a shared workspace on the HPC
...
To request a node using PBS, submit a shell script containing your RAM/CPU/analysis time requirements and the code needed to run your analysis. For an overview of submitting a PBS job, see here:
Need a link here for creating PBS jobs
Alternatively, you can start up an ‘interactive’ node, using the following:
...