...
The nf-core/bactmap workflow requires Nextflow to be installed in your account on the HPC. Find details on how to install and test Nextflow here prepare a nextflow.config file and run a PBS pro submission script for Nextflow pipelines.
...
Usage: https://nf-co.re/bactmap/1.0.0/usage
...
Interactive session on the HPC
Code Block |
---|
qsub -I -S /bin/bash -l walltime=10:00:00 -l select=1:ncpus=2:mem=4gb |
SRA TOOLKIT
Use singularity container to fetch public data on the HPC:
One file at a time:
Code Block |
---|
singularity run docker://ncbi/sra-tools:latest prefetch SRR1198667 |
Code Block |
---|
singularity run docker://ncbi/sra-tools:latest fastq-dump -X 1000000 -I --split-files SRR1198667 |
2. use a list:
Code Block |
---|
singularity run docker://ncbi/sra-tools:latest prefetch --option-file SraAccList.txt |
Code Block |
---|
singularity run docker://ncbi/sra-tools:latest fastq-dump -X 1000000 -I --split-files SRR1198667 |
3.a compress the fastq files
Code Block |
---|
gzip -c filename.fastq > filename.fastq.gz |
3.b alternatively run a loop to compress all fastq files in the folder:
Code Block |
---|
for file in `ls *.fastq`; do echo $file; gzip -c $file > ${file}.gz; done |
Getting Started
Download and run the workflow using a minimal data provided by nf-core/bactmap. We recommend using singularity as the profile for QUT’s HPC. Note: the profile option ‘docker’ is not available on the HPC.
...
Note: at this time, the test profile will fail to run
Running the test - create a 'launch.pbs' script:
Code Block |
---|
#!/bin/bash -l #PBS -N nf-bactmap #PBS -l walltime=24:00:00 #PBS -l select=1:ncpus=1:mem=5gb cd $PBS_O_WORKDIR NXF_OPTS='-Xms1g -Xmx4g' module load java #run test for bactmap nextflow run nf-core/bactmap -profile test,singularity |
submit the job:
Code Block |
---|
qsub launch.pbs |
check the job:
Code Block |
---|
qjobs |
Running the pipeline using custom data
...
When specifying the path to the data files, it is more portable to use absolute paths rather than relative paths.
check if ascii characters were added in your samplesheet.csv file:
Code Block |
---|
cat -A samplesheet.csv |
Creating the samplesheet.csv file using Excel can add ascii characters, run the following command to remove them:
Code Block |
---|
dos2unix samplesheet.csv |
Preparing to run on the HPC
...
Code Block |
---|
#!/bin/bash -l #PBS -N bactmap02 #PBS -l select=1:ncpus=2:mem=4gb #PBS -l walltime=24:00:00 cd $PBS_O_WORKDIR module load java NXF_OPTS='-Xms1g -Xmx4g' nextflow run nf-core/bactmap \ --input samplesheet.csv \ --reference chromosome.fasta \ -profile singularity \ --trim reads --trim \ #trim reads --remove_recombination \ #remove recombination using gubbins --rapidnj \ #build a RapidNJ tree --fasttree \ #build a RapidNJ tree --iqtree \ #build an IQ-TREE tree --raxmlng #build a RAxML-NG tree |
...