Table of Contents |
---|
...
You should see several output directories and files have been created in your ‘ampliseq_test’ directory. These contain the test analysis results. Have a look through these, as they are similar to the output from a full ampliseq run (i.e. on your dataset).
Need instructions on setting up NextFlow tower
Q for Craig:
Do we need to add any of this to .nextflow/config file? Perhaps just for Tower?
process {
executor = 'pbspro'
scratch = 'true'
beforeScript = {
"""
mkdir -p /data1/whatmorp/singularity/mnt/session
source $HOME/.bashrc
source $HOME/.profile
"""
}
}
...
Code Block |
---|
nextflow run nf-core/ampliseq -profile singularity --manifest manifest.txt |
Block: takes a very long time and often fails during classifier step.
Solution: run classifier manually with QIIME and point to output file in nextflow.config
https://docs.qiime2.org/2020.11/tutorials/feature-classifier/
This is the NexFlow script for the job:
Code Block |
---|
export HOME="${PWD}/HOME"
unzip -qq Silva_132_release.zip
fasta="SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna"
taxonomy="SILVA_132_QIIME_release/taxonomy/16S_only/99/consensus_taxonomy_7_levels.txt"
if [ "false" = "true" ]; then
sed 's/#//g' $taxonomy >taxonomy-99_removeHash.txt
taxonomy="taxonomy-99_removeHash.txt"
echo "
######## WARNING! The taxonomy file was altered by removing all hash signs!"
fi
### Import
qiime tools import --type 'FeatureData[Sequence]' --input-path $fasta --output-path ref-seq-99.qza
qiime tools import --type 'FeatureData[Taxonomy]' --input-format HeaderlessTSVTaxonomyFormat --input-path $taxonomy --output-path ref-taxonomy-99.qza
#Extract sequences based on primers
qiime feature-classifier extract-reads --i-sequences ref-seq-99.qza --p-f-primer TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG --p-r-primer GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC --o-reads TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC-99-ref-seq.qza --quiet
#Train classifier
qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC-99-ref-seq.qza --i-reference-taxonomy ref-taxonomy-99.qza --o-classifier TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC-99-classifier.qza --quiet |
The input fast and taxonomy files the above script are pointing to are in the reference database file, Silva_132_release.zip
So to generate the classifier file manually:
Code Block |
---|
# Install QIIME2 using conda (only need to do this once per user)
# https://docs.qiime2.org/2019.10/install/native/
wget https://data.qiime2.org/distro/core/qiime2-2019.10-py36-linux-conda.yml
conda env create -n qiime2-2019.10 --file qiime2-2019.10-py36-linux-conda.yml
conda activate qiime2-2019.10
# You can test the QIIME2 installation by running: qiime --help
# Make sure you are in an interactive PBS session with sufficient RAM (lots of RAM needed)
qsub -I -S /bin/bash -l walltime=72:00:00 -l select=1:ncpus=16:mem=256gb
# Unzip Silva_132_release.zip to access the required databases
unzip Silva_132_release.zip
|
Running analysis on QUTs HPC
...
If you haven’t been set up or have used the HPC previously, click on this link for information on how to get access to and use the HPC:
Need a link here for HPC access and usage
Creating a shared workspace on the HPC
...
To request a node using PBS, submit a shell script containing your RAM/CPU/analysis time requirements and the code needed to run your analysis. For an overview of submitting a PBS job, see here:
Need a link here for creating PBS jobs
Alternatively, you can start up an ‘interactive’ node, using the following:
...