Aims:
Implement an end-to-end bioinformatics workflow that is reproducible, robust, scalable and compute infrastructure agnostic
Leverage from the host plant antiviral response pathway to increase sensitivity and specificity of pathogen detections
Prevent or minimise the reporting of cross-sample contaminations owing to index hopping events (false positive detections)
Pre-requisites
Installed conda3 or miniconda3 ( https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html )
Basic unix command line knowledge (example: https://researchcomputing.princeton.edu/education/external-online-resources/linux ; https://swcarpentry.github.io/shell-novice/ )
Familiarity with one unix text editors (example Vi/Vim or Nano):
Have an HPC account on QUT’s lyra. Apply for a new HPC account here.
Install nextflow: Nextflow
Database
Custom virus database, please do not distribute to third parties. Location:
/work/img/databases/
Creating a local blast database
makeblastdb -in test.fasta -parse_seqids -dbtype nucl
Method
The open-source VirReport code is available at https://github.com/eresearchqut/VirReport
1. Fetch a copy of VirReport
Get a copy of the tool:
git clone https://github.com/eresearchqut/VirReport.git
To run VirReport it is required to create an 'index_samples.csv` that specifies the sample ID, path to raw data, minimal length, and the maximum length of reads to be used for diagnosis. For example:
sampleid,samplepath,minlen,maxlen MT212,/work/hia_mt18005/diagnostics/2021/14_RAMACIOTTI_LEL9742-LEL9751/results/06_usable_reads/MT212_21-22bp.fastq,21,22 MT213,/work/hia_mt18005/diagnostics/2021/14_RAMACIOTTI_LEL9742-LEL9751/results/06_usable_reads/MT213_21-22bp.fastq,21,22
You can modify the above template with your own samples.
2. Run VirReport test
An alternative is to clone a copy of the VirReport (above) run the following command that will both download VirReport tool and also run a test:
nextflow run eresearchqut/VirReport -profile singularity --indexfile index_example.csv