Page Comparison

...

Aim:

This pipeline uses raw Oxford Nanopore (ONT) data to run the following processes:

De novo genome assemebly of ONT reads using flye https://github.com/fenderglass/Flye
Sequence comparison of assembled genome to a provide Reference Genome;
Nano-Q: conservatively cleaning ONT reads from bam files and estimate variant frequencies https://github.com/PrestonLeung/Nano-Q

Prerequisites:

Install Nextflow using the following User Guide: Nextflow

Inputs:

ONT raw data in FASTQ format (compressed) - if multiple FASTQ.gz files are available for the same sample all need to be in the same folder. DO NOT place raw files for different samples in the same folder.
Index file that provide information of the ONT data including: Sample ID, location of ONT raw files and a reference genome:

Code Block
sampleid,sample_files,reference NC483,/mnt/work/phylo/OxfordNanopore/NC483_barcode96/*.fastq.gz,/mnt/work/phylo/OxfordNanopore/NC483_NC001477_reference_sequence.fasta

The index file (i.e., index.csv) can contain one or multiple samples information, one per line:

Code Block

sampleid,sample_files,reference
ET300,/mnt/work/phylo/OxfordNanopore/ET300_barcode95/*.fastq.gz,/mnt/work/phylo/OxfordNanopore/ET300_MT921572_reference_sequence.fasta
NC483,/mnt/work/phylo/OxfordNanopore/NC483_barcode96/*.fastq.gz,/mnt/work/phylo/OxfordNanopore/NC483_NC001477_reference_sequence.fasta

Running the ONTprocessing nextflow pipeline:

Prepare a PBS pro submission script to run the ONTprocessing pipeline. An example launch.pbs script is the following:

Code Block

#!/bin/bash -l
#PBS -N ontprocessing
#PBS -l select=1:ncpus=2:mem=4gb
#PBS -l walltime=24:00:00

#Use the current directory to run the workflow
cd $PBS_O_WORKDIR

module load java
NXF_OPTS='-Xms1g -Xmx4g'

nextflow run eresearchqut/ontprocessing --samplesheet index.csv

Create a folder where you analyses will be run, and place a copy of both launch.pbs and index.csv in the same folder. The submit the job to the HPC cluster:

...

Monitor progress of the job:

Code Block
qjobs

Outputs:

Find in a subfolder called ‘work’ all the intermediate files generating while running the ONTprocessing workflow - this folder is not typically used to assess the outputs but rather to debug any issues with the pipeline. Key intermediate files or outputs for specific ‘processes’ (analyses tasks) are saved to a subfolder called ‘results’ where data for the following analyses are presented:

...

Versions Compared

Old Version 1

New Version 2

Key

Aim:

Prerequisites:

Inputs:

Running the ONTprocessing nextflow pipeline:

Outputs:

Page Comparison

Versions Compared

Old Version 1

New Version 2

Key

<span class="diff-html-changed" data-a11y-before="Start of changed content" data-a11y-after="End of changed content" id="changed-diff-0">[data-colorid=</span>

Prerequisites:

Inputs:

Running the ONTprocessing nextflow pipeline:

Outputs: