ONT de novo genome assembly

Aim:

Implement a de novo assembly pipeline for ONT data

Source:

https://nanoporetech.com/sites/default/files/s3/literature/Bacterial-assembly-workflow.pdf

https://github.com/fenderglass/Flye/blob/flye/docs/INSTALL.md

Methodology

a) create a conda environment with Flye and its dependencies

Prepare a file called ‘environment.yml’ that contains the following information:

name: flye channels: - defaults - anaconda - bioconda - conda-forge dependencies: - python=3.7 - flye=2.9.1 - seqkit=2.3.1 - blast=2.13.0 - nanoq=0.9.0 - minimap2 - samtools

Run the following command to generate the ‘flye’ conda environment:

conda env create -f environment.yml

Activate the environment to access the installed tools:

conda activate flye

merge FASTQ files

get path of merged file

modify launch*.pbs file

 

b) Run a de novo assembly and sequence comparison against a Reference genome

submit the job

 

monitor progress of the assembly: