ONT de novo genome assembly
Aim:
Implement a de novo assembly pipeline for ONT data
Source:
https://nanoporetech.com/sites/default/files/s3/literature/Bacterial-assembly-workflow.pdf
https://github.com/fenderglass/Flye/blob/flye/docs/INSTALL.md
Methodology
a) create a conda environment with Flye and its dependencies
Prepare a file called ‘environment.yml’ that contains the following information:
name: flye
channels:
- defaults
- anaconda
- bioconda
- conda-forge
dependencies:
- python=3.7
- flye=2.9.1
- seqkit=2.3.1
- blast=2.13.0
- nanoq=0.9.0
- minimap2
- samtools
Run the following command to generate the ‘flye’ conda environment:
conda env create -f environment.yml
Activate the environment to access the installed tools:
conda activate flye
merge FASTQ files
get path of merged file
modify launch*.pbs file
b) Run a de novo assembly and sequence comparison against a Reference genome
submit the job
monitor progress of the assembly: