Aim:
Implement a de novo assembly pipeline for ONT data
Source:
https://nanoporetech.com/sites/default/files/s3/literature/Bacterial-assembly-workflow.pdf
https://github.com/fenderglass/Flye/blob/flye/docs/INSTALL.md
Methodology
a) create a conda environment with Flye and its dependencies
Create a 'python 3.7' environment called flye
conda create --name flye python=3.7
activate the conda environment
conda activate flye
install flye
conda install flye
b) Convert FASTQ to FASTA
conda install -c bioconda seqkit
c) install blast
name: flye channels: - defaults - anaconda - bioconda - conda-forge dependencies: - python=3.7 - flye=2.9.1 - seqkit=2.3.1 - blast=2.13.0
Run the following command to generate the ‘nanoQ’ conda environment:
conda env create -f environment.yml
Activate the environment to access the installed tools:
conda activate flye
c) Run a de novo assembly test run
#!/bin/bash -l #PBS -N FlyeAssembly #PBS -l walltime=24:00:00 #PBS -l mem=16gb #PBS -l ncpus=8 cd $PBS_O_WORKDIR ONTDATA='/path/to/ONT/reads' REF='ET300_MT921572_reference_sequence.fasta' conda activate flye #run assembly flye --nano-raw $ONTDATA --out-dir out_nano --threads 8