/
Deepvariant analysis using ONT data

Deepvariant analysis using ONT data

Aim:

Identify sequence variants using the outputs from the nf-eresearch/ONTprocessing - NextFlow pipeline for Oxford Nanopore de novo assembly and ref guided consensus. Generated minimap2 alignments are processed using https://github.com/kishwarshafin/pepper to identify highly reliable sequence variants (i.e., SNPs).

Preparing a samplesheet.csv file

The nextflow ‘eresearch/deepvariant’ pipeline requires a sample metadata file that specifies: sample ID, BAM alignment, BAI index, and genome reference. For example:

sampleid,sample_files,sample_files_index,reference NC483,/ontprocessing/NC483/run1/results/samtools/NC483_aln.sorted.bam,/ontprocessing/NC483/run1/results/samtools/NC483_aln.sorted.bam.bai,/data/ref/NC483_NC001477_reference_sequence.fasta

Use the following script (i.e., called run_create_deepvariant_index.sh) to generate the index file. Note: modify the sample ID and reference sequence location as appropriate.

#/bin/bash #eResearch,QUT #Usage: ./run_create_deepvariant_index.sh ######################################################################################## SAMPLEID='NC483' BAM=`readlink -f ./results/samtools/*.bam` BAI=`readlink -f ./results/samtools/*.bam.bai` REF='/mnt/work/phylo/OxfordNanopore/NC483_NC001477_reference_sequence.fasta' ######################################################################################## #header for index file echo "sampleid,sample_files,sample_files_index,reference" > header #create sample metadata awk -v sampleid2="$SAMPLEID" -v bam2="$BAM" -v bai2="$BAI" -v ref2="$REF" '{print sampleid2 "," bam2 "," bai2 "," ref2}' header > index_deepvariant #merge header and location of files cat header index_deepvariant > index_deepvariant.csv #remove intermediate files rm header index_deepvariant

Run the above script from within the ‘ONTprocessing’ folder for the sample of interest, just outside the ‘results’ and ‘work’ folders. Once all the variables have been adjusted, run the following command:

./run_create_deepvariant_index.sh .

Check that the index file has been properly generated.

Running the ‘deepvariant’ analysis

Create a folder for the deepvariant analysis and copy the ‘index_deepvariant.csv’ file to it.

Prepare the following PBS Pro script to run the ‘deepvariant’ analysis using the minimap2 BAM files produced by the ‘ONTprocessing’ pipeline.

Create a folder where you analyses will be run, and place a copy of both launch.pbs and index.csv in the same folder. The submit the job to the HPC cluster:

Monitor progress of the job:

 

 

 

 

 

Related content

nf-eresearch/ONTprocessing - NextFlow pipeline for Oxford Nanopore de novo assembly and ref guided consensus
nf-eresearch/ONTprocessing - NextFlow pipeline for Oxford Nanopore de novo assembly and ref guided consensus
More like this
ConsGenome: A Virus Genome Assembly, Variant Calling and building a Consensus Genome workflow
ConsGenome: A Virus Genome Assembly, Variant Calling and building a Consensus Genome workflow
Read with this
Version 3.1.1 - WES variant analysis
Version 3.1.1 - WES variant analysis
More like this
nf-core/hic: Analysis of Chromosome Conformation Capture data (Hi-C)
nf-core/hic: Analysis of Chromosome Conformation Capture data (Hi-C)
Read with this
MAF - adding allele frequency to VCF files
MAF - adding allele frequency to VCF files
More like this
nf-core/bactmap: A mapping-based pipeline for creating a phylogeny from bacterial whole genome sequences
nf-core/bactmap: A mapping-based pipeline for creating a phylogeny from bacterial whole genome sequences
Read with this