Visualising VCF files in IGV

Goal:

Merging VCF files generated by the nextflow consgenome pipeline

The location of the VCF files in the consgenome pipeline can be found at:

cd /results/variantcalling/

Merging multiple VCF files

We will use 'bcftools' to merge VCF files. Initially, we need to create a list of VCF files that we plan to merge. For example: while in the above directory do the following:

for i in `ls --color=never *.vcf`; do echo $i; echo $i >> file_list.txt; done

if you do not yet have 'bcftools' installed then run the following command to install it using conda

conda install -c bioconda bcftools

Now let’s create an index file for each VCF file:

Finally, we can now merge all the VCF files by:

uncompress version of merged vcf

Visualising the merged VCF file in IGV

Transfer/copy the following files from the HPC to your laptop/desktop.

  • Merged VCF file

  • Reference template used in the mapping stage: For example reference genome, scaffold, genomic locus, transcript, etc. For example, the location of the reference template is specified in the nextflow.config file used to run the consgenome workflow.

In your laptop/desktop open IGV and then:

  • Genome → Load Genome from file → then select the relevant template

  • File → Load from File → select the merged VCF file

It is possible to add atrtibutes (metadata information) as a separate file to IGV to assess the association of variants with for example control or unhealthy groups. For this create a tab delimited file that has for example the following information

Load the attributes (metadata) file to IGV as follows:

Then, to show the atributes as a separate track do the following:

To select which attributes to show in the IGV browser do the following:

Another useful feature of IGV is to denote regions of interest (ROIs) to be displayed in IGV.

The above will open a small window where one can specify the start and end coordinates of a feature of interest for example: an exon or exons.