Goal:
Merge multiple VCF files generated using the consgenome workflow for visualisation in IGV https://software.broadinstitute.org/software/igv/
Use IGV to inspect common and unique variants to the group(s) of processed samples
Merging VCF files generated by the nextflow consgenome pipeline
The location of the VCF files in the consgenome pipeline can be found at:
cd /results/variantcalling/
Merging multiple VCF files
We will use 'bcftools' to merge VCF files. Initially, we need to create a list of VCF files that we plan to merge. For example: while in the above directory do the following:
for i in `ls --color=never *.vcf`; do echo $i; echo $i >> file_list.txt; done
if you do not yet have 'bcftools' installed then run the following command to install it using conda
conda install -c bioconda bcftools
Now let’s create an index file for each VCF file:
for i in `cat file_list.txt`; do echo $i; bcftools index $i; done
Finally, we can now merge all the VCF files by:
bcftools merge -l file_list.txt -0 -Oz -o merged_ALL.vcf.gz
Visualising the merged VCF file in IGV
Transfer/copy the following files from the HPC to your laptop/desktop.
Merged VCF file
Reference template used in the mapping stage: For example reference genome, scaffold, genomic locus, transcript, etc. For example, the location of the reference template is specified in the nextflow.config file used to run the consgenome workflow.
In your laptop/desktop open IGV and then:
Genome → Load Genome from file → then select the relevant template
File → Load from File → select the merged VCF file
It is possible to add atrtibutes (metadata information) as a separate file to IGV to assess the association of variants with for example control or unhealthy groups. For this create a tab delimited file that has for example the following information
sampleID group AT001 control AT002 control AT003 control AT004 control AT005 unhealthy AT006 unhealthy AT007 unhealthy AT008 unhealthy
Load the attributes (metadata) file to IGV as follows:
File --> Load from File --> select the metadata file (i.e., metadata.txt)
Then, to show the atributes as a separate track do the following:
View --> Show Attribute Display
To select which attributes to show in the IGV browser do the following:
View --> Select Attributes to Show
Another useful feature of IGV is to denote regions of interest (ROIs) to be displayed in IGV.
Regions --> Regions Navigator
The above will open a small window where one can specify the start and end coordinates of a feature of interest for example: an exon or exons.