Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

For this exercise we will use the epi2me-labs/wf-human-variation pipeline. Find information on the pipeline at https://labs.epi2me.io/workflows/wf-human-variation/

...

Approximate run time: Variable depending on whether it is targeted sequencing or whole genome sequencing, as well as coverage and the individual analyses requested. For instance, a 90X human sample run (options: --snp --sv --mod --str --cnv --phased --sex male) takes less than 8h with recommended resources.

NOTE: in contrast to the nf-core/sarek pipeline that we used in session 2, the epi2me-labs/wf-human-variation pipeline runs in ‘local’ mode (needs large amount of CPUs and RAM memory), while the nf-core/sarek pipeline will use a ‘pbspro’ mode, where the pipeline will submit individual jobs to the HPC cluster and define the CPUs and memory for each task individually.

...

Copy the script for the exercise:

Code Block
cp /work/traingtraining/ONTvariants/scripts/launch_ONTvariants_epi2me-labs_wfWF-human-variationHV.pbs .

Print the content of the script:

...

Once the pipeline has completed you will see the following set of output files in the ‘results’ folder:

Code Block
.
├── ├── DEMOexecution
│   ├── report.html
│   ├── timeline.html
│   └── trace.txt
├── jbrowse.json
├── OPTIONAL_FILE
├── SRR17138639.flagstat.tsv
├── DEMOSRR17138639.mosdepth.global.dist.txt
├── DEMOSRR17138639.mosdepth.summary.txt
├── DEMOSRR17138639.readstats.tsv.gz
├── DEMOSRR17138639.regions.bed.gz
├── DEMOSRR17138639.stats.json
├── DEMOSRR17138639.thresholds.bed.gz
├── DEMOSRR17138639.wf-human-alignment-report.html
├── DEMOSRR17138639.wf-human-snp-report.html
├── DEMOSRR17138639.wf-human-sv-report.html
├── DEMOSRR17138639.wf_snp_clinvar.vcf
├── DEMOSRR17138639.wf_snp.vcf.gz
├── DEMOSRR17138639.wf_snp.vcf.gz.tbi
├── DEMOSRR17138639.wf_sv.vcf.gz
├──└── DEMOSRR17138639.wf_sv.vcf.gz.tbi
├── execution
│   ├── report.html
│   ├── timeline.html
│   └── trace.txt
├── jbrowse.json
└── OPTIONAL_FILE

Let’s inspect the HTML reports for wf-human-alignment-report.html, wf-human-snp-report.htmland wf-human-sv-report.html.

NOTE: To proceed, you need to be on QUT’s WiFi network or signed via VPN.

To browse the working folder in the HPC type in the file finder:

...

Now browse to the runs run3_variant_calling folder → results folder and open the HTML reports.

Code Block
├── SRR17138639.wf-human-alignment-report.html
├── SRR17138639.wf-human-snp-report.html
├── SRR17138639.wf-human-sv-report.html

Notes:

  • Transitions (Ti or Ts) vs transversions (Tv) mutations - typically a Whole Genome Sequencing (WGS) study finds a Ti/Tv ratio of 2.1, while exome studies detect Ti/Tv = 2.8

...

  • ClinVar = The pipeline reports mutations overlapping known Clinical variants of interest (see: wf-human-snp-report.html)

  • Structural variants : The dataset used in this workshop does not contain real SVs, rather it reports Insertions or Deletions in regions where there are “N” bases on chromosome 20. For example:

Code Block
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  SRR17138639
chr20   71803   Sniffles2.DEL.10B2S0    TGAAAAGCTAAATTAAACTAATTAAGCTAAAG        N       39.0    PASS    PRECISE;SVTYPE=DEL;SVLEN=-32;END=7183
chr20   72147   Sniffles2.INS.7S0       N       AAAAGTCAAAAAATAACAGACACTGGTATACAGAAGAAAAGGAACACTTATACAC 40.0    PASS    PRECISE;SVTYPE=INS;SV
chr20   97733   Sniffles2.DEL.10B8S0    TAAGTCCCGCATGCATTAGCTATTTGTCTTAATGCTCTG N       39.0    PASS    PRECISE;SVTYPE=DEL;SVLEN=-39;END=9777
chr20   105784  Sniffles2.DEL.10BBS0    TGTTGGGCCTGGAGAGAGTGGGACCACCTTTGCCATGGGACTGGGGTGCATCTGTTCTGCAGGCCCTCCTACCTGTAGCCCCTCCGAAGGCCCCTGCCTAG
chr20   161017  Sniffles2.DEL.10C3S0    ATCCATTTCTTCTAGATTTTCTAGTTTATTTGTGTAGAGGTGTTTGTAGTATTCTCTGATGGTAGTTTGTATTTCTGTGGGATCGGTGGTGATATCCCCTT
chr20   173685  Sniffles2.INS.28S0      N       GGCAAGTACGGCACTGGGGGGCAGAACCCCCAACTTCTCCATGTCTCTACCCCTTCTCTTTTCTTGGGGAGACTGGCTTTTCCCAACCCCTTC
chr20   173712  Sniffles2.INS.27S0      N       TGTCTCTGCCCCTTCTCCACTTTTCTGGGGGCGAGAACCCCCAACCCCTTCTCCTTCACCCTTAGTGGCAATTACCGCTTTTCTGAGGGGCAA
chr20   174783  Sniffles2.INS.2BS0      N       GGAGCTTGCTACAAGCGCCAGAAATCTGGCCACCAGGCCAAGAATGTCCGCAGCCTGGGATTCCTCCTAAGCCGCGTCCCATCTGTGAAGGAC
chr20   175777  Sniffles2.INS.2DS0      N       GGATACTTTTTGACTTCGAAACCTGGTTTTGCCATCCTAATAAAACCATTATATAAACTCACAAAAAGGAAACCTAGCTGACCCCATAGATCC
chr20   176457  Sniffles2.INS.30S0      N       AATTGACTTTACTCACATGCCCCGGATCAGAAAACTAAAATACCTCTTAGTCTAGGTAGACACTTTCACTGGATAGGTAGGGCCTTTCCCACA
chr20   176457  Sniffles2.INS.2FS0      N       TGAGATGCTACAGGAGTGGTCCATTTGAACTTTTATATGGACACTTTCTTGCTTGGCCCCAACCTCATCCCAGACACCAGCCCTCTAGGTGAC
chr20   177062  Sniffles2.DEL.10CAS0    GCCCAACTACACACATCACTGAAACAATAGGAGCCTTCCAGCTACATATTACAGACAAGCCCTCTATCAATACTGGCAAACTTAAAAACATTAGCTGTAAT
chr20   178476  Sniffles2.DEL.10CDS0    GAAGTAACTGAAGAATCACCAAAGAAGTGAAAGTGGCCT N       59.0    PASS    PRECISE;SVTYPE=DEL;SVLEN=-39;END=1785
chr20   178495  Sniffles2.INS.37S0      N       AAAAGAATGAATATGCCCTGCCCCACCTTAACTGATGACATTCCACCACAAAGAAGTGTAAATGGCCGGTCATGCACCTTAACTGATGACATT
chr20   183223  Sniffles2.INS.38S0      N       ATCAAAAAGCCATTCAAATGGATTCACAGCTGAATTCTACCAGATGTATAAAGAACTGATACCAACTTATTGAAACTATTCCAAAATACGGAG
chr20   184725  Sniffles2.INS.3DS0      N       AAAGCATTGAGATGTTTATGTGTATGCATATCCAAAAAGCACAGCATAATCCTTTACATTGTCTATGATGCCAAGACCTTTGTTCACGTGTTT
chr20   185082  Sniffles2.INS.3BS0      N       AAGGAAGAAAACCAGGCTGGGCACAACGGCTCATGCCTCAAATCTCAATACATTGGCAAGCCAAGTAGAGGATCATTTGTTTCTCAGTTGTTC
chr20   190692  Sniffles2.DUP.2114S0    N       <DUP>   53.0    PASS    PRECISE;SVTYPE=DUP;SVLEN=25400936;END=25591628;SUPPORT=8;RNAMES=SRR17