For this exercise we will use the epi2me-labs/wf-human-variation pipeline. Find information on the pipeline at https://labs.epi2me.io/workflows/wf-human-variation/
...
Approximate run time: Variable depending on whether it is targeted sequencing or whole genome sequencing, as well as coverage and the individual analyses requested. For instance, a 90X human sample run (options: --snp --sv --mod --str --cnv --phased --sex male
) takes less than 8h with recommended resources.
NOTE: in contrast to the nf-core/sarek pipeline that we used in session 2, the epi2me-labs/wf-human-variation pipeline runs in ‘local’ mode (needs large amount of CPUs and RAM memory), while the nf-core/sarek pipeline will use a ‘pbspro’ mode, where the pipeline will submit individual jobs to the HPC cluster and define the CPUs and memory for each task individually.
...
Line 1: Defines that the script is a bash script.
Lines 2-5: Are commented out with “#” at the beginning and are ignored by bash, however, these PBS lines tell the scholar (PBS Pro) the name of the job (line 2), the number of CPUs and RAM memory to use (line 3), the time to run the script (line 4) and report if there are any errors (line 5).
Line 7: load java required to run cextflow nextflow pipelines.
Line 8: assign up to 4GB memory for the nextflow initial script to use.
Line 9: Tells the job to run on the current directory.
Lines 11-22: Parameters to run the epi2me-labs/wf-human-variation pipeline (refer above for details on each parameter)
...
Code Block |
---|
qjobs |
Once the pipeline has complete completed you will see the following set of output files in the ‘results’ folder:
...
Let’s inspect the HTML reports for wf-human-alignment-report.html
, wf-human-snp-report.html
and wf-human-sv-report.html
.
NOTE: To proceed, you need to be on QUT’s WiFi network or signed via VPN.
To browse the working folder in the HPC type in the file finder:
...