Create working folder and copy data
...
Install tools using conda
Approach 1: Create a conda
...
environment and install tools one at a time
Create a conda environment called ONTvariants_QC
...
Let’s activate the conda environment:
Code Block |
---|
conda activate ONTvariantONTvariants_QC |
Next, we need to install few tools for today’s exercises. Now let’s go the https://anaconda.org and search for the following tools and instructions on how to install them:
...
Code Block |
---|
conda install bioconda::seqkit |
...
Approach 2: Create environment and install tools
...
all
...
at once
...
This is a slower option, but it is convenient when installing many tools.
Prepare the following environment.yml file:
Code Block |
---|
name: ONTvariants_QC channels: - conda-forge - defaults - bioconda dependencies: - nanoplot=1.42.0 - porechop=0.2.4 - porechop_abi=0.5.0 - chopper=0.8.0 |
Create a new environment:
Code Block |
---|
cd $HOME/workshop/ONTvariants/scripts conda env create -f environment_QC.yml |
Running QC
Now that we have installed all the tools needed for the QC of Nanopore reads, let’s run the preprocessing of reads.
...
Code Block |
---|
#!/bin/bash -l #PBS -N run1_QC #PBS -l select=1:ncpus=8:mem=16gb #PBS -l walltime=48:00:00 #PBS -m abe cd $PBS_O_WORKDIR conda activate ONTvariants_QC ############################################################### # Variables ############################################################### FASTQ='/work/training/ONTvariants/data/SRR17138639_1.fastq.gz' GENOME='/work/training/ONTvariants/data/chr20.fasta' SAMPLEID='SRR17138639' ############################################################### #STEP1: NanoPlot - overall QC report NanoPlot -t 8 --fastq $FASTQ --prefix ${SAMPLEID}_QC_ --plots dot --N50 --tsv_stats #STEP2: porechop_abi - remove adapters porechop_abi -abi -t 8 --input ${SAMPLEID}.fastq.gz$FASTQ --discard_middle --output ${SAMPLEID}_trimmed.fastq #STEP3: chopper - retain reads with >Q10 and length>300b chopper -q 10 -l 300 -i ${SAMPLEID}_trimmed.fastq > ${SAMPLEID}_trimmed_q10.fastq #STEP4: get stats of trimmed FASTQ files seqkit stats *.fastq > Report_trimmed_FASTQ_stats.txt |
...
As outputs find the porechop_abi processed file (SRR17138639_1_porechop_abi.fastq
) and the chopper output (SRR17138639_1_porechop_abi_chopper_q10_300b.fastq
). To visualise the QC reports, let’s connect to the HPC via file finder (see below).
NOTE: To proceed, you need to be on QUT’s WiFi network or signed via VPN.
To browse the working folder in the HPC type in the file finder:
...
Next, let’s inspect the “SRR17138639_QC_LengthvsQualityScatterPlot_dot.png“ file. Alternatively for high resolution image open instead “SRR17138639_QC_LengthvsQualityScatterPlot_dot.html“
...
Next: ONTvariants - mapping