Let’s create an NEW interactive session on the HPC:
Code Block |
qsub -I -S /bin/bash -l walltime=10:00:00 -l select=1:ncpus=2:mem=4gb |
We will now create a new CONDA environment to install the tools needed for mapping. The reason we need to create a new a new environments is because the QC and mapping tools have no compatible dependencies.
Alternative approach to create a conda env and install tools (we are not doing this - this just for your information) - installing all tools at once (slower option!)
Prepare the following environment.yml file:
Code Block |
name: ONTvariants_mapping channels: - conda-forge - defaults - bioconda dependencies: - samtools=1.20 - minimap2=2.28 |
As you have seen, we can search at anaconda.org for other tools that we might be interested to use.
Remember, if you run into compatibility issues or errors, you can always create a new conda environment for the tool of interest. NOTE: you can switch between conda environements as follows:
Code Block |
conda activate ONTvariants_QC ... ... ... conda deactivate conda activate ONTvariants_mapping ... ... ... |
Code Block |
#!/bin/bash -l #PBS -N run2_mapping #PBS -l select=1:ncpus=8:mem=16gb #PBS -l walltime=72:00:00 #PBS -m abe cd $PBS_O_WORKDIR #condaconda activate ONTvariants_QC condamapping activate porechop ############################################################### # Variables ############################################################### FASTQ='/work/training/ONTvariants/data/runs/run1_QC/SRR17138639_1_porechop_abi_chopper_q10_300b.fastq' GENOME='/work/training/ONTvariants/data/chr20.fasta' SAMPLEID='SRR17138639' GENOMEID='chr20' ############################################################### #STEP1: Mapping preprocessed reads with minimap2 onto reference genome minimap2#minimap2 -t 8 -a $GENOME $FASTQ | awk '$3!="*"' > ${SAMPLEID}_mapped_${GENOMEID}.sam minimap2 -t 8 -a $GENOME $FASTQ > ${SAMPLEID}_mapped_${GENOMEID}.sam ##STEP2: samtools - SAM to sorted BAM samtools view -bS ${SAMPLEID}_mapped_${GENOMEID}.sam > ${SAMPLEID}_mapped_${GENOMEID}.bam samtools sort -o ${SAMPLEID}_mapped_${GENOMEID}.sorted.bam ${SAMPLEID}_mapped_${GENOMEID}.bam samtools index ${SAMPLEID}_mapped_${GENOMEID}.sorted.bam |
To visualise the mapped reads (sorted BAM file) using IGV (see below), first we need to connect to the HPC using FileFinder:
NOTE: To proceed, you need to be on QUT’s WiFi network or signed via VPN.
To browse the working folder in the HPC type in the file finder:
Zoom in a visualise the alignments: