Let’s create an NEW interactive session on the HPC:
Code Block |
---|
qsub -I -S /bin/bash -l walltime=10:00:00 -l select=1:ncpus=2:mem=4gb |
We will now create a new CONDA environment to install the tools needed for mapping. The reason we need to create a new a new environments is because the QC and mapping tools have no compatible dependencies.
...
Next, we need to install few tools for today’s exercises. Now let’s go the https://anaconda.org and search for the following tools and instructions on how to install them:
Code Block |
---|
samtools, sniffles, minimap2 |
For example, search for samtools:
...
Click on the link to the tool of interest and you will be presented with the conda command line to run in your system to install the tool:
...
Copy and paste the first command shown line from above in into your terminal where you have activated the ‘ONTvariant’ conda environmentsessions with the activated ‘ONTvariants_mapping’ environment. Install samtools (version 1.20):
Code Block |
---|
conda install bioconda::samtools |
Next, let’s install minimap2 (version 2.28):
Code Block |
---|
conda install bioconda::minimap2 |
...
Alternative approach to create a conda env and install tools (we are not doing this - this just for your information) - installing all tools at once (slower option!)
Prepare the following environment.yml file:
Code Block |
---|
name: ONTvariants_mapping channels: - conda-forge - defaults - bioconda dependencies: - samtools=1.20 - minimapminimap2=2-2.28 |
Create a new environment:
...
As you have seen, we can search at anaconda.org for other tools that we might be interested to use.
Remember, if you run into compatibility issues or errors, you can always create a new conda environment for the tool of interest. NOTE: you can switch between conda environements as follows:
Code Block |
---|
conda activate ONTvariants_QC ... ... ... conda deactivate conda activate ONTvariants_mapping ... ... ... |
...
Code Block |
---|
#!/bin/bash -l #PBS -N run2_mapping #PBS -l select=1:ncpus=8:mem=16gb #PBS -l walltime=72:00:00 #PBS -m abe cd $PBS_O_WORKDIR #condaconda activate ONTvariants_QCmapping conda activate porechop ############################################################### # Variables ############################################################### FASTQ='/work/training/ONTvariants/data/runs/run1_QC/SRR17138639_1_porechop_abi_chopper_q10_300b.fastq' GENOME='/work/training/ONTvariants/data/chr20.fasta' SAMPLEID='SRR17138639' GENOMEID='chr20' ############################################################### #STEP1: Mapping preprocessed reads with minimap2 onto reference genome minimap2#minimap2 -t 8 -a $GENOME $FASTQ | awk '$3!="*"' > ${SAMPLEID}_mapped_${GENOMEID}.sam minimap2 -t 8 -a $GENOME $FASTQ > ${SAMPLEID}_mapped_${GENOMEID}.sam ##STEP2: samtools - SAM to sorted BAM samtools view -bS ${SAMPLEID}_mapped_${GENOMEID}.sam > ${SAMPLEID}_mapped_${GENOMEID}.bam samtools sort -o ${SAMPLEID}_mapped_${GENOMEID}.sorted.bam ${SAMPLEID}_mapped_${GENOMEID}.bam samtools index ${SAMPLEID}_mapped_${GENOMEID}.sorted.bam |
...
Monitor the progress of the job:
Code Block |
---|
qjobs |
Once the run has completed. The following files will be in the “run2_mapping” folder:
Code Block |
---|
.
├── launch_ONTvariants_mapping.pbs
├── SRR17138639_mapped_chr20.bam
├── SRR17138639_mapped_chr20.sam
├── SRR17138639_mapped_chr20.sorted.bam
└── SRR17138639_mapped_chr20.sorted.bam.bai |
To visualise the mapped reads (sorted BAM file) using IGV (see below), first we need to connect to the HPC using FileFinder:
NOTE: To proceed, you need to be on QUT’s WiFi network or signed via VPN.
To browse the working folder in the HPC type in the file finder:
Windows PC
Code Block |
---|
\\hpc-fs\work\training\ONTvariants\runs\run2_mapping |
Mac
Code Block |
---|
smb://hpc-fs/work/training/ONTvariants\runs\run2_mapping |
Visualisation of the alignment using IGV
...
Zoom in a visualise the alignments:
...