Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

Installing BLAST

Once miniconda3 has been installed. The first time you may need to reinitiate your terminal session to make the ‘conda’ command available.

...

run the above command to install blast. Conda will evaluate if the tool or necessary dependencies are available and will automatically install all necessary items to run in this case blast.

Note: Follow a similar process as above to install other tools.

Sample Data

Demo sample data to compare the similarity of DNA sequences generated by an RNA-seq approach against a reference Miscanthus sinensis mosaic virus (MsiMV) can be found at:

Code Block
/work/eresearch_bio/sandpit/blast
Code Block
-rw-rw----  1 barrero  36K Jan 20 12:43 query_sample.fa
-rw-rw----  1 barrero 9.7K Jan 20 12:44 MsiMV_genome.fasta
-rw-rw----+ 1 barrero  419 Jan 20 12:50 launch_blastN.pbs

We want to compare the similarity (from 0 to 100%) of the sequences (also called ‘reads’) inside the query_sample.fa file against the reference MsiMV_genome.fasta sequence. Note: RNA/DNA (and protein) sequences can be stored in a ‘fasta format’. This is a header noted by “>” symbol followed by a sequence identifier on the first row. From the second row onwards the DNA/RNA(protein) sequence is presented.

Running blast on the HPC

We use a PBS Pro submission script to submit jobs to the HPC cluster. Above the file called ‘launch_blastN.pbs’ can be used to submit the job. This file has the following information:

Code Block
#!/bin/bash -l
#PBS -N blastN
#PBS -l walltime=10:00:00
#PBS -l mem=8gb
#PBS -l ncpus=4
#PBS -q testvm
#PBS -m bae
###PBS -M email@host
#PBS -j oe

cd $PBS_O_WORKDIR

#define variables. For example name of the fasta file to use. Note: it can be either with a suffix .fa or .fasta or other.
QUERY=query_sample.fa
REFERENCE=MsiMV_genome.fasta
EVALUE=1e-10

#run blastn search
blastn -query $QUERY \
       -subject $REFERENCE \
       -out blastN_${QUERY}_vs_${REFERENCE}.out \
       -outfmt 6 \
       -evalue $EVALUE \
       -num_threads 4

Checking the progression of the submitted job:

Code Block
qjobs

#alternatively use:
qstats -u $USERNAME

How to interpret the result? check this tutorial.