Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Background and external resources

...

You can also use SRA Explorer to view all files in a project and download all or some of them: https://www.biostars.org/p/385930/

Goal

Download public data deposited in NCBI’s Short Read Archive (SRA) database.

Pre-requisites (if not available)

Installing miniconda

https://docs.conda.io/en/latest/miniconda.html#linux-installers

Code Block
bash Miniconda3-latest-Linux-x86_64.sh

Install sra-tools

Once conda is installed in the instance. Go to https://anaconda.org and search for sra-tools. Copy and paste the command to install the tool in your HPC account:

Code Block
conda install -c bioconda sra-tools 

Download SRA filesSubmit

a Example: PBS script (launch_fetch_SRAfiles.pbs) to fetch multiple files from SRA filesdatabase

Code Block
#!/bin/bash -l
#PBS -N SRAfiles
#PBS -l walltime=2:00:00
#PBS -l mem=4gb
#PBS -l ncpus=2
#PBS -m bae
###PBS -M email@host
#PBS -j oe

cd $PBS_O_WORKDIR

### User defined varaiblesSRA identifiers
SRAIDACCESSIONS=SRR1002659,SRR1002660,SRR1002661,SRR1002662

### Pipeline

#Step1: Download SRA file
prefetch ${SRAIDACCESSIONS}

#Step2: Extract FASTQ file(s) from SRA file
fastq-dump --split-files ${SRAID}ACCESSIONS}

submit PBS script to the HPC cluster

Code Block
qsub launch_fetch_SRAfiles.pbs

monitor job progression

Code Block
qjobs