Content Comparison

Background and external resources

“Sequence Read Archive (SRA) data, available through multiple cloud providers and NCBI servers, is the largest publicly available repository of high throughput sequencing data.”

...

To install SRA toolkit, follow these instructions:https://github.com/ncbi/sra-tools/wiki/02.-Installing-SRA-Toolkit

To download data fasta files using project or biosample accession numbers:https://github.com/ncbi/sra-tools/wiki/08.-prefetch-and-fasterq-dump

You can also use SRA Explorer to view all files in a project and download all or some of them.: https://www.biostars.org/p/385930/

Goal

Download public data deposited in NCBI’s Short Read Archive (SRA) database.

Pre-requisites

Installed conda3 or miniconda3 ( https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html )
Basic unix command line knowledge (example: https://researchcomputing.princeton.edu/education/external-online-resources/linux ; https://swcarpentry.github.io/shell-novice/ )
Familiarity with one unix text editors (example Vi/Vim or Nano):
- VIM ( https://bioinformatics.uconn.edu/vim-guide/ ; https://missing.csail.mit.edu/2020/editors/)
- Nano (https://engineering.purdue.edu/ECN/Support/KB/Docs/BasictutorialforNanou ; https://www.howtogeek.com/howto/42980/the-beginners-guide-to-nano-the-linux-command-line-text-editor/ )

Installing miniconda

https://docs.conda.io/en/latest/miniconda.html#linux-installers

Code Block
bash Miniconda3-latest-Linux-x86_64.sh

Install sra tools

Once conda is installed in the instance. Go to https://anaconda.org and search for sra-tools. Copy and paste the command to install the tool in your HPC account:

Code Block
conda install -c bioconda sra-tools

Download SRA files

Submit a PBS script to fetch SRA files

Code Block

#!/bin/bash -l
#PBS -N SRAfiles
#PBS -l walltime=2:00:00
#PBS -l mem=4gb
#PBS -l ncpus=2
#PBS -m bae
###PBS -M email@host
#PBS -j oe

cd $PBS_O_WORKDIR

### User defined varaibles
SRAID=SRR1002659

### Pipeline

#Step1: Download SRA file
prefetch ${SRAID}

#Step2: Extract FASTQ file(s) from SRA file
fastq-dump --split-files ${SRAID}

Version	Old Version 1	New Version 2
Changes made by	Paul Whatmore (Deactivated)	Roberto Barrero Gumiel
Saved on	Sept 09, 2021	Oct 25, 2021

Versions Compared

Key

Goal

Pre-requisites

Installing miniconda