Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Current »

Airway smooth muscle cells

Himes BE, Jiang X, Wagner P, Hu R, Wang Q, Klanderman B, Whitaker RM, Duan Q, Lasky-Su J, Nikolos C, Jester W, Johnson M, Panettieri R Jr, Tantisira KG, Weiss ST, Lu Q. “RNA-Seq Transcriptome Profiling Identifies CRISPLD2 as a Glucocorticoid Responsive Gene that Modulates Cytokine Function in Airway Smooth Muscle Cells.” PLoS One. 2014 Jun 13;9(6):e99625. PMID: 24926665. GEO: GSE52778.

From the abstract, a brief description of the RNA-Seq experiment on airway smooth muscle (ASM) cell lines: “Using RNA-Seq, a high-throughput sequencing method, we characterized transcriptomic changes in four primary human ASM cell lines that were treated with dexamethasone - a potent synthetic glucocorticoid (1 micromolar for 18 hours).”

Data source:

NCBI - Short Read Archive

https://www.ncbi.nlm.nih.gov/bioproject/PRJNA229998

Sample information

##            SampleName                                     cell   dex        Run
## SRR1039508 GSM1275862 tissue: human airway smooth muscle cells untrt SRR1039508
## SRR1039509 GSM1275863 tissue: human airway smooth muscle cells untrt SRR1039509
## SRR1039510 GSM1275864 tissue: human airway smooth muscle cells untrt SRR1039510
## SRR1039511 GSM1275865 tissue: human airway smooth muscle cells untrt SRR1039511
## SRR1039512 GSM1275866 tissue: human airway smooth muscle cells untrt SRR1039512
## SRR1039513 GSM1275867 tissue: human airway smooth muscle cells untrt SRR1039513
## SRR1039514 GSM1275868 tissue: human airway smooth muscle cells untrt SRR1039514
## SRR1039515 GSM1275869 tissue: human airway smooth muscle cells untrt SRR1039515
## SRR1039516 GSM1275870 tissue: human airway smooth muscle cells untrt SRR1039516
## SRR1039517 GSM1275871 tissue: human airway smooth muscle cells untrt SRR1039517
## SRR1039518 GSM1275872 tissue: human airway smooth muscle cells untrt SRR1039518
## SRR1039519 GSM1275873 tissue: human airway smooth muscle cells untrt SRR1039519
## SRR1039520 GSM1275874 tissue: human airway smooth muscle cells untrt SRR1039520
## SRR1039521 GSM1275875 tissue: human airway smooth muscle cells untrt SRR1039521
## SRR1039522 GSM1275876 tissue: human airway smooth muscle cells untrt SRR1039522
## SRR1039523 GSM1275877 tissue: human airway smooth muscle cells untrt SRR1039523
##            avgLength Experiment    Sample    BioSample
## SRR1039508       126  SRX384345 SRS508568 SAMN02422669
## SRR1039509       126  SRX384346 SRS508567 SAMN02422675
## SRR1039510       126  SRX384347 SRS508570 SAMN02422668
## SRR1039511       126  SRX384348 SRS508569 SAMN02422667
## SRR1039512       126  SRX384349 SRS508571 SAMN02422678
## SRR1039513        87  SRX384350 SRS508572 SAMN02422670
## SRR1039514       126  SRX384351 SRS508574 SAMN02422681
## SRR1039515       114  SRX384352 SRS508573 SAMN02422671
## SRR1039516       120  SRX384353 SRS508575 SAMN02422682
## SRR1039517       126  SRX384354 SRS508576 SAMN02422673
## SRR1039518       126  SRX384355 SRS508578 SAMN02422679
## SRR1039519       107  SRX384356 SRS508577 SAMN02422672
## SRR1039520       101  SRX384357 SRS508579 SAMN02422683
## SRR1039521        98  SRX384358 SRS508580 SAMN02422677
## SRR1039522       125  SRX384359 SRS508582 SAMN02422680
## SRR1039523       126  SRX384360 SRS508581 SAMN02422674

metadata information (metadata.txt):

Run	SampleName	cell	group
SRR1039508	GSM1275862	N61311	control
SRR1039509	GSM1275863	N61311	dex
SRR1039512	GSM1275866	N052611	control
SRR1039513	GSM1275867	N052611	dex
SRR1039516	GSM1275870	N080611	control
SRR1039517	GSM1275871	N080611	dex
SRR1039520	GSM1275874	N061011	control
SRR1039521	GSM1275875	N061011	dex

dex= dexamethasone treatment

Downloading data

Prior to downloading the data, first, we need to install the NCBI’s sra-tools using conda:

conda install -c bioconda sra-tools

The prepare a list of SRR accession numbers of interest to fetch FASTQ data:

cat metadata.txt | awk '{print $1}' | sed 1d > SraAccList.txt

Check SraAccList.txt (i.e., cat SraAccList.txt):

SRR1039508
SRR1039509
SRR1039512
SRR1039513
SRR1039516
SRR1039517
SRR1039520
SRR1039521

Once the list of wanted SRA accession IDs is ready, use a PBS Pro submission script to fetch all the sequences. Note, data will be downloaded to the folder where the job is submitted. Example script (fetch_SraAccList.pbs):

#!/bin/bash -l
#PBS -N sra_fetch
#PBS -l walltime=8:00:00
#PBS -l mem=8gb
#PBS -l ncpus=4
#PBS -m bae
###PBS -M email@host
#PBS -j oe

#Usage: qsub fetch_SraAccList.pbs

cd $PBS_O_WORKDIR

for i in `cat SraAccList.txt`;
do 
  echo $i
  prefetch $i
  fastq-dump --split-files $i  
done

Pre-processing of public data

Downloaded public data for the airway smooth muscle project show size differences between ‘Read 1’ and ‘Read 2’ FASTQ files. Prior to running the nextflow nf-core/RNAseq pipeline, downloaded raw data will be quality checked using default trim galore options:

#!/bin/bash -l
#PBS -N QC_P1-6
#PBS -l walltime=10:00:00
#PBS -l mem=8gb
#PBS -l ncpus=4
#PBS -m bae
#PBS -M email@host
#PBS -j oe


#User-defined parameters:
SAMPLEID=SRR1039513
READ1=SRR1039513_1.fastq
READ2=SRR1039513_2.fastq

#Pipeline:

cd $PBS_O_WORKDIR

#make output folder
mkdir -p trimgalore

# Remove adaptors and poor quality bases/reads using trimgalore. Minimal quality score of 20 (-q20) and minimal length of 50 bases (--length 50)
trim_galore --length 50 --cores 4 --paired -q 20 --fastqc  -o ./trimgalore ${READ1} ${READ2}

  • No labels