Save a copy of the DESeq2.R script into the run3_MirGeneDB folder and edit it as below…
Exercises for you to try:
There is a different database for microRNA that we’ve analysed this dataset against, called MirGeneDB. MirGeneDB is a database of manually curated microRNA genes that have been validated and annotated as initially described in Fromm et al. 2015 and Fromm et al. 2020. MirGeneDB 2.1 includes more than 16,000 microRNA gene entries representing more than 1,500 miRNA families from 75 metazoan species and published in the 2022 NAR database issue.
The output of the MirGeneDB analysis can be found at /work/training/2024/smallRNAseq/runs/run3_MirGeneDB, if you want to practice editing the R scripts we’ve given you to get the same plots as above for this analysis (in preparation for you doing it for your own data).
Precomputed results from session 6:
We ran the small RNA seq samples against the MirGeneDB database and the results can be found at:
Code Block |
---|
/work/training/2024/smallRNAseq/runs/run3_MirGeneDB/results/mirna_quant/edger_qc/mature_counts.csv /work/training/2024/smallRNAseq/data/human_disease/metadata_microRNA.txt |
Let’s create a “DESeq2” folder and copy the files needed for the statistical analysis:
Code Block |
---|
cp $HOME/workshop/2024-2/session6_smallRNAseq/scripts/transpose_csv.py $HOME/workshop/2024-2/session6_smallRNAseq/runs/run2_human_MirGeneDB/DESeq2 cp $HOME/workshop/2024-2/session6_smallRNAseq/data/metadata_microRNA.txt $HOME/workshop/2024-2/session6_smallRNAseq/runs/run2_human_MirGeneDB/DESeq2 cp /work/training/2024/smallRNAseq/runs/run3_MirGeneDB/results/mirna_quant/edger_qc/mature_counts.csv $HOME/workshop/2024-2/session6_smallRNAseq/runs/run2_human_MirGeneDB/DESeq2 cd $HOME/workshop/2024-2/session6_smallRNAseq/runs/run2_human_MirGeneDB/DESeq2 |
To transpose the initial “mature_counts.csv” file do the following:
Code Block |
---|
python transpose_csv.py --input mature_counts.csv --out mature_counts.txt |
Differential expression analysis using RStudio
Run analysis script in RStudio
Pre-steps: Open RStudio, Create a new R script ('File'->'New File'-> ‘R script’), Hit the save button and save this file in the working directory you created above (H:\workshop\2024-2\session6_smallRNAseq\runs\run2_human_MirGeneDB\DESeq2
). Name the R script ‘DESeq2.R’.
Step 1: LOAD PACKAGES
Step 2: IMPORT DATA
Step 3: OUTLIERS AND BATCH EFFECTS - TRANSFORM DATA: remove low-coverage transcripts below 20
Step 4: OUTLIERS AND BATCH EFFECTS - VISUALISE DATA (PCA)
Step 5: OUTLIERS AND BATCH EFFECTS - VISUALISE DATA (HEATMAP)
Step 6: DIFFERENTIAL EXPRESSION ANALYSIS
Step 7: DIFFERENTIAL EXPRESSION ANALYSIS - VISUALISATION (VOLCANO PLOT)
Step 8: DIFFERENTIAL EXPRESSION ANALYSIS - VISUALISATION (HEATMAP)
Step 9: DIFFERENTIAL EXPRESSION ANALYSIS - OUTLIER REMOVAL AND REPEATING OF CODE ABOVE