2024 eResearch - Session 6 : Small RNAs: A regulatory network of a broad range of biological processes
Aims
Learn to process small RNAseq data to identify differentially expressed genes (DEGs) using the nf-core/smrnaseq pipeline (version 2.13.1)
Run the nextflow nf-core/rnaseq pipeline in the HPC cluster. Exercises include:
Human - Huntington Disease:
Download public data from the European Nucleotide Archive (ENA) database
Running the full nf-core/smrnaseq pipelines using the human reference genome (GRCh38) provided by nf-core (igenomes) and the reference miRBase database
Running the full nf-core/smrnaseq pipelines using the human reference genome (GRCh38) provided by nf-core (igenomes) and the reference MirGeneDB database
Elucidating the role of the Drosha Proline-Rich Disordered (PRD) domain:
We will use a basic-science project to assess how mutants of the PRD domain influence the expression of miRNAs.
Today’s overview
Introduction to nf-core/smRNA-seq
Preparing working directory
Fetch public RNA-seq data
Prefrontal cortex - Huntington Disease vs. controls - small RNA-seq pipeline (miRBase)
Prefrontal cortex - Huntington Disease vs. controls - small RNA-seq pipeline (MirGeneDB)
Drosha’s controls vs Proline-Rich Disorder (PRD) domain mutant
Before we start using the HPC, let’s start an interactive session:
qsub -I -S /bin/bash -l walltime=10:00:00 -l select=1:ncpus=2:mem=4gb
where:
‘walltime’ is amount of time requested to run the interactive session
‘cpus’ number of CPUs to be used by the interactive session
‘mem’ amount of memory assigned to the interactive session
Reference nf-core/smRNA pipeline: https://nf-co.re/smrnaseq/2.3.1