Session 3: Advanced RNA-seq pipeline

Overview of today’s session:

During Session 2 a basic generic RNA-seq pipeline has been run, without specifying additional parameters that can ensure the removal of sequence biases to have a more precise estimation of gene expression (feature counts). In this session, we will do the following tasks:

inspect the results from Session 2
run an advanced RNA-seq pipeline to measure the expression of genes
(optional) run statistical analysis to identify differentially expressed genes

Task 1: Evaluation of RNA-seq results using a basic (generic) nextflow pipeline

The nextflow/RNA-seq pipeline automatically generates two output folders:

results - contains the main outputs generated by each of the pipeline steps. Most users only need to look at the files contained in this folder.
work - this folder contains all the files generated during the running of the pipeline including intermediate files. Most users do not need to look at the content in this folder, unless the pipeline did not run properly.

To view the results from the completed pipeline, enter the run folder (i.e., run1_star_salmon)

#access the run folder for your samples. For example:
cd run1_star_salmon

#then access the results folder
cd results

In the results folder you will find the following sub-folders:

fastqc/
trimgalore/
multiqc/
star_salmon/
pipeline_info/

Fastqc - assessing the quality of input reads

For example; Read1:

Read 2: