Overview of today’s session:
During Session 2 a basic generic RNA-seq pipeline has been run, without specifying additional parameters that can ensure the removal of sequence biases to have a more precise estimation of gene expression (feature counts). In this session, we will do the following tasks:
inspect the results from Session 2
run an advanced RNA-seq pipeline to measure the expression of genes
(optional) run statistical analysis to identify differentially expressed genes
Task 1: Evaluation of RNA-seq results using a basic (generic) nextflow pipeline
The nextflow/RNA-seq pipeline automatically generates two output folders:
results - contains the main outputs generated by each of the pipeline steps. Most users only need to look at the files contained in this folder.
work - this folder contains all the files generated during the running of the pipeline including intermediate files. Most users do not need to look at the content in this folder, unless the pipeline did not run properly.
To view the results from the completed pipeline, enter the run folder (i.e., run1_star_salmon)
#access the run folder for your samples. For example: cd run1_star_salmon #then access the results folder cd results
In the results folder you will find the following sub-folders:
fastqc/ trimgalore/ multiqc/ star_salmon/ pipeline_info/
Fastqc - assessing the quality of input reads
For example; Read1:
Read 2: