To access this count table:

Go to the W:\training\rnaseq\runs\run3_RNAseq\results folder that contains the results from running the nfcore/rnaseq pipeline. The output folders from task 3 look like this:


The count table can be found in the star_salmon folder. A list of files and folders in the star_salmon folder will look like this:


a. First create a new folder in H:\workshop\RNAseq . Call it something suitable, such as ‘DE_analysis_workshop’


We will be using the pretty heatmap package to accomplish this.

Code Block
#### 4b5b. Samples heatmap and dendrogram ####

# This section plots a heatmap and dendrogram of pairwise relationships between samples. In this way you can see if samples cluster by treatment group.

# See here:

# Define annotation column
annot_columns <- data.frame(meta$group[meta$group %in% plotgroups])
# Make the row names the sample IDs
row.names(annot_columns) <- meta$sample_ID[meta$group %in% plotgroups]
colnames(annot_columns) <- "Treatment groups"
# Need to factorise it
annot_columns[[1]] <- factor(annot_columns[[1]])

# Generate dendrogram and heatmap
pheatmap(distance_matrix, color=colorRampPalette(c("white", "#9999FF", "#990000"))(50), cluster_rows = TRUE, show_rownames = TRUE, treeheight_row = 0, treeheight_col = 70, fontsize_col = 12, annotation_names_col = F, annotation_col = annot_columns, filename = paste0("./figures/Pairwise_sample_heatmap_", paste(plotgroups, collapse = "_Vs_"), ".tiff"))

# Notes about heatmap colours.
# You can change the colours used in the heatmap itself by changing the colour names (color=colorRampPalette....)
# If you want to change the annotation colours, see here:


This section plots a heatmap and dendrogram of DE gene expression per sample. Expression counts are scaled and centered so that groupwise relationships can be examined.

Code Block
#### 5c6c. DE genes heatmaps and dendrograms ####

# Make the row names gene symbols.
DE_genes <- na.omit(DE_genes)
row.names(DE_genes) <- make.unique(DE_genes$SYMBOL)

# sort by p-value
DE_genes <- DE_genes[order(DE_genes$padj), ]

# Pull out normalised counts only
siggc <- DE_genes[colnames(DE_genes) %in% colnames(expdata)]

# Scale and center each row. This is important to visualise relative differences between groups and not have row-wise colouration dominated by high or low gene expression.
xts <- scale(t(siggc))
xtst <- t(xts)

# Define annotation column
annot_columns <- data.frame(meta$group[meta$group %in% degroups])
# Make the row names the sample IDs
row.names(annot_columns) <- meta$sample_ID[meta$group %in% degroups]
colnames(annot_columns) <- "Treatment groups"
# Need to factorise it
annot_columns[[1]] <- factor(annot_columns[[1]])

# Generate dendrogram and heatmap for ALL DE genes
pheatmap(xtst, color=colorRampPalette(c("#D55E00", "white", "#0072B2"))(100), annotation_col=annot_columns, annotation_names_col = F, fontsize_col = 12, fontsize_row = 7, labels_row = row.names(siggc), show_rownames = F, filename = paste0("./figures/All_DEG_Heatmap_", paste(plotgroups, collapse = "_Vs_"), ".tiff"))

# Generate dendrogram and heatmap for top 20 DE genes
pheatmap(xtst[1:20,], color=colorRampPalette(c("#D55E00", "white", "#0072B2"))(100), annotation_col=annot_columns, annotation_names_col = F, fontsize_col = 14, fontsize_row = 12, labels_row = row.names(siggc), show_rownames = T, filename = paste0("./figures/Top_DEG_Heatmap_", paste(plotgroups, collapse = "_Vs_"), ".tiff"))
# NOTE: you can plot more than 20 top genes by adjusting 'xtst[1:20,]'. If you wanted to plot the top 50 genes you'd change this to 'xtst[1:50,]'