Table of Contents | ||
---|---|---|
|
Overview
This section examines the community structure - i.e. the proportion, variance and abundance of taxonomic groups found in each sample and treatment group.
Amplicon sequence variants
Taxonomic groups in this study are based on Amplicon sequence variants (ASV), inferred using the DADA2 software package by matching the sample sequences to the Silva ribosomal RNA sequence database.
DADA2 infers sample sequences exactly and resolves differences of as little as 1 nucleotide.
SILVA provides comprehensive, quality checked and regularly updated datasets of aligned small (16S/18S, SSU) and large subunit (23S/28S, LSU) ribosomal RNA (rRNA) sequences for all three domains of life (Bacteria, Archaea and Eukarya).
6a. Results summary
You can see a summary of your data by typing the name of your ampvis2 object:
Code Block |
---|
ampvisdata |
This will provide you a summary of sequence reads per sample (minimum, maximum, average, etc) and the number and percentage of ASVs resolved to each taxonomic level.
6b. Taxonomic abundance bar plots
This section generates stacked bar plots showing the proportion of taxa per sample, for each taxonomic level.
First, you can output the raw ASV abundance data as a csv file. You can add this as supplemental data to a manuscript or use it in other plotting or statistical programs:
Code Block |
---|
amp_export_otutable(ampvisdata, filename = "ASVtable", normalise = T, sep = ",") |
Now create the bar plots.
Choose the taxonomic level you want to plot (Choose from Phylum, Class, Order, Family, Genus, Species)
Code Block |
---|
taxgroup <- "Class" |
Generate the data for plotting. > 1% relative abundance only
Code Block |
---|
# Run amp_boxplot to generate aggregated data for the chosen taxonomic level, but output as a data object rather than plotting
x <- amp_boxplot(ampvisdata, tax_aggregate = taxgroup, tax_empty = "remove")
# Filter out any taxa with > 3% relative abundance
# First calculate mean on aggregated taxa
x2 <- aggregate(x$data$Abundance, list(x$data$Display), mean)
# Select taxa only with abundance > 1%
x3 <- x2[x2$x > 1,]
# Pull out just those taxa in x$data
x4 <- x$data[x$data$Display %in% x3$Group.1, ]
# Remove duplicate rows (means already calculated, duplicates only cause bias)
x5 <- x4[!duplicated(x4), ] |
Generate a stacked bar plot.
As with previous plots, you can modify colours using scale_fill_manual()
.
You can also change axis labels, text size and angle, default theme, etc.
Feel free to add additional modifications to this or any other plot in this Notebook. Here is a good guide for doing this: http://r-statistics.co/Complete-Ggplot2-Tutorial-Part2-Customizing-Theme-With-R-Code.html
Code Block |
---|
p <- ggplot(x5, aes(x = Sample, y = Abundance, fill = Display))
p <- p + geom_bar(position="fill", stat="identity") +
labs(y = "Abundance (%)") +
theme_classic() +
scale_y_continuous(label = label_percent()) +
scale_fill_manual(values = c("red", "blue", "green")) +
theme(text = element_text(size = 18), axis.text.x = element_text(angle = 90, size=12), axis.text.y = element_text(size=14)) +
labs(fill = taxgroup)
p |
To plot a different taxonomic level, You can change taxgroup <- "...."
above to another taxonomic level, then re-run the code from that point.
Save your plot as a 300dpi (i.e. publication quality) tiff or pdf file
Code Block |
---|
tiff_exp <- paste0("community_tax_abund_bar_plot_", taxgroup, "_samples_.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0("community_tax_abund_bar_plot_", taxgroup, "_samples_.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm") |
6c. Taxonomic abundance box and whisker plots
This section contains box and whisker plots for each taxonomic level (phylum … species).
As with the bar plots in the previous section, these B&W plots show the relative abundance (in %) of taxa. But whereas heatmaps show the mean %, box and whisker plots show the interquartile range between samples (max, min, median, quartiles), thus providing additional statistical information.
Choose the taxonomic level you want to plot (Choose from Phylum, Class, Order, Family, Genus, Species)
Code Block |
---|
taxgroup <- "Order" |
Generate the plot (and modify the attributes as you like)
Note the tax_show = 10
attribute. This says to show the top 10 taxa. Change this as desired.
Code Block |
---|
p <- amp_boxplot(ampvisdata, tax_aggregate = taxgroup, tax_empty = "remove", tax_show = 10) +
theme_bw() +
theme(text = element_text(size = 18), axis.text.x = element_text(size=16), axis.text.y = element_text(size=14)) +
labs(x = taxgroup)
p |
Export the plots as tiff and pdf:
Code Block |
---|
tiff_exp <- paste0("community_tax_abund_BW_plot_", taxgroup, "_samples_.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0("community_tax_abund_BW_plot_", taxgroup, "_samples_.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm") |
6d. Choosing a variable to analyse
In the Alpha Diversity section, you already imported your nfcore/ampliseq results and converted them to an ampvis2 object. To have another quick look at your ampvis object:
Code Block |
---|
ampvisdata |
As in the Alpha Diversity section, you need to choose a variable to work with.
You can view your variables as column names in your samples_table:
Code Block |
---|
colnames(samples_table) |
Now enter the variable you want to analyse:
Code Block |
---|
group <- "Nose_size" |
And once again you’ll need to order the groups in your variable. See the ‘Ordering your variable’ section in the Alpha diversity section for more details.
Choose how you want to order your groups:
Code Block |
---|
lev <- c("Small", "Medium", "Big") |
Then run the following to apply the levels to your data:
Code Block |
---|
ampvisdata$metadata[[group]] <- factor(ampvisdata$metadata[[group]], levels = lev) |
6e. Community structure by variable
In this section you can generate heatmaps and box and whisker plots for the variable you selected above, and for each taxonomic level.
If you want to analyse another variable here, go back to the previous section, choose another variable and re-run the code from that point.
Choose the taxonomic level you want to plot (Choose from Phylum, Class, Order, Family, Genus, Species). This applies to both the heatmaps and B&W plots below.
Code Block |
---|
taxgroup <- "Order" |
Heatmaps
This section contains heatmaps for each taxonomic level (class - species) for your selected variable. The number in each cell is the relative proportion (in %) of taxa per group.
There are a variety of attributes you can modify:
Note the tax_show = 10
attribute. This says to show the top 10 taxa.
The plot_values_size =
defines the size of the text in the heatmap cells.
The tax_add = NULL
adds an additional taxonomic group to the taxa names (e.g. changing this to tax_add = "Phylum"
and plotting Genus will name the taxa as 'phylum:genus' but leaving it as tax_add = NULL
will just give the genus name.
showRemainingTaxa = T
will aggregate, in a single row, the remaining taxa not shown on this heatmap. Change to F
if you don't want to see this.
color_vector = NULL
uses the default colour range (orange -> blue). You can change this by providing your own colour range, e.g. color_vector = c("red", "green")
. You can choose from the huge number of R colours here: http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf
Generate the plot:
Code Block |
---|
p <- amp_heatmap(ampvisdata, tax_show = 10, plot_values_size = 4, tax_add = NULL, showRemainingTaxa = T, color_vector = NULL, group_by = group, tax_aggregate = taxgroup, tax_empty = "remove")
p |
Add some additional plot modifications. Change or remove these as desired (or add your own modifications)
Code Block |
---|
p <- p +
theme_bw() +
theme(text = element_text(size = 18), axis.text.x = element_text(angle = 90, size=16), axis.text.y = element_text(size=14)) +
labs(y = taxgroup)
p |
Export as pdf and tiff
Code Block |
---|
tiff_exp <- paste0("community_tax_abund_heatmap_", group, "_", taxgroup, "_samples_.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0("community_tax_abund_heatmap_", group, "_", taxgroup, "_samples_.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm") |
Box and whisker plots
This section contains B&W plots for each taxonomic level (class - species) for your selected variable. This is similar to the heatmaps you generated in section 6, but separates the B&W into groups per variable, thus enabling an examination of how the variable groups differ in terms of taxonomic abundance.
Again, there are a variety of attributes you can modify:
Note the tax_show = 7
attribute. This says to show the top 7 taxa (Tip: to look at specific taxa you can provide a vector of taxa names, e.g. tax_show = c("Bacteroidales", "Fibrobacterales")
).
The tax_add = NULL
adds an additional taxonomic group to the taxa names (e.g. changing this to tax_add = "Phylum"
and plotting Genus will name the taxa as 'phylum:genus' but leaving it as tax_add = NULL
will just give the genus name.
Generate the plot:
Code Block |
---|
p <- amp_boxplot(ampvisdata, tax_show = 7, tax_add = NULL, tax_aggregate = taxgroup, tax_empty = "remove", group_by = group)
p |
You can change additional properties of the plot here:
Code Block |
---|
p$mapping$fill <- as.name(".Group")
p <- p + theme_bw() +
scale_color_manual(values=c("Red", "Green", "Blue")) + scale_fill_manual(values=c("Red", "Green", "Blue")) +
labs(fill = group, x = taxgroup) +
theme(text = element_text(size = 18), axis.text.x = element_text(size=18), axis.text.y = element_text(size=16)) +
guides(color = "none")
p |
Export as pdf and tiff
Code Block |
---|
tiff_exp <- paste0("community_tax_abund_BW_", group, "_", taxgroup, "_samples_.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0("community_tax_abund_BW_", group, "_", taxgroup, "_samples_.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm") |