Overview
This section examines the community structure - i.e. the proportion, variance and abundance of taxonomic groups found in each sample and treatment group.
Amplicon sequence variants
Taxonomic groups in this study are based on Amplicon sequence variants (ASV), inferred using the DADA2 software package by matching the sample sequences to the Silva ribosomal RNA sequence database.
DADA2 infers sample sequences exactly and resolves differences of as little as 1 nucleotide.
SILVA provides comprehensive, quality checked and regularly updated datasets of aligned small (16S/18S, SSU) and large subunit (23S/28S, LSU) ribosomal RNA (rRNA) sequences for all three domains of life (Bacteria, Archaea and Eukarya).
6a. Results summary
You can see a summary of your data by typing the name of your ampvis2 object:
ampvisdata
This will provide you a summary of sequence reads per sample (minimum, maximum, average, etc) and the number and percentage of ASVs resolved to each taxonomic level.
6b. Taxonomic abundance bar plots
This section generates stacked bar plots showing the proportion of taxa per sample, for each taxonomic level
First, you can output the raw ASV abundance data as a csv file. You can add this as supplemental data to a manuscript or use it in other plotting or statistical programs:
amp_export_otutable(ampvisdata, filename = "ASVtable", normalise = T, sep = ",")
Now create the bar plots.
Choose the taxonomic level you want to plot (Choose from Phylum, Class, Order, Family, Genus, Species)
taxgroup <- "Class"
Generate the data for plotting. > 1% relative abundance only
# Run amp_boxplot to generate aggregated data for the chosen taxonomic level, but output as a data object rather than plotting x <- amp_boxplot(ampvisdata, tax_aggregate = taxgroup, tax_empty = "remove") # Filter out any taxa with > 3% relative abundance # First calculate mean on aggregated taxa x2 <- aggregate(x$data$Abundance, list(x$data$Display), mean) # Select taxa only with abundance > 1% x3 <- x2[x2$x > 1,] # Pull out just those taxa in x$data x4 <- x$data[x$data$Display %in% x3$Group.1, ] # Remove duplicate rows (means already calculated, duplicates only cause bias) x5 <- x4[!duplicated(x4), ]
Generate a stacked bar plot.
As with previous plots, you can modify colours using scale_fill_manual()
.
You can also change axis labels, text size and angle, default theme, etc.
Feel free to add additional modifications to this or any other plot in this Notebook. Here is a good guide for doing this: http://r-statistics.co/Complete-Ggplot2-Tutorial-Part2-Customizing-Theme-With-R-Code.html
p <- ggplot(x5, aes(x = Sample, y = Abundance, fill = Display)) p <- p + geom_bar(position="fill", stat="identity") + labs(y = "Abundance (%)") + theme_classic() + scale_y_continuous(label = label_percent()) + scale_fill_manual(values = c("red", "blue", "green")) + theme(text = element_text(size = 18), axis.text.x = element_text(angle = 90, size=12), axis.text.y = element_text(size=14)) + labs(fill = taxgroup) p
To plot a different taxonomic level, You can change taxgroup <- "...."
above to another taxonomic level, then re-run the code from that point.
Save your plot as a 300dpi (i.e. publication quality) tiff or pdf file
tiff_exp <- paste0("community_tax_abund_bar_plot_", taxgroup, "_samples_.tiff") ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm") pdf_exp <- paste0("community_tax_abund_bar_plot_", taxgroup, "_samples_.pdf") ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")