Overview
While alpha diversity examines differences within treatment groups, beta diversity measures the similarity (or dissimilarity) between samples and groups.
Each variable is plotted on principal coordinates analysis (PCoA) plots, to examine the variance between samples based on a dissimilarity matrix. A detailed explanation of PCoA and other ordination methods can be seen here: http://albertsenlab.org/ampvis2-ordination/
Sample distance can be measured using 3 distance-based ordination methods. These methods are:
Bray–Curtis dissimilarity measures the fraction of overabundant counts.
Sorenson, T. (1948) “A method of establishing groups of equal amplitude in plant sociology based on similarity of species content.” Kongelige Danske Videnskabernes Selskab 5.1-34: 4-7.
Cao index is a minimally biased index for high beta diversity and variable sampling intensity. Chao index tries to take into account the number of unseen species pairs.
Cao, Y., Bark, A. W., & Williams, W. P. (1997). Analysing benthic macroinvertebrate community changes along a pollution gradient: a framework for the development of biotic indices. Water Research, 31(4), 884-892.
Jaccard similarity index measures the fraction of unique features, regardless of abundance..
Jaccard, P. (1908). “Nouvellesrecherches sur la distribution florale.” Bull. Soc. V and. Sci. Nat., (44):223-270.
5a. Choosing a variable to analyse
In the Alpha Diversity section, you already imported your nfcore/ampliseq results and converted them to an ampvis2 object. To have another quick look at your ampvis object:
ampvisdata
As in the Alpha Diversity section, you need to choose a variable to work with.
You can view your variables as column names in your samples_table:
colnames(samples_table)
Now enter the column name of the variable you want to analyse:
group <- "Nose_size"
Ordering your variable
And once again you’ll need to order the groups in your variable. See the ‘Ordering your variable’ section in the Alpha diversity section for more details.
Choose how you want to order your groups:
lev <- c("Small", "Medium", "Big")
Then run the following to apply the levels to your data:
ampvisdata$metadata[[group]] <- factor(ampvisdata$metadata[[group]], levels = lev)
5b. PCoA plots and statistics
The overview section described (with links and references) the ordination methods that can be used to estimate and plot beta diversity.
Briefly, these are: Bray–Curtis dissimilarity, Cao index, and Jaccard similarity index. Each of these has strengths and weaknesses. It's up to you, the researcher, to explore the literature and decide which is the best index to use for your data.
Choose the ordination method you want to use to estimate and plot beta diversity.
Bray–Curtis dissimilarity is used by default ("bray"
) Change this to "cao"
for Cao index, or "jaccard"
for Jaccard similarity index.
index <- "bray"
Now create the PCoA plot.
p <- amp_ordinate(ampvisdata, type = "pcoa", transform = "none", distmeasure = index, sample_color_by = group, sample_point_size = 3, sample_colorframe = TRUE) + scale_color_manual(values=c("Red", "Green", "Blue")) + scale_fill_manual(values=c("Red", "Green", "Blue")) + theme(text = element_text(size = 16)) p
Save your plot as a 300dpi (i.e. publication quality) tiff or pdf file:
tiff_exp <- paste0("PCoA_beta_div_", group, "", index, "_samples.tif") ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm") pdf_exp <- paste0("PCoA_beta_div_", group, "_", index,, "_samples.pdf") ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")
Statistical analysis
To compare the overall differences between groups within your chosen variable, a PERMANOVA test can be performed and similarly a pairwise Permutational Multivariate Analysis of Variance (PERMANOVA) test can be performed to compare differences between each group.
The R-squared value represents the percentage of variance explained by the examined groups. E.g. if R = 0.23 then 23% of the total diversity is explained by groupwise differences. PERMANOVA is based on groupwise differences, thus cannot be applied to continuous data.
PERMANOVA:
# Need to remove rows (from ASV abundance table) with all 0 counts first asvmatrix <- ampvisdata$abund asvmatrix <- asvmatrix[rowSums(asvmatrix) > 0, ] # Also need to transpose (samples need to be as rows, asv's as columns) asvmatrix <- t(asvmatrix) # Then generate pairwise distance matrix sampdist <- vegdist(asvmatrix, method="bray") # Subset samples table by samples in ampvis2 object (in case some were filtered out) samples_table_filt <- samples_table[samples_table$ID %in% ampvisdata$metadata$ID, ] # Use adonis function (vegan package: "Permutational Multivariate Analysis of Variance Using Distance Matrices") to run PERMANOVA on distances pathotype.adonis <- adonis2(sampdist ~ get(group), data = samples_table_filt) # Output the r squared and p values as variables r2 <- pathotype.adonis$R2[1] pval <- pathotype.adonis$`Pr(>F)`[1]
PERMANOVA R squared =
round(r2, 2)
PERMANOVA significance (p) =
pval
Pairwise PERMANOVA:
Calculate the pairwise PERMANOVA. This is a bit complex, as each group within the variable has to be compared to each other group in a variety of ways. Code comments (#) explain what each line of code does.
# The combn function creates every combination of provided elements # Below it takes all group names and combines them pairwise (2) # Creates a matrix where each column = a combination comb_pair <- data.frame(combn(unique(samples_table_filt[[group]]),2)) # Convert scores to relative abundance # Use sweep function to divide ("/") each column (2) by its total (colSums) comm <- sweep(ampvisdata$abund,2,colSums(ampvisdata$abund),"/") # Using adonis function (vegan package: "Permutational Multivariate Analysis of Variance Using Distance Matrices") tabstat_adonis <- c() # Loop through each pair (i.e. column in 'comb_pair') for (i in 1:ncol(comb_pair)) { # Pull out pair data # From samples table samples_table_SB_pair <- samples_table_filt[samples_table_filt[[group]] %in% comb_pair[[i]], ] # From ASV matrix asvmatrix_pair <- comm[samples_table_SB_pair$ID] # Transpose asvmatrix_pair <- asvmatrix_pair[rowSums(asvmatrix_pair) > 0, ] # Also need to transpose (samples need to be as rows, asv's as columns) asvmatrix_pair <- t(asvmatrix_pair) # Then generate pairwise distance matrix sampdist_pair <- vegdist(asvmatrix_pair, method="bray") # Use anosim (vegan): Analysis of Similarities x2 <- adonis2(sampdist_pair ~ get(group), data = samples_table_SB_pair) # Pull out just r squared and p value r2_adonis <- x2$R2[1] pval_adonis <- x2$`Pr(>F)`[1] # Combine into data frame tabstat_adonis <- cbind(tabstat_adonis, c(r2_adonis, pval_adonis)) # Name vector with group combinations colnames(tabstat_adonis)[i] <- paste(comb_pair[[i]], collapse=' Vs ') } row.names(tabstat_adonis) <- c("R squared", "p")
Now output these results as a table.
tabstat_adonis <- t(tabstat_adonis) tabstat_adonis <- data.frame(tabstat_adonis) tabstat_adonis
You can export this table as a csv file:
write_csv(tabstat_adonis, paste0("beta_div_pairwise_PERMANOVA", group, "_", index, "_samples_.csv"))
Go to the next section: Community structure