Overview
While alpha diversity examines differences within treatment groups, beta diversity measures the similarity (or dissimilarity) between samples and groups.
Each variable is plotted on principal coordinates analysis (PCoA) plots, to examine the variance between samples based on a dissimilarity matrix. A detailed explanation of PCoA and other ordination methods can be seen here: http://albertsenlab.org/ampvis2-ordination/
Sample distance can be measured using 3 distance-based ordination methods. These methods are:
Bray–Curtis dissimilarity measures the fraction of overabundant counts.
Sorenson, T. (1948) “A method of establishing groups of equal amplitude in plant sociology based on similarity of species content.” Kongelige Danske Videnskabernes Selskab 5.1-34: 4-7.
Cao index is a minimally biased index for high beta diversity and variable sampling intensity. Chao index tries to take into account the number of unseen species pairs.
Cao, Y., Bark, A. W., & Williams, W. P. (1997). Analysing benthic macroinvertebrate community changes along a pollution gradient: a framework for the development of biotic indices. Water Research, 31(4), 884-892.
Jaccard similarity index measures the fraction of unique features, regardless of abundance..
Jaccard, P. (1908). “Nouvellesrecherches sur la distribution florale.” Bull. Soc. V and. Sci. Nat., (44):223-270.
Significance tests
For each beta diversity method, both overall significance and pairwise significance can be calculated using a Permutational Multivariate Analysis of Variance (PERMANOVA), a non-parametric multivariate statistical test. The R-squared value represents the percentage of variance explained by the examined groups. E.g. if R = 0.23 then 23% of the total diversity is explained by groupwise differences. PERMANOVA is based on groupwise differences, thus cannot be applied to continuous data.
4a. Choosing a variable to analyse
In the Alpha Diversity section, you already imported your da
You can view your variables as column names in your samples_table:
colnames(samples_table)
Enter the column name of the variable you want to analyse (i.e. change group <- "Myvariable"
in the below cell to your chosen variable's column name). This has to be exactly the same as the column name, including capitalisation, characters such as underscores, etc.
group <- "Nose_size"
Ordering your variable
The plotting done in ampvis2 is done by the ggplot2 package. ggplot factorises variables and automatically orders them on the plot by alphabetical order.
This can cause your groups to be ordered incorrectly on the plot axes (e.g. a time series may not be plotted sequentially).
You can manually set the order of your variable here.
First have a look at how ggplot will order your variable.
levels(factor(ampvisdata$metadata[[group]]))
If these are in the order you want to see them on your plot axes, nothing needs to be done. If they are in the wrong order you need to order them manually by setting the levels.
Choose how you want to order your groups here:
lev <- c("Small", "Medium", "Big")
To order your variable you need to put all the variable levels into the lev = c(..)
. Make sure each level is in double quotes and separated by a comma.
Then run the following to apply the levels to your data:
ampvisdata$metadata[[group]] <- factor(ampvisdata$metadata[[group]], levels = lev)