...
Combined Shannon and Observed ASV. An additional plot combing both Shannon’s index and Observed ASV indices has been included, to compare similarities and differences between these results. As each index uses different units, results for both have been normalised between 0 and 1.
Kruskal-Wallis rank sum test is a rank-based nonparametric test that can be used to determine if there are statistically significant differences between two or more groups. This statistical analysis is provided for each plot, to estimate if there is a significant difference (q.value < 0.05) between all groups.
Pairwise Wilcoxon rank sum test (AKA: Mann-Whitney test is the same as the Kruskal-Wallis test, but applied pairwise to each group (technically, The Kruskal-Wallis test is the generalization of the Wilcoxon rank sum test).
4a. Preparing your data
The ampvis2 package requires 3 components:
...
Have a look at your samples table and variables (metadata):
Code Block |
---|
samples_table |
Import the ASV abundance table
First import the unfiltered ASV table (ampvis2 does internal filtering).
...
Code Block |
---|
subgroup <- "all" |
Import the taxonomy table
Read in the taxonomy data - i.e. the taxonomic assignments for each ASV
...
Code Block |
---|
asv_table$Kingdom <- gsub("D_0", "k", asv_table$Kingdom) asv_table$Phylum <- gsub("D_1", "p", asv_table$Phylum) asv_table$Class <- gsub("D_2", "c", asv_table$Class) asv_table$Order <- gsub("D_3", "o", asv_table$Order) asv_table$Family <- gsub("D_4", "f", asv_table$Family) asv_table$Genus <- gsub("D_5", "g", asv_table$Genus) asv_table$Species <- gsub("D_6", "s", asv_table$Species) |
Convert the imported data to ampvis2 format
Now combine the samples data with the ASV table using amp_load(). This creates an ampvis2 database that can be used by ampvis2
...
You can see information about the ampvis2 object you just created by typing its name
Code Block |
---|
ampvisdata |
4b. Choosing a categorical variable to analyse
In your metadata you'll usually have multiple variables. These need to be analysed individually, by selecting the variable in this section, then running the remaining analysis sections on this chosen variable. You can then re-run the analysis on another variable by returning to this section, changing the variable name, then running again the remaining analysis sections.
NOTE This section is for choosing categorical variables only. See section 8 onward for analysis of continuous (i.e. numeric) variables.
You can view your variables as column names in your samples_table:
...
Code Block |
---|
group <- "Time" |
Ordering your variable
The plotting done in ampvis2 is done by the ggplot2 package. ggplot factorises variables and automatically orders them on the plot by alphabetical order.
...
Code Block |
---|
ampvisdata$metadata[[group]] <- factor(ampvisdata$metadata[[group]], levels = lev) |
4c. Rarefaction curve
This section is for plotting rarefaction curves for your samples, coloured by your chosen variable (if you want to change variables, go back and re-run section 4, choosing a different variable).
...
You can print this out as-is simply by:
Code Block |
---|
p |
Modifying plot attributes
You can make additional modifications to the plot colours, axis labels, font size, theme, etc:
...
Once you have your rarefecation plot looking how you like, you can export it as a 300dpi (i.e. publication quality) tiff or pdf file:
Exporting your plot as a file
You can save your plot as a 300dpi (i.e. publication quality) tiff or pdf file. These files can be found in your working directory.
...
You can now find these files in your working directory (which you originally defined in the 'Setting up your analysis environment' section).
4d. Diversity index plots and statistics - single categorical variable
The overview section outlined (with links and references) the alpha diversity indices that can be examined in this Notebook.
Briefly, these are: Shannon’s index, Simpson's index, Chao1 richness and Observed ASVs. Each of these has strengths and weaknesses. It's up to you, the researcher, to explore the literature and decide which is the best index to use for your data.
...