Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
Code Block |
---|
head(asv_table) |
The taxonomy data from ampliseq is prefaced by D_0_
D_1_
etc but ampvis2 expects taxonomy assignments to be k_
p_
, etc (kingdom, phylum, etc). So we need to convert these using gsub()
Code Block |
---|
asv_table$Kingdom <- gsub("D_0", "k", asv_table$Kingdom) asv_table$Phylum <- gsub("D_1", "p", asv_table$Phylum) asv_table$Class <- gsub("D_2", "c", asv_table$Class) asv_table$Order <- gsub("D_3", "o", asv_table$Order) asv_table$Family <- gsub("D_4", "f", asv_table$Family) asv_table$Genus <- gsub("D_5", "g", asv_table$Genus) asv_table$Species <- gsub("D_6", "s", asv_table$Species) |
...
This is based on the variable you chose in the previous sections. If you’ve forgotten what variable group you’ve selected, type:
Code Block |
---|
group |
Calculate alpha diversity
First you need to calculate the alpha diversity index scores using ampvis2 function amp_alphadiv()
. This will calculate all 4 indices.
...
Code Block |
---|
write.csv(div_indices, "diversity_indices_raw_scores.csv") |
Choose the index you want to plot
Choose the diversity index scores you want to plot.
...
Shannon's index is used by default ("Shannon"
) Change this to "Simpson"
to plot Simpson's index scores, "Chao1"
for Chao1 richness or "ObservedOTUs"
for Observed ASVs.
Box and whisker plot
Create a box and whisker plot showing diversity per group. You can view the basic plot like so:
...
Code Block |
---|
pdf_exp <- paste0("alpha_div_box_plot_", group, "_", indicname,"_samples.pdf") ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm") |
Statistical analysis
To compare the differences between groups within your variable, a Kruskal-Wallis test (one-way analysis of variance) can be performed to test for overall differences between all groups, and a Wilcoxon rank sum to test pairwise differences between each group.
...
To see the pairwise results (p values).
Code Block |
---|
wt_pair |
4e. Diversity index plots and statistics - multiple categorical variables
In the previous section you examined a single variable.
In this section you can examine multiple variables by estimating diversity for your primary variable, then splitting the resulting plots into 'facets', based on another variable. This often allows for a better examination of associations and trends between variables.
To refresh your memory, the variable you've chosen to analyse is:
Code Block |
---|
group |
The first few steps are the same as the previous section. You generate your diversity indices and then select which index you want to plot:
Code Block |
---|
div_indices <- amp_alphadiv(ampvisdata, richness = T) |
Code Block |
---|
indicname <- "Shannon" |
Shannon's index is used by default ("Shannon"
) Change this to "Simpson"
to plot Simpson's index scores, "Chao1"
for Chao1 richness and "ObservedOTUs"
for Observed ASVs.
Now you need to choose a secondary variable.
You can see all the available variables you can choose from by looking at your samples_table column names
Code Block |
---|
colnames(samples_table) |
Select the secondary variable.
Code Block |
---|
var2 <- "Batch" |
IMPORTANT: your plots will be split into as many facets as there are unique subcategories in your secondary variable. To see how many subcategories:
Code Block |
---|
unique(samples_table[[var2]]) |
Rarefaction curvegeneralised linear model is applied to examine statistically significant correlations.In addition to the scatter plot, glm (t statistic, p value) and correlation (correlation, p value) statistics can be generated.
NOTE: This section, as with the previous plotting sections, requites that you've run the '3. Preparing your data' section and chosen the samples you want to work with. If you want to change your samples, go back to that section and re-run it with new parameters.
To refresh your memory, the samples you've chosen to analyse are:
[ ]:
Code Block |
---|
samples_table$sample.id |
Now choose the continuous variable you want to analyse:
Enter the column name of the variable you want to analyse (i.e. change group <- "Myvariable"
in the below cell to your chosen variable's column name). This has to be exactly the same as the column name, including capitalisation, characters such as underscores, etc.
[ ]:
Code Block |
---|
group <- "DM_digestibility_perc" |
You can see view your variable names as column names in your samples table:
[ ]:
Code Block |
---|
colnames(samples_table) |
4f. Diversity index plots and statistics - continuous variable Generalised linear model t value =
[ ]:
Code Block |
---|
round(glm_sum$coefficients[2,3], 4) |
[ ]:
Code Block |
---|
round(cor_stat$estimate, 4) |
Pearson's correlation p value =
[ ]:
Code Block |
---|
round(cor_stat$p.value, 4) |
Adding another variablefacets based on an additional categorical variable.First select which categorical variable you want to examine (remember, these are the column names of the samples table):
[ ]:
Code Block |
---|
var3 <- "Phase" |
Make sure you've run the plotting code in the previous section. This will create an object called 'p' which you can now split into facets based on the variable you chose in the above code cell"
[ ]:
Code Block |
---|
p <- p + facet_grid(get(var3) ~ .) |
Code Block |
---|
p |
And once again, you can export the plot as a 300dpi TIFF and pdf file
[ ]:
Code Block |
---|
tiff_exp <- paste0("alpha_div_scatter_glm_", group, "_", var2, "_", var3, "_", indicname, "_", subgroup, "_samples.tiff") |
Code Block |
---|
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm") |
[ ]:
Code Block |
---|
pdf_exp <- paste0("alpha_div_scatter_glm_", group, "_", var2, "_", var3, "_", indicname, "_", subgroup, "_samples.pdf") |
Code Block |
---|
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm") |
Click here to go to the next section: 5. Beta diversityGo to the next section: Beta Diversity