Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
Code Block |
---|
head(asv_table) |
The taxonomy data from ampliseq is prefaced by D_0_
D_1_
etc but ampvis2 expects taxonomy assignments to be k_
p_
, etc (kingdom, phylum, etc). So we need to convert these using gsub()
Code Block |
---|
asv_table$Kingdom <- gsub("D_0", "k", asv_table$Kingdom) asv_table$Phylum <- gsub("D_1", "p", asv_table$Phylum) asv_table$Class <- gsub("D_2", "c", asv_table$Class) asv_table$Order <- gsub("D_3", "o", asv_table$Order) asv_table$Family <- gsub("D_4", "f", asv_table$Family) asv_table$Genus <- gsub("D_5", "g", asv_table$Genus) asv_table$Species <- gsub("D_6", "s", asv_table$Species) |
...
This section will generate the diversity index scores and then plot them.
This is based on the samples and variable you chose in the previous sections. To confirm what these are, you can run the following two code cellsIf you’ve forgotten what variable group you’ve selected, type:
Code Block |
---|
group |
samples_table$sample.id |
Calculate alpha diversity
...
Code Block |
---|
write.csv(div_indices, paste0(group, "_diversity_indices_raw_scores.csv")) |
Choose the index you want to plot
...
Code Block |
---|
tiff_exp <- paste0("alpha_div_box_plot_", group, "_", indicname, "_", subgroup, "_samples.tiff") ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm") |
...
Code Block |
---|
pdf_exp <- paste0("alpha_div_box_plot_", group, "_", indicname, "", subgroup, "_samples.pdf") ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm") |
...
To see the pairwise results (p values).
Code Block |
---|
wt_pair |
Combining diversity plots
You can combine two diversity box and whisker plots, for a side-by-side comparison of results.
However, as the diversity inices are based on different scales, you need to rescale them both to scale between 0 and 1.
Code Block |
---|
div_indices$Shannon_scaled <- rescale(div_indices$Shannon)
div_indices$ObservedOTUs_scaled <- rescale(div_indices$ObservedOTUs)
# These need to also be converted to long format for ggplot
div_indices_long <- gather(div_indices, Indices, scaled_score, Shannon_scaled: ObservedOTUs_scaled) |
Code Block |
---|
p <- ggplot(div_indices_long, aes_string(x=group, y="scaled_score", fill = "Indices")) +
geom_boxplot(outlier.size = 2.5, lwd=0.7)
p |
Modify the plot like so:
Code Block |
---|
p <- p + theme_bw() +
ylab("Normalised score") +
xlab("") +
coord_flip() +
theme(text = element_text(size = 17)) +
scale_fill_manual(labels = c("Observed ASVs", "Shannon diversity"), values = c("dodgerblue", "firebrick"))
p |
You can save your plot as a 300dpi (i.e. publication quality) tiff or pdf file. These files can be found in your working directory.
Export as a 300dpi tiff
Code Block |
---|
tiff_exp <- paste0("alpha_div_Shannon_obsASV_box_plot_", group, "_", subgroup "_samples.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm") |
Export as a pdf
Code Block |
---|
pdf_exp <- paste0("alpha_div_Shannon_obsASV_box_plot_", group, "_", subgroup "_samples.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm") |
4e. Diversity index plots and statistics - multiple categorical variables
In the previous section you examined a single variable.
In this section you can examine multiple variables by estimating diversity for your primary variable, then splitting the resulting plots into 'facets', based on another variable. This often allows for a better examination of associations and trends between variables.
...
4e. Diversity index plots and statistics - multiple categorical variables
In the previous section you examined weight as a single variable.
In this section you could include fat content as a secondary variable and split your diversity index scores into high fat, medium fat and low fat facets on the same plot.This section, as with the previous section, requites that you've run the '3. Preparing your data' section and chosen both the samples and main variable you want to work with. If you want to change your samples or variables, go back to that section and re-run it with new parameterscan examine multiple variables by estimating diversity for your primary variable, then splitting the resulting plots into 'facets', based on another variable. This often allows for a better examination of associations and trends between variables.
To refresh your memory, the samples variable you've chosen to analyse are:
Code Block |
---|
samples_table$sample.id |
And the main variable you want to analyse is:
Code Block |
---|
group |
...
Now you need to choose a secondary variable, to split you plots by (change "Phase"
to your secondary variable name).
[ ]:
Code Block |
---|
var2 <- "PhaseBatch" |
You can see all the available variables you can choose from by looking at your samples_table column names[ ]:
Code Block |
---|
colnames(samples_table) |
IMPORTANT: your plots will be split into as many facets as there are unique subcategories in your secondary variable. To see how many subcategories:
[ ]:
Code Block |
---|
unique(samples_table[[var2]]) |
...
) |
...
Rarefaction curvegeneralised linear model is applied to examine statistically significant correlations.In addition to the scatter plot, glm (t statistic, p value) and correlation (correlation, p value) statistics can be generated.
...