Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel6
outlinefalse
stylenone
typelist
printabletrue

...

Code Block
head(asv_table)

The taxonomy data from ampliseq is prefaced by D_0_D_1_ etc but ampvis2 expects taxonomy assignments to be k_p_, etc (kingdom, phylum, etc). So we need to convert these using gsub()

Code Block
asv_table$Kingdom <- gsub("D_0", "k", asv_table$Kingdom)
asv_table$Phylum <- gsub("D_1", "p", asv_table$Phylum)
asv_table$Class <- gsub("D_2", "c", asv_table$Class)
asv_table$Order <- gsub("D_3", "o", asv_table$Order)
asv_table$Family <- gsub("D_4", "f", asv_table$Family)
asv_table$Genus <- gsub("D_5", "g", asv_table$Genus)
asv_table$Species <- gsub("D_6", "s", asv_table$Species)

...

This section will generate the diversity index scores and then plot them.

This is based on the samples and variable you chose in the previous sections. To confirm what these are, you can run the following two code cellsIf you’ve forgotten what variable group you’ve selected, type:

code
Code Block
group
samples_table$sample.id

Calculate alpha diversity

...

Code Block
write.csv(div_indices, paste0(group, "_diversity_indices_raw_scores.csv"))

Choose the index you want to plot

...

Code Block
tiff_exp <- paste0("alpha_div_box_plot_", group, "_", indicname, "_", subgroup, "_samples.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")

...

Code Block
pdf_exp <- paste0("alpha_div_box_plot_", group, "_", indicname, "", subgroup, "_samples.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")

...

To see the pairwise results (p values).

Code Block
wt_pair

Combining diversity plots

You can combine two diversity box and whisker plots, for a side-by-side comparison of results.

However, as the diversity inices are based on different scales, you need to rescale them both to scale between 0 and 1.

Code Block
div_indices$Shannon_scaled <- rescale(div_indices$Shannon)
div_indices$ObservedOTUs_scaled <- rescale(div_indices$ObservedOTUs)
# These need to also be converted to long format for ggplot
div_indices_long <- gather(div_indices, Indices, scaled_score, Shannon_scaled: ObservedOTUs_scaled)

Code Block
p <- ggplot(div_indices_long, aes_string(x=group, y="scaled_score", fill = "Indices")) +
  geom_boxplot(outlier.size = 2.5, lwd=0.7)
p

Modify the plot like so:

Code Block
p <- p + theme_bw() + 
ylab("Normalised score") + 
xlab("") + 
coord_flip() + 
theme(text = element_text(size = 17)) +
scale_fill_manual(labels = c("Observed ASVs", "Shannon diversity"), values = c("dodgerblue", "firebrick"))
p

You can save your plot as a 300dpi (i.e. publication quality) tiff or pdf file. These files can be found in your working directory.

Export as a 300dpi tiff

Code Block
tiff_exp <- paste0("alpha_div_Shannon_obsASV_box_plot_", group, "_", subgroup "_samples.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")

Export as a pdf

Code Block
pdf_exp <- paste0("alpha_div_Shannon_obsASV_box_plot_", group, "_", subgroup "_samples.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")

4e. Diversity index plots and statistics - multiple categorical variables

In the previous section you examined a single variable.

In this section you can examine multiple variables by estimating diversity for your primary variable, then splitting the resulting plots into 'facets', based on another variable. This often allows for a better examination of associations and trends between variables.

...

4e. Diversity index plots and statistics - multiple categorical variables

In the previous section you examined weight as a single variable.

In this section you could include fat content as a secondary variable and split your diversity index scores into high fat, medium fat and low fat facets on the same plot.This section, as with the previous section, requites that you've run the '3. Preparing your data' section and chosen both the samples and main variable you want to work with. If you want to change your samples or variables, go back to that section and re-run it with new parameterscan examine multiple variables by estimating diversity for your primary variable, then splitting the resulting plots into 'facets', based on another variable. This often allows for a better examination of associations and trends between variables.

To refresh your memory, the samples variable you've chosen to analyse are:

Code Block
samples_table$sample.id

And the main variable you want to analyse is:

Code Block
group

...

Now you need to choose a secondary variable, to split you plots by (change "Phase" to your secondary variable name).

[ ]:

Code Block
var2 <- "PhaseBatch"

You can see all the available variables you can choose from by looking at your samples_table column names[ ]:

Code Block
colnames(samples_table)

IMPORTANT: your plots will be split into as many facets as there are unique subcategories in your secondary variable. To see how many subcategories:

[ ]:

Code Block
unique(samples_table[[var2]])

...

)

...

Rarefaction curvegeneralised linear model is applied to examine statistically significant correlations.In addition to the scatter plot, glm (t statistic, p value) and correlation (correlation, p value) statistics can be generated.

...