Content Comparison

Table of Contents

minLevel	1
maxLevel	6
outline	false
style	none
type	list
printable	true

...

Code Block
head(asv_table)

The taxonomy data from ampliseq is prefaced by D_0_D_1_ etc but ampvis2 expects taxonomy assignments to be k_p_, etc (kingdom, phylum, etc). So we need to convert these using gsub()

Code Block

asv_table$Kingdom <- gsub("D_0", "k", asv_table$Kingdom)
asv_table$Phylum <- gsub("D_1", "p", asv_table$Phylum)
asv_table$Class <- gsub("D_2", "c", asv_table$Class)
asv_table$Order <- gsub("D_3", "o", asv_table$Order)
asv_table$Family <- gsub("D_4", "f", asv_table$Family)
asv_table$Genus <- gsub("D_5", "g", asv_table$Genus)
asv_table$Species <- gsub("D_6", "s", asv_table$Species)

...

This is based on the variable you chose in the previous sections. If you’ve forgotten what variable group you’ve selected, type:

Code Block
group

Calculate alpha diversity

First you need to calculate the alpha diversity index scores using ampvis2 function amp_alphadiv(). This will calculate all 4 indices.

...

Code Block
write.csv(div_indices, "diversity_indices_raw_scores.csv")

Choose the index you want to plot

Choose the diversity index scores you want to plot.

...

Shannon's index is used by default ("Shannon") Change this to "Simpson" to plot Simpson's index scores, "Chao1" for Chao1 richness or "ObservedOTUs" for Observed ASVs.

Box and whisker plot

Create a box and whisker plot showing diversity per group. You can view the basic plot like so:

...

Code Block
pdf_exp <- paste0("alpha_div_box_plot_", group, "_", indicname,"_samples.pdf") ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")

Statistical analysis

To compare the differences between groups within your variable, a Kruskal-Wallis test (one-way analysis of variance) can be performed to test for overall differences between all groups, and a Wilcoxon rank sum to test pairwise differences between each group.

...

To see the pairwise results (p values).

Code Block
wt_pair

4e. Diversity index plots and statistics - multiple categorical variables

In the previous section you examined a single variable.

In this section you can examine multiple variables by estimating diversity for your primary variable, then splitting the resulting plots into 'facets', based on another variable. This often allows for a better examination of associations and trends between variables.

To refresh your memory, the variable you've chosen to analyse is:

Code Block
group

The first few steps are the same as the previous section. You generate your diversity indices and then select which index you want to plot:

Code Block
div_indices <- amp_alphadiv(ampvisdata, richness = T)

Code Block
indicname <- "Shannon"

Shannon's index is used by default ("Shannon") Change this to "Simpson" to plot Simpson's index scores, "Chao1" for Chao1 richness and "ObservedOTUs" for Observed ASVs.

Now you need to choose a secondary variable.

You can see all the available variables you can choose from by looking at your samples_table column names

Code Block
colnames(samples_table)

Select the secondary variable.

Code Block
var2 <- "Batch"

IMPORTANT: your plots will be split into as many facets as there are unique subcategories in your secondary variable. To see how many subcategories:

Code Block
unique(samples_table[[var2]])

Rarefaction curvegeneralised linear model is applied to examine statistically significant correlations.In addition to the scatter plot, glm (t statistic, p value) and correlation (correlation, p value) statistics can be generated.

NOTE: This section, as with the previous plotting sections, requites that you've run the '3. Preparing your data' section and chosen the samples you want to work with. If you want to change your samples, go back to that section and re-run it with new parameters.

To refresh your memory, the samples you've chosen to analyse are:

[ ]:

Code Block
samples_table$sample.id

Now choose the continuous variable you want to analyse:

Enter the column name of the variable you want to analyse (i.e. change group <- "Myvariable" in the below cell to your chosen variable's column name). This has to be exactly the same as the column name, including capitalisation, characters such as underscores, etc.

[ ]:

Code Block
group <- "DM_digestibility_perc"

You can see view your variable names as column names in your samples table:

[ ]:

Code Block
colnames(samples_table)

4f. Diversity index plots and statistics - continuous variable Generalised linear model t value =

[ ]:

Code Block
round(glm_sum$coefficients[2,3], 4)

Pearson's correlation =

[ ]:

Code Block
round(cor_stat$estimate, 4)

Pearson's correlation p value =

[ ]:

Code Block
round(cor_stat$p.value, 4)

Adding another variablefacets based on an additional categorical variable.First select which categorical variable you want to examine (remember, these are the column names of the samples table):

[ ]:

Code Block
var3 <- "Phase"

Make sure you've run the plotting code in the previous section. This will create an object called 'p' which you can now split into facets based on the variable you chose in the above code cell"

[ ]:

Code Block
p <- p + facet_grid(get(var3) ~ .)

Code Block
p

And once again, you can export the plot as a 300dpi TIFF and pdf file

[ ]:

Code Block
tiff_exp <- paste0("alpha_div_scatter_glm_", group, "_", var2, "_", var3, "_", indicname, "_", subgroup, "_samples.tiff")

Code Block
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")

[ ]:

Code Block
pdf_exp <- paste0("alpha_div_scatter_glm_", group, "_", var2, "_", var3, "_", indicname, "_", subgroup, "_samples.pdf")

Code Block
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")

Click here to go to the next section: 5. Beta diversityGo to the next section: Beta Diversity

Version	Old Version 17	New Version 18
Changes made by	Paul Whatmore (Deactivated)	Paul Whatmore (Deactivated)
Saved on	Jun 02, 2024	Jun 02, 2024

Versions Compared

Key

Calculate alpha diversity

Choose the index you want to plot

Box and whisker plot

Statistical analysis

4e. Diversity index plots and statistics - multiple categorical variables

Rarefaction curvegeneralised linear model is applied to examine statistically significant correlations.In addition to the scatter plot, glm (t statistic, p value) and correlation (correlation, p value) statistics can be generated.

4f. Diversity index plots and statistics - continuous variable Generalised linear model t value =

Adding another variablefacets based on an additional categorical variable.First select which categorical variable you want to examine (remember, these are the column names of the samples table):