Content Comparison

...

Combined Shannon and Observed ASV. An additional plot combing both Shannon’s index and Observed ASV indices has been included, to compare similarities and differences between these results. As each index uses different units, results for both have been normalised between 0 and 1.
Kruskal-Wallis rank sum test is a rank-based nonparametric test that can be used to determine if there are statistically significant differences between two or more groups. This statistical analysis is provided for each plot, to estimate if there is a significant difference (q.value < 0.05) between all groups.
Pairwise Wilcoxon rank sum test (AKA: Mann-Whitney test is the same as the Kruskal-Wallis test, but applied pairwise to each group (technically, The Kruskal-Wallis test is the generalization of the Wilcoxon rank sum test).

3. Preparing your data

Import the samples table

When you ran your sequences through the ampliseq pipeline, you submitted the samples with a metadata file. This file contains information on your samples and variables. We need to import this metadata file to run our analysis on selected variables.

Code Block
samples_table <- read.table("metadata.tsv", sep = "\t", header = T)

Have a look at your samples table and variables (metadata):

Code Block
samples_table

Import the ASV abundance table

First import the unfiltered ASV table (ampvis2 does internal filtering).

Code Block

asvtable <- read.table("./results/abundance_table/unfiltered/feature-table.tsv", check.names = FALSE, sep = "\t", skip = 1, stringsAsFactors = FALSE, comment.char = "", header = T)
colnames(asvtable)[1] <- "ASV"

Have a look at the top few rows of the ASV table, to see if it looks right. The sample IDs should be the column names and the ASV IDs in the first column. All the other columns should contain numbers (i.e. the count of the number of times each ASV was found in each sample).

Code Block
head(asvtable)

Important: always run this next code cell, even if you haven't subset your data If you're working with a subset of your data (see section 3) you'll need to choose a name for your subgroup (e.g. subgroup <- "high_fat") or if you haven't subset your data (i.e. you're working with all your samples) subgroup <- "all" . This is for naming output plots and files by subset group, so you're not overwriting them each time you run a subset of your data.

Code Block
subgroup <- "all"

Convert the imported data to ampvis2 format

The following code cell manipulates the data in a variety of ways (see the in-code #comments for explanations) to prepare the data for conversion to an ampliseq2 database.

Read in the taxonomy data - i.e. the taxonomic assignments for each ASV

Code Block
mytax <- read.table("./results/taxonomy/taxonomy.tsv", sep = "\t", header = T, quote = "")

Each taxonomic level needs to be in a separate column, and in taxonomy.tsv they are all combined and separated by ';' so we need to split by that delimiter. Use tidyr::separate function for this.

Code Block
mytaxsplit <- tidyr::separate(data = mytax, col = Taxon, into = c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species"), sep = ";")

Version	Old Version 3	New Version 4
Changes made by	Paul Whatmore (Deactivated)	Paul Whatmore (Deactivated)
Saved on	May 28, 2024	May 28, 2024

Versions Compared

Key

3. Preparing your data

Import the samples table

Import the ASV abundance table

Convert the imported data to ampvis2 format