Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
mytaxsplit <- tidyr::separate(data = mytax, col = Taxon, into = c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species"), sep = ";")

Combine the taxonomy and ASV tables

Code Block
asv_table <- cbind(asvtable, mytaxsplit)
# Also remove the Feature.ID and Confidence columns, as they are not needed
asv_table <- subset(asv_table, select=-c(Feature.ID, Confidence))

The taxonomy data from ampliseq is prefaced by D_0_ D_1_ etc but ampvis2 expects taxonomy assignments to be k_ p_, etc (kingdom, phylum, etc). So we need to convert these using gsub()

Code Block
asv_table$Kingdom <- gsub("D_0", "k", asv_table$Kingdom)
asv_table$Phylum <- gsub("D_1", "p", asv_table$Phylum)
asv_table$Class <- gsub("D_2", "c", asv_table$Class)
asv_table$Order <- gsub("D_3", "o", asv_table$Order)
asv_table$Family <- gsub("D_4", "f", asv_table$Family)
asv_table$Genus <- gsub("D_5", "g", asv_table$Genus)
asv_table$Species <- gsub("D_6", "s", asv_table$Species)

Now combine the samples data with the ASV table using amp_load(). This creates an ampvis2 database that can be used by ampvis2

Code Block
ampvisdata <- amp_load(otutable = asv_table,
              metadata = samples_table)

4. Choosing a categorical variable to analyse

In your metadata you'll usually have multiple variables. These need to be analysed individually, by selecting the variable in this section, then running the remaining analysis sections on this chosen variable. You can then re-run the analysis on another variable by returning to this section, changing the variable name, then running again the remaining analysis sections.

NOTE This section is for choosing categorical variables only. See section 8 onward for analysis of continous (i.e. numeric) variables.

You can view your variables as column names in your samples_table:

Code Block
colnames(samples_table)

Enter the column name of the variable you want to analyse (i.e. change group <- "Myvariable" in the below cell to your chosen variable's column name). This has to be exactly the same as the column name, including capitalisation, characters such as underscores, etc.

Code Block
group <- "Time"

Ordering your variable

The plotting done in ampvis2 is done by the ggplot2 package. ggplot factorises variables and automatically orders them on the plot by alphabetical order.

This can cause your groups to be ordered incorrectly on the plot axes (e.g. a time series may not be plotted sequentially).

You can manually set the order of your variable here.

First have a look at how ggplot will order your variable.

Code Block
levels(factor(ampvisdata$metadata[[group]]))

If these are in the order you want to see them on your plot axes, nothing needs to be done. If they are in the wrong order you need to order them manually by setting the levels.

Choose how you want to order your groups here:

Code Block
lev <- c("0h", "2h", "4h", "6h", "8h", "16h", "24h", "32h", "36h", "48h", "60h", "72h", "96h", "120h")

To order your variable you need to put all the variable levels into the lev = c(..). Make sure each level is in double quotes and separated by a comma.

Then run the following to apply the levels to your data:

Code Block
ampvisdata$metadata[[group]] <- factor(ampvisdata$metadata[[group]], levels = lev)

5. Rarefaction curve