Skip to contents

This function converts a list of hierarchies for individual species into a single species by taxonomic level matrix, then calculates a distance matrix based on taxonomy alone, and outputs either a phylo or dist object. See details for more information.

Usage

class2tree(input, varstep = TRUE, check = TRUE, remove_shared = FALSE, ...)

# S3 method for class 'classtree'
plot(x, ...)

# S3 method for class 'classtree'
print(x, ...)

Arguments

input

List of classification data.frame's from the function classification()

varstep

Vary step lengths between successive levels relative to proportional loss of the number of distinct classes.

check

If TRUE, remove all redundant levels which are different for all rows or constant for all rows and regard each row as a different basal taxon (species). If FALSE all levels are retained and basal taxa (species) also must be coded as variables (columns). You will get a warning if species are not coded, but you can ignore this if that was your intention.

remove_shared

If TRUE, remove any taxa that are coarser ranks present in other taxa, such as both a genus and a species in that genus in the same tree.

...

Further arguments passed on to hclust.

x

Input object to print or plot - output from class2tree function.

Value

An object of class "classtree" with slots:

  • phylo - The resulting object, a phylo object

  • classification - The classification data.frame, with taxa as rows, and different classification levels as columns

  • distmat - Distance matrix

  • names - The names of the tips of the phylogeny

Note that when you execute the resulting object, you only get the phylo object. You can get to the other 3 slots by calling them directly, like output$names, etc.

Details

See vegan::taxa2dist(). Thanks to Jari Oksanen for making the taxa2dist function and pointing it out, and Clarke & Warwick (1998, 2001), which taxa2dist was based on. The taxonomy tree created is not only based on the clustering of the taxonomy ranks (e.g. strain, species, genus, ...), but it also utilizes the actual taxon clades (e.g. mammals, plants or reptiles, etc.). The process of this function is as following: First, all possible taxonomy ranks and their corresponding IDs for each given taxon will be collected from the input. Then, the rank vectors of all taxa will be aligned, so that they together will become a matrix where columns are ordered taxonomy ranks of all taxa and rows are the rank vectors of those taxa. After that, the rank matrix will be converted into taxonomy ID matrix, any missing rank will have a pseudo ID from the previous rank. Finally, this taxonomy ID matrix will be used to cluster taxa that have similar taxonomy hierarchy together.

Examples

if (FALSE) { # \dontrun{
spnames <- c('Quercus robur', 'Iris oratoria', 'Arachis paraguariensis',
 'Helianthus annuus','Madia elegans','Lupinus albicaulis',
 'Pinus lambertiana')
out <- classification(spnames, db='itis')
tr <- class2tree(out)
plot(tr)

spnames <- c('Klattia flava', 'Trollius sibiricus',
 'Arachis paraguariensis',
 'Tanacetum boreale', 'Gentiana yakushimensis','Sesamum schinzianum',
 'Pilea verrucosa','Tibouchina striphnocalyx','Lycium dasystemum',
 'Berkheya echinacea','Androcymbium villosum',
 'Helianthus annuus','Madia elegans','Lupinus albicaulis',
 'Pinus lambertiana')
out <- classification(spnames, db='ncbi')
tr <- class2tree(out)
plot(tr)
} # }