taxa package is intended to:
Provide a set of classes to store taxonomic data and any user-specific data associated with it
Provide functions to convert commonly used formats to these classes
Provide a common foundation for other packages to build on to enable an ecosystem of compatible packages dealing with taxonomic data.
Provide generally useful functionality, such as filtering and mapping functions
These are the classes users would typically interact with:
taxon: A class used to define a single taxon. Many other classes in the `taxa`` package include one or more objects of this class.
taxonomy: A taxonomy composed of taxon objects organized in a tree structure. This differs from the hierarchies class in how the taxon objects are stored. Unlike a hierarchies object, each unique taxon is stored only once and the relationships between taxa are stored in an edgelist.
taxmap: A class designed to store a taxonomy and associated
user-defined data. This class builds on the taxonomy class. User defined
data can be stored in the list
obj is a taxmap
object. Any number of user-defined lists, vectors, or tables mapped
to taxa can be manipulated in a cohesive way such that relationships
between taxa and data are preserved.
These classes are mostly components for the larger classes above and would not typically be used on their own.
taxon_database: Used to store information about taxonomy databases.
taxon_id: Used to store taxon IDs, either arbitrary or from a particular taxonomy database.
taxon_name: Used to store taxon names, either arbitrary or from a particular taxonomy database.
taxon_rank: Used to store taxon ranks (e.g. species, family), either arbitrary or from a particular taxonomy database.
filter_taxa: Filter taxa in a taxonomy or taxmap object with a series of conditions. Relationships between remaining taxa and user-defined data are preserved (There are many options controlling this).
There are lots of functions for getting information for each taxon.
Note, this is mostly of interest to developers and advanced users.
The classes in the
taxa package are mostly
R6 classes (R6Class). A few of the
simpler ones (taxa and hierarchies) are
S3 instead. R6 classes are different than
most R objects because they are
mutable (e.g. A function
can change its input without returning it). In this, they are more similar
to class systems in
languages like python. As in other object-oriented class systems, functions
are thought to "belong" to classes (i.e. the data), rather than functions
existing independently of the data. For example, the function
print.myclassname. In contrast, the functions that operate on R6
functions are "packaged" with the data they operate on. For example, a print
method of an object for an R6 class might be called like
my_data$print() instead of
Note, you will need to read the previous section to fully understand this one.
Since the R6 function syntax (e.g.
my_data$print()) might be confusing to
many R users, all functions in
taxa also have S3 versions. For example,
filter_taxa() function can be called on a taxmap object called
my_obj$filter_taxa(...) (the R6 syntax) or
filter_taxa(my_obj, ...) (the S3 syntax). For some functions, these two
way of calling the function can have different effect. For functions that do
not returned a modified version of the input (e.g.
subtaxa()), the two ways have identical behavior.
However, functions like
filter_taxa(), that modify their inputs, actually
change the object passed to them as the first argument as well as returning that
object. For example,
my_obj <- filter_taxa(my_obj, ...)
new_obj <- my_obj$filter_taxa(...)
my_obj with the filtered result, but
new_obj <- filter_taxa(my_obj, ...)
will not modify
This is a rather advanced topic.
Like packages such as
ggplot2 and dplyr, the
taxa package uses
non-standard evaluation to allow code
to be more readable and shorter. In effect, there are variables that only
"exist" inside a function call and depend on what is passed to that function
as the first parameter (usually a class object). For example, in the
filter(), column names can be used as if they were independent
?dpylr::filter for examples of this. The
taxa package builds on this idea.
For many functions that work on taxonomy or taxmap objects (e.g. filter_taxa),
some functions that return per-taxon information (e.g.
be referred to by just the name of the function. When one of these functions
are referred to by name, the function is run on the relevant object and its
value replaces the function name. For example,
new_obj <- filter_taxa(my_obj, taxon_names == "Bacteria")
is identical to:
new_obj <- filter_taxa(my_obj, taxon_names(my_obj) == "Bacteria")
which is identical to:
new_obj <- filter_taxa(my_obj, my_obj$taxon_names() == "Bacteria")
which is identical to:
my_names <- taxon_names(my_obj)
new_obj <- filter_taxa(my_obj, my_names == "Bacteria")
taxmap objects, you can also use names of user defined lists, vectors,
and the names of columns in user-defined tables that are stored in the
obj$data list. See
filter_taxa() for examples. You can even add your own
functions that are called by name by adding them to the
For any object with functions that use non-standard evaluation, you can see
what values can be used with
Various elements of the
taxa package were inspired by the dplyr and
taxize packages. This package started as parts of the
binomen packages. There are also many dependencies that make
Find a problem? Have a suggestion? Have a question? Please submit an issue at our GitHub repository:
A GitHub account is free and easy to set up. We welcome feedback! If you don't want to use GitHub for some reason, feel free to email us. We do prefer posting to github since it allows others that might have the same issue to see our conversation. It also helps us keep track of what problems we need to address.
Checkout the vignette
browseVignettes("taxa")) for detailed introduction and examples.
Scott Chamberlain [email protected]
Zachary Foster [email protected]