This package provides programmatic access to the Chromosome Counts Database (CCDB) API. The CCDB is a community resource for plant chromosome numbers. For more details on the database, see the associated publication by Rice et al. (2014) doi:10.1111/nph.13191 in New Phytologist.
This package is maintained by Karl Broman and was formerly maintained by Paula Andrea Martinez and Matthew Pennell, none of whom are affiliated with the CCDB group. The URL for Chromer docs is https://docs.ropensci.org/chromer/.
The package can be installed directly from CRAN, but it is currently outdated – PLEASE install directly from GitHub
or, for the latest version, you can install directly from GitHub using remotes
## install.packages("remotes") remotes::install_github("ropensci/chromer")
It is possible to query the database in three ways: by
majorGroup. For example, if we are interested in the genus Solanum (Solanaceae), which contains the potato, tomato, and eggplant, we would query the database as follows
library(chromer) sol_gen <- chrom_counts(taxa = "Solanum", rank = "genus") head(sol_gen) nrow(sol_gen)
There are over 3000 records for Solanum alone! If we are interested in a particular species, such as tomatoes, we can search for the species directly.
taxa="Solanum lycopersicum" (including a space between the genus and species name) will also work here.
If we wanted to get data on the whole family, we simply type
Or, expand the scope much further and get all Angiosperms (this will take some time)
There are two options for returning data. The first (default) is to only return the species name information (including taxonomic resolutions made by Taxonome) and the haploid and diploid counts. Setting the argument
sol_gen_full <- chrom_counts("Solanum", rank = "genus", full = TRUE)
returns a bunch more info on the records.
The Chromosome Counts Database is a fantastic resource but as it is a compilation of a large number of resources and studies, the data is somewhat messy and challenging to work with. We have written a little function that does some post-processing to make it easier to handle. The function
summarize_counts() does the following:
Aggregates multiple records for the same species
Infers the gametophytic (haploid) number of chromosomes when only the sporophytic (diploid) counts are available.
Parses the records for numeric values. In some cases chromosomal counts also include text characters (e.g., #-#; c.#; #,#,#; and many other varieties). As there are many possible ways that chromosomal counts may be listed in the database, the function takes the naive approach and simply searches the strings for integers. In most cases, this is sensible but may produces weird results on occasion. Some degree of manual curation will probably be necessary and the output of the summary should be used with caution in downstream analyses.
To summarize and clean the count data obtained from
chrom_counts() simply use