suppdata is an R [@R2018] package to provide easy, reproducible access to supplemental materials within R. Thus
suppdata facilitates open, reproducible research workflows: scientists re-analyzing published datasets can work with them as easily as if they were stored on their own computer, and others can track their analysis workflow painlessly.
For example, imagine you were conducting an analysis of the evolution of body mass in mammals. Without
suppdata, such an analysis would require manually downloading body mass and phylogenetic data from published manuscripts. This is time-consuming, difficult (if not impossible) to make truly reproducible without re-distributing the data, and hard to follow. With
suppdata, such an analysis is straightforward, reproducible, and the sources of the data [@Fritz2009, @Jones2009] are clear because their DOIs are embedded within the code:
# Load phylogenetics packages library(ape) library(caper) library(phytools) # Load suppdata library(suppdata) # Load two published datasets tree <- read.nexus(suppdata("10.1111/j.1461-0248.2009.01307.x", 1))[] traits <- read.delim(suppdata( "E090-184", "PanTHERIA_1-0_WR05_Aug2008.txt", "esa_archives")) # Merge datasets traits <- with(traits, data.frame(body.mass = log10(X5.1_AdultBodyMass_g), species=gsub(" ","_",MSW05_Binomial))) c.data <- comparative.data(tree, traits, species) # Calculate phylogenetic signal phylosig(c.data$phy, c.data$data$body.mass)
The above example makes use of code from the packages
caper [@Orme2013], and
suppdata was, originally, part of
fulltext [@Chamberlain2018], it is already being used in a number of research projects. One such project is
natdb, a package that builds a database of functional traits from published sources. The software is currently available on GitHub (https://github.com/ropensci/suppdata), and we plan to distribute it through ROpenSci and CRAN.