citesdb is an R package to conveniently analyze the full CITES shipment-level wildlife trade database, available at https://trade.cites.org/. This data consists of over 40 years and 20 million records of reported shipments of wildlife and wildlife products subject to oversight under the Convention on International Trade in Endangered Species of Wild Fauna and Flora. The source data are maintained by the UN Environment World Conservation Monitoring Centre. Harfoot et al. (2018) provide a recent overview of broad temporal and spatial trends in the data.
Install the citesdb package with this command:
Note that since citesdb installs a source dependency from GitHub, you will need package build tools.
When you first load the package you will see a message like this:
Not to worry, just do as it says and run
cites_db_download(). This will fetch the most recent1 database from online, an approximately 158 MB download. It will expand to over 1 GB in the local database. During the download and database building, up to 3.5 GB of disk space may be used temporarily.
Once you fetch the data you can connect to the database with the
cites_db() command. You can use the
cites_shipments() command to load a remote
tibble that is backed by the database but not loaded into R. You can use this to analyze CITES data without ever loading it into memory, then gather your results with the
collect(). For example, as demonstrated in the package README file, one could compute the number of shipment records per year like so:
library(citesdb) library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union cites_shipments() %>% group_by(Year) %>% summarize(n_records = n()) %>% arrange(desc(Year)) %>% collect() #> # A tibble: 44 x 2 #> Year n_records #> <int> <dbl> #> 1 2018 1326 #> 2 2017 1015719 #> 3 2016 1262632 #> 4 2015 1296532 #> 5 2014 1109872 #> 6 2013 1127363 #> 7 2012 1096645 #> 8 2011 950144 #> 9 2010 894011 #> 10 2009 908669 #> # … with 34 more rows
library(ggplot2) breaks <- seq(from = 1975, to = 2020, by = 5) cites_shipments() %>% group_by(Year) %>% summarize(n_records = n()) %>% mutate(log10_n_records = log10(n_records)) %>% ggplot(aes(x = Year, y = n_records)) + geom_point() + geom_line(linetype = "dashed", size = 0.2) + ylab("Number of Records") + theme_minimal() + scale_x_continuous(breaks = breaks)
cites_shipments() %>% group_by(Year) %>% summarize(n_records = n()) %>% mutate(log10_n_records = log10(n_records)) %>% ggplot(aes(x = Year, y = log10_n_records)) + geom_point() + geom_line(linetype = "dashed", size = 0.2) + ylab("log10(Number of Records)") + ylim(1.5, 6.5) + theme_minimal() + scale_x_continuous(breaks = breaks)
The package database also contains tables of field metadata, codes used, and CITES countries. This information comes from “A guide to using the CITES Trade Database”, on the CITES website. Convenience functions
cites_parties() access this information:
head(cites_metadata()) #> # A tibble: 6 x 2 #> variable description #> <chr> <chr> #> 1 Year year in which trade occurred #> 2 Appendix CITES Appendix of taxon concerned #> 3 Taxon scientific name of animal or plant concerned #> 4 Class scientific name of animal or plant concerned #> 5 Order scientific name of animal or plant concerned #> 6 Family scientific name of animal or plant concerned head(cites_codes()) #> # A tibble: 6 x 3 #> field code description #> <chr> <chr> <chr> #> 1 Purpose B Breeding in captivity or artificial propagation #> 2 Purpose E Educational #> 3 Purpose G Botanical garden #> 4 Purpose H Hunting trophy #> 5 Purpose L Law enforcement / judicial / forensic #> 6 Purpose M Medical (including biomedical research) head(cites_parties()) #> # A tibble: 6 x 6 #> country code former_code non_ISO_code date data_source #> <chr> <chr> <lgl> <lgl> <chr> <chr> #> 1 Afghanistan AF FALSE FALSE 1986-01-28 'A guide to using the CITES Trade Database', Version 8, Anne… #> 2 Africa XF FALSE TRUE <NA> 'A guide to using the CITES Trade Database', Version 8, Anne… #> 3 Åland Islands AX FALSE FALSE <NA> 'A guide to using the CITES Trade Database', Version 8, Anne… #> 4 Albania AL FALSE FALSE 2003-09-25 'A guide to using the CITES Trade Database', Version 8, Anne… #> 5 Algeria DZ FALSE FALSE 1984-02-21 'A guide to using the CITES Trade Database', Version 8, Anne… #> 6 American Samoa AS FALSE FALSE <NA> 'A guide to using the CITES Trade Database', Version 8, Anne…
More information on the release of shipment-level data can be found in the
?guidance help file.
con <- cites_db() DBI::dbListTables(con) #>  "cites_codes" "cites_metadata" "cites_parties" "cites_shipments" "cites_status"
Note that MonetDB, as implemented by the MonetDBLite package, currently has a limitation of only one connection to the database at a time. Therefore, if you are running R in multiple sessions (say, one in a console and separately knitting an R Markdown document), you will receive this error:
Error: Local citesdb database is locked by another R session. Try closing or running cites_disconnect() in that session.
cites_disconnect() shuts down the current connection, freeing up your other session to connect.
If you are using a recent version of RStudio interactively, loading the
citesdb package also brings up a browsable pane in the “Connections” tab that lets you explore and preview the database tables.
Click on the arrows to see the data types in each table, or the table icon to the right for a preview of the table. The SQL button (which appears in RStudio >=1.2) opens an SQL document to write and preview SQL queries directly.
Suppose we were interested in CITES data on sharks from the order Lamniformes. We might begin an analysis by visualizing shipments of these organisms and their derived products over time:
cites_shipments() %>% filter(Order == "Lamniformes") %>% group_by(Year, Taxon) %>% summarize(n_records = n()) %>% ggplot(aes(x = Year, y = n_records, color = Taxon)) + geom_point() + geom_line(linetype = "dashed", size = 0.2) + ylab("Number of Records") + theme_minimal() + scale_x_continuous(breaks = breaks, limits = c(1990, max(breaks)))
What accounts for the temporal differences in the number of CITES records we observe? Why are there no data prior to 2001? It would be helpful to know more about the history of these particular species within CITES.
Fortunately, the rcites package provides access to the Speciesplus/CITES Checklist API, which includes metadata about species and their protected status through time. To use the functions within this package, users will first need to sign up for a Speciesplus API account in order to generate a personal access token. This token can then be set for the current R session using
rcites::set_token() or can be stored permanently if written to the user’s
.Renviron file (the approach taken in this tutorial). See here for more information on use of the API access tokens.
As an initial step in
rcites workflows, it will typically be most useful to call
spp_taxonconcept() on some taxa of interest. Using the specific example of one of our Lamniformes sharks, the great white (Carcharodon carcharias), as a query taxon, we can see that this function returns a variety of information:
#>  "11210"
Importantly, we can collect the ID for our taxon of interest from this output and store it as an object for ease of reference:
great_white_id <- spp_taxonconcept("Carcharodon carcharias")$general$id
Using our stored ID variable, we can query other
rcites functions like
spp_cites_legislation(), which returns the CITES listing and reservation status for the query taxon. Note that reservations are essentially exemptions declared by particular CITES Parties in reference to specific taxa.
#> # A tibble: 4 x 6 #> id taxon_concept_id is_current appendix change_type effective_at #> <chr> <chr> <lgl> <chr> <chr> <chr> #> 1 904 11210 TRUE II + 2005-01-12 #> 2 906 11210 TRUE II R+ 2005-01-12 #> 3 907 11210 TRUE II R+ 2005-01-12 #> 4 905 11210 TRUE II R+ 2005-01-12
So this function query reveals that the great white is listed under CITES Appendix II (as of 12 January 2005) and that three CITES Parties have active reservations for this species.
To reveal the full listing and reservation history, rather than just current information, we use the
scope = "all" argument:
spp_cites_legislation(great_white_id, scope = "all")$cites_listings
#> # A tibble: 13 x 6 #> id taxon_concept_id is_current appendix change_type effective_at #> <chr> <chr> <lgl> <chr> <chr> <chr> #> 1 899 11210 FALSE III + 2001-10-29 #> 2 904 11210 TRUE II + 2005-01-12 #> 3 898 11210 FALSE III R+ 2001-10-22 #> 4 900 11210 FALSE III R+ 2002-03-26 #> 5 901 11210 FALSE II R+ 2004-07-15 #> 6 902 11210 FALSE III R+ 2004-07-15 #> 7 906 11210 TRUE II R+ 2005-01-12 #> 8 907 11210 TRUE II R+ 2005-01-12 #> 9 905 11210 TRUE II R+ 2005-01-12 #> 10 22005 11210 FALSE III R- 2005-01-12 #> 11 22003 11210 FALSE III R- 2005-01-12 #> 12 22004 11210 FALSE III R- 2005-01-12 #> 13 23783 11210 FALSE II R- 2016-11-21
Here, we see that the species was in fact listed under CITES Appendix III as early as 29 October 2001. This may account for the CITES trade records for this species that date prior to its uplisting to Appendix II, a status which affords more stringent protection and monitoring.
In general, the historical listing information provides critical context for CITES data interpretation. In this case, for instance, it is clear that the lack of CITES data for great white sharks prior to the year 2002 should not be taken as an indication of a lack of trade in this species during that time frame but rather a lack of intergovernmental oversight and data collection.
rcites also offers functions that return other useful metadata such as species distribution information:
#> # A tibble: 96 x 5 #> id iso_code2 name type tags #> <chr> <chr> <chr> <chr> <chr> #> 1 14 BS Bahamas COUNTRY "" #> 2 101 TW Taiwan, Province of China TERRITORY "" #> 3 1315 TR Turkey COUNTRY "" #> 4 1704 GH Ghana COUNTRY "" #> 5 5041 CV Cabo Verde COUNTRY "" #> 6 5393 HR Croatia COUNTRY "" #> 7 6460 GA Gabon COUNTRY "" #> 8 8056 MA Morocco COUNTRY "" #> 9 8181 ES Spain COUNTRY "" #> 10 8286 SD Sudan COUNTRY "" #> # … with 86 more rows
The CITES shipment-level data is a valuable resource for understanding part of the global wildlife trade. However, care must be taken in working with this data. Incomplete or ambiguous data can lead to misinterpretation and different approaches to handling the data can lead to different conclusions. Harrington (2015), Berec, Vršecká, and Šetlíková (2018), Robinson and Sinovas (2018), and Eskew et al. (2019) all provide greater detail on challenges related to analyzing the CITES Trade Database. Also note the shipment-level guidance provided by CITES, which can be found in the
?guidance help file.
As an example, single shipments may be recorded multiple times, as they may be reported by both exporting and importing countries. Here we examine a set of shipments of orchids from a US exporter to the Netherlands:
cites_shipments() %>% filter(Year == 2016, Export.permit.RandomID == "ce63001ff5", Genus == "Paphiopedilum") %>% collect() %>% head() %>% knitr::kable()
See that each pair of shipments is identical except for
Reporter.type, indicating that the shipment was reported both from the US and from the Netherlands. Note that while the importer reported both import and export permit numbers, the exporter reported only the export permit numbers.
How to deal with repeat records? If you are aggregating records to calculate total shipments it is best to remove them, but not every record is recorded twice. This code filters to use only the importer’s records, should all fields other than
Import.permit.RandomID be the same:
cites_shipments() %>% filter(Year == 2016, Export.permit.RandomID == "ce63001ff5", Genus == "Paphiopedilum") %>% collect() %>% group_by_at(vars(-Reporter.type, -Import.permit.RandomID)) %>% mutate(n = n()) %>% filter((n > 1 & Reporter.type == "I") | n == 1) %>% ungroup() %>% head() %>% knitr::kable()
If you use citesdb in a publication, please cite both the package and source data:
Ross, Noam, Evan A. Eskew, and Nicolas Ray. 2019. citesdb: An R package to support analysis of CITES Trade Database shipment-level data. Journal of Open Source Software, 4(37), 1483, https://doi.org/10.21105/joss.01483
UNEP-WCMC (Comps.) 2019. Full CITES Trade Database Download. Version 2019.2. CITES Secretariat, Geneva, Switzerland. Compiled by UNEP-WCMC, Cambridge, UK. Available at: https://trade.cites.org.
Have feedback or want to contribute? Great! Please take a look at the contributing guidelines before filing an issue or pull request.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
Berec, Michal, Lucie Vršecká, and Irena Šetlíková. 2018. “What Is the Reality of Wildlife Trade Volume? CITES Trade Database Limitations.” Biological Conservation 224: 111–16. https://doi.org/10.1016/j.biocon.2018.05.025.
Eskew, Evan A., Noam Ross, Carlos Zambrana-Torrelio, and William B. Karesh. 2019. “The CITES Trade Database Is Not a "Global Snapshot" of Legal Wildlife Trade: Response to Can et al., 2019.” Global Ecology and Conservation 18: e00631. https://doi.org/10.1016/j.gecco.2019.e00631.
Harfoot, Michael, Satu A. M. Glaser, Derek P. Tittensor, Gregory L. Britten, Claire McLardy, Kelly Malsch, and Neil D. Burgess. 2018. “Unveiling the Patterns and Trends in 40 Years of Global Trade in CITES-Listed Wildlife.” Biological Conservation 223: 47–57. https://doi.org/10.1016/j.biocon.2018.04.017.
Harrington, Lauren A. 2015. “International Commercial Trade in Live Carnivores and Primates 2006-2012: Response to Bush et al. 2014.” Conservation Biology 29: 293–96. https://doi.org/10.1111/cobi.12448.
Robinson, Janine E., and Pablo Sinovas. 2018. “Challenges of Analyzing the Global Trade in CITES-Listed Wildlife.” Conservation Biology 32: 1203–6. https://doi.org/10.1111/cobi.13095.
When data is updated on the CITES website, we check for changes in structure, metadata, or documentation that we should pass on to the user (e.g., in the
?guidance help file) and re-package the data for distribution in our own repository. Data updates are expected twice per year, and we expect our repository to update within 2 weeks of the official release.↩