Skip to contents

In order to use GBIF mediated data effectively, you will often need to match a scientific name to the GBIF Backbone Taxonomy.

The goal of name matching is to get back an unambiguous taxonomic key (a number) of the scientific name you are interested in. Having a key makes it easy for GBIF to know what you mean.

name_backbone or name_backbone_checklist are the best ways to go from scientific name to GBIF taxonkey.

name_backbone(name="Calopteryx splendens") 
# name_backbone(name="Calopteryx splendens", verbose=TRUE)

This will return a data.frame of the single best match for the name you supplied.

The most interesting columns are:

  • usageKey: Another name for the GBIF taxonkey.
  • status : name_backbone will always return only “ACCEPTED” names.
  • matchType : “EXACT”, “HIGHERRANK”, “FUZZY”, or “NONE” (see below).
  • verbatim_name : The name you supplied to GBIF. Useful for matching back to your original data.

A matchType of “HIGHERRANK” usually means the name is not in the GBIF backbone or it is not a species-level name (a genus, family, order …). A matchType of “FUZZY” means that the name you supplied may have been misspelled or is a variant not in the backbone. A matchType of “Exact” means the binomial name appears exactly as spelled by you in the GBIF backbone (note that it ignores authorship info).

If you have multiple names to match, you can use name_backbone_checklist.

# This requires the newest version of rgbif
name_list <- c(
"Cirsium arvense (L.) Scop.", 
"Calopteryx splendens", 
"Puma concolor (Linnaeus, 1771)", 
"Ceylonosticta alwisi", 
"Fake species (John Waller 2021)", 
"Calopteryx")

name_backbone_checklist(name_list)

name_backbone_checklist will also work with a data.frame of name information also known as a checklist.

name_data <- data.frame(
scientificName = c(
  "Cirsium arvense (L.) Scop.", # a plant
  "Calopteryx splendens (Harris, 1780)", # an insect
  "Puma concolor (Linnaeus, 1771)", # a big cat
  "Ceylonosticta alwisi (Priyadarshana & Wijewardhane, 2016)", # newly discovered insect 
  "Puma concuolor (Linnaeus, 1771)", # a mis-spelled big cat
  "Fake species (John Waller 2021)", # a fake species
  "Calopteryx" # Just a Genus   
), 
kingdom = c(
  "Plantae",
  "Animalia",
  "Animalia",
  "Animalia",
  "Animalia",
  "Johnlia",
  "Animalia"
))

name_backbone_checklist(name_data)
# To return more than just the 'best' results, run
# name_backbone_checklist(name_data,verbose=TRUE) 

When using name_backbone_checklist with a data.frame, you can include higher taxonomic information (genus, family, order, phylum, kingdom, rank) as columns. The ‘name’ column can also be one of several commonly used aliases (scientificName, sci_name, names, species, species_name, sp_name).

name_data <- data.frame(
species = c(
  "Cirsium arvense (L.) Scop.", # a plant
  "Calopteryx splendens (Harris, 1780)", # an insect
  "Puma concolor (Linnaeus, 1771)"
  ), 
 kingdom = c(
  "Plantae",
  "Animalia",
  "Animalia"
))

name_backbone_checklist(name_data)

Too many choices problem

When two or more names exist in the GBIF Backbone Taxonomy that have the same name but different authorship (homotypic synonyms), supplying just the binomial name will result in matchType : "HIGHERRANK".

For example, the binomial name “Glocianus punctiger” has two entries in the backbone taxonomy. Using the verbose=TRUE will return both names.

name_backbone("Glocianus punctiger",verbose=TRUE) # returns more names
# "Glocianus punctiger (C.R.Sahlberg, 1835)"
# "Glocianus punctiger (Gyllenhal, 1837)"

However, giving just the binomial name will return the genus Glocianus, since GBIF doesn’t know which one to choose.

name_backbone("Glocianus punctiger") # matchType : "HIGHERRANK"

Since name_backbone is designed to give back the best match, it’s not possible for the response to choose between the two names.

Other name_* functions

There are several functions for finding taxonomic information. Typically, the function you want to use is name_backbone or name_backbone_checklist, but these other functions can also be useful in certain situations.

name_suggest can be useful for looking up subspecies or partial names. It is the same service that lets gbif.org guess which name you are typing in the occurrence search.

name_suggest("Calopteryx splendens")

name_lookup can be sometimes useful for seeing what is available in other checklists.

name_lookup("Calopteryx splendens")$data

name_usage is a catch all function that does a lot. ?name_usage for more examples.

For example, name_usage can be used for looking up all of the order, families, or genera in a higher-rank group.

library(dplyr)
# all bird genera, families, and orders
name_usage(212,data="children",limit=200)$data %>% 
filter(!is.na(nubKey)) %>% # only things with a GBIF backbone nubKey
glimpse()

name_usage can be used for looking up the common names or vernacular names.

name_usage(key=212, data="vernacularNames")$data # the common names for birds

Further reading

Read more about how the GBIF backbone is made here.