You have probably, or will, run into problems with taxonomic names. For example, you may think you know how a taxonomic name is spelled, but then GBIF will not agree with you. Or, perhaps GBIF will have multiple versions of the taxon, spelled in slightly different ways. Or, the version of the name that they think is the right one does not match what yo think is the right one.

This isn’t really anyone’s fault. It’s a result of there not being one accepted taxonomic source of truth across the globe. There are many different taxonomic databases. GBIF makes their own backbone taxonomy that they use as a source of internal truth for taxonomic names. The accepted names in the backbone taxonomy match those in the database of occurrences - so do try to figure out what backbone taxonomy version of the name you want.

Another source of problems stems from the fact that names are constantly changing. Sometimes epithets change, sometimes generic names, and sometimes higher names like family or tribe. These changes can take a while to work their way in to GBIF’s data.

The following are some examples of confusing name bits. We’ll update these if GBIF’s name’s change. The difference between each pair of names is highlighted in bold.

Load rgbif

Helper function

To reduce code duplication, we’ll use a little helper function to make a call to name_backbone() for each input name, then rbind them together:

name_rbind <- function(..., rank = "species") {
  columns <- c('usageKey', 'scientificName', 'canonicalName', 'rank',
    'status', 'confidence', 'matchType', 'synonym')
  df <- lapply(list(...), function(w) {
    rgbif::name_backbone(w, rank = rank)[, columns]
  })
  data.frame(do.call(rbind, df))
}

And another function to get the taxonomic data provider

taxon_provider <- function(x) {
  tt <- name_usage(key = x)$data
  datasets(uuid = tt$constituentKey)$data$title
}

We use taxon_provider() below to get the taxonomy provider in the bulleted list of details for each taxon (even though you don’t see it called, we use it, but the code isn’t shown :)).

Pinus sylvestris vs. P. silvestris

(c1 <- name_rbind("Pinus sylvestris", "Pinus silvestris"))
#>   usageKey      scientificName    canonicalName    rank   status confidence
#> 1  5285637 Pinus sylvestris L. Pinus sylvestris SPECIES ACCEPTED         97
#> 2  5285637 Pinus sylvestris L. Pinus sylvestris SPECIES ACCEPTED         94
#>   matchType synonym
#> 1     EXACT   FALSE
#> 2     FUZZY   FALSE
  • P. sylvestris w/ occurrences count from the data provider
occ_count(c1$usageKey[[1]])
#> [1] 642901
taxon_provider(c1$usageKey[[1]])
#> [1] "Catalogue of Life - May 2021"
  • P. silvestris w/ occurrences count from the data provider
occ_count(c1$usageKey[[2]])
#> [1] 642901
taxon_provider(c1$usageKey[[2]])
#> [1] "Catalogue of Life - May 2021"

Macrozamia platyrachis vs. M. platyrhachis

(c2 <- name_rbind("Macrozamia platyrachis", "Macrozamia platyrhachis"))
#>   usageKey                     scientificName           canonicalName    rank
#> 1  2683551 Macrozamia platyrhachis F.M.Bailey Macrozamia platyrhachis SPECIES
#> 2  2683551 Macrozamia platyrhachis F.M.Bailey Macrozamia platyrhachis SPECIES
#>     status confidence matchType synonym
#> 1 ACCEPTED         96     FUZZY   FALSE
#> 2 ACCEPTED         98     EXACT   FALSE
  • M. platyrachis w/ occurrences count from the data provider
occ_count(c2$usageKey[[1]])
#> [1] 61
taxon_provider(c2$usageKey[[1]])
#> [1] "Catalogue of Life - May 2021"
  • M. platyrhachis w/ occurrences count from the data provider
occ_count(c2$usageKey[[2]])
#> [1] 61
taxon_provider(c2$usageKey[[2]])
#> [1] "Catalogue of Life - May 2021"

Cycas circinalis vs. C. circinnalis

(c3 <- name_rbind("Cycas circinalis", "Cycas circinnalis"))
#>   usageKey      scientificName    canonicalName    rank   status confidence
#> 1  2683264 Cycas circinalis L. Cycas circinalis SPECIES ACCEPTED         98
#> 2  2683264 Cycas circinalis L. Cycas circinalis SPECIES ACCEPTED         95
#>   matchType synonym
#> 1     EXACT   FALSE
#> 2     FUZZY   FALSE
  • C. circinalis w/ occurrences count from the data provider
occ_count(c3$usageKey[[1]])
#> [1] 724
taxon_provider(c3$usageKey[[1]])
#> [1] "Catalogue of Life - May 2021"
  • C. circinnalis w/ occurrences count from the data provider
occ_count(c3$usageKey[[2]])
#> [1] 724
taxon_provider(c3$usageKey[[2]])
#> [1] "Catalogue of Life - May 2021"

Isolona perrieri vs. I. perrierii

(c4 <- name_rbind("Isolona perrieri", "Isolona perrierii"))
#>   usageKey          scientificName     canonicalName    rank   status
#> 1  6308376 Isolona perrierii Diels Isolona perrierii SPECIES ACCEPTED
#> 2  6308376 Isolona perrierii Diels Isolona perrierii SPECIES ACCEPTED
#>   confidence matchType synonym
#> 1         96     FUZZY   FALSE
#> 2         98     EXACT   FALSE
  • I. perrieri w/ occurrences count from the data provider
occ_count(c4$usageKey[[1]])
#> [1] 92
taxon_provider(c4$usageKey[[1]])
#> [1] "Catalogue of Life - May 2021"
  • I. perrierii w/ occurrences count from the data provider
occ_count(c4$usageKey[[2]])
#> [1] 92
taxon_provider(c4$usageKey[[2]])
#> [1] "Catalogue of Life - May 2021"

Wiesneria vs. Wisneria

(c5 <- name_rbind("Wiesneria", "Wisneria", rank = "genus"))
#>   usageKey         scientificName canonicalName  rank   status confidence
#> 1  2864604      Wiesneria Micheli     Wiesneria GENUS ACCEPTED         96
#> 2  7327444 Wisneria Micheli, 1881      Wisneria GENUS  SYNONYM         95
#>   matchType synonym
#> 1     EXACT   FALSE
#> 2     EXACT    TRUE
  • Wiesneria w/ occurrences count from the data provider
occ_count(c5$usageKey[[1]])
#> [1] 120
taxon_provider(c5$usageKey[[1]])
#> [1] "Catalogue of Life - May 2021"
  • Wisneria w/ occurrences count from the data provider
occ_count(c5$usageKey[[2]])
#> [1] 3
taxon_provider(c5$usageKey[[2]])
#> [1] "The Interim Register of Marine and Nonmarine Genera"

The take away messages from this vignette

  • Make sure you are using the name you think you’re using
  • Realize that GBIF’s backbone taxonomy is used for occurrence data
  • Searching for occurrences by name matches against backbone names, not other names (e.g., synonyms)
  • GBIF may at some points in time have multiple version of the same name in their own backbone taxonomy - These can usually be separated by data provider (e.g., Catalogue of Life vs. International Plant Names Index)
  • There are different ways to search for names - make sure are familiar with the four different name search functions, all starting with name_