Background

The Netherlands Biodiversity API

The Netherlands Biodiversity API (NBA) facilitates access to the Natural History Collection at Naturalis Biodiversity Center. Next to museum specimen records and metadata, access to taxonomic classification and nomenclature, to geographical information, and to multimedia files is provided. By using the powerful Elasticsearch engine, the NBA facilitates searching for collection- and biodiversity data in near real-time. Furthermore, by incorporating information from taxonomic databases, taxonomic name resolution can be accomplished with the NBA. Persistent Uniform Resource Identifiers (PURLs) ensure that each specimen accessible via the NBA is represented by a citeable unambiguous web reference. Access to our data is provided via a RESTful interface and several clients such as the BioPortal, a web application for browsing biodiversity data that is served by the NBA. For more information about the NBA, please see our detailed documentation.

R access

The R programming language is established as a common tool in scientific research, with growing adoption by researchers in biodiversity research. Hence, to ease the access to the NBA for researchers, we developed this R client.

Full client vs wrapper functions

nbaR aims to be a full client of the NBA API, meaning that it implements all endpoints and the entire NBA object model. The client thus facilitates all API queries possible. Complex objects returned by the API, such as Specimen or Taxon objects are implemented as R6 classes. This includes also objects used for querying (QuerySpec and QueryCondition, respectively, see also here).

For many queries, the full functionality of the NBA won’t be required. The package therefore offers a wrapper function for each endpoint that does not use R6 classes but common R data structes, such as list and data.frame. However, querying capabilities are limited for these wrappers. Below we will show how to set up some simple queries using the wrapper functions.

Quick start: querying the NBA using wrapper functions

The data in the NBA consists of four main data types (see NBA docs):

  • Specimen
  • Taxon
  • Multimedia
  • Geo

Wrapper functions start with the data type (lower-case letter) and an underscore (specimen_*, taxon_*) etc. There is a wrapper function for each endpoint (see here for all endpoints); camelCase naming is replaced by snake_case. The NBA endpoint getDistinctValues for specimen data, for instance, is called by the function specimen_get_distinct_values.

Specimen services provide the interface to the Naturalis collection and to species occurrences (see here), wheras Taxon services provide data from taxonomic checklists (see here). Multimedia services give access to photos, videos and sound data (see here); Geo services store polygon data for geographical regions and nature reserves (see here).

Querying specimen records

Suppose we want to look up specimens of the genus Mola (sunfish). To find out what field of the NBA we could query, we can use the function specimen_get_paths() (see ?specimen_get_paths for documentation).

library('nbaR')

all_paths <- specimen_get_paths()
head(all_paths)
## [1] "sourceSystem.code" "sourceSystem.name" "sourceSystemId"   
## [4] "recordURI"         "unitID"            "unitGUID"

Note that paths of nested objects are seperated via a .. To search for a specific genus, we can query the field identifications.scientificName.genusOrMonomial. The specimen_query method lets us query for a specific field, where the query parameters are given as a named list (a named vector also works!):

queryParams <- list("identifications.scientificName.genusOrMonomial" =
                        "Mola")
sp_data <- specimen_query(queryParams)

## how many specimens are found?
nrow(sp_data)
## [1] 10
## which fields are available?
colnames(sp_data)
##  [1] "sourceSystem"             "sourceSystemId"          
##  [3] "id"                       "unitID"                  
##  [5] "unitGUID"                 "sourceInstitutionID"     
##  [7] "sourceID"                 "owner"                   
##  [9] "licenseType"              "license"                 
## [11] "recordBasis"              "collectionType"          
## [13] "preparationType"          "numberOfSpecimen"        
## [15] "fromCaptivity"            "objectPublic"            
## [17] "multiMediaPublic"         "gatheringEvent"          
## [19] "identifications"          "kindOfUnit"              
## [21] "phaseOrStage"             "title"                   
## [23] "associatedMultiMediaUris" "theme"

Return type can either be list or data.frame (the default). Note that nested structures in the data frame are represented as list columns (for instance the field associatedMultiMediaUris). which lists, if given, all links to multimedia resources for the specimens:

sp_data$associatedMultiMediaUris
## [[1]]
## NULL
## 
## [[2]]
## NULL
## 
## [[3]]
## NULL
## 
## [[4]]
## NULL
## 
## [[5]]
## NULL
## 
## [[6]]
## NULL
## 
## [[7]]
##                                                              accessUri
## 1 https://medialib.naturalis.nl/file/id/RMNH.PISC.17807_1/format/large
##       format        variant
## 1 image/jpeg ac:GoodQuality
## 
## [[8]]
## NULL
## 
## [[9]]
## NULL
## 
## [[10]]
## NULL

Querying taxon records

Taxonomic information can be retrieved using the taxon_ functions. Taxon records come from two sources, the Dutch species register (Nederlands Soortregister, NSR) and the Catalogue of Life (COL).

To see how many records are from each source, we can query for all distinct values (and counts) for a specific field (see taxon_get_paths) for all fields in the taxon data:

taxon_get_distinct_values("sourceSystem.name")
## $`Species 2000 - Catalogue Of Life`
## [1] 1998431
## 
## $`Naturalis - Dutch Species Register`
## [1] 50010
## alternatively, show for sourceSystem.code
taxon_get_distinct_values('sourceSystem.code')
## $COL
## [1] 1998431
## 
## $NSR
## [1] 50010

To query, for instance all the species listed in the Catalogue of life for the genus Mola, we can use the wrapper function taxon_query:

## specify query parameters
queryParams <- list("sourceSystem.code"="COL",
                    "defaultClassification.genus"="Mola")

## do the query
tax_data <- taxon_query(queryParams)

## access nested field 'accepted Name' -> 'specificEpithet'
tax_data$acceptedName$specificEpithet
## [1] "tecta"   "ramsayi" "mola"

Let’s see if we can find vernacular (common) names for the species Mola ramsayi:

tax_data$vernacularNames[[3]]
##                          name                    language
## 1                        Hana                    Albanian
## 2                         Sol                  Portuguese
## 3                        Mula                    Corsican
## 4                        Mola                      Arabic
## 5                        Mola                     Italian
## 6                        Mola                     Maltese
## 7                        Mola                     Spanish
## 8                        Mola                     Spanish
## 9                    Moonfish                     English
## 10                    Pez sol                     Spanish
## 11                    Pez sol                     Spanish
## 12                  Peixe lua                  Portuguese
## 13                  Peixe lua                  Portuguese
## 14                  Peixe lua                     Spanish
## 15                       蜇鲂            Mandarin Chinese
## 16                       蜇鱼            Mandarin Chinese
## 17                  Peixe-lua                  Portuguese
## 18                  Peixe-lua                  Portuguese
## 19                  Peixe-lua                  Portuguese
## 20                       蜇魴            Mandarin Chinese
## 21                    Sunfish                     English
## 22                    Sunfish                     English
## 23                    Sunfish                     English
## 24                    Sunfish                     English
## 25                      Rolim                  Portuguese
## 26                    Orelhão                  Portuguese
## 27               Poisson lune                      Arabic
## 28               Poisson lune                      French
## 29                 Peixe-roda                  Portuguese
## 30                Pixxitambur                     Maltese
## 31                  Pixxiluna                     Maltese
## 32              Ocean sunfish                     English
## 33              Ocean sunfish                     English
## 34              Ocean sunfish                     English
## 35              Ocean sunfish                     English
## 36              Ocean sunfish                     English
## 37              Ocean sunfish                     English
## 38              Ocean sunfish                     English
## 39              Ocean sunfish                     English
## 40              Ocean sunfish                     English
## 41              Ocean sunfish                     English
## 42              Ocean sunfish                     English
## 43              Ocean sunfish                     English
## 44                   Matahari Malay (individual language)
## 45                   Sun-fish                     English
## 46                   Sun-fish                     English
## 47           Samaket el-shams                      Arabic
## 48               Poisson-lune                      French
## 49                  Maanvisch                       Dutch
## 50                  Klumpfisk                      Danish
## 51                  Klumpfisk                   Norwegian
## 52                  Klumpfisk                     Swedish
## 53                  Hout kmar                      Arabic
## 54                     Bucanj                    Croatian
## 55                     Bucanj                     Serbian
## 56                     Bucanj                     Serbian
## 57              Dag hashemesh                      Hebrew
## 58                    Bezedor                  Portuguese
## 59             Poisson soleil                      Arabic
## 60                     개복치                      Korean
## 61                Môle commun                      French
## 62                Môle commun                      French
## 63                   Boloublè                     Fon GBE
## 64                   Samoglów                      Polish
## 65          Veliki pešibarila                    Croatian
## 66                   Pešeluna                    Croatian
## 67                      Misec                    Croatian
## 68                        Mih                    Croatian
## 69                      Bačva                    Croatian
## 70                     翻车鲀            Mandarin Chinese
## 71                 海虫(澎湖)            Mandarin Chinese
## 72                 曼波(成功)            Mandarin Chinese
## 73               Raataahuihui                       Maori
## 74                      Manbô                    Japanese
## 75               Cá Mặt trăng                  Vietnamese
## 76                 Pez cabeza                     Spanish
## 77                        Lua                  Portuguese
## 78                        Lua                  Portuguese
## 79                       Môle                      French
## 80                       Môle                      French
## 81       Almindelig klumpfisk                      Danish
## 82                 Mánafiskur                     Faroese
## 83              Pysgodyn haul                       Welsh
## 84                 Pesce mola                     Italian
## 85                  луна-рыба                     Russian
## 86   Голова-рыба обыкновенная                     Russian
## 87  Korshid-mahi-e-oghyanoosi                     Persian
## 88                   Headfish                     English
## 89                  Mondfisch                      German
## 90              Opesee-sonvis                   Afrikaans
## 91             Pervane balığı                     Turkish
## 92                  Ay balığı                     Turkish
## 93                Φεγγαρόψαρο        Modern Greek (1453-)
## 94                Fegaropsaro        Modern Greek (1453-)
## 95                       蜇魚            Mandarin Chinese
## 96                 Peste luna                    Romanian
## 97                      Mambo                    Romanian
## 98                        Bot                     Catalan
## 99                      Qamar                     Maltese
## 100                Pixxi mola                     Maltese
## 101                  Pez luna                     Spanish
## 102                  Pez luna                     Spanish
## 103                   Pervane                     Turkish
## 104                 Möhkäkala                     Finnish
## 105                  Månefisk                   Norwegian
## 106                   Maanvis                       Dutch
## 107                  Niffâkha                      Arabic
## 108                    翻車魨            Mandarin Chinese
## 109                    翻車魨            Mandarin Chinese
## 110                海蟲(澎湖)            Mandarin Chinese
## 111                曼波(成功)            Mandarin Chinese
## 112              Mbamba kubwa                    Comorian
## 113                 Takabatra                    Malagasy
## 114                    Mbamba                    Comorian
## 115              Pesciu tondu                    Corsican
## 116                   Προπέλα        Modern Greek (1453-)
## 117               Peshku hënë                    Albanian
## 118                Peshkahana                    Albanian
## 119              Morski mesec                   Slovenian
## 120        Mola ocean sunfish                     English
## 121                Pisci mola                     Italian
## 122                Pisci luna                     Italian
## 123               Pesciu meua                     Italian
## 124               Pesciu luna                     Italian
## 125               Pescio meua                     Italian
## 126                Pesce luna                     Italian
## 127                      Luna                     Italian
## 128           Mola cocciulara                     Italian
## 129                Pesce bala                     Italian
## 130                 Mulacchia                     Italian
## 131                  Girasole                     Italian
## 132               Tunglfiskur                   Icelandic
## 133             Giant sunfish                     English
## 134           Harilik kuukala                    Estonian
## 135                     Bucan                    Croatian

Geo queries

The Geo data type in the NBA holds polygon data for countries, Dutch municipalities etc, and Dutch nature reserves. For more information please refer to the API documentation. To retreive e.g. a polygon, encoded in the geoJSON format for a country, we can query as follows:

geo_json <- geo_get_geo_json_for_locality('Nigeria')

Multimedia queries

Multimedia items accessible via the NBA include items captured from physical specimens (e.g. photos and videos) but also from human observations (e.g. recordings of bird sounds).

As an example, we will retrieve records that represent sounds that were recorded in the country Cape Verde. The sound data accessible via the NBA is stored in the Xeno-Canto database, hosted at the Naturalis Biodiversity Center. The field sourceSystem.code for these records is XC; the country of occurrence is stored in the field gatheringEvents.country.

queryParams <- list("sourceSystem.code"="XC",
                    "gatheringEvents.country"="Cape Verde")

mm_data <- multimedia_query(queryParams)

## Access link to Xeno-Canto database for each record:
mm_data$recordURI
##  [1] "https://data.biodiversitydata.nl/xeno-canto/observation/XC456747"
##  [2] "https://data.biodiversitydata.nl/xeno-canto/observation/XC456850"
##  [3] "https://data.biodiversitydata.nl/xeno-canto/observation/XC164033"
##  [4] "https://data.biodiversitydata.nl/xeno-canto/observation/XC156912"
##  [5] "https://data.biodiversitydata.nl/xeno-canto/observation/XC456724"
##  [6] "https://data.biodiversitydata.nl/xeno-canto/observation/XC456740"
##  [7] "https://data.biodiversitydata.nl/xeno-canto/observation/XC405554"
##  [8] "https://data.biodiversitydata.nl/xeno-canto/observation/XC405139"
##  [9] "https://data.biodiversitydata.nl/xeno-canto/observation/XC456707"
## [10] "https://data.biodiversitydata.nl/xeno-canto/observation/XC456710"

Limitations of wrapper functions

It is important to note that querying power is limited using the wrapper functions. They relate to basic, human readable NBA queries (see here).

  • Size of result set: As by NBA default, wrapper functions only return the first 10 hits of a query.
  • Operators: Only full matches (operator EQUALS) are considered in wrapper query functions. Partial matching is only available in the full API client
  • Locical conjunctions: If multiple query conditions are given, wrapper functions only allow a simple AND conjunction. For more complex logical query constructs including OR operators or negations, the full API client must be used.

The wrappers are thus designed for easy access for simple queries. In many situations it might be necessary to use the full API client which offers (almost) the entire functionality of the NBA API. Detailed documentation for the full client can be found here.