rdpla: R client for Digital Public Library of America

Digital Public Library of America brings together metadata from libraries, archives, and museums in the US, and makes it freely available via their web portal as well as an API. DPLA’s portal and API don’t provide the items themselves from contributing institutions, but they provide links to make it easy to find things. The kinds of things DPLA holds metadata for include images of works held in museums, photographs from various photographic collections, texts, sounds, and moving images.

DPLA has a great API with good documentation - a rare thing in this world. Further documentation on their API can be found on their search fields and examples of queries. Metadata schema information here.

DPLA data data can be used for a variety of use cases in various academic and non-academic fields. Here are some examples (vignettes to come soon showing examples):

  • Search for all photos of churches and make vizualization of their metadata through time
  • Visualize data from individual collections - a maintainer of a collection could gain insight from via DPLA data
  • Search for all works within a spatial area, map results

DPLA API has two main services (quoting from their API docs):

  • items: A reference to the digital representation of a single piece of content indexed by the DPLA. The piece of content can be, for example, a book, a photograph, a video, etc. The content is digitized from its original, physical source and uploaded to an online repository.
  • collections: A collection is a little more abstract than an item. Where an item is a reference to the digital representation of a physical object, a collection is a representation of the grouping of a set of items.

rdpla also has an interface (dpla_bulk) to download bulk and compressed JSON data.

Note that you can only run examples/vignette/tests if you have an API key. See ?dpla_get_key to get an API key.

Tutorials

There are two vignettes. After installation check them out. If installing from GitHub, do devtools::install_github("ropensci/rdpla", build_vignettes = TRUE)

  • Introduction to rdpla
  • rdpla use case: vizualize churches across DPLA holdings

Installation

Stable version from CRAN

Dev version from GitHub:

install.packages("devtools")
devtools::install_github("ropensci/rdpla")
library('rdpla')

Authentication

You need an API key to use the DPLA API. Use dpla_get_key() to request a key, which will then be emailed to you. Pass in the key in the key parameter in functions in this package or you can store the key in your .Renviron as DPLA_API_KEY or in your .Rprofile file under the name dpla_api_key.

Search - items

Note: limiting fields returned for readme brevity.

Basic search

Limit fields returned

Limit records returned

Search by date

Search on specific fields

Spatial search, across all spatial fields

Spatial search, by states

Faceted search

Visualize

Visualize metadata from the DPLA - histogram of number of records per state (includes states outside the US)

out <- dpla_items(facets="sourceResource.spatial.state", page_size=0, facet_size=25)
library("ggplot2")
library("scales")
ggplot(out$facets$sourceResource.spatial.state$data, aes(reorder(term, count), count)) +
  geom_bar(stat="identity") +
  coord_flip() +
  theme_grey(base_size = 16) +
  scale_y_continuous(labels = comma) +
  labs(x="State", y="Records")

Meta

ropensci