The package rdefra allows to retrieve air pollution data from the Air Information Resource UK-AIR of the Department for Environment, Food and Rural Affairs in the United Kingdom. UK-AIR does not provide a public API for programmatic access to data, therefore this package scrapes the HTML pages to get relevant information.
This package follows a logic similar to other packages such as waterData and rnrfa: sites are first identified through a catalogue, data are imported via the station identification number, then data are visualised and/or used in analyses. The metadata related to the monitoring stations are accessible through the function
ukair_catalogue(), missing stations’ coordinates can be obtained using the function
ukair_get_coordinates(), and time series data related to different pollutants can be obtained using the function
DEFRA’s servers can handle multiple data requests, therefore concurrent calls can be sent simultaneously using the parallel package. Although the limit rate depends on the maximum number of concurrent calls, traffic and available infrustracture, data retrieval is very efficient. Multiple years of data for hundreds of sites can be downloaded in only few minutes.
For similar functionalities see also the openair package, which relies on a local copy of the data on servers at King’s College (UK), and the ropenaq which provides UK-AIR latest measured levels (see https://uk-air.defra.gov.uk/latest/currentlevels) as well as data from other countries.
The rdefra package depends on two things:
The Geospatial Data Abstraction Library (GDAL).
Some additional CRAN packages. Check for missing dependencies and install them using the commands below:
The package logic assumes that users access the UK-AIR database in the fllowing steps:
For an in-depth description of the various functionalities andexample applications, please refer to the package vignette.
rdefrain R doing
citation(package = "rdefra")