Skip to contents

Downloads data from Environment and Climate Change Canada (ECCC) for one or more stations. For details and units, see the glossary vignette (vignette("glossary", package = "weathercan")) or the glossary online https://climate.weather.gc.ca/glossary_e.html.

Usage

weather_dl(
  station_ids,
  start = NULL,
  end = NULL,
  interval = "hour",
  trim = TRUE,
  format = TRUE,
  string_as = NA,
  time_disp = "none",
  stn = NULL,
  encoding = "UTF-8",
  list_col = FALSE,
  verbose = FALSE,
  quiet = FALSE
)

Arguments

station_ids

Numeric/Character. A vector containing the ID(s) of the station(s) you wish to download data from. See the stations data frame or the stations_search function to find IDs.

start

Date/Character. The start date of the data in YYYY-MM-DD format (applies to all stations_ids). Defaults to start of range.

end

Date/Character. The end date of the data in YYYY-MM-DD format (applies to all station_ids). Defaults to end of range.

interval

Character. Interval of the data, one of "hour", "day", "month".

trim

Logical. Trim missing values from the start and end of the weather dataframe. Only applies if format = TRUE

format

Logical. If TRUE, formats data for immediate use. If FALSE, returns data exactly as downloaded from Environment and Climate Change Canada. Useful for dealing with changes by Environment Canada to the format of data downloads.

string_as

Character. What value to replace character strings in a numeric measurement with. See Details.

time_disp

Character. Either "none" (default) or "UTC". See details.

stn

DEFUNCT. Now use stations_dl() to update internal data and stations_meta() to check the date it was last updated.

encoding

Character. Text encoding for download.

list_col

Logical. Return data as nested data set? Defaults to FALSE. Only applies if format = TRUE

verbose

Logical. Include progress messages

quiet

Logical. Suppress all messages (including messages regarding missing data, etc.)

Value

A tibble with station ID, name and weather data.

Details

Data can be returned 'raw' (format = FALSE) or can be formatted. Formatting transforms dates/times to date/time class, renames columns, and converts data to numeric where possible. If character strings are contained in traditionally numeric fields (e.g., weather speed may have values such as "< 30"), they can be replaced with a character specified by string_as. The default is NA. Formatting also replaces data associated with certain flags with NA (M = Missing).

Start and end date can be specified, but if not, it will default to the start and end date of the range (this could result in downloading a lot of data!).

For hourly data, timezones are always "UTC", but the actual times are either local time (default; time_disp = "none"), or UTC (time_disp = "UTC"). When time_disp = "none", times reflect the local time without daylight savings. This means that relative measures of time, such as "nighttime", "daytime", "dawn", and "dusk" are comparable among stations in different timezones. This is useful for comparing daily cycles. When time_disp = "UTC" the times are transformed into UTC timezone. Thus midnight in Kamloops would register as 08:00:00 (Pacific time is 8 hours behind UTC). This is useful for tracking weather events through time, but will result in odd 'daily' measures of weather (e.g., data collected in the afternoon on Sept 1 in Kamloops will be recorded as being collected on Sept 2 in UTC).

Files are downloaded from the url stored in getOption("weathercan.urls.weather"). To change this location use options(weathercan.urls.weather = "your_new_url").

Data is downloaded from ECCC as a series of files which are then bound together. Each file corresponds to a different month, or year, depending on the interval. Metadata (station name, lat, lon, elevation, etc.) is extracted from the start of the most recent file (i.e. most recent dates) for a given station. Note that important data (i.e. station name, lat, lon) is unlikely to change between files (i.e. dates), but some data may or may not be available depending on the date of the file (e.g., station operator was added as of April 1st 2018, so will be in all data which includes dates on or after April 2018).

Examples

if (FALSE) { # check_eccc()

kam <- weather_dl(station_ids = 51423,
                  start = "2016-01-01", end = "2016-02-15")

stations_search("Kamloops A$", interval = "hour")
stations_search("Prince George Airport", interval = "hour")

kam.pg <- weather_dl(station_ids = c(48248, 51423),
                     start = "2016-01-01", end = "2016-02-15")

library(ggplot2)

ggplot(data = kam.pg, aes(x = time, y = temp,
                          group = station_name,
                          colour = station_name)) +
       geom_line()
}