Convenience function to define and synchronize a bowerbird data collection
Source:R/get.R
bb_get.Rd
This is a convenience function that provides a shorthand method for synchronizing a small number of data sources. The call bb_get(...)
is roughly equivalent to bb_sync(bb_add(bb_config(...), ...), ...)
(don't take the dots literally here, they are just indicating argument placeholders).
Usage
bb_get(
data_sources,
local_file_root,
clobber = 1,
http_proxy = NULL,
ftp_proxy = NULL,
create_root = FALSE,
verbose = FALSE,
confirm_downloads_larger_than = 0.1,
dry_run = FALSE,
...
)
Arguments
- data_sources
tibble: one or more data sources to download, as returned by e.g.
bb_example_sources
- local_file_root
string: location of data repository on local file system
- clobber
numeric: 0=do not overwrite existing files, 1=overwrite if the remote file is newer than the local copy, 2=always overwrite existing files
- http_proxy
string: URL of HTTP proxy to use e.g. 'http://your.proxy:8080' (NULL for no proxy)
- ftp_proxy
string: URL of FTP proxy to use e.g. 'http://your.proxy:21' (NULL for no proxy)
- create_root
logical: should the data root directory be created if it does not exist? If this is
FALSE
(default) and the data root directory does not exist, an error will be generated- verbose
logical: if
TRUE
, provide additional progress output- confirm_downloads_larger_than
numeric or NULL: if non-negative,
bb_sync
will ask the user for confirmation to download any data source of size greater than this number (in GB). A value of zero will trigger confirmation on every data source. A negative or NULL value will not prompt for confirmation. Note that this only applies when R is being used interactively. The expected download size is taken from thecollection_size
parameter of the data source, and so its accuracy is dependent on the accuracy of the data source definition- dry_run
logical: if
TRUE
,bb_sync
will do a dry run of the synchronization process without actually downloading files- ...
: additional parameters passed through to
bb_config
orbb_sync
Value
a tibble, as for bb_sync
Examples
if (FALSE) { # \dontrun{
my_source <- bb_example_sources("Australian Election 2016 House of Representatives data")
status <- bb_get(local_file_root = tempdir(), data_sources = my_source, verbose = TRUE)
## the files that have been downloaded:
status$files[[1]]
## Define a new source: Geelong bicycle paths from data.gov.au
my_source <- bb_source(
name = "Bike Paths - Greater Geelong",
id = "http://data.gov.au/dataset/7af9cf59-a4ea-47b2-8652-5e5eeed19611",
doc_url = "https://data.gov.au/dataset/geelong-bike-paths",
citation = "See https://data.gov.au/dataset/geelong-bike-paths",
source_url = "https://data.gov.au/dataset/7af9cf59-a4ea-47b2-8652-5e5eeed19611",
license = "CC-BY",
method = list("bb_handler_rget", accept_download = "\\.zip$", level = 1),
postprocess = list("bb_unzip"))
## get the data
status <- bb_get(data_sources = my_source, local_file_root = tempdir(), verbose = TRUE)
## find the .shp file amongst the files, and plot it
shpfile <- status$files[[1]]$file[grepl("shp$", status$files[[1]]$file)]
library(sf)
bx <- read_st(shpfile)
plot(bx)
} # }