Make a DHS API client
Methods
dhs_api_request
Makes a call to the DHS websites API. You can make requests to any of their declared api endpoints (see
vignette(rdhs)
for more on these). API queries can be filtered by providing query terms, and you can control how many search results you want returned. The default parameters will return all of the results, and will format it nicely into a data.frame for you. N.B. This is easier to now do by using the bespoke functions that are included within the package. These take the form dhs_<endpoint>, e.g.dhs_data
. These functions can also take your client as an argument that will cache the response for youUsage:
dhs_api_request(api_endpoint, query = list(), api_key = private$api_key, num_results = 100, just_results = TRUE)
Arguments:
api_endpoint
: API endpoint. Must be one of the 12 possible endpoints.query
: List of query filters. To see possible query filter terms for each endpoint then head to the DHS api website.api_key
: DHS API key. Default will grab the key provided when the client was created.num_results
: The Number of results to return. Default = "ALL" which will loop through all the api search results pages for you if there are more results than their API will allow you to fetch in one page. If you specify a number this many results will be returned (but probably best to just leave default).just_results
: Boolean whether to return just the results or all the http API response. Default = TRUE (probably best again to leave as this.)
Value: Data.frame with search results if just_results=TRUE, otherwise a nested list with all the API responses for each page required.
available_datasets
Searches the DHS website for all the datasets that you can download. The results of this function are cached in the client. If you have recently requested new datasets from the DHS website then you can specify to clear the cache first so that you get the new set of datasets available to you.
Usage:
available_datasets(clear_cache_first = FALSE)
Arguments:
clear_cache_first
: Boolean detailing if you would like to clear the cached available datasets first. The default is set to FALSE. This option is available so that you can make sure your client fetches any new datasets that you have recently been given access to.
Value: Data.frame object with 14 variables that detail the surveys you can download, their url download links and the country, survey, year etc info for that link.
get_datasets
Gets datasets from your cache or downloads from the DHS website. By providing the filenames, as specified in one of the returned fields from
dhs_datasets
, the client will log in for you and download all the files you have requested. If any of the requested files are unavailable for your log in, these will be flagged up first as a message so you can make a note and request them through the DHS website. You also have the option to control whether the downloaded zip file is then extracted and converted into a more convenient Rdata.frame
. This converted object will then be subsequently saved as a ".rds" object within the client root directory datasets folder, which can then be more quickly loaded when needed withreadRDS
. You also have the option to reformat the dataset, which will ensure that a suitable parser is used to preserve the meta information in your dataset, such as what different survey response codes mean.Usage:
get_datasets(dataset_filenames, download_option = "rds", reformat = FALSE, all_lower = TRUE, output_dir_root = file.path(private$root, "datasets"), clear_cache = FALSE, ...)
Arguments:
dataset_filenames
: The desired filenames to be downloaded. These can be found as one of the returned fields fromdhs_datasets
. Alternatively you can also pass the desired rows fromdhs_datasets
.download_option
: Character specifying whether the dataset should be just downloaded ("zip"), imported and saved as an .rds object ("rds"), or both extract and rds ("both"). Conveniently you can just specify any letter from these options.reformat
: Boolean concerning whether to reformat read in datasets by removing all factors and labels. Default = FALSE.all_lower
: Logical indicating whether all value labels should be lower case. Default to `TRUE`.output_dir_root
: Root directory where the datasets will be stored within. The default will download datasets to a subfolder of the client root called "datasets"clear_cache
: Should your available datasets cache be cleared first. This will allow newly accessed datasets to be available. Default = `TRUE`...
: Any other arguments to be passed toread_dhs_dataset
Value: Depends on the download_option requested, but ultimately it is a file path to where the dataset was downloaded to, so that you can interact with it accordingly.
survey_questions
Use this function after download_survey to query downloaded surveys for what questions they asked. This function will look for the downloaded and imported survey datasets from the cache, and will download them if not previously downloaded.
Usage:
survey_questions(dataset_filenames, search_terms = NULL, essential_terms = NULL, regex = NULL, rm_na = TRUE, ...)
Arguments:
dataset_filenames
: The desired filenames to be downloaded. These can be found as one of the returned fields fromdhs_datasets
.search_terms
: Character vector of search terms. If any of these terms are found within the surveys question descriptions, the corresponding code and description will be returned.essential_terms
: Character pattern that has to be in the description of survey questions. I.e. the function will first find all survey_questions that contain your search terms (or regex) OR essential_terms. It will then remove any questions that did not contain your essential_terms. Default = NULL.regex
: Regex character pattern for matching. If you want to specify your regex search pattern, then specify this argument. N.B. If both search_terms and regex are supplied as arguments then regex will be ignored.rm_na
: Should NAs be removed. Default is `TRUE`...
: Any other arguments to be passed todownload_datasets
Value: Data frame of the surveys where matches were found and then all the resultant codes and descriptions.
survey_variables
Use this function after download_survey to look up all the surveys that have the provided codes.
Usage:
survey_variables(dataset_filenames, variables, essential_variables = NULL, rm_na = TRUE, ...)
Arguments:
dataset_filenames
: The desired filenames to be downloaded. These can be found as one of the returned fields fromdhs_datasets
.variables
: Character vector of survey variables to be looked upessential_variables
: Character vector of variables that need to present. If any of the codes are not present in that survey, the survey will not be returned by this function. Default = NULL.rm_na
: Should NAs be removed. Default is `TRUE`...
: Any other arguments to be passed todownload_datasets
Value: Data frame of the surveys where matches were found and then all the resultant codes and descriptions.
extract
Function to extract datasets using a set of survey questions as taken from the output from
survey_questions
Usage:
extract(questions, add_geo = FALSE)
Arguments:
questions
: Questions to be queried, in the format fromsurvey_questions
add_geo
: Add geographic information to the extract. Default = TRUE
get_variable_labels
Returns information about a dataset's survey variables and definitions.
Usage:
get_variable_labels(dataset_filenames = NULL, dataset_paths = NULL, rm_na = FALSE)
Arguments:
dataset_filenames
: Vector of dataset filenames to look updataset_paths
: Vector of dataset file paths to where datasets have been saved torm_na
: Should variables and labels with NAs be removed. Default = FALSE
Value: Data frame of survey variable names and definitions
get_cache_date
Returns the private member variable cache-date, which is the date the client was last created/validated against the DHS API.
Usage:
get_cache_date()
Value: POSIXct and POSIXt time
get_root
Returns the file path to the client's root directory
Usage:
get_root()
Value: Character string file path
get_config
Returns the client's configuration
Usage:
get_config()
Value: Config data.frame
get_downloaded_datasets
Returns a named list of all downloaded datasets and their file paths
Usage:
get_downloaded_datasets()
Value: List of dataset names and file paths.
set_cache_date
Sets the private member variable cache-date, which is the date the client was last created/validated against the DHS API. This should never really be needed but is included to demonstrate the cache clearing properties of the client in the vignette.
Usage:
set_cache_date(date)
Arguments:
date
: POSIXct and POSIXt time to update cache time to.
save_client
Internally save the client object as an .rds file within the root directory for the client.
Usage:
save_client()
clear_namespace
Clear the keys and values associated within a cache context. The dhs client caches a number of different tasks, and places these within specific contexts using the package
storr::storr_rds
.Usage:
clear_namespace(namespace)
Arguments:
namespace
: Character string for the namespace to be cleared.
Examples
if (FALSE) { # \dontrun{
# create an rdhs config file at "rdhs.json"
conf <- set_rdhs_config(
config_path = "rdhs.json",global = FALSE, prompt = FALSE
)
td <- tempdir()
cli <- rdhs::client_dhs(api_key = "DEMO_1234", config = conf, root = td)
} # }