Secrets often turn up in API work. A common example is an API key.
vcr
saves responses from APIs as YAML files, and this will
include your secrets unless you indicate to vcr
what they
are and how to protect them. The vcr_configure
function has
the filter_sensitive_data
argument function for just this
situation. The filter_sensitive_data
argument takes a named
list where the name of the list is the string that will be used
in the recorded cassettes instead of the secret, which is the
list item. vcr
will manage the replacement of that
for you, so all you need to do is to edit your helper-vcr.R
file like this:
library("vcr") # *Required* as vcr is set up on loading
invisible(vcr::vcr_configure(
dir = "../fixtures"
))
vcr::check_cassette_names()
Use the filter_sensitive_data
argument in the
vcr_configure
function to show vcr
how to keep
your secret. The best way to store secret information is to have it in a
.Renviron
file. Assuming that that is already in place,
supply a named list to the filter_sensitive_data
argument.
library("vcr")
invisible(vcr::vcr_configure(
filter_sensitive_data = list("<<<my_api_key>>>" = Sys.getenv('APIKEY')), # add this
dir = "../fixtures"
))
vcr::check_cassette_names()
Notice we wrote Sys.getenv('APIKEY')
and not the API key
directly, otherwise you’d have written your API key to a file that might
end up in a public repo.
The will get your secret information from the environment, and make
sure that whenever vcr
records a new cassette, it will
replace the secret information with
<<<my_api_key>>>
. You can find out more
about this in the HTTP testing book
chapter on security.
The addition of the line above will instruct vcr
to
replace any string in cassettes it records that are equivalent to your
string which is stored as the APIKEY
environmental variable
with the masking string <<<my_api_key>>>
.
In practice, you might get a YAML
that looks a little like
this:
http_interactions:
- request:
method: post
...
headers:
Accept: application/json, text/xml, application/xml, */*
Content-Type: application/json
api-key: <<<my_api_key>>>
...
Here, my APIKEY
environmental variable would have been
stored as the api-key
value, but vcr
has
realised this and recorded the string
<<<my_api_key>>>
instead.
Once the cassette is recorded, vcr
no longer needs the
API key as no real requests will be made. Furthermore, as by default
requests matching does not include the API key, things will work.
Now, how to ensure tests work in the absence of a real API key?
E.g. to have tests pass on continuous integration for external pull requests to your code repository.
- vcr does not need an actual API key for requests once the cassettes are created, as no real requests will be made.
- you still need to fool your package into believing there is
an API key as it will construct requests with it. So add the following
lines to a testthat setup file
(e.g.
tests/testthat/helper-vcr.R
)
if (!nzchar(Sys.getenv("APIKEY"))) {
Sys.setenv("APIKEY" = "foobar")
}
Using an .Renviron
A simple way to manage local environmental variables is to use an .Renviron
file. Your .Renviron
file might look like this:
You can have this set at a project or user level, and
usethis
has the usethis::edit_r_environ()
function to help edit the file.
httr2 request headers
vcr
automatically redacts request headers that are
marked (via attributes) as redacted when using the httr2
package. This includes the following functions:
There is no way to avoid this behavior other than not redacting the request headers.
Configuration
filter_sensitive_data
A named list of values to replace. Sometimes your package or script is working with sensitive tokens/keys, which you do not want to accidentally share with the world.
Before recording (writing to a cassette) we do the replacement and then when reading from the cassette we do the reverse replacement to get back to the real data.
vcr_configure(
filter_sensitive_data = list("<some_api_key>" = Sys.getenv('MY_API_KEY'))
)
Before recording to disk, the env var MY_API_KEY
is
retrieved from your machine, and we find instances of it, and replace
with <some_api_key>
. When replaying to create the
HTTP response object we put the real value of the env var back in
place.
To target specific request or response headers see
filter_request_headers
and
filter_response_headers
.
filter_request_headers
Expects a character vector or a named list. If a character vector, or any unnamed element in a list, the request header is removed before being written to the cassette.
If a named list is passed, the name is the header and the value is the value with which to replace the real value.
A request header you set to remove or replace is only
removed/replaced from the cassette, and any requests using a cassette,
but will still be in your crul
, httr
or
httr2
response objects on a real request that creates the
cassette.
Note that for the httr2
package only we redact request
headers automatically that are marked (via attributes) as redacted.
Examples:
vcr_configure(
filter_request_headers = "Authorization"
)
vcr_configure(
filter_request_headers = c("Authorization", "User-Agent")
)
vcr_configure(
filter_request_headers = list(Authorization = "<<<not-my-bearer-token>>>")
)
filter_response_headers
Expects a character vector or a named list. If a character vector, or any unnamed element in a list, the response header is removed before being written to the cassette.
If a named list is passed, the name is the header and the value is the value with which to replace the real value.
A response header you set to remove or replace is only
removed/replaced from the cassette, and any requests using a cassette,
but will still be in your crul
, httr
or
httr2
response objects on a real request that creates the
cassette.
Examples:
vcr_configure(
filter_response_headers = "server"
)
vcr_configure(
filter_response_headers = c("server", "date")
)
vcr_configure(
filter_response_headers = list(server = "fake-server")
)
filter_query_parameters
Expects a character vector or a named list. If a character vector, or any unnamed element in a list, the query parameter is removed (both parameter name and value) before being written to the cassette.
If a named list is passed, the name is the query parameter name and the value is the value with which to replace the real value.
A response header you set to remove or replace is only
removed/replaced from the cassette, and any requests using a cassette,
but will still be in your crul
, httr
or
httr2
response objects on a real request that creates the
cassette.
Beware of your match_requests_on
option when using this
filter. If you filter out a query parameter it’s probably a bad idea to
match on query
given that there is no way for vcr to
restore the exact http request from your cassette after one or more
query parameters is removed or changed. One way you could filter a query
parameter and still match on query or at least on the complete uri is to
use replacement behavior (a named list), but instead of
list(a="b")
use two values list(a=c("b","c"))
,
where “c” is the string to be stored in the cassette. You could of
course replace those values with values from environment variables so
that you obscure the real values if your code is public.
Examples:
# completely drop parameter "user"
vcr_configure(
filter_query_parameters = "user"
)
# completely drop parameters "user" and "api_key"
vcr_configure(
filter_query_parameters = c("user", "api_key")
)
# replace the value of parameter "api_key" with "fake-api-key"
# NOTE: in this case there's no way to put back any value on
# subsequent requests, so we have to match by dropping this
# parameter value before comparing URIs
vcr_configure(
filter_query_parameters = list(api_key = "fake-api-key")
)
# replace the value found at Sys.getenv("MY_API_KEY") of parameter
# "api_key" with the value "foo". When using a cassette on subsequent
# requests, we can replace "foo" with the value at Sys.getenv("MY_API_KEY")
# before doing the URI comparison
vcr_configure(
filter_query_parameters = list(api_key = c(Sys.getenv("MY_API_KEY"), "foo"))
)