This vignette demonstrates the easiest way to use
autotest
, which is to apply it continuously through the
entire process of package development. The best way to understand the
process is to obtain a local copy of the vignette itself from this
link, and step through the code. We begin by constructing a simple
package in the local tempdir()
.
Package Construction
To create a package in one simple line, we use usethis::create_package()
,
and name our package "demo"
.
path <- file.path (tempdir (), "demo")
usethis::create_package (path, check_name = FALSE, open = FALSE)
#> ✔ Creating /tmp/Rtmp5zQuGD/demo/.
#> ✔ Setting active project to "/tmp/Rtmp5zQuGD/demo".
#> ✔ Creating R/.
#> ✔ Writing DESCRIPTION.
#> Package: demo
#> Title: What the Package Does (One Line, Title Case)
#> Version: 0.0.0.9000
#> Authors@R (parsed):
#> * First Last <first.last@example.com> [aut, cre] (YOUR-ORCID-ID)
#> Description: What the package does (one paragraph).
#> License: `use_mit_license()`, `use_gpl3_license()` or friends to
#> pick a license
#> Encoding: UTF-8
#> Roxygen: list(markdown = TRUE)
#> RoxygenNote: 7.3.2
#> ✔ Writing NAMESPACE.
#> ✔ Setting active project to "<no active project>".
The structure looks like this:
fs::dir_tree (path)
#> /tmp/Rtmp5zQuGD/demo
#> ├── DESCRIPTION
#> ├── NAMESPACE
#> └── R
Having constructed a minimal package structure, we can then insert
some code in the R/
directory, including initial roxygen2
documentation lines, and use the roxygenise()
function to create the corresponding man
files.
autotest
works by parsing and running “example” code
from function documentation, so our code needs to include at least one
example line.
code <- c ("#' my_function",
"#'",
"#' @param x An input",
"#' @return Something else",
"#' @examples",
"#' y <- my_function (x = 1)",
"#' @export",
"my_function <- function (x) {",
" return (x + 1)",
"}")
writeLines (code, file.path (path, "R", "myfn.R"))
roxygen2::roxygenise (path)
#> ℹ Loading demo
#> Writing NAMESPACE
#> Writing my_function.Rd
Our package now looks like this:
fs::dir_tree (path)
#> /tmp/Rtmp5zQuGD/demo
#> ├── DESCRIPTION
#> ├── NAMESPACE
#> ├── R
#> │ └── myfn.R
#> └── man
#> └── my_function.Rd
We can already apply autotest
to that package to see
what happens, first ensuring that we’ve loaded the package ready to
use.
library (autotest)
x0 <- autotest_package (path)
#> ℹ Loading autotest
#>
#>
#> ── autotesting demo ──
#>
#>
#>
#> ✔ [1 / 1]: my_function
We use the DT
package to display the results here.
The first thing to notice is the first column, which has
test_type = "dummy"
for all rows. The autotest_package()
function has a parameter test
with a default value of
FALSE
, so that the default call demonstrated above does not
actually implement the tests, rather it returns an object listing all
tests that would be performed with actually doing so. Applying the tests
by setting test = TRUE
gives the following result.
x1 <- autotest_package (path, test = TRUE)
#> ── autotesting demo ──
#>
#> ✔ [1 / 1]: my_function
DT::datatable (x1, options = list (dom = "t"))
Of the 9 tests which were performed, only 2 yielded unexpected
behaviour. The first indicates that the parameter x
has
only been used as an integer, yet was not specified as such. The second
states that the parameter x
is “assumed to be a single
numeric”. autotest
does its best to figure out what types
of inputs are expected for each parameter, and with the example only
demonstrating x = 1
, assumes that x
is always
expected to be a single value. We can resolve the first of these by
replacing x = 1
with x = 1.
to clearly
indicate that it is not an integer, and the second by asserting that
length(x) == 1
, as follows:
code <- c ("#' my_function",
"#'",
"#' @param x An input",
"#' @return Something else",
"#' @examples",
"#' y <- my_function (x = 1.)",
"#' @export",
"my_function <- function (x) {",
" if (length(x) > 1) {",
" warning(\"only the first value of x will be used\")",
" x <- x [1]",
" }",
" return (x + 1)",
"}")
writeLines (code, file.path (path, "R", "myfn.R"))
roxygen2::roxygenise (path)
#> ℹ Loading demo
#> Writing my_function.Rd
This is then sufficient to pass all autotest
tests and
so return NULL
.
autotest_package (path, test = TRUE)
#>
#> ── autotesting demo ──
#>
#> ✔ [1 / 1]: my_function
#> NULL
Integer input
Note that autotest
distinguishes integer and non-integer
types by their storage.mode
of "integer"
and "double"
, and not by their
respective classes of "integer"
and "numeric"
,
because "numeric"
is ambiguous in R, and
is.numeric(1L)
is TRUE
, even though
storage.mode(1L)
is "integer"
, and not
"numeric"
. Replacing x = 1
with
x = 1.
explicitly identifies that parameter as a
"double"
parameter, and allowed the preceding tests to
pass. Note what happens if we instead specify that parameter as an
integer (x = 1L
).
code [6] <- gsub ("1\\.", "1L", code [6])
writeLines (code, file.path (path, "R", "myfn.R"))
roxygen2::roxygenise (path)
#> ℹ Loading demo
#> Writing my_function.Rd
x2 <- autotest_package (path, test = TRUE)
#>
#> ── autotesting demo ──
#>
#> ✔ [1 / 1]: my_function
DT::datatable (x2, options = list (dom = "t"))
That then generates two additional messages, the second of which
reflects an expectation that parameters assumed to be integer-valued
should assert that, for example by converting with
as.integer()
. The following suffices to remove that
message.
The remaining message concerns integer ranges. For any parameters
which autotest
identifies as single integers, routines will
try a full range of values between
+/- .Machine$integer.max
, to ensure that all values are
appropriately handled. Many routines may sensibly allow unrestricted
ranges, while many others may not implement explicit control over
permissible ranges, yet may error on, for example, unexpectedly large
positive or negative values. The content of the diagnostic message
indicates one way to resolve this issue, which is simply by describing
the input as "unrestricted"
.
code [3] <- gsub ("An input", "An unrestricted input", code [3])
writeLines (code, file.path (path, "R", "myfn.R"))
roxygen2::roxygenise (path)
#> ℹ Loading demo
#> Writing my_function.Rd
autotest_package (path, test = TRUE)
#>
#> ── autotesting demo ──
#>
#> ✔ [1 / 1]: my_function
#> NULL
An alternative, and frequently better way, is to ensure and document specific control over permissible ranges, as in the following revision of our function.
code <- c ("#' my_function",
"#'",
"#' @param x An input between 0 and 10",
"#' @return Something else",
"#' @examples",
"#' y <- my_function (x = 1L)",
"#' @export",
"my_function <- function (x) {",
" if (length(x) > 1) {",
" warning(\"only the first value of x will be used\")",
" x <- x [1]",
" }",
" if (is.numeric (x))",
" x <- as.integer (x)",
" if (x < 0 | x > 10) {",
" stop (\"x must be between 0 and 10\")",
" }",
" return (x + 1L)",
"}")
writeLines (code, file.path (path, "R", "myfn.R"))
roxygen2::roxygenise (path)
#> ℹ Loading demo
#> Writing my_function.Rd
autotest_package (path, test = TRUE)
#>
#> ── autotesting demo ──
#>
#> ✔ [1 / 1]: my_function
#> NULL
Respective limits of ranges may be specified with any of the following words:
- Lower limits: “more”, “greater”, “larger than”, “lower limit of”, “above”
- Upper limits: “less”, “lower”, “smaller than”, “upper limit of”, “below”
Vector input
The initial test results above suggested that the input was assumed to be of length one. Let us now revert our function to its original format which accepted vectors of length > 1, and include an example demonstrating such input.
code <- c ("#' my_function",
"#'",
"#' @param x An input",
"#' @return Something else",
"#' @examples",
"#' y <- my_function (x = 1)",
"#' y <- my_function (x = 1:2)",
"#' @export",
"my_function <- function (x) {",
" if (is.numeric (x)) {",
" x <- as.integer (x)",
" }",
" return (x + 1L)",
"}")
writeLines (code, file.path (path, "R", "myfn.R"))
roxygen2::roxygenise (path)
#> ℹ Loading demo
#> Writing my_function.Rd
Note that the first example no longer has x = 1L
. This
is because vector inputs are identified as integer
by
examining all individual values, and presuming integer
representations for any parameters for which all values are whole
numbers, regardless of storage.mode
.
x3 <- autotest_package (path, test = TRUE)
#>
#> ── autotesting demo ──
#>
#> ✔ [1 / 1]: my_function
DT::datatable (x3, options = list (dom = "t"))
List-column conversion
The above result reflects one of the standard tests, which is to
determine whether list-column formats are appropriately processed.
List-columns commonly arise when using (either directly or indirectly),
the tidyr::nest()
function, or equivalently in base R with the I
or AsIs
function. They look like this:
dat <- data.frame (x = 1:3, y = 4:6)
dat$x <- I (as.list (dat$x)) # base R
dat <- tidyr::nest (dat, y = y)
print (dat)
#> # A tibble: 3 × 2
#> x y
#> <I<list>> <list>
#> 1 <int [1]> <tibble [1 × 1]>
#> 2 <int [1]> <tibble [1 × 1]>
#> 3 <int [1]> <tibble [1 × 1]>
The use of packages like tidyr
and purrr
quite often
leads to tibble
-class inputs
which contain list-columns. Any functions which fail to identify and
appropriately respond to such inputs may generate unexpected errors, and
this autotest
is intended to enforce appropriate handling
of these kinds of inputs. The following lines demonstrate the kinds of
results that can arise without such checks.
m <- mtcars
head (m, n = 2L)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
m$mpg <- I (as.list (m$mpg))
head (m, n = 2L) # looks exaxtly the same
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
cor (m)
#> Error in cor(m): 'x' must be numeric
In contrast, many functions either assume inputs to be lists, and
convert when not, or implicitly unlist
. Either way, such
functions may respond entirely consistently regardless of the presence
of list-columns, like this:
The list-column autotest
is intended to enforce
consistent behaviour in response to list-column inputs. One way to
identify list-column formats is to check the value of
class(unclass(.))
of each column. The unclass
function is necessary to first remove any additional class attributes,
such as I
in dat$x
above. A modified version
of our function which identifies and responds to list-column inputs
might look like this:
code <- c ("#' my_function",
"#'",
"#' @param x An input",
"#' @return Something else",
"#' @examples",
"#' y <- my_function (x = 1)",
"#' y <- my_function (x = 1:2)",
"#' @export",
"my_function <- function (x) {",
" if (methods::is (unclass (x), \"list\")) {",
" x <- unlist (x)",
" }",
" if (is.numeric (x)) {",
" x <- as.integer (x)",
" }",
" return (x + 1L)",
"}")
writeLines (code, file.path (path, "R", "myfn.R"))
roxygen2::roxygenise (path)
#> ℹ Loading demo
That change once again leads to clean autotest
results:
autotest_package (path, test = TRUE)
#>
#> ── autotesting demo ──
#>
#> ✔ [1 / 1]: my_function
#> NULL
Of course simply attempting to unlist
a complex
list-column may be dangerous, and it may be preferable to issue some
kind of message or warning, or even either simply remove any
list-columns entirely or generate an error. Replacing the above,
potentially dangerous, line, x <- unlist (x)
with a
simple stop("list-columns are not allowed")
will also
produce clean autotest
results.
Return results and documentation
Functions which return complicated results, such as objects with
specific classes, need to document those class types, and
autotest
compares return objects with documentation to
ensure that this is done. The following code constructs a new function
to demonstrate some of the ways autotest
inspects return
objects, demonstrating a vector input (length(x) > 1
) in
the example to avoid messages regarding length checks an integer
ranges.
code <- c ("#' my_function3",
"#'",
"#' @param x An input",
"#' @examples",
"#' y <- my_function3 (x = 1:2)",
"#' @export",
"my_function3 <- function (x) {",
" return (datasets::iris)",
"}")
writeLines (code, file.path (path, "R", "myfn3.R"))
roxygen2::roxygenise (path) # need to update docs with seed param
#> ℹ Loading demo
#> Writing NAMESPACE
#> Writing my_function3.Rd
x4 <- autotest_package (path, test = TRUE)
#>
#> ── autotesting demo ──
#>
#> ✔ [1 / 2]: my_function
#> ✔ [2 / 2]: my_function3
DT::datatable (x4, options = list (dom = "t"))
Several new diagnostic messages are then issued regarding the description of the returned value. Let’s insert a description to see the effect.
code <- c (code [1:3],
"#' @return The iris data set as dataframe",
code [4:length (code)])
writeLines (code, file.path (path, "R", "myfn3.R"))
roxygen2::roxygenise (path) # need to update docs with seed param
#> ℹ Loading demo
#> Writing my_function3.Rd
x5 <- autotest_package (path, test = TRUE)
#>
#> ── autotesting demo ──
#>
#> ✔ [1 / 2]: my_function
#> ✔ [2 / 2]: my_function3
DT::datatable (x5, options = list (dom = "t"))
That result still contains a couple of diagnostic messages, but it is
now pretty clear what we need to do, which is to be precise with our
specification of the class of return object. The following then suffices
to once again generate clean autotest
results.
code [4] <- "#' @return The iris data set as data.frame"
writeLines (code, file.path (path, "R", "myfn3.R"))
roxygen2::roxygenise (path) # need to update docs with seed param
#> ℹ Loading demo
#> Writing my_function3.Rd
autotest_package (path, test = TRUE)
#>
#> ── autotesting demo ──
#>
#> ✔ [1 / 2]: my_function
#> ✔ [2 / 2]: my_function3
#> NULL
Documentation of input parameters
Similar checks are performed on the documentation of input parameters, as demonstrated by the following modified version of the preceding function.
code <- c ("#' my_function3",
"#'",
"#' @param x An input",
"#' @return The iris data set as data.frame",
"#' @examples",
"#' y <- my_function3 (x = datasets::iris)",
"#' @export",
"my_function3 <- function (x) {",
" return (x)",
"}")
writeLines (code, file.path (path, "R", "myfn3.R"))
roxygen2::roxygenise (path) # need to update docs with seed param
#> ℹ Loading demo
#> Writing my_function3.Rd
x6 <- autotest_package (path, test = TRUE)
#>
#> ── autotesting demo ──
#>
#> ✔ [1 / 2]: my_function
#> ✔ [2 / 2]: my_function3
DT::datatable (x6, options = list (dom = "t"))
This warning again indicates precisely how it can be rectified, for example by replacing the third line with
code [3] <- "#' @param x An input which can be a data.frame"
General Procedure
The demonstrations above hopefully suffice to indicate the general
procedure which autotest
attempts to make as simple as
possible. This procedure consists of the following single point:
- From the moment you develop your first function, and every single
time you modify your code, do whatever steps are necessary to ensure
autotest_package()
returnsNULL
.
This vignette has only demonstrated a few of the tests included in
the package, but as long as you use autotest
throughout the
entire process of package development, any additional diagnostic
messages should include sufficient information for you to be able to
restructure your code to avoid them.