What’s it for?
Building an R package is a great way of encapsulating code, documentation and data in a single testable and easily distributable unit.
For a package to be distributed via CRAN, it needs to pass a set of
checks implemented in R CMD check
, such as: Is there
minimal documentation, e.g., are all arguments of exported functions
documented? Are all dependencies declared?
These checks are helpful in developing a solid R package but they don’t check for several other good practices. For example, a package does not need to contain any tests but is it good practice to include such. Following a coding standard helps readability. Avoiding overly complex functions reduces the risk of bugs. Including an URL for bug reports lets people more easily report bugs if they find any.
Tools for automatically checking several of these aspects already
exist and the goodpractice package bundles the checks
from rcmdcheck
with code coverage through the covr
package, source code linting via the lintr
package and cyclomatic complexity via the cyclocomp
package and augments it with some further checks on good practice for R
package development such as avoiding T
and F
in favour of TRUE
and FALSE
. It provides
advice on which practices to follow and which to avoid.
You can use goodpractice checks as a reminder for you and your colleagues - and if you have custom checks to run, you can make goodpractice run those as well! Please see the vignette “Custom Checks” for more details.
Good practice out of the box
Main function
The main function is goodpractice()
and has an alias
gp()
which takes the path to the source code of a package
as its first argument. The goodpractice package
contains the source for a simple package which violates some good
practices. We’ll use this for the examples.
library(goodpractice)
# get path to example package
pkg_path <- system.file("bad1", package = "goodpractice")
# run gp() on it
g <- gp(pkg_path)
#> ℹ Preparing: covr
#> Warning in MYPREPS[[prep]](state, quiet = quiet): Prep step for test coverage
#> failed.
#> ℹ Preparing: cyclocomp
#> ── R CMD build ─────────────────────────────────────────────────────────────────
#> * checking for file ‘/tmp/RtmpaOKMC1/remotes91b3b8c6c59/badpackage/DESCRIPTION’ ... OK
#> * preparing ‘badpackage’:
#> * checking DESCRIPTION meta-information ... OK
#> * checking vignette meta-information ... OK
#> * checking for LF line-endings in source and make files and shell scripts
#> * checking for empty or unneeded directories
#> * building ‘badpackage_1.0.0.tar.gz’
#> ℹ Preparing: description
#> ℹ Preparing: lintr
#> ℹ Preparing: namespace
#> ℹ Preparing: rcmdcheck
# show the result
g
#> ── GP badpackage ───────────────────────────────────────────────────────────────
#>
#> It is good practice to
#>
#> ✖ not use "Depends" in DESCRIPTION, as it can cause name clashes, and poor
#> interaction with other packages. Use "Imports" instead.
#> ✖ omit "Date" in DESCRIPTION. It is not required and it gets invalid quite
#> often. A build date will be added to the package when you perform `R CMD
#> build` on it.
#> ✖ add a "URL" field to DESCRIPTION. It helps users find information about
#> your package online. If your package does not have a homepage, add an URL
#> to GitHub, or the CRAN package package page.
#> ✖ add a "BugReports" field to DESCRIPTION, and point it to a bug tracker.
#> Many online code hosting services provide bug trackers for free,
#> https://github.com, https://gitlab.com, etc.
#> ✖ omit trailing semicolons from code lines. They are not needed and most R
#> coding standards forbid them
#>
#> R/semicolons.R:4:30
#> R/semicolons.R:5:29
#> R/semicolons.R:9:38
#>
#> ✖ not import packages as a whole, as this can cause name clashes between the
#> imported packages, especially over time as packages change. Instead, import
#> only the specific functions you need.
#> ✖ fix this R CMD check ERROR: VignetteBuilder package not declared: ‘knitr’
#> See section ‘The DESCRIPTION file’ in the ‘Writing R Extensions’ manual.
#> ✖ avoid 'T' and 'F', as they are just variables which are set to the logicals
#> 'TRUE' and 'FALSE' by default, but are not reserved words and hence can be
#> overwritten by the user. Hence, one should always use 'TRUE' and 'FALSE'
#> for the logicals.
#>
#> R/tf.R
#> R/tf.R
#> R/tf.R
#> R/tf.R
#> R/tf.R
#> ... and 4 more lines
#>
#> ────────────────────────────────────────────────────────────────────────────────
So with this package, we’ve done a few things in the DESCRIPTION file
for which there are reasons not to do them, have unnecessary trailing
semicolons in the code and used T
and F
for
TRUE
and FALSE
. The output of
gp()
tells you what you did that isn’t considered good
practice and if it’s in the R code, it points you the location of your
faux-pas. In general, the messages are supposed to not only point out to
you what you might want to avoid but also why.
The above example tries to run all 230 checks available, to see the
full list use all_checks()
. If you only want to run a
subset of the checks, e.g., the one on the URL field in the DESCRIPTION,
you can specify the checks by name:
# what is the name of the check?
grep("url", all_checks(), value = TRUE)
#> [1] "description_url"
# run only this check
g_url <- gp(pkg_path, checks = "description_url")
#> ℹ Preparing: description
g_url
#> ── GP badpackage ───────────────────────────────────────────────────────────────
#>
#> It is good practice to
#>
#> ✖ add a "URL" field to DESCRIPTION. It helps users find information about
#> your package online. If your package does not have a homepage, add an URL
#> to GitHub, or the CRAN package package page.
#> ────────────────────────────────────────────────────────────────────────────────
Doing more than just printing
Apart from printing a goodPractice
object as returned by
gp()
to access the advice, you can also access which checks
were carried out and which of those failed:
# which checks were carried out?
checks(g_url)
#> [1] "description_url"
# which checks failed?
failed_checks(g)
#> [1] "no_description_depends"
#> [2] "no_description_date"
#> [3] "description_url"
#> [4] "description_bugreports"
#> [5] "lintr_semicolon_linter"
#> [6] "no_import_package_as_a_whole"
#> [7] "rcmdcheck_package_dependencies_present"
#> [8] "truefalse_not_tf"
To access all the checks carried out with their results in a data
frame, use results()
on your goodPractice
object.
# show the first 5 checks carried out and their results
results(g)[1:5,]
#> check result
#> 1 covr NA
#> 2 cyclocomp TRUE
#> 3 no_description_depends FALSE
#> 4 no_description_date FALSE
#> 5 description_url FALSE
Note that the code coverage could not be calculated. The
corresponding check does not show up in the failed checks (because it
was not carried out) and the result is NA
. It is also
possible to export the results to a JSON file with
export_json()
.