Bayesian data analysis usually incurs long runtimes and cumbersome custom code, and the process of prototyping and deploying custom Stan models can become a daunting software engineering challenge. To ease this burden, the stantargets
R package creates Stan pipelines that are concise, efficient, scalable, and tailored to the needs of Bayesian statisticians. Leveraging targets
, stantargets
pipelines automatically parallelize the computation and skip expensive steps when the results are already up to date. Minimal custom user-side code is required, and there is no need to manually configure branching, so stantargets
is easier to use than targets
and CmdStanR
directly. stantargets
can access all of cmdstanr
’s major algorithms (MCMC, variational Bayes, and optimization) and it supports both single-fit workflows and multi-rep simulation studies.
Prerequisites
- The prerequisites of the
targets
R package. - Basic familiarity with
targets
: watch minutes 7 through 40 of this video, then read this chapter of the user manual. - Familiarity with Bayesian Statistics and Stan. Prior knowledge of
cmdstanr
helps.
How to get started
Read the stantargets
introduction and simulation vignettes, and use https://docs.ropensci.org/stantargets/ as a reference while constructing your own workflows. Visit https://github.com/wlandau/stantargets-example-validation for an example project based on the simulation vignette. The example has an RStudio Cloud workspace which allows you to run the project in a web browser.
Example projects
Description | Link |
---|---|
Validating a minimal Stan model | https://github.com/wlandau/targets-stan |
Using Target Markdown and stantargets to validate a Bayesian longitudinal model for clinical trial data analysis |
https://github.com/wlandau/rmedicine2021-pipeline |
Installation
Install the GitHub development version to access the latest features and patches.
remotes::install_github("ropensci/stantargets")
The CmdStan command line interface is also required.
cmdstanr::install_cmdstan()
If you have problems installing CmdStan, please consult the installation guide of cmdstanr
and the installation guide of CmdStan. Alternatively, the Stan discourse is a friendly place to ask Stan experts for help.
Usage
First, write a _targets.R
file that loads your packages, defines a function to generate Stan data, and lists a pipeline of targets. The target list can call target factories like tar_stan_mcmc()
as well as ordinary targets with tar_target()
. The following minimal example is simple enough to contain entirely within the _targets.R
file, but for larger projects, you may wish to store functions in separate files as in the targets-stan
example.
# _targets.R
library(targets)
library(stantargets)
generate_data <- function() {
true_beta <- stats::rnorm(n = 1, mean = 0, sd = 1)
x <- seq(from = -1, to = 1, length.out = n)
y <- stats::rnorm(n, x * true_beta, 1)
list(n = n, x = x, y = y, true_beta = true_beta)
}
list(
tar_stan_mcmc(
name = example,
stan_files = "x.stan",
data = generate_data()
)
)
Run tar_visnetwork()
to check _targets.R
for correctness, then call tar_make()
to run the pipeline. Access the results using tar_read()
, e.g. tar_read(example_summary_x)
. Visit the introductory vignette to read more about this example.
How it works behind the scenes
stantargets
supports specialized target factories that create ensembles of target objects for cmdstanr
workflows. These target factories abstract away the details of targets
and cmdstanr
and make both packages easier to use. For details, please read the introductory vignette.
Help
Please first read the help guide to learn how best to ask for help.
If you have trouble using stantargets
, you can ask for help in the GitHub discussions forum. Because the purpose of stantargets
is to combine targets
and cmdstanr
, your issue may have something to do with one of the latter two packages, a dependency of targets
, or Stan itself. When you troubleshoot, peel back as many layers as possible to isolate the problem. For example, if the issue comes from cmdstanr
, create a reproducible example that directly invokes cmdstanr
without invoking stantargets
. The GitHub discussion and issue forums of those packages, as well as the Stan discourse, are great resources.
Participation
Development is a community effort, and we welcome discussion and contribution. Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Citation
citation("stantargets")
#>
#> To cite stantargets in publications use:
#>
#> Landau, W. M., (2021). The stantargets R package: a workflow
#> framework for efficient reproducible Stan-powered Bayesian data
#> analysis pipelines. Journal of Open Source Software, 6(60), 3193,
#> https://doi.org/10.21105/joss.03193
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Article{,
#> title = {The stantargets {R} package: a workflow framework for efficient reproducible {S}tan-powered {B}ayesian data analysis pipelines},
#> author = {William Michael Landau},
#> journal = {Journal of Open Source Software},
#> year = {2021},
#> volume = {6},
#> number = {60},
#> pages = {3193},
#> url = {https://doi.org/10.21105/joss.03193},
#> }