The targets package is a Make-like pipeline tool for statistics and data science in R. With targets, you can maintain a reproducible workflow without repeating yourself. targets skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. An up-to-date targets pipeline is tangible evidence that the output aligns with the code and data, which substantiates trust in the results.
Philosophy
A pipeline is a computational workflow that does statistics, analytics, or data science. Examples include forecasting customer behavior, simulating a clinical trial, and detecting differential expression from genomics data. A pipeline contains tasks to prepare datasets, run models, and summarize results for a business deliverable or research paper. The methods behind these tasks are user-defined R functions that live in R scripts, ideally in a folder called "R/" in the project. The tasks themselves are called “targets”, and they run the functions and return R objects. The targets package orchestrates the targets and stores the output objects to make your pipeline efficient, painless, and reproducible.
Prerequisites
- Familiarity with the R programming language, covered in R for Data Science.
- Data science workflow management techniques.
- How to write functions to prepare data, analyze data, and summarize results in a data analysis project.
Installation
| Type | Source | Command |
|---|---|---|
| Release | CRAN | install.packages("targets") |
| Development | GitHub | remotes::install_github("ropensci/targets") |
| Development | rOpenSci | install.packages("targets", repos = "https://dev.ropensci.org") |
Get started in 4 minutes
The 4-minute video at https://vimeo.com/700982360 demonstrates the example pipeline used in the walkthrough and functions chapters of the user manual. Visit https://github.com/wlandau/targets-four-minutes for the code and https://rstudio.cloud/project/3946303 to try out the code in a browser (no download or installation required).
Usage
To create a pipeline of your own:
-
Write R functions for a pipeline and save them to R scripts (ideally in the
"R/"folder of your project). - Call
use_targets()to write key files, including the vital_targets.Rfile which configures and defines the pipeline. - Follow the comments in
_targets.Rto fill in the details of your specific pipeline. - Check the pipeline with
tar_visnetwork(), run it withtar_make(), and read output withtar_read(). More functions are available.
Documentation
-
User manual: in-depth discussion about how to use
targets. - Reference website: formal documentation of all user-side functions, the statement of need, and multiple design documents of the internal architecture.
-
Developer documentation: software design documents for developers contributing to the deep internal architecture of
targets.
Apps
-
tar_watch(): a built-in Shiny app to visualize progress while a pipeline is running. Available as a Shiny module viatar_watch_ui()andtar_watch_server(). -
targetsketch: a Shiny app to help sketch pipelines (app, source).
Deployment
- https://solutions.rstudio.com/r/workflows/ explains how to deploy a pipeline to RStudio Connect (example code).
-
tar_github_actions()sets up a pipeline to run on GitHub Actions. The minimal example demonstrates this approach.
Extending and customizing targets
-
R Targetopia: a collection of R packages that extend
targets. These packages simplify pipeline construction for specific fields of Statistics and data science. - Target factories: a programming technique to write specialized interfaces for custom pipelines. Posts here and here describe how.
Help
- Post to the GitHub discussion forum to ask questions. To get the best help about a specific issue, create a reproducible example with
targets::tar_reprex()orreprex::reprex(). - The RStudio Community forum is full of friendly enthusiasts of R and the tidyverse. Use the
targetstag. -
Stack Overflow broadcasts to the entire open source community. Use the
targets-r-packagetag.
Code of conduct
Please note that this package is released with a Contributor Code of Conduct.
Citation
citation("targets")
To cite targets in publications use:
Landau, W. M., (2021). The targets R package: a dynamic Make-like
function-oriented pipeline toolkit for reproducibility and
high-performance computing. Journal of Open Source Software, 6(57),
2959, https://doi.org/10.21105/joss.02959
A BibTeX entry for LaTeX users is
@Article{,
title = {The targets R package: a dynamic Make-like function-oriented pipeline toolkit for reproducibility and high-performance computing},
author = {William Michael Landau},
journal = {Journal of Open Source Software},
year = {2021},
volume = {6},
number = {57},
pages = {2959},
url = {https://doi.org/10.21105/joss.02959},
}