Run a pipeline of targets in parallel with transient
future
workers.
Source: R/tar_make_future.R
tar_make_future.Rd
This function is like tar_make()
except that targets
run in parallel with transient future
workers. It requires
that you declare your future::plan()
inside the
target script file (default: _targets.R
).
future
is not a strict dependency of targets
,
so you must install future
yourself.
Usage
tar_make_future(
names = NULL,
shortcut = targets::tar_config_get("shortcut"),
reporter = targets::tar_config_get("reporter_make"),
seconds_interval = targets::tar_config_get("seconds_interval"),
workers = targets::tar_config_get("workers"),
callr_function = callr::r,
callr_arguments = targets::tar_callr_args_default(callr_function, reporter),
envir = parent.frame(),
script = targets::tar_config_get("script"),
store = targets::tar_config_get("store"),
garbage_collection = targets::tar_config_get("garbage_collection")
)
Arguments
- names
Names of the targets to build or check. Set to
NULL
to check/build all the targets (default). Otherwise, you can supplytidyselect
helpers likeany_of()
andstarts_with()
. Becausetar_make()
and friends run the pipeline in a new R session, if you pass a character vector to a tidyselect helper, you will need to evaluate that character vector early with!!
, e.g.tar_make(names = any_of(!!your_vector))
. Applies to ordinary targets (stem) and whole dynamic branching targets (patterns) but not to individual dynamic branches.- shortcut
Logical of length 1, how to interpret the
names
argument. Ifshortcut
isFALSE
(default) then the function checks all targets upstream ofnames
as far back as the dependency graph goes.shortcut = TRUE
increases speed if there are a lot of up-to-date targets, but it assumes all the dependencies are up to date, so please use with caution. It relies on stored metadata for information about upstream dependencies.shortcut = TRUE
only works if you setnames
.- reporter
Character of length 1, name of the reporter to user. Controls how messages are printed as targets run in the pipeline. Defaults to
tar_config_get("reporter_make")
. Choices:"silent"
: print nothing."summary"
: print a running total of the number of each targets in each status category (queued, started, skipped, build, canceled, or errored). Also show a timestamp ("%H:%M %OS2"
strptime()
format) of the last time the progress changed and printed to the screen."timestamp"
: same as the"verbose"
reporter except that each .message begins with a time stamp."timestamp_positives"
: same as the"timestamp"
reporter except without messages for skipped targets."verbose"
: print messages for individual targets as they start, finish, or are skipped. Each individual target-specific time (e.g. "3.487 seconds") is strictly the elapsed runtime of the target and does not include steps like data retrieval and output storage."verbose_positives"
: same as the"verbose"
reporter except without messages for skipped targets.
- seconds_interval
Positive numeric of length 1 with the minimum number of seconds between saves to the metadata and progress data. Also controls how often the reporter prints progress messages. Higher values generally make the pipeline run faster, but unsaved work (in the event of a crash) is not up to date. When a target starts or the pipeline ends, everything is saved/printed immediately, regardless of
seconds_interval
.- workers
Positive integer, maximum number of transient
future
workers allowed to run at any given time.- callr_function
A function from
callr
to start a fresh clean R process to do the work. Set toNULL
to run in the current session instead of an external process (but restart your R session just before you do in order to clear debris out of the global environment).callr_function
needs to beNULL
for interactive debugging, e.g.tar_option_set(debug = "your_target")
. However,callr_function
should not beNULL
for serious reproducible work.- callr_arguments
A list of arguments to
callr_function
.- envir
An environment, where to run the target R script (default:
_targets.R
) ifcallr_function
isNULL
. Ignored ifcallr_function
is anything other thanNULL
.callr_function
should only beNULL
for debugging and testing purposes, not for serious runs of a pipeline, etc.The
envir
argument oftar_make()
and related functions always overrides the current value oftar_option_get("envir")
in the current R session just before running the target script file, so whenever you need to set an alternativeenvir
, you should always set it withtar_option_set()
from within the target script file. In other words, if you calltar_option_set(envir = envir1)
in an interactive session and thentar_make(envir = envir2, callr_function = NULL)
, thenenvir2
will be used.- script
Character of length 1, path to the target script file. Defaults to
tar_config_get("script")
, which in turn defaults to_targets.R
. When you set this argument, the value oftar_config_get("script")
is temporarily changed for the current function call. Seetar_script()
,tar_config_get()
, andtar_config_set()
for details about the target script file and how to set it persistently for a project.- store
Character of length 1, path to the
targets
data store. Defaults totar_config_get("store")
, which in turn defaults to_targets/
. When you set this argument, the value oftar_config_get("store")
is temporarily changed for the current function call. Seetar_config_get()
andtar_config_set()
for details about how to set the data store path persistently for a project.- garbage_collection
Logical of length 1, whether to run garbage collection on the main process before sending a target to a worker. Independent from the
garbage_collection
argument oftar_target()
, which controls garbage collection on the worker.
Value
NULL
except if callr_function = callr::r_bg()
, in which case
a handle to the callr
background process is returned. Either way,
the value is invisibly returned.
Details
To configure tar_make_future()
with a computing cluster,
see the future.batchtools
package documentation.
See also
Other pipeline:
tar_make_clustermq()
,
tar_make()
Examples
if (identical(Sys.getenv("TAR_EXAMPLES"), "true")) { # for CRAN
tar_dir({ # tar_dir() runs code from a temp dir for CRAN.
tar_script({
future::plan(future::multisession, workers = 2)
list(
tar_target(x, 1 + 1),
tar_target(y, 1 + 1)
)
}, ask = FALSE)
tar_make_future()
})
}