Remove target values from _targets/objects/
and the cloud
and remove target metadata from _targets/meta/meta
for targets that are no longer part of the pipeline.
Usage
tar_prune(
cloud = TRUE,
batch_size = 1000L,
verbose = TRUE,
callr_function = callr::r,
callr_arguments = targets::tar_callr_args_default(callr_function),
envir = parent.frame(),
script = targets::tar_config_get("script"),
store = targets::tar_config_get("store")
)
Arguments
- cloud
Logical of length 1, whether to delete objects from the cloud if applicable (e.g. AWS, GCP). If
FALSE
, files are not deleted from the cloud.- batch_size
Positive integer between 1 and 1000, number of target objects to delete from the cloud with each HTTP API request. Currently only supported for AWS. Cannot be more than 1000.
- verbose
Logical of length 1, whether to print console messages to show progress when deleting each batch of targets from each cloud bucket. Batched deletion with verbosity is currently only supported for AWS.
- callr_function
A function from
callr
to start a fresh clean R process to do the work. Set toNULL
to run in the current session instead of an external process (but restart your R session just before you do in order to clear debris out of the global environment).callr_function
needs to beNULL
for interactive debugging, e.g.tar_option_set(debug = "your_target")
. However,callr_function
should not beNULL
for serious reproducible work.- callr_arguments
A list of arguments to
callr_function
.- envir
An environment, where to run the target R script (default:
_targets.R
) ifcallr_function
isNULL
. Ignored ifcallr_function
is anything other thanNULL
.callr_function
should only beNULL
for debugging and testing purposes, not for serious runs of a pipeline, etc.The
envir
argument oftar_make()
and related functions always overrides the current value oftar_option_get("envir")
in the current R session just before running the target script file, so whenever you need to set an alternativeenvir
, you should always set it withtar_option_set()
from within the target script file. In other words, if you calltar_option_set(envir = envir1)
in an interactive session and thentar_make(envir = envir2, callr_function = NULL)
, thenenvir2
will be used.- script
Character of length 1, path to the target script file. Defaults to
tar_config_get("script")
, which in turn defaults to_targets.R
. When you set this argument, the value oftar_config_get("script")
is temporarily changed for the current function call. Seetar_script()
,tar_config_get()
, andtar_config_set()
for details about the target script file and how to set it persistently for a project.- store
Character of length 1, path to the
targets
data store. Defaults totar_config_get("store")
, which in turn defaults to_targets/
. When you set this argument, the value oftar_config_get("store")
is temporarily changed for the current function call. Seetar_config_get()
andtar_config_set()
for details about how to set the data store path persistently for a project.
Value
NULL
except if callr_function
is callr::r_bg
, in which case
a handle to the callr
background process is returned. Either way,
the value is invisibly returned.
Details
tar_prune()
is useful if you recently worked through
multiple changes to your project and are now trying to
discard irrelevant data while keeping the results that still matter.
Global objects and local files with format = "file"
outside the
data store are unaffected. Also removes _targets/scratch/
,
which is only needed while tar_make()
, tar_make_clustermq()
,
or tar_make_future()
is running. To list the targets that will be
pruned without actually removing anything, use tar_prune_list()
.
Storage access
Several functions like tar_make()
, tar_read()
, tar_load()
,
tar_meta()
, and tar_progress()
read or modify
the local data store of the pipeline.
The local data store is in flux while a pipeline is running,
and depending on how distributed computing or cloud computing is set up,
not all targets can even reach it. So please do not call these
functions from inside a target as part of a running
pipeline. The only exception is literate programming
target factories in the tarchetypes
package such as tar_render()
and tar_quarto()
.
Cloud target data versioning
Some buckets in Amazon S3 or Google Cloud Storage are "versioned",
which means they track historical versions of each data object.
If you use targets
with cloud storage
(https://books.ropensci.org/targets/cloud-storage.html)
and versioning is turned on, then targets
will record each
version of each target in its metadata.
Functions like tar_read()
and tar_load()
load the version recorded in the local metadata,
which may not be the same as the "current" version of the
object in the bucket. Likewise, functions tar_delete()
and tar_destroy()
only remove
the version ID of each target as recorded in the local
metadata.
If you want to interact with the latest version of an object instead of the version ID recorded in the local metadata, then you will need to delete the object from the metadata.
Make sure your local copy of the metadata is current and up to date. You may need to run
tar_meta_download()
ortar_meta_sync()
first.Run
tar_unversion()
to remove the recorded version IDs of your targets in the local metadata.With the version IDs gone from the local metadata, functions like
tar_read()
andtar_destroy()
will use the latest version of each target data object.Optional: to back up the local metadata file with the version IDs deleted, use
tar_meta_upload()
.
See also
tar_prune_inspect
Other clean:
tar_delete()
,
tar_destroy()
,
tar_invalidate()
,
tar_prune_list()
,
tar_unversion()
Examples
if (identical(Sys.getenv("TAR_EXAMPLES"), "true")) { # for CRAN
tar_dir({ # tar_dir() runs code from a temp dir for CRAN.
tar_script({
library(targets)
library(tarchetypes)
list(
tar_target(y1, 1 + 1),
tar_target(y2, 1 + 1),
tar_target(z, y1 + y2)
)
}, ask = FALSE)
tar_make()
# Remove some targets from the pipeline.
tar_script(list(tar_target(y1, 1 + 1)), ask = FALSE)
# Keep only the remaining targets in the data store.
tar_prune()
})
}