Snapshot the Git data repository of a targets
project.
Usage
tar_git_snapshot(
message = NULL,
ref = "HEAD",
code = getwd(),
script = targets::tar_config_get("script"),
store = targets::tar_config_get("store"),
stash_gitignore = TRUE,
reporter = targets::tar_config_get("reporter_outdated"),
envir = parent.frame(),
callr_function = callr::r,
callr_arguments = NULL,
status = interactive(),
force = FALSE,
pack_refs = TRUE,
verbose = TRUE
)
Arguments
- message
Optional Git commit message of the data snapshot. If
NULL
, then the message is the Git commit message of the matching code commit.- ref
Character of length 1, reference (branch name, Git SHA1 hash, etc.) of the code commit that will map to the new data snapshot. Defaults to the commit checked out right now.
- code
Character of length 1, directory path to the code repository, usually the root of the
targets
project.- script
Character of length 1, path to the target script file. Defaults to
tar_config_get("script")
, which in turn defaults to_targets.R
. When you set this argument, the value oftar_config_get("script")
is temporarily changed for the current function call. Seetar_script()
,tar_config_get()
, andtar_config_set()
for details about the target script file and how to set it persistently for a project.- store
Character of length 1, path to the data store of the pipeline. If
NULL
, thestore
setting is left unchanged in the YAML configuration file (default:_targets.yaml
). Usually, the data store lives at_targets
. Setstore
to a custom directory to specify a path other than_targets/
. The path need not exist before the pipeline begins, and it need not end with "_targets", but it must be writeable. For optimal performance, choose a storage location with fast read/write access. If the argumentNULL
, the setting is not modified. Usetar_config_unset()
to delete a setting.- stash_gitignore
Logical of length 1, whether to temporarily stash the
.gitignore
file of the data store. See the "Stashing .gitignore" section for details.- reporter
Character of length 1, name of the reporter to user. Controls how messages are printed as targets are checked. Choices:
"silent"
: print nothing."forecast"
: print running totals of the checked and outdated targets found so far.
- envir
An environment, where to run the target R script (default:
_targets.R
) ifcallr_function
isNULL
. Ignored ifcallr_function
is anything other thanNULL
.callr_function
should only beNULL
for debugging and testing purposes, not for serious runs of a pipeline, etc.The
envir
argument oftar_make()
and related functions always overrides the current value oftar_option_get("envir")
in the current R session just before running the target script file, so whenever you need to set an alternativeenvir
, you should always set it withtar_option_set()
from within the target script file. In other words, if you calltar_option_set(envir = envir1)
in an interactive session and thentar_make(envir = envir2, callr_function = NULL)
, thenenvir2
will be used.- callr_function
A function from
callr
to start a fresh clean R process to do the work. Set toNULL
to run in the current session instead of an external process (but restart your R session just before you do in order to clear debris out of the global environment).callr_function
needs to beNULL
for interactive debugging, e.g.tar_option_set(debug = "your_target")
. However,callr_function
should not beNULL
for serious reproducible work.- callr_arguments
A list of arguments to
callr_function
.- status
Logical of length 1, whether to print the project status with
tar_git_status()
and ask whether a snapshot should be created.- force
Logical of length 1. Force checkout the data branch of an existing data snapshot of the current code commit?
- pack_refs
Logical of length 1, whether to run
git pack-refs --all
in the data store after taking the snapshot. Packing references improves efficiency when the number of snapshots is large. Learn more at https://git-scm.com/docs/git-pack-refs.- verbose
Logical of length 1, whether to print R console messages confirming that a snapshot was created.
Details
A Git-backed gittargets
data snapshot is a special kind of
Git commit. Every data commit is part of a branch specific to
the current code commit.
That way, when you switch branches or commits in the code,
tar_git_checkout()
checks out the latest data snapshot
that matches the code in your workspace.
That way, your targets can stay up to date even as you
transition among multiple branches.
Stashing .gitignore
The targets
package writes a .gitignore
file to new data stores
in order to prevent accidental commits to the code Git repository.
Unfortunately, for gittargets
, this automatic .gitignore
file
interferes with proper data versioning. So by default, gittargets
temporarily stashes it to a hidden file called .gittargets_gitignore
inside the data store. If your R program crashes while the stash
is active, you can simply move it manually back to .gitignore
or run tar_git_status_data()
to restore the stash automatically
if no .gitignore
already exists.
See also
Other git:
tar_git_checkout()
,
tar_git_init()
,
tar_git_log()
,
tar_git_ok()
,
tar_git_status_code()
,
tar_git_status_data()
,
tar_git_status_targets()
,
tar_git_status()
Examples
if (Sys.getenv("TAR_EXAMPLES") == "true" && tar_git_ok(verbose = FALSE)) {
targets::tar_dir({ # Containing code does not modify the user's filespace.
targets::tar_script(tar_target(data, 1))
targets::tar_make()
gert::git_init()
gert::git_add("_targets.R")
gert::git_commit("First commit")
tar_git_init()
tar_git_snapshot(status = FALSE)
})
}