Get the fingerprints of all the targets in a data frame.
This functionality is like
make(..., cache_log_file = TRUE)
,
but separated and more customizable. Hopefully, this functionality
is a step toward better data versioning tools.
Usage
drake_cache_log(
path = NULL,
search = NULL,
cache = drake::drake_cache(path = path),
verbose = 1L,
jobs = 1,
targets_only = FALSE
)
Arguments
- path
Path to a
drake
cache (usually a hidden.drake/
folder) orNULL
.- search
Deprecated.
- cache
drake cache. See
new_cache()
. If supplied,path
is ignored.- verbose
Deprecated on 2019-09-11.
- jobs
Number of jobs/workers for parallel processing.
- targets_only
Logical, whether to output information only on the targets in your workflow plan data frame. If
targets_only
isFALSE
, the output will include the hashes of both targets and imports.
Details
A hash is a fingerprint of an object's value.
Together, the hash keys of all your targets and imports
represent the state of your project.
Use drake_cache_log()
to generate a data frame
with the hash keys of all the targets and imports
stored in your cache.
This function is particularly useful if you are
storing your drake project in a version control repository.
The cache has a lot of tiny files, so you should not put it
under version control. Instead, save the output
of drake_cache_log()
as a text file after each make()
,
and put the text file under version control.
That way, you have a changelog of your project's results.
See the examples below for details.
Depending on your project's
history, the targets may be different than the ones
in your workflow plan data frame.
Also, the keys depend on the hash algorithm
of your cache. To define your own hash algorithm,
you can create your own storr
cache and give it a hash algorithm
(e.g. storr_rds(hash_algorithm = "murmur32")
)
Examples
if (FALSE) { # \dontrun{
isolate_example("Quarantine side effects.", {
if (suppressWarnings(require("knitr"))) {
# Load drake's canonical example.
load_mtcars_example() # Get the code with drake_example()
# Run the project, build all the targets.
make(my_plan)
# Get a data frame of all the hash keys.
# If you want a changelog, be sure to do this after every make().
cache_log <- drake_cache_log()
head(cache_log)
# Suppress partial arg match warnings.
suppressWarnings(
# Save the hash log as a flat text file.
write.table(
x = cache_log,
file = "drake_cache.log",
quote = FALSE,
row.names = FALSE
)
)
# At this point, put drake_cache.log under version control
# (e.g. with 'git add drake_cache.log') alongside your code.
# Now, every time you run your project, your commit history
# of hash_lot.txt is a changelog of the project's results.
# It shows which targets and imports changed on every commit.
# It is extremely difficult to track your results this way
# by putting the raw '.drake/' cache itself under version control.
}
})
} # }