Retrieve the most recent commit that added or updated a file or git2rdata object. This does not imply that file still exists at the current HEAD as it ignores the deletion of files.
Use this information to document the current version of file or git2rdata
object in an analysis. Since it refers to the most recent change of this
file, it remains unchanged by committing changes to other files. You can
also use it to track if data got updated, requiring an analysis to
be rerun. See vignette("workflow", package = "git2rdata").
Arguments
- file
the name of the git2rdata object. Git2rdata objects cannot have dots in their name. The name may include a relative path.
fileis a path relative to theroot. Note thatfilemust point to a location withinroot.- root
The root of a project. Can be a file path or a
git-repository.- data
does
filerefers to a data object (TRUE) or to a file (FALSE)? Defaults toFALSE.
Value
a data.frame with commit, author and when for the most recent
commit that adds op updates the file.
See also
Other version_control:
commit(),
pull(),
push(),
repository(),
status()
Examples
# initialise a git repo using git2r
repo_path <- tempfile("git2rdata-repo")
dir.create(repo_path)
repo <- git2r::init(repo_path)
git2r::config(repo, user.name = "Alice", user.email = "alice@example.org")
# write and commit a first dataframe
# store the output of write_vc() minimize screen output
junk <- write_vc(
iris[1:6, ], "iris", repo, sorting = "Sepal.Length", stage = TRUE,
digits = 6
)
commit(repo, "important analysis", session = TRUE)
#> [5cfd1c6] 2025-10-26: important analysis
list.files(repo_path)
#> [1] "iris.tsv" "iris.yml"
Sys.sleep(1.1) # required because git doesn't handle subsecond timings
# write and commit a second dataframe
junk <- write_vc(
iris[7:12, ], "iris2", repo, sorting = "Sepal.Length", stage = TRUE,
digits = 6
)
commit(repo, "important analysis", session = TRUE)
#> [2e6d21c] 2025-10-26: important analysis
list.files(repo_path)
#> [1] "iris.tsv" "iris.yml" "iris2.tsv" "iris2.yml"
Sys.sleep(1.1) # required because git doesn't handle subsecond timings
# write and commit a new version of the first dataframe
junk <- write_vc(iris[7:12, ], "iris", repo, stage = TRUE)
list.files(repo_path)
#> [1] "iris.tsv" "iris.yml" "iris2.tsv" "iris2.yml"
commit(repo, "important analysis", session = TRUE)
#> [2b6d66d] 2025-10-26: important analysis
# find out in which commit a file was last changed
# "iris.tsv" was last updated in the third commit
recent_commit("iris.tsv", repo)
#> commit author when
#> 1 2b6d66d14360bc43d96b27b17d9e79d4e1d00b42 Alice 2025-10-26 05:23:39
# "iris.yml" was last updated in the first commit
recent_commit("iris.yml", repo)
#> commit author when
#> 1 2b6d66d14360bc43d96b27b17d9e79d4e1d00b42 Alice 2025-10-26 05:23:39
# "iris2.yml" was last updated in the second commit
recent_commit("iris2.yml", repo)
#> commit author when
#> 1 2e6d21c3dcbf9265b3b9b8af8a585fd68be63452 Alice 2025-10-26 05:23:38
# the git2rdata object "iris" was last updated in the third commit
recent_commit("iris", repo, data = TRUE)
#> commit author when
#> 1 2b6d66d14360bc43d96b27b17d9e79d4e1d00b42 Alice 2025-10-26 05:23:39
# remove a dataframe and commit it to see what happens with deleted files
file.remove(file.path(repo_path, "iris.tsv"))
#> [1] TRUE
prune_meta(repo, ".")
commit(repo, message = "remove iris", all = TRUE, session = TRUE)
#> [19637f3] 2025-10-26: remove iris
list.files(repo_path)
#> [1] "iris2.tsv" "iris2.yml"
# still points to the third commit as this is the latest commit in which the
# data was present
recent_commit("iris", repo, data = TRUE)
#> commit author when
#> 1 2b6d66d14360bc43d96b27b17d9e79d4e1d00b42 Alice 2025-10-26 05:23:39