Remove the data (.tsv) file from all valid git2rdata objects at the path. The metadata remains untouched. A warning lists any git2rdata object with invalid metadata. The function keeps any .tsv file with invalid metadata or from non-git2rdata objects.

Use this function with caution since it will remove all valid data files without asking for confirmation. We strongly recommend to use this function on files under version control. See vignette("workflow", package = "git2rdata") for some examples on how to use this.

rm_data(root = ".", path = NULL, recursive = TRUE, ...)

# S3 method for git_repository
  path = NULL,
  recursive = TRUE,
  stage = FALSE,
  type = c("unmodified", "modified", "ignored", "all")



The root of a project. Can be a file path or a git-repository. Defaults to the current working directory (".").


the directory in which to clean all the data files. The directory is relative to root.


remove files in subdirectories too.


parameters used in some methods


stage the changes after removing the files. Defaults to FALSE.


Defines the classes of files to remove. unmodified are files in the git history and unchanged since the last commit. modified are files in the git history and changed since the last commit. ignored refers to file listed in a .gitignore file. Selecting modified will remove both unmodified and modified data files. Selecting ìgnored will remove unmodified, modified and ignored data files. all refers to all visible data files, including untracked files.


returns invisibly a vector of removed files names. The paths are relative to root.

See also

Other storage: list_data(), prune_meta(), read_vc(), relabel(), write_vc()


## on file system # create a directory root <- tempfile("git2rdata-") dir.create(root) # store a dataframe as git2rdata object. Capture the result to minimise # screen output junk <- write_vc(iris[1:6, ], "iris", root, sorting = "Sepal.Length") # write a standard tab separate file (non git2rdata object) write.table(iris, file = file.path(root, "standard.tsv"), sep = "\t") # write a YAML file yml <- list( authors = list( "Research Institute for Nature and Forest" = list( href = ""))) yaml::write_yaml(yml, file = file.path(root, "_pkgdown.yml")) # list the git2rdata objects list_data(root)
#> [1] "iris"
# list the files list.files(root, recursive = TRUE)
#> [1] "_pkgdown.yml" "iris.tsv" "iris.yml" "standard.tsv"
# remove all .tsv files from valid git2rdata objects rm_data(root, path = ".") # check the removal of the .tsv file list.files(root, recursive = TRUE)
#> [1] "_pkgdown.yml" "iris.yml" "standard.tsv"
#> character(0)
# remove dangling git2rdata metadata files prune_meta(root, path = ".")
#> Warning: Invalid metadata files found. See ?is_git2rmeta(): #> _pkgdown.yml
# check the removal of the metadata list.files(root, recursive = TRUE)
#> [1] "_pkgdown.yml" "standard.tsv"
#> character(0)
## on git repo # initialise a git repo using git2r repo_path <- tempfile("git2rdata-repo-") dir.create(repo_path) repo <- git2r::init(repo_path) git2r::config(repo, = "Alice", = "[email protected]") # store a dataframe write_vc(iris[1:6, ], "iris", repo, sorting = "Sepal.Length", stage = TRUE)
#> 09d5bfd6a65e682a4ca030c766348180861568c8 #> "iris.tsv" #> 0d434e56d22a710c99c5b912e8624d52abd41aaf #> "iris.yml"
# check that the dataframe is stored status(repo)
#> Staged changes: #> New: iris.tsv #> New: iris.yml #>
#> [1] "iris"
# commit the current version and check the git repo commit(repo, "add iris data", session = TRUE)
#> [14482fb] 2020-10-20: add iris data
#> working directory clean
# remove the data files from the repo rm_data(repo, path = ".") # check the removal list_data(repo)
#> character(0)
#> Unstaged changes: #> Deleted: iris.tsv #>
# remove dangling metadata prune_meta(repo, path = ".") # check the removal list_data(repo)
#> character(0)
#> Unstaged changes: #> Deleted: iris.tsv #> Deleted: iris.yml #>
# clean up junk <- file.remove( list.files(root, full.names = TRUE, recursive = TRUE), root) junk <- file.remove( rev(list.files(repo_path, full.names = TRUE, recursive = TRUE, include.dirs = TRUE, all.files = TRUE)), repo_path)