Creates a data frame using information from paths and file names. It searches through the directories in order to create the path names of the files. It accepts either a template or a regular expression and column names.

dirdf(paths, template = NULL, regexp = NULL, colnames = NULL,
  missing = NA_character_, recursive = TRUE, ...)

dirdf2(paths, template = NULL, regexp = NULL, colnames = NULL,
  missing = NA_character_, recursive = TRUE, ...)

Arguments

paths

character vector with zero or more paths that will be searched.

template

template character string, e.g. "Country/Province/City/StationID_Date.ext".

regexp

regular expression used to parse the file names. Only one of the arguments regexp and template must be specified, i.e. only one of them can be non-NULL.

colnames

character vector containing the names of the columns in the data frame. Not required if using template or if regexp uses named capturing groups (see examples), but may still be used to override column names.

missing

value to use for unmatched optional template elements or regexp capturing groups.

recursive

if TRUE, it will recursively search over directories.

...

Additional arguments pass to base::dir().

See also

Examples

path1 <- system.file(package = "dirdf", "examples", "dataset_1") template1 <- "Year-Month-Day_Assay_Plasmid-Type-Fraction_WellNumber?.extension" files1 <- dirdf(path1, template = template1) print(files1)
#> Year Month Day Assay Plasmid Type Fraction WellNumber #> 1 2013 06 26 BRAFWTNEGASSAY Plasmid Cellline 100-1MutantFraction A01 #> 2 2013 06 26 BRAFWTNEGASSAY Plasmid Cellline 100-1MutantFraction A02 #> 3 2014 02 26 BRAFWTNEGASSAY FFPEDNA CRC 1-41 D08 #> 4 2014 03 05 BRAFWTNEGASSAY FFPEDNA CRC REPEAT platefile #> 5 2016 04 01 BRAFWTNEGASSAY FFPEDNA CRC 1-41 <NA> #> extension #> 1 csv #> 2 csv #> 3 csv #> 4 csv #> 5 csv #> pathname #> 1 2013-06-26_BRAFWTNEGASSAY_Plasmid-Cellline-100-1MutantFraction_A01.csv #> 2 2013-06-26_BRAFWTNEGASSAY_Plasmid-Cellline-100-1MutantFraction_A02.csv #> 3 2014-02-26_BRAFWTNEGASSAY_FFPEDNA-CRC-1-41_D08.csv #> 4 2014-03-05_BRAFWTNEGASSAY_FFPEDNA-CRC-REPEAT_platefile.csv #> 5 2016-04-01_BRAFWTNEGASSAY_FFPEDNA-CRC-1-41.csv
path2 <- system.file(package = "dirdf", "examples", "dataset_2") template2 <- "Date_Assay_Experiment_WellNumber?.extension" files2 <- dirdf(path2, template = template2) print(files2)
#> Date Assay Experiment WellNumber #> 1 2011-12-16 OTHERASSAY FFPEDNA-CRC-1-41 D08 #> 2 2013-06-26 OTHERASSAY Plasmid-Cellline-100-1MutantFraction B02 #> 3 2014-03-05 OTHERASSAY FFPEDNA-CRC-REPEAT platefile #> 4 2014-07-06 OTHERASSAY Plasmid-Cellline-100-1MutantFraction B01 #> 5 2016-01-11 OTHERASSAY FFPEDNA-CRC-2-41 <NA> #> extension pathname #> 1 csv 2011-12-16_OTHERASSAY_FFPEDNA-CRC-1-41_D08.csv #> 2 csv 2013-06-26_OTHERASSAY_Plasmid-Cellline-100-1MutantFraction_B02.csv #> 3 csv 2014-03-05_OTHERASSAY_FFPEDNA-CRC-REPEAT_platefile.csv #> 4 csv 2014-07-06_OTHERASSAY_Plasmid-Cellline-100-1MutantFraction_B01.csv #> 5 csv 2016-01-11_OTHERASSAY_FFPEDNA-CRC-2-41.csv