The goal of the dataspice package is to make it easier for researchers to create basic, lightweight and concise metadata files for their datasets. These basic files can then be used to:

  • make useful information available during analysis.
  • Create a helpful dataset README webpage.
  • Produce more complex metadata formats to aid dataset discovery. [future goal]

Our metadata fields are based on schema.org and other metadata standards.

Step 1 - Start with one or more files

To start the user will have one or more datafiles in a common directory. We currently support rectagular data with headers in a .csv file or spatial data with attributes.

Step 2 - Fill in Templates

create_spice() reads in the files from that directory and creates a set of template metadata files for the user to populate. These metadata templates will include some data extracted from the users datafiles, including things like file names and measured variable names to aid the user in populating the metadata files.

Once these are created, the user needs to open each and fill in the missing data, as completely as possible and read the files back into R.

Metadata Files

  • creators.csv: one row for each creator, and gives their affiliation and contact email

  • attributes.csv: this is where most of the user data entry will take place, and will be where the measured variable, its units and a written description are provided for every measured variable

  • biblio.csv: citation information about the project, as much or as little data as possible can be included, but if things like bounding box coordinates are not included, then when the website is generated there will not be a bounding box map generated

  • access.csv: includes a row for each file that was read in, and documents the name of each file and its format

Step 3 - Save metadata in JSON

Once the populated the metadata files are read back R they can then be fed into write_JSON(), which converts those tabular metadata files into JSON files.

Step 4 - Create a datadown website

A dataset README website is an interactive representation of the JSON information about the data. Assuming sufficient information is provided in the JSON it will include a map of the points and a bounding box of the area of study.

The output from write_JSON() is fed into build_site() to create the website.

serve_site() ?

[example JSON website image]