Skip to contents

The deposits package is a universal client for depositing and accessing research data anywhere. Currently supported services are zenodo and figshare. These two systems have fundamentally different interfaces (“API”s, or Application Programming Interfaces), and access to these and indeed all deposition services has traditionally been enabled through individual software clients. The deposits package aims to be a universal client offering access to a variety of deposition services, without users having to know any specific details of the APIs for each service. This vignette provides a demonstration of the primary functionality of the deposits package.

The deposits client

The deposits package uses an R6 client to interface with the individual deposition services. The following sub-section explains the properties of a deposits client for those unfamiliar with R6 objects.

R6 methods

The R6 package used to construct deposits clients here allows for structured class objects in R. The objects include elements (such as variables) and methods, which for R are generally functions. A new client can be constructed with the new operator, which for deposits requires specifying the service for which the client is to be constructed:

cli <- depositsClient$new (service = "figshare")

Additional functions are called in a similar way, using the notation, cli$deposit_function(). The deposits package is constructed so that function calls constructed is this way will “automatically” update the object itself, and so generally do not need to be assigned to a return value. For example, the function deposits_list() updates the list of deposits on the associated service. In conventional R packages, calling this function would require assigning a return value like this:

cli_updated <- cli$deposits_list ()

R6 objects are, however, always updated internally, so the client itself, cli, will already include the updated list of deposits without any need for assigning the return value to cli_updated. That is, rather than the above line, all deposits functions may be called simply as,

cli$deposits_list ()

(The single exception to this is the deposit_download_file() function, which returns the path to the locally downloaded file, and so should always be assigned to a return value.)

Initialising a deposits client

An empty client can be constructed by naming the desired service. An additional sandbox parameter constructs a client to the zenodo sandbox environment intended for testing their API. Actual use of the zenodo API can then be enabled with the default sandbox = FALSE.

cli <- depositsClient$new ("zenodo", sandbox = TRUE)
#> <deposits client>
#> deposits service : zenodo
#>           sandbox: TRUE
#>         url_base :
#> Current deposits : <none>
#>   hostdata : <none>
#>   metadata : <none>

Client construction requires personal access or authentication tokens for deposits services to be stored as local environment variables, as described in the main README document. Authentication tokens are checked when new clients are constructed, so the $new() function will only succeed with valid tokens.

As also described in that README, all methods of a deposits client can be seen with the deposits_methods() method:

cli$deposits_methods ()
#> List of methods for a deposits client:
#>    - deposit_delete
#>    - deposit_download_file
#>    - deposit_fill_metadata
#>    - deposit_new
#>    - deposit_retrieve
#>    - deposit_service
#>    - deposit_update
#>    - deposit_upload_file
#>    - deposits_list
#>    - deposits_methods
#>    - deposits_search
#>  see `?depositsClient` for full details of all methods.

The client constructed above is mostly empty, but nevertheless demonstrates the two primary fields or elements of a deposits client:

  1. hostdata holding all metadata from a “host” or external deposits service for a particular deposit; and
  2. metadata holding a consistently structured representation of the key components of the hostdata.

The hostdata structures are generally lists, but differ for different services, whereas the metadata structures remain consistent between services, and allow data to be transformed from one format to another, and, in future functionality, will allow data to be transferred between different services.

Both of these elements represent the “metadata” of a deposit, with the data itself referred to as “files”, which can be uploaded and downloaded. Thus all deposits begin with metadata, with the actual data upload only possible once the initial metadata has been specified and uploaded.


A new deposit is initially constructed by filling the metadata field with a local representation of metadata. The hostdata field is filled only after this initial deposit metadata has been uploaded to the external service. The best way to understand the distinction between metadata and hostdata is through a practical demonstration.

Metadata as a list

There are several ways of defining metadata for a deposits entity, perhaps the easiest of which is as a simple list:

metadata <- list (
    title = "New Title",
    abstract = "This is the abstract",
    creator = list ("A. Person", "B. Person")

A new deposits client can be filled with this metadata by passing it as the metadata parameter:

cli <- depositsClient$new (service = "zenodo", sandbox = TRUE, metadata = metadata)
print (cli)
#> <deposits client>
#>  deposits service : zenodo
#>            sandbox: TRUE
#>          url_base :
#>  Current deposits : <none>
#>     hostdata : <none>
#>     metadata : 4 terms (see 'metadata' element for details)

The summary produced by calling print() (or, equivalently, just typing cli in the console) says that the object now includes four metadata terms. They can be seen by viewing cli$metadata:

#> $abstract
#> [1] "This is the abstract"
#> $created
#> [1] "2023-01-01"
#> $creator
#> $creator[[1]]
#> [1] "A. Person"
#> $creator[[2]]
#> [1] "B. Person"
#> $title
#> [1] "New Title"

Metadata in deposits objects are stored as named lists. These metadata are primarily intended for internal use within the deposits package, and shouldn’t generally need to be manipulated directly by users of this package (although they certainly can be, as illustrated below).

Metadata from a local file

Another convenient way to specify metadata is to use the deposits_metadata_template() funciton to write a local “.json” representation of metadata. This local file includes all metadata fields recognised by a deposits client. The function also accepts an optional metadata parameter which accepts a named list of values used to pre-populate entries in the resultant file.

meta_file <- tempfile (pattern = "meta-", fileext = ".yaml")
deposits_metadata_template (filename = meta_file, metadata = metadata)
head (readLines (meta_file))
#> [1] "{"                                                                                                 
#> [2] "  \"_comment1\": \"Fields starting with underscores will be ignored (and can safely be deleted)\","
#> [3] "  \"abstract\": \"This is the abstract\","                                                         
#> [4] "  \"accessRights\": \"\","                                                                         
#> [5] "  \"accrualMethod\": \"\","                                                                        
#> [6] "  \"accrualPeriodicity\": \"\","

Those metadata can then be directly edited using any text file editor. The name of the file can then also be passed as the metadata parameter of a new deposits client. The following code thus produces the same results as above:

cli <- depositsClient$new (service = "zenodo", sandbox = TRUE, metadata = meta_file)

Creating a new deposit

Once filled with metadata, a deposits client can be used to initiate a new deposit on the associated external service with the $deposit_new() function. This is not to be confused with the $new() function which creates a new client. The $deposit_new() function uses an existing client to create a new deposit on the external service. Using the client constructed above with our sample metadata gives the following result:

cli$deposit_new ()
#> ID of new deposit: 1064327
print (cli)
#> <deposits client>
#>  deposits service : zenodo
#>            sandbox: TRUE
#>          url_base :
#>  Current deposits : 1 (see 'deposits' element for details)
#>  url_deposit :
#>   deposit id : 1064327
#>     hostdata : list with 14  elements
#>     metadata : 4 terms (see 'metadata' element for details)

The client now lists one current deposit, additional fields for the URL and “id” of the deposit, and has a hostdata field with 14 elements. Importantly, the id field holds a unique integer value used to identify particular deposits both on all external services, and as the deposit_id parameter of deposits client functions.

If we now construct a new, empty client, we see the following result:

cli <- depositsClient$new (service = "zenodo", sandbox = TRUE)
print (cli)
#> <deposits client>
#>  deposits service : zenodo
#>            sandbox: TRUE
#>          url_base :
#>  Current deposits : 1 (see 'deposits' element for details)
#>   hostdata : <none>
#>   metadata : <none>

This differs from our initial client in that it now lists one “current deposit”. We can examine that to get the associated “id” value:

#> [1] 1064327

We can then retrieve the metadata we previously uploaded with the deposit_retrieve() function:

cli$deposit_retrieve (deposit_id = cli$deposits$deposit_id)

The local client then holds identical information to the previous client immediately after calling deposit_new() - that is, retrieve_deposit() has filled the local client with all of the metadata from the previously-created deposit.

Uploading files to deposits

The deposits clients thus far have only been used to construct and upload metadata. The main point of a deposit is of course to store actual data in any arbitrary format alongside these structured metadata. This is achieved with the deposit_upload_file() function, demonstrated in the following code which uses our deposit retrieved directly above.

path <- file.path (tempdir (), "data.csv")
write.csv (datasets::Orange, path, row.names = FALSE)
cli$deposit_upload_file (path = path)

Although the print output of our cli object does not change after uploading, the details of the files are contained in the hostdata$files element:

#>                           checksum         filename filesize                                   id
#> 1 cc624d72ede85ef061afa494d9951f6f         data.csv      625 56c44dd6-5f84-4212-9a65-d37f64ca886f
#> 2 eaeb7c4f8a931c99e662172299a0b17f datapackage.json      812 32d556ef-5b65-4b9d-a8a8-2e7bed11da5d
#> 1
#> 2
#>                                                                                              links.self
#> 1
#> 2

The list of files includes a “datapackage.json” file generated by the frictionless package, as described in the main README. Files can be downloaded with the converse download_file function, demonstrated here by first removing the local copy, and then downloading it from the deposits service:

file.remove (path)
file <- cli$deposit_download_file (filename = "data.csv", path = tempdir ())
#> [1] /tmp/RtmpcO59N8/data.csv