nodbi is an R package that provides a single interface for several NoSQL databases and databases with JSON functionality, with the same function parameters and return values across all database backends. Last updated 2026-01-17.
Currently, nodbi supportsas database backends |
for an R object of anyof these data types |
for these operations |
|---|---|---|
| MongoDB | data.frame | List, Exists |
| SQLite | list | Create |
| PostgreSQL | JSON string | Get |
| DuckDB | file name of NDJSON records | Query |
| Elasticsearch | URL of NDJSON records | Update |
| CouchDB | Delete |
For speed comparisons of database backends, see benchmark and testing below.
Plans for developing package nodbi in 2026 include to add MariaDB and MySQL as database backends.
API overview
Parameters for docdb_*() functions are the same across all database backends. See walk-through below and the canonical testing in core-nodbi.R. “Container” is used as term to indicate where conceptually the backend holds the data, see Database connections below. The key parameter holds the name of a container.
| Purpose | Function call |
|---|---|
| Create database connection (see below) | src <- nodbi::src_{duckdb, postgres, mongo, sqlite, couchdb, elastic}(<see below for parameters>) |
Load my_data (a data frame, list, JSON string, or file name or URL pointing to NDJSON records) into database, container my_container
|
nodbi::docdb_create(src = src, key = "my_container", value = my_data) |
| Get all documents back into a data frame | nodbi::docdb_get(src = src, key = "my_container") |
| Get documents selected with query (as MongoDB-compatible JSON) into a data frame | nodbi::docdb_query(src = src, key = "my_container", query = '{"age": 20}') |
| Get selected fields (in MongoDB compatible JSON) from documents selected by query into a data frame | nodbi::docdb_query(src = src, key = "my_container", query = '{"age": {"$gt": 20}}', fields = '{"friends.name": 1, "_id": 0, "age": 1}', limit = 2L) |
Update (patch) documents selected by query with new data my_data (in a data frame, list, JSON string, or file name or URL pointing to NDJSON records) |
nodbi::docdb_update(src = src, key = "my_container", value = my_data, query = '{"age": 20}') |
| Check if container exists | nodbi::docdb_exists(src = src, key = "my_container") |
| List all containers in database | nodbi::docdb_list(src = src) |
| Delete document(s) in container | nodbi::docdb_delete(src = src, key = "my_container", query = '{"age": 20}') |
| Delete container | nodbi::docdb_delete(src = src, key = "my_container") |
Close and remove database connection manually (when restarting R, connections are automatically closed and removed by nodbi) |
rm(src) |
Install
CRAN version
install.packages("nodbi")Development version
remotes::install_github("ropensci/nodbi")Load package from library
Database connections
Overview on parameters and aspects that are specific to the database backend. These are only needed once, for for src_*() to create a connection object. Any such connection object is subsequently used similarly across the docdb_* functions.
“Container” refers to how conceptually the backend holds the data. Data types are mapped from JSON to R objects by jsonlite. Any root-level _id is extracted from the document(s) and used for an index column _id, otherwise a UUID is created as _id.
DuckDB
See also https://CRAN.R-project.org/package=duckdb. “Container” refers to a DuckDB table, with columns _id and json created and used by package nodbi, applying SQL functions and functions as per https://duckdb.org/docs/extensions/json to the json column. Each row in the table represents a JSON document.
src <- nodbi::src_duckdb(dbdir = ":memory:", ...)MongoDB
“Container” refers to a MongoDB collection, in which nodbi creates JSON documents. See also https://jeroen.github.io/mongolite/. MongoDB but none of the other databases require to specify the container name already in the src_*() function; use the collection name for parameter key in docdb_* functions.
src <- nodbi::src_mongo(
collection = "my_container", db = "my_database",
url = "mongodb://localhost", ...
)SQLite
“Container” refers to an SQLite table, with columns _id and json created and used by package nodbi, applying SQL functions and functions as per https://www.sqlite.org/json1.html to the json column. Each row in the table represents a JSON document. The table is indexed on _id. See also https://CRAN.R-project.org/package=RSQLite.
src <- nodbi::src_sqlite(dbname = ":memory:", ...)PostgreSQL
“Container” refers to a PostgreSQL table, with columns _id and json created and used by package nodbi, applying SQL functions and functions as per https://www.postgresql.org/docs/current/functions-json.html to the json column. With PostgreSQL, a custom plpgsql function jsonb_merge_patch() is used for docdb_update(). The order of variables in data frames returned by docdb_get() and docdb_query() can differ from their order the input to docdb_create().
src <- nodbi::src_postgres(
dbname = "my_database", host = "127.0.0.1", port = 5432L, ...
)CouchDB
“Container” refers to a CouchDB database, in which nodbi creates JSON documents. See also https://CRAN.R-project.org/package=sofa. With CouchDB, function docdb_update() uses jqr to implement patching JSON, in analogy to functions available for the other databases.
src <- nodbi::src_couchdb(
host = "127.0.0.1", port = 5984L, path = NULL,
transport = "http", user = NULL, pwd = NULL, headers = NULL
)Elasticsearch
“Container” refers to an Elasticsearch index, in which nodbi creates JSON documents. Opensearch can equally be used. Only lowercase is accepted for container names (in parameter key of docdb_* functions).
src <- nodbi::src_elastic(
host = "127.0.0.1", port = 9200L, path = NULL,
transport_schema = "http", user = NULL, pwd = NULL, ...
)As of 2026-01-17, the future availability and development of package elastic is not clear after it was recently archived, see also https://github.com/opensci-archive/elastic; at this time, it can still be installed as follows.
remotes::install_github("ropensci-archive/elastic")Walk-through
This example is to show how functional nodbi is at this time: With any of the six database backends, the functions work in the same way and return the same values.
# load nodbi
library(nodbi)
# name of container
key <- "my_container"
# connect any of these database backends
src <- src_duckdb()
src <- src_mongo(collection = key)
src <- src_sqlite()
src <- src_postgres()
src <- src_elastic()
src <- src_couchdb(
user = Sys.getenv("COUCHDB_TEST_USER"),
pwd = Sys.getenv("COUCHDB_TEST_PWD")
)
# check if container already exists
docdb_exists(src, key)
# [1] FALSE
# load data (here data frame, alternatively a list, JSON or file with NSJSON)
# into the container "my_container" specified in "key" parameter
docdb_create(src, key, value = mtcars)
# [1] 32
# load additionally 98 NDJSON records
docdb_create(src, key, "https://httpbin.org/stream/98")
# Note: container 'my_container' already exists
# [1] 98
# load additionally contacts JSON data, from package nodbi
docdb_create(src, key, contacts)
# Note: container 'my_container' already exists
# [1] 5
# get all documents, irrespective of schema
dplyr::tibble(docdb_get(src, key))
# # A tibble: 135 × 27
# `_id` isActive balance age eyeColor name email about registered tags friends
# <chr> <lgl> <chr> <int> <chr> <chr> <chr> <chr> <chr> <list> <list>
# 1 5cd6… TRUE $2,412… 20 blue Kris… kris… "Sin… 2017-07-1… <chr> <df>
# 2 5cd6… FALSE $3,400… 20 brown Rae … raec… "Nis… 2018-12-1… <chr> <df>
# 3 5cd6… TRUE $1,161… 22 brown Pace… pace… "Eiu… 2018-08-1… <chr> <df>
# 4 5cd6… FALSE $2,579… 30 brown Will… will… "Nul… 2018-02-1… <chr> <df>
# 5 5cd6… FALSE $3,808… 23 green Lacy… lacy… "Sun… 2014-08-0… <chr> <df>
# 6 69bc… NA NA NA NA NA NA NA NA <NULL> <NULL>
# 7 69bc… NA NA NA NA NA NA NA NA <NULL> <NULL>
# 8 69bc… NA NA NA NA NA NA NA NA <NULL> <NULL>
# 9 69bc… NA NA NA NA NA NA NA NA <NULL> <NULL>
# 10 69bc… NA NA NA NA NA NA NA NA <NULL> <NULL>
# # ℹ 125 more rows
# # ℹ 16 more variables: url <chr>, args <df[,0]>, headers <df[,4]>, origin <chr>,
# # id <int>, mpg <dbl>, cyl <int>, disp <dbl>, hp <int>, drat <dbl>, wt <dbl>,
# # qsec <dbl>, vs <int>, am <int>, gear <int>, carb <int>
# # ℹ Use `print(n = ...)` to see more rows
# query some documents
docdb_query(src, key, query = '{"mpg": {"$gte": 30}}')
# _id mpg cyl disp hp drat wt qsec vs am gear carb
# 1 Fiat 128 32 4 79 66 4.1 2.2 19 1 1 4 1
# 2 Honda Civic 30 4 76 52 4.9 1.6 19 1 1 4 2
# 3 Toyota Corolla 34 4 71 65 4.2 1.8 20 1 1 4 1
# 4 Lotus Europa 30 4 95 113 3.8 1.5 17 1 1 5 2
# query some fields from some documents; 'query' is a mandatory
# parameter and is used here in its position in the signature
docdb_query(src, key, '{"mpg": {"$gte": 30}}', fields = '{"wt": 1, "mpg": 1}')
# _id wt mpg
# 1 Fiat 128 2.2 32
# 2 Honda Civic 1.6 30
# 3 Lotus Europa 1.5 30
# 4 Toyota Corolla 1.8 34
# query some subitem fields from some documents
str(docdb_query(
src, key,
query = '{"$or": [{"age": {"$gt": 21}},
{"friends.name": {"$regex": "^B[a-z]{3,9}.*"}}]}',
fields = '{"age": 1, "friends.name": 1}'
))
# 'data.frame': 3 obs. of 3 variables:
# $ _id : chr "5cd6785325ce3a94dfc54096" "5cd6785335b63cb19dfa8347" "5cd67853f841025e65ce0ce2"
# $ age : int 22 30 23
# $ friends.name:List of 3
# ..$ : chr "Baird Keller" "Francesca Reese" "Dona Bartlett"
# ..$ : chr "Coleen Dunn" "Doris Phillips" "Concetta Turner"
# ..$ : chr "Wooten Goodwin" "Brandie Woodward" "Angelique Britt"
# such queries can also be used for updating (patching) selected documents
# with a new 'value'(s) from a JSON string, a data frame a list or a file with NSJSON)
docdb_update(src, key, value = '{"vs": 9, "xy": [1, 2]}', query = '{"carb": 3}')
# [1] 3
docdb_query(src, key, '{"carb": {"$in": [1,3]}}', fields = '{"vs": 1, "_id": 0}')[[1]]
# [1] 1 1 1 9 9 9 1 1 1 1
docdb_get(src, key)[c(3, 109, 130, 101), c("_id", "xy", "url", "email")]
# _id xy url email
# 3 5cd6785325ce3a94dfc54096 NULL <NA> pacebell@conjurica.com
# 109 Dodge Challenger NULL <NA> <NA>
# 130 Pontiac Firebird NULL <NA> <NA>
# 101 69bcd195-a59c-11ee-bfb9-acbc328130bb NULL https://httpbin.org/stream/98 <NA>
# use with dplyr
# *note* that dplyr includes a (deprecated) function src_sqlite
# which would mask nodbi's src_sqlite, so it is excluded here
library("dplyr", exclude = c("src_sqlite", "src_postgres"))
#
docdb_get(src, key) %>%
group_by(gear) %>%
summarise(mean_mpg = mean(mpg))
# # A tibble: 4 × 2
# gear mean_mpg
# <int> <dbl>
# 1 3 16.1
# 2 4 24.5
# 3 5 21.4
# 4 NA NA
# delete documents; query is optional parameter and has to be
# specified for deleting documents instead of deleting the container
dim(docdb_query(src, key, query = '{"$or": [{"age": {"$lte": 20}}, {"age": {"$gte": 25}}]}'))
# [1] 3 11
docdb_delete(src, key, query = '{"$or": [{"age": {"$lte": 20}}, {"age": {"$gte": 25}}]}')
# TRUE
nrow(docdb_get(src, key))
# [1] 132
# delete container from database
docdb_delete(src, key)
# [1] TRUE
#
# shutdown
DBI::dbDisconnect(src$con, shutdown = TRUE)
rm(src)Benchmark
library("nodbi")
srcMongo <- src_mongo()
srcSqlite <- src_sqlite()
srcPostgres <- src_postgres()
srcDuckdb <- src_duckdb()
srcElastic <- src_elastic()
srcCouchdb <- src_couchdb(
user = Sys.getenv("COUCHDB_TEST_USER"),
pwd = Sys.getenv("COUCHDB_TEST_PWD")
)
key <- "test2"
query <- '{"clarity": {"$in": ["NOTME", "VS1"]}}'
fields <- '{"cut": 1, "_id": 1, "clarity": 1}'
value <- '{"clarity": "XYZ", "new": ["ABC", "DEF"]}'
data <- diamonds[1:1000, ]
ndjs <- tempfile()
jsonlite::stream_out(diamonds[1:10000, ], con = file(ndjs), verbose = FALSE)
testFunction <- function(src) {
try(docdb_delete(src, key), silent = TRUE)
on.exit(docdb_delete(src, key))
suppressMessages(docdb_create(src, key, data))
suppressMessages(docdb_create(src, key, ndjs))
head(docdb_get(src, key))
docdb_query(src, key, query = query, fields = fields)
docdb_query(src, key, query = query, listfields = TRUE)
docdb_update(src, key, value = value, query = query)
}
result <- rbenchmark::benchmark(
CouchDB = testFunction(src = srcCouchdb),
DuckDB = testFunction(src = srcDuckdb),
Elastic = testFunction(src = srcElastic),
MongoDB = testFunction(src = srcMongo),
PostgreSQL = testFunction(src = srcPostgres),
SQLite = testFunction(src = srcSqlite),
replications = 3L,
order = "elapsed"
)
# 2026-01-17 with M3 hardware, databases via homebrew
result[ , c("test", "replications", "elapsed")]
# test replications elapsed
# 4 CouchDB 3 36.78
# 3 Elastic 3 25.60
# 1 MongoDB 3 8.61
# 5 PostgreSQL 3 1.12
# 2 SQLite 3 0.66
# 6 DuckDB 3 0.57
message(R.version$version.string)
# R Under development (unstable) (2026-01-16 r89305)
pkgs <- c("nodbi", "RSQLite", "duckdb", "RPostgres", "mongolite", "elastic", "sofa")
for (pkg in pkgs) message(pkg, ": ", packageVersion(pkg))
# nodbi: 0.14.0
# RSQLite: 2.4.4
# duckdb: 1.4.3
# RPostgres: 1.4.8
# mongolite: 4.0.0
# elastic: 1.2.1.91
# sofa: 0.4.0Testing
Every database backend is subject to identical tests, see core-nodbi.R.
# 2026-01-17
suppressMessages(testthat::test_local())
# ✔ | F W S OK | Context
# ✔ | 2 184 | couchdb [72.7s]
# ✔ | 1 183 | duckdb [1.8s]
# ✔ | 2 181 | elastic [57.4s]
# ✔ | 2 182 | mongodb [2.6s]
# ✔ | 185 | postgres [3.5s]
# ✔ | 186 | sqlite [1.9s]
#
# ══ Results ══════════════════════
# Duration: 140.0 s
#
# ── Skipped tests (7) ────────────
# • Testing for auto disconnect and shutdown not relevant (3):
# test-couchdb.R:26:3, test-elastic.R:21:3, test-mongodb.R:24:3
# • Testing for parallel writes not possible or implemented (4):
# test-couchdb.R:26:3, test-duckdb.R:22:3, test-elastic.R:21:3,
# test-mongodb.R:24:3
#
# [ FAIL 0 | WARN 0 | SKIP 7 | PASS 1101 ]
covr::package_coverage(path = ".", type = "tests")
# nodbi Coverage: 94.01%
# R/src_postgres.R: 82.43%
# R/src_duckdb.R: 84.00%
# R/zzz.R: 85.56%
# R/update.R: 94.46%
# R/query.R: 94.67%
# R/src_mongo.R: 95.00%
# R/get.R: 95.79%
# R/create.R: 96.25%
# R/delete.R: 98.96%
# R/exists.R: 100.00%
# R/list.R: 100.00%
# R/src_couchdb.R: 100.00%
# R/src_elastic.R: 100.00%
# R/src_sqlite.R: 100.00%Notes
- Please report any issues or bugs.
- License: MIT
- Get citation information for
nodbiin R doingcitation(package = 'nodbi') - Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
- Support for redis has been removed since version 0.5.
