While triples data can be added one by one over SPARQL queries, Virtuoso bulk import is by far the fastest way to import large triplestores in the database.
Arguments
- con
a ODBC connection to Virtuoso, from
vos_connect()
- files
paths to files to be imported
- wd
Alternatively, can specify directory and globbing pattern to import. Note that in this case, wd must be in (or a subdir of) the
AllowedDirs
list ofvirtuoso.ini
file created byvos_configure()
. By default, this includes the working directory where you calledvos_start()
orvos_configure()
.- glob
A wildcard aka globbing pattern (e.g. `"*.nq"“).
- graph
Name (technically URI) for a graph in the database. Can leave as default. If a graph is already specified by the import file (e.g. in nquads), that will be used instead.
- n_cores
specify the number of available cores for parallel loading. Particularly useful when importing large numbers of bulk files.
Value
(Invisibly) returns the status table of the bulk loader, indicating file loading time or errors.
Details
the bulk importer imports all files matching a pattern in a given directory. If given a list of files, these are temporarily symlinked (or copied on Windows machines) to the Virtuoso app cache dir in a subdirectory, and the entire subdirectory is loaded (filtered by the globbing pattern). If files are not specified, load is called directly on the specified directory and pattern. This is particularly useful for loading large numbers of files.
Note that Virtuoso recommends breaking large files into multiple smaller ones, which can improve loading time (particularly if using multiple cores.)
Virtuoso Bulk Importer recognizes the following file formats:
.grdf
.nq
.owl
.nt
.rdf
.trig
.ttl
.xml
Any of these can optionally be gzipped (with a .gz
extension).
Examples
vos_status()
#> virtuoso isn't running.
# \donttest{
if(has_virtuoso()){
vos_start()
con <- vos_connect()
example <- system.file("extdata", "person.nq", package = "virtuoso")
vos_import(con, example)
}
#> Warning: Exiting, virtuoso template not found... is virtuoso installed?
#> PROCESS 'virtuoso-t', running, pid 3178.
#> Server is now starting up, this may take a few seconds...
#> virtuoso isn't running.
#> Warning: could not automatically locate virtodbc.so
#> Error in base::tryCatch(base::withCallingHandlers({ NULL base::saveRDS(base::do.call(base::do.call, base::c(base::readRDS("/tmp/Rtmp9zFIEl/callr-fun-862a13a38"), base::list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv, quote = TRUE), file = "/tmp/Rtmp9zFIEl/callr-res-84f1f5fd5", compress = FALSE) base::flush(base::stdout()) base::flush(base::stderr()) NULL base::invisible()}, error = function(e) { { callr_data <- base::as.environment("tools:callr")$`__callr_data__` err <- callr_data$err if (FALSE) { base::assign(".Traceback", base::.traceback(4), envir = callr_data) utils::dump.frames("__callr_dump__") base::assign(".Last.dump", .GlobalEnv$`__callr_dump__`, envir = callr_data) base::rm("__callr_dump__", envir = .GlobalEnv) } e <- err$process_call(e) e2 <- err$new_error("error in callr subprocess") class <- base::class class(e2) <- base::c("callr_remote_error", class(e2)) e2 <- err$add_trace_back(e2) cut <- base::which(e2$trace$scope == "global")[1] if (!base::is.na(cut)) { e2$trace <- e2$trace[-(1:cut), ] } base::saveRDS(base::list("error", e2, e), file = base::paste0("/tmp/Rtmp9zFIEl/callr-res-84f1f5fd5", ".error")) }}, interrupt = function(e) { { callr_data <- base::as.environment("tools:callr")$`__callr_data__` err <- callr_data$err if (FALSE) { base::assign(".Traceback", base::.traceback(4), envir = callr_data) utils::dump.frames("__callr_dump__") base::assign(".Last.dump", .GlobalEnv$`__callr_dump__`, envir = callr_data) base::rm("__callr_dump__", envir = .GlobalEnv) } e <- err$process_call(e) e2 <- err$new_error("error in callr subprocess") class <- base::class class(e2) <- base::c("callr_remote_error", class(e2)) e2 <- err$add_trace_back(e2) cut <- base::which(e2$trace$scope == "global")[1] if (!base::is.na(cut)) { e2$trace <- e2$trace[-(1:cut), ] } base::saveRDS(base::list("error", e2, e), file = base::paste0("/tmp/Rtmp9zFIEl/callr-res-84f1f5fd5", ".error")) }}, callr_message = function(e) { base::try(base::signalCondition(e))}), error = function(e) { NULL if (FALSE) { base::try(base::stop(e)) } else { base::invisible() }}, interrupt = function(e) { NULL if (FALSE) { e } else { base::invisible() }}): ! ODBC failed with error 00000 from [unixODBC][Driver Manager].
#> ✖ Can't open lib 'virtodbc.so' : file not found
#> ℹ From nanodbc/nanodbc.cpp:1150.
# }