If output_dir
is specified, files will have the .xml
file extension.
Arguments
- input
Character vector describing the paths and/or urls to the input documents.
- ...
Other parameters to be sent to
tika()
.
Value
A character vector in the same order and with the same length as input
, of unparsed XHTML
. Unprocessed files are as.character(NA)
.
Examples
# \donttest{
batch <- c(
system.file("extdata", "jsonlite.pdf", package = "rtika"),
system.file("extdata", "curl.pdf", package = "rtika"),
system.file("extdata", "table.docx", package = "rtika"),
system.file("extdata", "xml2.pdf", package = "rtika"),
system.file("extdata", "R-FAQ.html", package = "rtika"),
system.file("extdata", "calculator.jpg", package = "rtika"),
system.file("extdata", "tika.apache.org.zip", package = "rtika")
)
xml <- tika_xml(batch)
# }