Extract text from a file
extract_text(file, pages = NULL, area = NULL, password = NULL, encoding = NULL, copy = FALSE)
file  A character string specifying the path or URL to a PDF file. 

pages  An optional integer vector specifying pages to extract from. 
area  An optional list, of length equal to the number of pages specified, where each entry contains a fourelement numeric vector of coordinates (top,left,bottom,right) containing the table for the corresponding page. As a convenience, a list of length 1 can be used to extract the same area from all (specified) pages. 
password  Optionally, a character string containing a user password to access a secured PDF. 
encoding  Optionally, a character string specifying an encoding for the text, to be passed to the assignment method of 
copy  Specifies whether the original local file(s) should be copied to

If pages = NULL
(the default), a length 1 character vector, otherwise a vector of length length(pages)
.
This function converts the contents of a PDF file into a single unstructured character string.
Thomas J. Leeper <[email protected]>
# \donttest{ # simple demo file f < system.file("examples", "text.pdf", package = "tabulizer") # extract all text extract_text(f)#> [1] "To cite R in publications use:\nR Core Team (2018). R: A language and environment for statistical computing.\nR Foundation for Statistical Computing, Vienna, Austria. URL\nhttps://www.Rproject.org/.\nA BibTeX entry for LaTeX users is\[email protected]{,\ntitle = {R: A Language and Environment for Statistical Computing},\nauthor = {{R Core Team}},\norganization = {R Foundation for Statistical Computing},\naddress = {Vienna, Austria},\nyear = {2018},\nurl = {https://www.Rproject.org/},\n}\nWe have invested a lot of time and effort in creating R, please cite it when using\nit for data analysis. See also ‘citation(“pkgname”)’ for citing R packages.\nTo cite R in publications use:\nR Core Team (2018). R: A language and environment for statistical computing.\nR Foundation for Statistical Computing, Vienna, Austria. URL\nhttps://www.Rproject.org/.\nA BibTeX entry for LaTeX users is\[email protected]{,\ntitle = {R: A Language and Environment for Statistical Computing},\nauthor = {{R Core Team}},\norganization = {R Foundation for Statistical Computing},\naddress = {Vienna, Austria},\nyear = {2018},\nurl = {https://www.Rproject.org/},\n}\nWe have invested a lot of time and effort in creating R, please cite it when using\nit for data analysis. See also ‘citation(“pkgname”)’ for citing R packages.\n"# extract all text from page 1 only extract_text(f, pages = 1)#> [1] "To cite R in publications use:\nR Core Team (2018). R: A language and environment for statistical computing.\nR Foundation for Statistical Computing, Vienna, Austria. URL\nhttps://www.Rproject.org/.\nA BibTeX entry for LaTeX users is\[email protected]{,\ntitle = {R: A Language and Environment for Statistical Computing},\nauthor = {{R Core Team}},\norganization = {R Foundation for Statistical Computing},\naddress = {Vienna, Austria},\nyear = {2018},\nurl = {https://www.Rproject.org/},\n}\nWe have invested a lot of time and effort in creating R, please cite it when using\nit for data analysis. See also ‘citation(“pkgname”)’ for citing R packages.\n"#> [1] "@Manual{,\ntitle = {R: A Language and Environment for Statistical Computing},\nauthor = {{R Core Team}},\norganization = {R Foundation for Statistical Computing},\naddress = {Vienna, Austria},\nyear = {2018},\nurl = {https://www.Rproject.org/},\n}\n" #> [2] "@Manual{,\ntitle = {R: A Language and Environment for Statistical Computing},\nauthor = {{R Core Team}},\norganization = {R Foundation for Statistical Computing},\naddress = {Vienna, Austria},\nyear = {2018},\nurl = {https://www.Rproject.org/},\n}\n"# }