CHANGES TO tabulizer 0.2.2
CRAN release: 2018-06-07
-
extract_tables()
getsoutdir
argument for writing out CSV, TSV and JSON files. - Fixes in vignette.
- avoid using reflection for Java 17 compatibility.
CHANGES TO tabulizer 0.2.1
CRAN release: 2018-05-24
-
make_thumbnails()
andsplit_pdf()
now usetempdir()
as the default output directory. -
extract_
functions getcopy
argument for copying original local files to R session’s temporary directory. - PDF files used do not get copied to temporary directory by default.
- General clean-up for CRAN submission.
CHANGES TO tabulizer 0.2.0
- Upgrade to PDFBox 2/Tabula 1.0.1 (#48)
-
method
argument is changed tooutput
inextract_tables()
. - New
method
argument reflects method of extraction as in Tabula command-line Java utility. -
extract_text()
acceptsarea
as argument.
CHANGES TO tabulizer 0.1.23
- Transferred source code repository to ropensci: https://github.com/ropensci/tabulizer
CHANGES TO tabulizer 0.1.22
- Transferred source code repository to ropenscilabs: https://github.com/ropenscilabs/tabulizer
CHANGES TO tabulizer 0.1.21
- Exposed a new option
widget
inlocate_areas()
to control which widget is used in locating areas.
CHANGES TO tabulizer 0.1.20
- Fixed bug in internal function
try_area_full()
introduced by changes in8.
CHANGES TO tabulizer 0.1.19
- Further expand the
locate_areas()
interface to use a Shiny gadget when working within RStudio, or otherwise rely on the full functionality interface (based on graphics device events) or reduced functionality interface (relying onlocator()
). (#8)
CHANGES TO tabulizer 0.1.18
- Completely rewrite the
locate_areas()
interface to rely on graphics device event handling where possible. This may behave differently across platforms or in RStudio. (#8)
CHANGES TO tabulizer 0.1.17
- Fixed a bug in
extract_tables()
such that when no tables are found, an empty list is returned (formethod
values with list response structures). (h/t Lincoln Mullen)
CHANGES TO tabulizer 0.1.16
-
split_pdfs()
andmake_thumbnails()
gain anoutdir
argument to specify where to save the output. The file numbering of output files is also now zero-padded. - An warning in
merge_pdfs()
has been fixed. -
stop_logging()
is called when the package is attached to the search path. -
get_page_dims()
earns adoc
argument and argument order inget_n_pages()
is reversed.
CHANGES TO tabulizer 0.1.15
- Added better support for specifying character encoding. (#10)
CHANGES TO tabulizer 0.1.14
- Added support for password-protected PDF files. (#11)
CHANGES TO tabulizer 0.1.12
- Improve handling of URLs when using
extract_areas()
by downloading PDF to temporary directory.
CHANGES TO tabulizer 0.1.11
- Exposed new functions
split_pdf()
andmerge_pdfs()
to split and merge PDFs, respectively. (#9) - Exposed a
get_n_pages()
to determine the page length of a PDF document. - Moved tabula .jar file to separate package, tabulizerjars. (#2)
CHANGES TO tabulizer 0.1.10
- Added a new function
extract_metadata()
to extract PDF metadata as a list. - Added a new function
extract_text()
to convert PDF contents to an R character vector. - Changed the internal
localize_file()
function to use PDFBox to natively read from a URL. - Removed illogical default
file
argument value inextract_tables()
.
CHANGES TO tabulizer 0.1.8
- Fixed a bug in
make_areas()
internal whenarea
was specified as a length 1 list for a multi-page document. (#5, h/t Tony Hirst)
CHANGES TO tabulizer 0.1.7
- Added a function,
extract_areas()
, to interactively identify and extract page areas. Another new function,locates_areas()
implements the locator functionality without performing any extraction. - Added a function,
make_thumbnails()
, to convert pages into individual image files. - Added a function,
get_page_dims()
, to extract page dimensions.
CHANGES TO tabulizer 0.1.3
- Added multiple table writing options beyond the default list of matrices. (#1)