Changelog
Source:NEWS.md
skimr 2.1.4
CRAN release: 2022-04-15
NEW FEATURES
- skim() used within a function now prints the data frame name.
- we have improved the interaction between focus() and the print methods.
- columns selected in focus() are shown in the correct order
- some edge cases relating to empty skim types have been improved
- you can control the width rule line for the printed subtables with an option:
skimr_table_header_width
. The default is to use the console width, i.e. the value of thewidth
option.
- we have improved performance when handling large data with many columns.
MINOR IMPROVEMENTS
- Replace the Suppporting Additional Objects vignette with Extending skimr. Remove sf from Suggests.
- Default support for
haven_labelled
columns is now supported. These columns are summarized using skimmers for the underlying data, typically either numeric or character.
BUG FIXES
- A
skim_list
(most commonly generated by thepartition()
function) also inherits from alist
skimr 2.1.2
CRAN release: 2020-07-06
skimr 2.1.0 (2020-01-10)
NEW FEATURES
We’ve made to_long()
generic, supporting a more intuitive interface.
- Called on a
skim_df
, it reshapes the output into the V1 long style. - Called on other tibble-like objects, it first skims then produces the long output. You can pass a custom skim function, like
skim_tee()
Thanks @sethlatimer for suggesting this feature.
skimr 2.0.0 (2019-11-12)
Welcome to skimr V2
V2 is a complete rewrite of skimr
, incorporating all of the great feedback the developers have received over the last year. A big thank you goes to @GShotwell, @akraemer007, @puterleat, @tonyfischetti, @Nowosad, @rgayler, @jrosen48, @randomgambit, @elben10, @koliii, @AndreaPi, @rubenarslan, @GegznaV, @svraka, @dpprdan and to our ROpenSci reviewers @jenniferthompson and @jimhester for all of the great support and feedback over the last year. We couldn’t have done this without you.
For most users using skimr
will not change in terms of visual outputs. However for users who use skimr
outputs as part of a larger workflow the differences are substantial.
Breaking changes
The skim_df
We’ve changed the way data is represented within skimr
to closer match expectations. It is now wide by default. This makes piping statistics much simpler
skim(iris) %>%
dplyr::filter(numeric.sd > 1)
This means that the old reshaping functions skim_to_wide()
and skim_to_list()
are deprecated. The latter is replaced with a reshaping function called partition()
that breaks a skim_df
into a list by data type. Similarly, yank()
gets a specific data type from the skim_df
. to_long()
gets you data that is closest to the format in the old API.
As the above example suggests, columns of summary statistics are prefixed by skim_type
. That is, statistics from numeric columns all begin numeric.
, those for factors all begin factor.
, and so on.
Rendering
We’ve deprecated support for pander()
and our kable()
method. Instead, we now support knitr
through the knit_print()
API. This is much more seamless than before. Having a skim_df
as the final object in a code chunk should produce nice results in the majority of RMarkdown formats.
Customizing and extending
We’ve deprecated the previous approach customization. We no longer use skim_format()
and skim_with()
no longer depends on a global state. Instead skim_with()
is now a function factory. Customization creates a new skimming function.
my_skim <- skim_with(numeric = sfl(mad = mad))
The fundamental tool for customization is the sfl
object, a skimmer function list. It is used within skim_with()
and also within our new API for adding default functions for new data types, the generic get_skimmers()
.
Most of the options set in skim_format
are now either in function arguments or print arguments. The former can be updated using skim_with
, the latter in a call to print()
. In RMarkdown documents, you can change the number of displayed digits by adding the skimr_digits
option to your code chunk.
OTHER NEW FEATURES
- Substantial improvements to
summary()
, and it is now incorporated intoprint()
methods. -
focus()
is likedplyr::select()
, but it keeps around the columnsskim_type
andskim_variable
. - We are also evaluating the behavior of different
dplyr
verbs to make sure that they place nice withskimr
objects. - While
skimr
has never really focused on performance, it should do a better job on big data sets with lots of different columns. - New statistic for character variables counting the number of rows that are completely made up of white space.
- We now export
skim_without_charts()
as a fallback for when unicode support is not possible. - By default,
skimr
removes the tibble metadata when generating output. On some platforms, this can lead to all output getting removed. To disable that behavior, set eitherstrip_metadata = FALSE
when calling print or useoptions(skimr_strip_metadata = FALSE)
.
skimr 1.0.5 (2019-01-05)
CRAN release: 2019-02-25
This is likely to be the last release of skimr version 1. Version 2 has major changes to the API. Users should review and prepare for those changes now.
skimr 1.0.4 (2018-01-12)
CRAN release: 2019-01-13
This is likely to be the last release of skimr version 1. Version 2 has major changes to the API. Users should review and prepare for those changes now.
skimr 1.0.3 (2018-06-06)
CRAN release: 2018-06-07
skimr 1.0.2 (2018-04-04)
CRAN release: 2018-04-02
NEW FEATURES
- You can create skimmers with the formula syntax from
rlang
:skim_with(iqr = ~IQR(.x, na.rm = TRUE))
.
MAJOR CHANGES
- The median label has been changed to p50 for consistency with the previous changes to p0 and p100.
MINOR IMPROVEMENTS
- Improvements and corrections to to README and other documentation.
- New vignette showing defaults for skimmers and formats.
- Vector output match data frame output more closely.
- Add minimum required version for testhat.
- Add minimum required version for knitr.
BUG FIXES
- You can use
skim_with()
to add and remove skimmers at the same time, i.e.skim_with(iqr = IQR, hist = NULL)
works as expected. - Histograms work when Inf or -Inf are present.
- Change seq( ) parameter to length.out to avoid problems with name matching.
- Summary should not display a data frame name of “.” (which occurs when piping begins with the data frame).
skimr 1.0.1 (2018-01-09)
CRAN release: 2018-01-10
MAJOR CHANGES
-
spark_line()
andspark_bar()
are no longer exported - Default statistics for numeric changed from
min(x)
andmax(x)
toquantile(x, probs = 0)
andquantile(x, probs = 1)
. These changes lead to more predictable behaviors when a column is all NA values.
BUG FIXES
- Fix issue where a histogram for data with all
NA
s threw an error - Suppress progress bars from
dplyr::do()
skimr 0.92 (2017-12-19)
skimr 0.91 (2017-10-14)
NEW FEATURES
- Handling of grouped data (generated by
dplyr::group_by()
) - Printing for all column classes
- Add indicator of if a factor is ordered to skim object for factor
- Introduction of flexible formatting
- Easy dropping of individual functions
- Vignettes for basic use and use with specialized object types
- Updated README and added CONTRIBUTING.md and CONDUCT.md
- New public get_skimmers function to access skim functions
- Support for difftime class