Skip to contents

skimr 2.1.5

CRAN release: 2022-12-23

  • Updated to work with newer version of purrr

skimr 2.1.4

CRAN release: 2022-04-15

NEW FEATURES

  • skim() used within a function now prints the data frame name.
  • we have improved the interaction between focus() and the print methods.
    • columns selected in focus() are shown in the correct order
    • some edge cases relating to empty skim types have been improved
    • you can control the width rule line for the printed subtables with an option: skimr_table_header_width. The default is to use the console width, i.e. the value of the width option.
  • we have improved performance when handling large data with many columns.

MINOR IMPROVEMENTS

  • Replace the Suppporting Additional Objects vignette with Extending skimr. Remove sf from Suggests.
  • Default support for haven_labelled columns is now supported. These columns are summarized using skimmers for the underlying data, typically either numeric or character.

BUG FIXES

  • A skim_list (most commonly generated by the partition() function) also inherits from a list

skimr 2.1.3

CRAN release: 2021-03-07

MINOR IMPROVEMENTS

  • Add support for data tables when dtplyr is used.
  • Improve tests.

skimr 2.1.2

CRAN release: 2020-07-06

MINOR IMPROVEMENTS

  • Add support for lubridate Timespan objects.
  • Improvements to Supporting Additional Objects vignette.

BUG FIXES

  • Update package to work with new version of knitr.

skimr 2.1.1 (2020-04-15)

CRAN release: 2020-04-16

MINOR IMPROVEMENTS

  • Prepare for release of dplyr 1.0 and related packages.
  • 0-length sfls are now permitted.

skimr 2.1.0 (2020-01-10)

NEW FEATURES

We’ve made to_long() generic, supporting a more intuitive interface.

  • Called on a skim_df, it reshapes the output into the V1 long style.
  • Called on other tibble-like objects, it first skims then produces the long output. You can pass a custom skim function, like skim_tee()

Thanks @sethlatimer for suggesting this feature.

BUG FIXES

  • Update package to work with new version of tibble.
  • Adds more flexibility in the rule width for skimr::summarize().
  • More README badges and documentation crosslinks

skimr 2.0.1 (2019-11-23)

CRAN release: 2019-11-24

BUG FIXES

Address failed build in CRAN due to lack of UTF-8 support in some platforms.

skimr 2.0.0 (2019-11-12)

Welcome to skimr V2

V2 is a complete rewrite of skimr, incorporating all of the great feedback the developers have received over the last year. A big thank you goes to @GShotwell, @akraemer007, @puterleat, @tonyfischetti, @Nowosad, @rgayler, @jrosen48, @randomgambit, @elben10, @koliii, @AndreaPi, @rubenarslan, @GegznaV, @svraka, @dpprdan and to our ROpenSci reviewers @jenniferthompson and @jimhester for all of the great support and feedback over the last year. We couldn’t have done this without you.

For most users using skimr will not change in terms of visual outputs. However for users who use skimr outputs as part of a larger workflow the differences are substantial.

Breaking changes

The skim_df

We’ve changed the way data is represented within skimr to closer match expectations. It is now wide by default. This makes piping statistics much simpler

skim(iris) %>%
  dplyr::filter(numeric.sd > 1)

This means that the old reshaping functions skim_to_wide() and skim_to_list() are deprecated. The latter is replaced with a reshaping function called partition() that breaks a skim_df into a list by data type. Similarly, yank() gets a specific data type from the skim_df. to_long() gets you data that is closest to the format in the old API.

As the above example suggests, columns of summary statistics are prefixed by skim_type. That is, statistics from numeric columns all begin numeric., those for factors all begin factor., and so on.

Rendering

We’ve deprecated support for pander() and our kable() method. Instead, we now support knitr through the knit_print() API. This is much more seamless than before. Having a skim_df as the final object in a code chunk should produce nice results in the majority of RMarkdown formats.

Customizing and extending

We’ve deprecated the previous approach customization. We no longer use skim_format() and skim_with() no longer depends on a global state. Instead skim_with() is now a function factory. Customization creates a new skimming function.

my_skim <- skim_with(numeric = sfl(mad = mad))

The fundamental tool for customization is the sfl object, a skimmer function list. It is used within skim_with() and also within our new API for adding default functions for new data types, the generic get_skimmers().

Most of the options set in skim_format are now either in function arguments or print arguments. The former can be updated using skim_with, the latter in a call to print(). In RMarkdown documents, you can change the number of displayed digits by adding the skimr_digits option to your code chunk.

OTHER NEW FEATURES

  • Substantial improvements to summary(), and it is now incorporated into print() methods.
  • focus() is like dplyr::select(), but it keeps around the columns skim_type and skim_variable.
  • We are also evaluating the behavior of different dplyr verbs to make sure that they place nice with skimr objects.
  • While skimr has never really focused on performance, it should do a better job on big data sets with lots of different columns.
  • New statistic for character variables counting the number of rows that are completely made up of white space.
  • We now export skim_without_charts() as a fallback for when unicode support is not possible.
  • By default, skimr removes the tibble metadata when generating output. On some platforms, this can lead to all output getting removed. To disable that behavior, set either strip_metadata = FALSE when calling print or use options(skimr_strip_metadata = FALSE).

BUG FIXES

  • Adjust code for several tidyverse soft deprecations.
  • Fix issue where multibyte characters were causing an error.

MINOR IMPROVEMENTS

  • Change top_counts to use useNA = “no”.

skimr 1.0.6 (2019-05-27)

CRAN release: 2019-05-27

BUG FIXES

  • Fix issue where skim_tee() was not respecting … options.
  • Fix issue where all NA character vectors were not returning NA for max() and min()

skimr 1.0.5 (2019-01-05)

CRAN release: 2019-02-25

This is likely to be the last release of skimr version 1. Version 2 has major changes to the API. Users should review and prepare for those changes now.

BUG FIXES

  • Fix issue where multibyte characters were causing an error.
  • Fix problem in which purrr cannot find mean.default.

skimr 1.0.4 (2018-01-12)

CRAN release: 2019-01-13

This is likely to be the last release of skimr version 1. Version 2 has major changes to the API. Users should review and prepare for those changes now.

BUG FIXES

  • Fix failures in handling dplyr verbs related to upcoming release of dplyr 0.8.0.

skimr 1.0.3 (2018-06-06)

CRAN release: 2018-06-07

NEW FEATURES

  • You can use skim_with() with a nest list of functions: skim_with(.list = mylist) or skim_with(!!!mylist)
  • More polished display of subtables in default printing.

BUG FIXES

  • Fix issue with conflict between knitr and skimr versions of kable() that occurred intermittently.
  • Do not skim a class when the skimmer list is empty for that class.
  • Fix a mistake in a test of skim_print for top counts.

skimr 1.0.2 (2018-04-04)

CRAN release: 2018-04-02

NEW FEATURES

  • You can create skimmers with the formula syntax from rlang: skim_with(iqr = ~IQR(.x, na.rm = TRUE)).

MAJOR CHANGES

  • The median label has been changed to p50 for consistency with the previous changes to p0 and p100.

MINOR IMPROVEMENTS

  • Improvements and corrections to to README and other documentation.
  • New vignette showing defaults for skimmers and formats.
  • Vector output match data frame output more closely.
  • Add minimum required version for testhat.
  • Add minimum required version for knitr.

BUG FIXES

  • You can use skim_with() to add and remove skimmers at the same time, i.e. skim_with(iqr = IQR, hist = NULL) works as expected.
  • Histograms work when Inf or -Inf are present.
  • Change seq( ) parameter to length.out to avoid problems with name matching.
  • Summary should not display a data frame name of “.” (which occurs when piping begins with the data frame).

skimr 1.0.1 (2018-01-09)

CRAN release: 2018-01-10

NEW FEATURES

  • Add support for spark plots on Windows

MAJOR CHANGES

  • spark_line() and spark_bar() are no longer exported
  • Default statistics for numeric changed from min(x) and max(x) to quantile(x, probs = 0) and quantile(x, probs = 1). These changes lead to more predictable behaviors when a column is all NA values.
MINOR IMPROVEMENTS
  • Add minimimum required version for stringr
  • Improve documentation in general, especially those related to fonts

BUG FIXES

  • Fix issue where a histogram for data with all NAs threw an error
  • Suppress progress bars from dplyr::do()

skimr 0.92 (2017-12-19)

MAJOR CHANGES

  • skim_v() is no longer exported. Vectors are now directly supported via skim.default().
  • Change license to GPL 3

NEW FEATURES

  • Add support for kable() and pander() for skim_df objects.
  • Add summary method for skim_df objects.
  • Add support for tidy select to skim specific columns of a data frame.
  • Add support for skimming individual vectors via skim.default().

skimr 0.91 (2017-10-14)

NEW FEATURES

  • Handling of grouped data (generated by dplyr::group_by())
  • Printing for all column classes
  • Add indicator of if a factor is ordered to skim object for factor
  • Introduction of flexible formatting
  • Easy dropping of individual functions
  • Vignettes for basic use and use with specialized object types
  • Updated README and added CONTRIBUTING.md and CONDUCT.md
  • New public get_skimmers function to access skim functions
  • Support for difftime class

MINOR IMPROVEMENTS

  • Add header to print providing summary information about data.

BUG FIXES

  • Change from Colformat to Pillar.

skimr 0.900 (2017-07-16)

BUG FIXES

  • Fix documentation for get_fun_names()
  • Fix test and build errors and notes