Gutenberg metadata about the subject of each work, particularly Library of Congress Classifications (lcc) and Library of Congress Subject Headings (lcsh).

gutenberg_subjects

Format

A tbl_df (see tibble or dplyr) with one row for each pairing of work and subject, with columns:

gutenberg_id

ID describing a work that can be joined with gutenberg_metadata

subject_type

Either "lcc" (Library of Congress Classification) or "lcsh" (Library of Congress Subject Headings)

subject

Subject

Details

Find more information about Library of Congress Categories here: https://www.loc.gov/catdir/cpso/lcco/, and about Library of Congress Subject Headings here: https://id.loc.gov/authorities/subjects.html.

To find the date on which this metadata was last updated, run attr(gutenberg_subjects, "date_updated").

Examples


library(dplyr)
library(stringr)

gutenberg_subjects %>%
  filter(subject_type == "lcsh") %>%
  count(subject, sort = TRUE)
#> # A tibble: 24,488 × 2
#>    subject                                  n
#>    <chr>                                <int>
#>  1 Fiction                               1801
#>  2 Short stories                         1604
#>  3 Science fiction                       1308
#>  4 Adventure stories                      727
#>  5 Poetry                                 640
#>  6 Love stories                           622
#>  7 Historical fiction                     607
#>  8 Conduct of life -- Juvenile fiction    554
#>  9 English wit and humor -- Periodicals   544
#> 10 Detective and mystery stories          521
#> # … with 24,478 more rows

sherlock_holmes_subjects <- gutenberg_subjects %>%
  filter(str_detect(subject, "Holmes, Sherlock"))

sherlock_holmes_subjects
#> # A tibble: 47 × 3
#>    gutenberg_id subject_type subject                                           
#>           <int> <chr>        <chr>                                             
#>  1          108 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  2          221 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  3          244 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  4          834 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  5         1661 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  6         2097 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  7         2343 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  8         2344 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  9         2345 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#> 10         2346 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#> # … with 37 more rows

sherlock_holmes_metadata <- gutenberg_works() %>%
  filter(author == "Doyle, Arthur Conan") %>%
  semi_join(sherlock_holmes_subjects, by = "gutenberg_id")

sherlock_holmes_metadata
#> # A tibble: 15 × 8
#>    gutenberg_id title  author  gutenberg_autho… language gutenberg_books… rights
#>           <int> <chr>  <chr>              <int> <chr>    <chr>            <chr> 
#>  1          108 The R… Doyle,…               69 en       Detective Ficti… Publi…
#>  2          244 A Stu… Doyle,…               69 en       Detective Ficti… Publi…
#>  3          834 The M… Doyle,…               69 en       Detective Ficti… Publi…
#>  4         1661 The A… Doyle,…               69 en       Banned Books fr… Publi…
#>  5         2097 The S… Doyle,…               69 en       Detective Ficti… Publi…
#>  6         2343 The A… Doyle,…               69 en       Detective Ficti… Publi…
#>  7         2344 The A… Doyle,…               69 en       Detective Ficti… Publi…
#>  8         2345 The A… Doyle,…               69 en       Detective Ficti… Publi…
#>  9         2346 The A… Doyle,…               69 en       Detective Ficti… Publi…
#> 10         2347 The A… Doyle,…               69 en       Detective Ficti… Publi…
#> 11         2348 The D… Doyle,…               69 en       Detective Ficti… Publi…
#> 12         2349 The A… Doyle,…               69 en       Detective Ficti… Publi…
#> 13         2350 His L… Doyle,…               69 en       Detective Ficti… Publi…
#> 14         2852 The H… Doyle,…               69 en       Detective Ficti… Publi…
#> 15         3289 The V… Doyle,…               69 en       Detective Ficti… Publi…
#> # … with 1 more variable: has_text <lgl>

if (FALSE) {
holmes_books <- gutenberg_download(sherlock_holmes_metadata$gutenberg_id)

holmes_books
}

# date last updated
attr(gutenberg_subjects, "date_updated")
#> [1] "2016-05-05"