Skip to contents

Gutenberg metadata about the subject of each work, particularly Library of Congress Classifications (lcc) and Library of Congress Subject Headings (lcsh).

Usage

gutenberg_subjects

Format

A tbl_df (see tibble or dplyr) with one row for each pairing of work and subject, with columns:

gutenberg_id

ID describing a work that can be joined with gutenberg_metadata

subject_type

Either "lcc" (Library of Congress Classification) or "lcsh" (Library of Congress Subject Headings)

subject

Subject

Details

Find more information about Library of Congress Categories here: https://www.loc.gov/catdir/cpso/lcco/, and about Library of Congress Subject Headings here: https://id.loc.gov/authorities/subjects.html.

To find the date on which this metadata was last updated, run attr(gutenberg_subjects, "date_updated").

Examples


library(dplyr)
library(stringr)

gutenberg_subjects %>%
  filter(subject_type == "lcsh") %>%
  count(subject, sort = TRUE)
#> # A tibble: 37,961 × 2
#>    subject                                 n
#>    <chr>                               <int>
#>  1 Science fiction                      2880
#>  2 Short stories                        2704
#>  3 Fiction                              1978
#>  4 Adventure stories                    1461
#>  5 Historical fiction                    934
#>  6 Conduct of life -- Juvenile fiction   874
#>  7 Love stories                          851
#>  8 Detective and mystery stories         811
#>  9 Man-woman relationships -- Fiction    782
#> 10 Poetry                                681
#> # ℹ 37,951 more rows

sherlock_holmes_subjects <- gutenberg_subjects %>%
  filter(str_detect(subject, "Holmes, Sherlock"))

sherlock_holmes_subjects
#> # A tibble: 54 × 3
#>    gutenberg_id subject_type subject                                           
#>           <int> <chr>        <chr>                                             
#>  1          108 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  2          221 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  3          244 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  4          834 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  5         1661 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  6         2097 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  7         2343 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  8         2344 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#>  9         2345 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#> 10         2346 lcsh         Holmes, Sherlock (Fictitious character) -- Fiction
#> # ℹ 44 more rows

sherlock_holmes_metadata <- gutenberg_works() %>%
  filter(author == "Doyle, Arthur Conan") %>%
  semi_join(sherlock_holmes_subjects, by = "gutenberg_id")

sherlock_holmes_metadata
#> # A tibble: 14 × 8
#>    gutenberg_id title    author gutenberg_author_id language gutenberg_bookshelf
#>           <int> <chr>    <chr>                <int> <chr>    <chr>              
#>  1          108 The Ret… Doyle…                  69 en       Detective Fiction  
#>  2          244 A Study… Doyle…                  69 en       Detective Fiction  
#>  3          834 The Mem… Doyle…                  69 en       Detective Fiction  
#>  4         2097 The Sig… Doyle…                  69 en       Detective Fiction  
#>  5         2343 The Adv… Doyle…                  69 en       Detective Fiction  
#>  6         2344 The Adv… Doyle…                  69 en       Detective Fiction  
#>  7         2345 The Adv… Doyle…                  69 en       Detective Fiction  
#>  8         2346 The Adv… Doyle…                  69 en       Detective Fiction  
#>  9         2347 The Adv… Doyle…                  69 en       Detective Fiction  
#> 10         2348 The Dis… Doyle…                  69 en       Detective Fiction  
#> 11         2349 The Adv… Doyle…                  69 en       Detective Fiction  
#> 12         2350 His Las… Doyle…                  69 en       Detective Fiction  
#> 13         3070 The Hou… Doyle…                  69 en       Bestsellers, Ameri…
#> 14         3289 The Val… Doyle…                  69 en       Detective Fiction  
#> # ℹ 2 more variables: rights <chr>, has_text <lgl>

if (FALSE) {
holmes_books <- gutenberg_download(sherlock_holmes_metadata$gutenberg_id)

holmes_books
}

# date last updated
attr(gutenberg_subjects, "date_updated")
#> [1] "2022-12-19"