Gutenberg metadata about the subject of each work, particularly Library of Congress Classifications (lcc) and Library of Congress Subject Headings (lcsh).
Format
A tbl_df (see tibble or dplyr) with one row for each pairing of work and subject, with columns:
- gutenberg_id
ID describing a work that can be joined with gutenberg_metadata
- subject_type
Either "lcc" (Library of Congress Classification) or "lcsh" (Library of Congress Subject Headings)
- subject
Subject
Details
Find more information about Library of Congress Categories here: https://www.loc.gov/catdir/cpso/lcco/, and about Library of Congress Subject Headings here: https://id.loc.gov/authorities/subjects.html.
To find the date on which this metadata was last updated,
run attr(gutenberg_subjects, "date_updated")
.
Examples
library(dplyr)
library(stringr)
gutenberg_subjects %>%
filter(subject_type == "lcsh") %>%
count(subject, sort = TRUE)
#> # A tibble: 37,961 × 2
#> subject n
#> <chr> <int>
#> 1 Science fiction 2880
#> 2 Short stories 2704
#> 3 Fiction 1978
#> 4 Adventure stories 1461
#> 5 Historical fiction 934
#> 6 Conduct of life -- Juvenile fiction 874
#> 7 Love stories 851
#> 8 Detective and mystery stories 811
#> 9 Man-woman relationships -- Fiction 782
#> 10 Poetry 681
#> # ℹ 37,951 more rows
sherlock_holmes_subjects <- gutenberg_subjects %>%
filter(str_detect(subject, "Holmes, Sherlock"))
sherlock_holmes_subjects
#> # A tibble: 54 × 3
#> gutenberg_id subject_type subject
#> <int> <chr> <chr>
#> 1 108 lcsh Holmes, Sherlock (Fictitious character) -- Fiction
#> 2 221 lcsh Holmes, Sherlock (Fictitious character) -- Fiction
#> 3 244 lcsh Holmes, Sherlock (Fictitious character) -- Fiction
#> 4 834 lcsh Holmes, Sherlock (Fictitious character) -- Fiction
#> 5 1661 lcsh Holmes, Sherlock (Fictitious character) -- Fiction
#> 6 2097 lcsh Holmes, Sherlock (Fictitious character) -- Fiction
#> 7 2343 lcsh Holmes, Sherlock (Fictitious character) -- Fiction
#> 8 2344 lcsh Holmes, Sherlock (Fictitious character) -- Fiction
#> 9 2345 lcsh Holmes, Sherlock (Fictitious character) -- Fiction
#> 10 2346 lcsh Holmes, Sherlock (Fictitious character) -- Fiction
#> # ℹ 44 more rows
sherlock_holmes_metadata <- gutenberg_works() %>%
filter(author == "Doyle, Arthur Conan") %>%
semi_join(sherlock_holmes_subjects, by = "gutenberg_id")
sherlock_holmes_metadata
#> # A tibble: 14 × 8
#> gutenberg_id title author gutenberg_author_id language gutenberg_bookshelf
#> <int> <chr> <chr> <int> <chr> <chr>
#> 1 108 The Ret… Doyle… 69 en Detective Fiction
#> 2 244 A Study… Doyle… 69 en Detective Fiction
#> 3 834 The Mem… Doyle… 69 en Detective Fiction
#> 4 2097 The Sig… Doyle… 69 en Detective Fiction
#> 5 2343 The Adv… Doyle… 69 en Detective Fiction
#> 6 2344 The Adv… Doyle… 69 en Detective Fiction
#> 7 2345 The Adv… Doyle… 69 en Detective Fiction
#> 8 2346 The Adv… Doyle… 69 en Detective Fiction
#> 9 2347 The Adv… Doyle… 69 en Detective Fiction
#> 10 2348 The Dis… Doyle… 69 en Detective Fiction
#> 11 2349 The Adv… Doyle… 69 en Detective Fiction
#> 12 2350 His Las… Doyle… 69 en Detective Fiction
#> 13 3070 The Hou… Doyle… 69 en Bestsellers, Ameri…
#> 14 3289 The Val… Doyle… 69 en Detective Fiction
#> # ℹ 2 more variables: rights <chr>, has_text <lgl>
if (FALSE) {
holmes_books <- gutenberg_download(sherlock_holmes_metadata$gutenberg_id)
holmes_books
}
# date last updated
attr(gutenberg_subjects, "date_updated")
#> [1] "2022-12-19"