Get a table of Gutenberg work metadata that has been filtered by some common (settable) defaults, along with the option to add additional filters. This function is for convenience when working with common conditions when pulling a set of books to analyze. For more detailed filtering of the entire Project Gutenberg metadata, use the gutenberg_metadata and related datasets.
gutenberg_works( ..., languages = "en", only_text = TRUE, rights = c("Public domain in the USA.", "None"), distinct = TRUE, all_languages = FALSE, only_languages = TRUE )
... | Additional filters, given as expressions using the variables
in the gutenberg_metadata dataset (e.g. |
---|---|
languages | Vector of languages to include |
only_text | Whether the works must have Gutenberg text attached. Works
without text (e.g. audiobooks) cannot be downloaded with
|
rights | Values to allow in the |
distinct | Whether to return only one distinct combination of each title and gutenberg_author_id. If multiple occur (that fulfill the other conditions), it uses the one with the lowest ID |
all_languages | Whether, if multiple languages are given, all of them
need to be present in a work. For example, if |
only_languages | Whether to exclude works that have other languages
besides the ones provided. For example, whether to include |
A tbl_df (see the tibble or dplyr packages) with one row for each work, in the same format as gutenberg_metadata.
By default, returns
English-language works
That are in text format in Gutenberg (as opposed to audio)
Whose text is not under copyright
At most one distinct field for each title/author pair
#> # A tibble: 40,737 x 8 #> gutenberg_id title author gutenberg_autho… language gutenberg_books… rights #> <int> <chr> <chr> <int> <chr> <chr> <chr> #> 1 0 NA NA NA en NA Publi… #> 2 1 "The… Jeffe… 1638 en United States L… Publi… #> 3 2 "The… Unite… 1 en American Revolu… Publi… #> 4 3 "Joh… Kenne… 1666 en NA Publi… #> 5 4 "Lin… Linco… 3 en US Civil War Publi… #> 6 5 "The… Unite… 1 en American Revolu… Publi… #> 7 6 "Giv… Henry… 4 en American Revolu… Publi… #> 8 7 "The… NA NA en NA Publi… #> 9 8 "Abr… Linco… 3 en US Civil War Publi… #> 10 9 "Abr… Linco… 3 en US Civil War Publi… #> # … with 40,727 more rows, and 1 more variable: has_text <lgl># filter conditions gutenberg_works(author == "Shakespeare, William")#> # A tibble: 79 x 8 #> gutenberg_id title author gutenberg_autho… language gutenberg_books… rights #> <int> <chr> <chr> <int> <chr> <chr> <chr> #> 1 1041 Shak… Shake… 65 en NA Publi… #> 2 1045 Venu… Shake… 65 en NA Publi… #> 3 1500 King… Shake… 65 en NA Publi… #> 4 1501 Hist… Shake… 65 en NA Publi… #> 5 1502 The … Shake… 65 en NA Publi… #> 6 1503 The … Shake… 65 en NA Publi… #> 7 1504 The … Shake… 65 en NA Publi… #> 8 1505 The … Shake… 65 en NA Publi… #> 9 1507 The … Shake… 65 en NA Publi… #> 10 1508 The … Shake… 65 en NA Publi… #> # … with 69 more rows, and 1 more variable: has_text <lgl>#> # A tibble: 1 x 2 #> language n #> <chr> <int> #> 1 es 449#> # A tibble: 3 x 2 #> language n #> <chr> <int> #> 1 en 40736 #> 2 es 447 #> 3 en/es 13#> # A tibble: 1 x 2 #> language n #> <chr> <int> #> 1 en/es 13#> # A tibble: 30 x 2 #> language n #> <chr> <int> #> 1 en 40736 #> 2 es 447 #> 3 en/eo 19 #> 4 en/la 19 #> 5 en/es 13 #> 6 en/fr 13 #> 7 de/en 7 #> 8 ang/en 3 #> 9 cy/en 3 #> 10 en/enm 3 #> # … with 20 more rows