Author: David Ranzolin

License: MIT


The goal of rperseus is to furnish classicists, textual critics, and R enthusiasts with texts from the Classical World. While the English translations of most texts are available through gutenbergr, rperseusreturns these works in their original language–Greek, Latin, and Hebrew.


rperseus provides access to classical texts within the Perseus Digital Library’s CapiTainS environment. A wealth of Greek, Latin, and Hebrew texts are available, from Homer to Cicero to Boetheius. The Perseus Digital Library includes English translations in some cases. The base API url is


rperseus is not on CRAN, but can be installed via:



See the vignette to get started.

To obtain a particular text, you must first know its full Uniform Resource Name (URN). URNs can be perused in the perseus_catalog, a data frame lazily loaded into the package. For example, say I want a copy of Virgil’s Aeneid:


aeneid_latin <- perseus_catalog %>% 
  filter(group_name == "Virgil",
         label == "Aeneid",
         language == "lat") %>% 
  pull(urn) %>% 

You can also request an English translation for some texts:

aeneid_english <- perseus_catalog %>% 
  filter(group_name == "Virgil",
         label == "Aeneid",
         language == "eng") %>% 
  pull(urn) %>% 

Refer to the language variable in perseus_catalog for translation availability.


You can also specify excerpts:

qoheleth <- get_perseus_text(urn = "urn:cts:ancJewLit:hebBible.ecclesiastes.leningrad-pntd", excerpt = "1.1-1.3")
#> [1] "דִּבְרֵי֙ קֹהֶ֣לֶת בֶּן־ דָּוִ֔ד מֶ֖לֶךְ בִּירוּשָׁלִָֽם : הֲבֵ֤ל הֲבָלִים֙ אָמַ֣ר קֹהֶ֔לֶת הֲבֵ֥ל הֲבָלִ֖ים הַכֹּ֥ל הָֽבֶל : מַה־ יִּתְר֖וֹן לָֽאָדָ֑ם בְּכָל־ עֲמָל֔וֹ שֶֽׁיַּעֲמֹ֖ל תַּ֥חַת הַשָּֽׁמֶשׁ :"

Parsing Excerpts

You can parse any Greek excerpt, returning a data frame with each word’s part of speech, gender, case, mood, voice, tense, person, number, and degree.

parse_excerpt("urn:cts:greekLit:tlg0031.tlg002.perseus-grc2", "5.1-5.2") %>% 
  head(7) %>% 
word form verse part_of_speech person number tense mood voice gender case degree
καί Καὶ 5.1 conjunction NA NA NA NA NA NA NA NA
ἔρχομαι ἦλθον 5.1 verb third plural aorist indicative active NA NA NA
εἰς εἰς 5.1 preposition NA NA NA NA NA NA NA NA
τὸ 5.1 article NA singular NA NA NA neuter accusative NA
πέραν πέραν 5.1 adverb NA NA NA NA NA NA NA NA
τῆς 5.1 article NA singular NA NA NA feminine genative NA
θάλασσα θαλάσσης 5.1 noun NA singular NA NA NA feminine genative NA

tidyverse and tidytext

rperseus plays well with the tidyverse and tidytext. Here I obtain all of Plato’s works that have English translations available:

plato <- perseus_catalog %>% 
  filter(group_name == "Plato",
         language == "eng") %>% 
  pull(urn) %>% 

And here’s how to retrieve the Greek text from Sophocles’ underrated Philoctetes before unleashing the tidytext toolkit:


philoctetes <- perseus_catalog %>% 
  filter(group_name == "Sophocles",
         label == "Philoctetes",
         language == "grc") %>% 
  pull(urn) %>%

philoctetes %>% 
  unnest_tokens(word, text) %>% 
  count(word, sort = TRUE) %>% 
#> Joining, by = "word"
#> # A tibble: 3,514 x 2
#>           word     n
#>          <chr> <int>
#>  1 νεοπτόλεμος   164
#>  2  φιλοκτήτης   141
#>  3           ὦ   119
#>  4          μʼ    74
#>  5    ὀδυσσεύς    56
#>  6      τέκνον    47
#>  7          τʼ    43
#>  8       χορός    41
#>  9          γʼ    40
#> 10         νῦν    39
#> # ... with 3,504 more rows

Rendering Parallels

You can render small parallels with perseus_parallel:

tibble(label = c("Colossians", "1 Thessalonians", "Romans"),
              excerpt = c("1.4", "1.3", "8.35-8.39")) %>%
    left_join(perseus_catalog) %>%
    filter(language == "grc") %>%
    select(urn, excerpt) %>%
    pmap_df(get_perseus_text) %>%
    perseus_parallel(words_per_row = 4)
#> Joining, by = "label"


