The function detect_language()
is vectorised and guesses the the language of each string
in text
or returns NA
if the language could not reliably be determined. The function
detect_language_multi()
is not vectorised and detects all languages inside the entire
character vector as a whole.
Examples
# Vectorized best guess
text <- c("To be or not to be?", "Ce n'est pas grave.",
"Hij heeft de klok horen luiden maar weet niet waar de klepel hangt.")
detect_language(text)
#> [1] "en" "fr" "nl"
# Multiple languages in one text (doesn't seem to work well)
detect_language_mixed(text)
#> language probability reliable proportion
#> 1 nl 0.9999962 TRUE 1
#> 2 und 0.0000000 FALSE 0
#> 3 und 0.0000000 FALSE 0