High-Performance Stemmer, Tokenizer, and Spell Checker for R
Low level spell checker and morphological analyzer based on the famous hunspell library https://hunspell.github.io. The package can analyze or check individual words as well as tokenize text, latex, html or xml documents. For a more user-friendly interface use the ‘spelling’ package which builds on this package with utilities to automate checking of files, documentation and vignettes in all common formats.
This package includes a bundled version of libhunspell and no longer depends on external system libraries:
# Check individual words words <- c("beer", "wiskey", "wine") correct <- hunspell_check(words) print(correct) # Find suggestions for incorrect words hunspell_suggest(words[!correct]) # Extract incorrect from a piece of text bad <- hunspell("spell checkers are not neccessairy for langauge ninja's") print(bad[]) hunspell_suggest(bad[]) # Stemming words <- c("love", "loving", "lovingly", "loved", "lover", "lovely", "love") hunspell_stem(words) hunspell_analyze(words)
The spelling package uses this package to spell R package documentation: