Skip to contents

All functions

tokenize_characters() tokenize_words() tokenize_sentences() tokenize_lines() tokenize_paragraphs() tokenize_regex()
Basic tokenizers
chunk_text()
Chunk text into smaller segments
mobydick
The text of Moby Dick
tokenize_ngrams() tokenize_skip_ngrams()
N-gram tokenizers
tokenize_ptb()
Penn Treebank Tokenizer
tokenize_character_shingles()
Character shingle tokenizers
tokenize_word_stems()
Word stem tokenizer
tokenizers-package tokenizers
Tokenizers
count_words() count_characters() count_sentences()
Count words, sentences, characters