
Package index
-
tokenize_characters()tokenize_words()tokenize_sentences()tokenize_lines()tokenize_paragraphs()tokenize_regex() - Basic tokenizers
-
chunk_text() - Chunk text into smaller segments
-
mobydick - The text of Moby Dick
-
tokenize_ngrams()tokenize_skip_ngrams() - N-gram tokenizers
-
tokenize_ptb() - Penn Treebank Tokenizer
-
tokenize_character_shingles() - Character shingle tokenizers
-
tokenize_word_stems() - Word stem tokenizer
-
tokenizers-packagetokenizers - Tokenizers
-
count_words()count_characters()count_sentences() - Count words, sentences, characters