Skip to contents

This function adds buckets for one or more new documents to an existing lsh_buckets object. Use the same bands value and minhash function that were used to create the original buckets.

Usage

lsh_add(buckets, x, bands, progress = interactive())

Arguments

buckets

An lsh_buckets object created by lsh.

x

A TextReuseCorpus or TextReuseTextDocument with minhashes.

bands

The number of bands to use for locality sensitive hashing. The number of hashes in the documents in the corpus must be evenly divisible by the number of bands. See lsh_threshold and lsh_probability for guidance in selecting the number of bands and hashes.

progress

Display a progress bar while comparing documents.

Value

An updated lsh_buckets object.