Skip to contents

Usage

pkgmatch_bm25(input, txt = NULL, idfs = NULL, corpus = "ropensci")

Arguments

input

A single character string to match against the second parameter of all input documents.

txt

An optional list of input documents. If not specified, data will be loaded as specified by the corpus parameter.

idfs

Optional list of Inverse Document Frequency weightings generated by the internal bm25_idf function. If not specified, values for the rOpenSci corpus will be automatically downloaded and used.

corpus

If txt is not specified, data for nominated corpus will be downloaded to local cache directory, and BM25 values calculated against those. Must be one of "ropensci", "ropensci-fns", or "cran". Note that the "ropensci-fns" corpus contains entries for every single function of every rOpenSci package, and the resulting BM25 values can be used to determine the best-matching function. The other two corpora are package-based, and the results can be used to find the best-matching package.

Value

A data.frame of package names and 'BM25' measures against text from whole packages both with and without function descriptions.

See also

Examples

if (FALSE) { # \dontrun{
input <- "Download open spatial data from NASA"
bm25 <- pkgmatch_bm25 (input)
# Or pre-load document-frequency weightings:
idfs <- pkgmatch_load_data ("idfs", fns = FALSE)
bm25 <- pkgmatch_bm25 (input, idfs = idfs)
} # }