Calculate the "BM25" = "Best Matching 25" ranking function between text input and all R packages within specified corpus.
Source:R/bm25.R
pkgmatch_bm25.Rd
Arguments
- input
A single character string to match against the second parameter of all input documents.
- txt
An optional list of input documents. If not specified, data will be loaded as specified by the
corpus
parameter.- idfs
Optional list of Inverse Document Frequency weightings generated by the internal
bm25_idf
function. If not specified, values for the rOpenSci corpus will be automatically downloaded and used.- corpus
If
txt
is not specified, data for nominated corpus will be downloaded to local cache directory, and BM25 values calculated against those. Must be one of "ropensci", "ropensci-fns", or "cran". Note that the "ropensci-fns" corpus contains entries for every single function of every rOpenSci package, and the resulting BM25 values can be used to determine the best-matching function. The other two corpora are package-based, and the results can be used to find the best-matching package.
Value
A data.frame
of package names and 'BM25' measures against text
from whole packages both with and without function descriptions.
See also
Other bm25:
pkgmatch_bm25_fn_calls()
Examples
if (FALSE) { # \dontrun{
input <- "Download open spatial data from NASA"
bm25 <- pkgmatch_bm25 (input)
# Or pre-load document-frequency weightings:
idfs <- pkgmatch_load_data ("idfs", fns = FALSE)
bm25 <- pkgmatch_bm25 (input, idfs = idfs)
} # }