Skip to contents

This function accepts a vector of either names of installed packages, or paths to local source code directories, and calculates language model (LM) embeddings for both text descriptions within the package (documentation, including of functions), and for the entire code base. Embeddings may also be calculating separately for all function descriptions.

The embeddings are currently retrieved from a local 'ollama' server (https://ollama.com) running Jina AI embeddings (https://ollama.com/jina/jina-embeddings-v2-base-en for text, and https://ollama.com/ordis/jina-embeddings-v2-base-code for code).

Usage

pkgmatch_embeddings_from_pkgs(packages = NULL, functions_only = FALSE)

Arguments

packages

A vector of either names of installed packages, or local paths to directories containing R packages.

functions_only

If TRUE, calculate embeddings for function descriptions only. This is intended to generate a separate set of embeddings which can then be used to match plain-text queries of functions, rather than entire packages.

Value

If !functions_only, a list of two matrices of embeddings: one for the text descriptions of the specified packages, including individual descriptions of all package functions, and one for the entire code base. For functions_only, a single matrix of embeddings for all function descriptions.

See also

Other embeddings: pkgmatch_embeddings_from_text()

Examples

packages <- "curl"
emb_fns <- pkgmatch_embeddings_from_pkgs (packages, functions_only = TRUE)
#> Generating text embeddings for function descriptions ...
colnames (emb_fns) # All functions the package
#>  [1] "curl::curl"           "curl::curl_download"  "curl::curl_echo"     
#>  [4] "curl::curl_escape"    "curl::curl_fetch"     "curl::curl_options"  
#>  [7] "curl::curl_parse_url" "curl::curl_upload"    "curl::file_writer"   
#> [10] "curl::handle"         "curl::handle_cookies" "curl::ie_proxy"      
#> [13] "curl::multi"          "curl::multi_download" "curl::multipart"     
#> [16] "curl::nslookup"       "curl::parse_date"     "curl::parse_headers" 
#> [19] "curl::send_mail"     
emb_pkg <- pkgmatch_embeddings_from_pkgs (packages, functions_only = FALSE)
#> Generating text embeddings [1 / 2] ...
#> Generating text embeddings [2 / 2] ...
#> Generating code embeddings ...
names (emb_pkg)
#> [1] "text_with_fns" "text_wo_fns"   "code"         
colnames (emb_pkg$text_with_fns) # "curl"
#> [1] "curl"