
Return raw embeddings from package text and function definitions.
Source:R/embeddings.R
pkgmatch_embeddings_from_pkgs.Rd
This function accepts a vector of either names of installed packages, or paths to local source code directories, and calculates language model (LM) embeddings for both text descriptions within the package (documentation, including of functions), and for the entire code base. Embeddings may also be calculating separately for all function descriptions.
The embeddings are currently retrieved from a local 'ollama' server (https://ollama.com) running Jina AI embeddings (https://ollama.com/jina/jina-embeddings-v2-base-en for text, and https://ollama.com/ordis/jina-embeddings-v2-base-code for code).
Arguments
- packages
A vector of either names of installed packages, or local paths to directories containing R packages.
- functions_only
If
TRUE
, calculate embeddings for function descriptions only. This is intended to generate a separate set of embeddings which can then be used to match plain-text queries of functions, rather than entire packages.
Value
If !functions_only
, a list of two matrices of embeddings: one for
the text descriptions of the specified packages, including individual
descriptions of all package functions, and one for the entire code base. For
functions_only
, a single matrix of embeddings for all function
descriptions.
See also
Other embeddings:
pkgmatch_embeddings_from_text()
Examples
packages <- "curl"
emb_fns <- pkgmatch_embeddings_from_pkgs (packages, functions_only = TRUE)
#> Generating text embeddings for function descriptions ...
colnames (emb_fns) # All functions the package
#> [1] "curl::curl" "curl::curl_download" "curl::curl_echo"
#> [4] "curl::curl_escape" "curl::curl_fetch" "curl::curl_options"
#> [7] "curl::curl_parse_url" "curl::curl_upload" "curl::file_writer"
#> [10] "curl::handle" "curl::handle_cookies" "curl::ie_proxy"
#> [13] "curl::multi" "curl::multi_download" "curl::multipart"
#> [16] "curl::nslookup" "curl::parse_date" "curl::parse_headers"
#> [19] "curl::send_mail"
emb_pkg <- pkgmatch_embeddings_from_pkgs (packages, functions_only = FALSE)
#> Generating text embeddings [1 / 2] ...
#> Generating text embeddings [2 / 2] ...
#> Generating code embeddings ...
names (emb_pkg)
#> [1] "text_with_fns" "text_wo_fns" "code"
colnames (emb_pkg$text_with_fns) # "curl"
#> [1] "curl"