Skip to contents

Search a given NCBI database with a particular query.

Usage

entrez_search(
  db,
  term,
  config = NULL,
  retmode = "xml",
  use_history = FALSE,
  ...
)

Arguments

db

character, name of the database to search for.

term

character, the search term. The syntax used in making these searches is described in the Details of this help message, the package vignette and reference given below.

config

vector configuration options passed to httr::GET

retmode

character, one of json (default) or xml. This will make no difference in most cases.

use_history

logical. If TRUE return a web_history object for use in later calls to the NCBI

...

character, additional terms to add to the request, see NCBI documentation linked to in references for a complete list

Value

ids integer Unique IDS returned by the search

count integer Total number of hits for the search

retmax integer Maximum number of hits returned by the search

web_history A web_history object for use in subsequent calls to NCBI

QueryTranslation character, search term as the NCBI interpreted it

file either and XMLInternalDocument xml file resulting from search, parsed with xmlTreeParse or, if retmode was set to json a list resulting from the returned JSON file being parsed with fromJSON.

Details

The NCBI uses a search term syntax where search terms can be associated with a specific search field with square brackets. So, for instance “Homo[ORGN]” denotes a search for Homo in the “Organism” field. The names and definitions of these fields can be identified using entrez_db_searchable.

Searches can make use of several fields by combining them via the boolean operators AND, OR and NOT. So, using the search term“((Homo[ORGN] AND APP[GENE]) NOT Review[PTYP])” in PubMed would identify articles matching the gene APP in humans, and exclude review articles. More examples of the use of these search terms, and the more specific MeSH terms for precise searching, is given in the package vignette. rentrez handles special characters and URL encoding (e.g. replacing spaces with plus signs) on the client side, so there is no need to include these in search term

Therentrez tutorial provides some tips on how to make the most of searches to the NCBI. In particular, the sections on uses of the "Filter" field and MeSH terms may in formulating precise searches.

See also

config for available httr configurations

entrez_db_searchable to get a set of search fields that can be used in term for any database

Examples

if (FALSE) { # \dontrun{
   query <- "Gastropoda[Organism] AND COI[Gene]"
   web_env_search <- entrez_search(db="nuccore", query, use_history=TRUE)
   cookie <- web_env_search$WebEnv
   qk <- web_env_search$QueryKey 
   snail_coi <- entrez_fetch(db = "nuccore", WebEnv = cookie, query_key = qk,
                             file_format = "fasta", retmax = 10)
} # }
if (FALSE) { # \dontrun{

fly_id <- entrez_search(db="taxonomy", term="Drosophila")
#Oh, right. There is a genus and a subgenus name Drosophila...
#how can we limit this search
(tax_fields <- entrez_db_searchable("taxonomy"))
#"RANK" loots promising
tax_fields$RANK
entrez_search(db="taxonomy", term="Drosophila & Genus[RANK]")
} # }