What is searched?

Europe PMC is a repository of life science literature. Europe PMC ingests all PubMed content and extends its index with other literature and patent sources.

For more background on Europe PMC, see:

https://europepmc.org/About

Levchenko, M., Gou, Y., Graef, F., Hamelers, A., Huang, Z., Ide-Smith, M., … McEntyre, J. (2017). Europe PMC in 2017. Nucleic Acids Research, 46(D1), D1254–D1260. https://doi.org/10.1093/nar/gkx1005

How to search Europe PMC with R?

This client supports the Europe PMC search syntax. If you are unfamiliar with searching Europe PMC, check out the Europe PMC query builder, a very nice tool that helps you to build queries. To make use of Europe PMC queries in R, copy & paste the search string to the search functions of this package.

In the following, some examples demonstrate how to search Europe PMC with R.

Managing search results

By default, 100 records are returned, but the number of results can be expanded or limited with the limit parameter.

europepmc::epmc_search('"Human malaria parasites"', limit = 10)
#> # A tibble: 10 x 28
#>    id     source pmid  doi   title authorString journalTitle issue journalVolume
#>    <chr>  <chr>  <chr> <chr> <chr> <chr>        <chr>        <chr> <chr>        
#>  1 33789… MED    3378… 10.1… Addi… Kwon H, Sim… mSphere      2     6            
#>  2 32470… MED    3247… 10.1… C-te… Kimata-Arig… J Biochem    4     168          
#>  3 33797… MED    3379… 10.4… Comp… Mat Salleh … Trop Biomed  1     38           
#>  4 PPR27… PPR    <NA>  10.2… Stoc… Tripathi J,… <NA>         <NA>  <NA>         
#>  5 33452… MED    3345… 10.1… In-d… Hamada S, P… Proteomics   6     21           
#>  6 PPR27… PPR    <NA>  10.2… Five… Hulluka TF,… <NA>         <NA>  <NA>         
#>  7 33710… MED    3371… 10.1… Mole… Oyedeji SI,… Acta Parasi… <NA>  <NA>         
#>  8 33693… MED    3369… 10.1… Non-… Antinori S,… J Travel Med <NA>  <NA>         
#>  9 33888… MED    3388… 10.1… A co… Carlton JM.  BMC Biol     1     19           
#> 10 33906… MED    3390… 10.1… The … Lu B, Liu M… mBio         2     12           
#> # … with 19 more variables: pubYear <chr>, journalIssn <chr>, pubType <chr>,
#> #   isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> #   hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> #   hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> #   hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> #   firstPublicationDate <chr>, pageInfo <chr>, pmcid <chr>

Results are sorted by relevance. Other options via the sort parameter are

  • sort = 'cited' by the number of citation, descending from the most cited publication
  • sort = 'date' by date published starting with the most recent publication

Search by DOIs

Sometimes, you would like to check, if articles are indexed in Europe PMC using DOI names, a widely used identifier for scholarly articles. Use epmc_search_by_doi() for this purpose.

my_dois <- c(
  "10.1159/000479962",
  "10.1002/sctm.17-0081",
  "10.1161/strokeaha.117.018077",
  "10.1007/s12017-017-8447-9"
  )
europepmc::epmc_search_by_doi(doi = my_dois)
#> # A tibble: 4 x 28
#>   id     source pmid   doi   title authorString journalTitle issue journalVolume
#>   <chr>  <chr>  <chr>  <chr> <chr> <chr>        <chr>        <chr> <chr>        
#> 1 28957… MED    28957… 10.1… Clin… Schnieder M… Eur Neurol   5-6   78           
#> 2 28941… MED    28941… 10.1… Conc… Doeppner TR… Stem Cells … 11    6            
#> 3 29018… MED    29018… 10.1… One-… Psychogios … Stroke       11    48           
#> 4 28623… MED    28623… 10.1… Defe… Carboni E, … Neuromolecu… 2-3   19           
#> # … with 19 more variables: pubYear <chr>, journalIssn <chr>, pageInfo <chr>,
#> #   pubType <chr>, isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> #   hasBook <chr>, hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> #   hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> #   hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> #   firstPublicationDate <chr>, pmcid <chr>

Output options

By default, a non-nested data frame printed as tibble is returned. Other formats are output = "id_list" returning a list of IDs and sources, and output = “‘raw’”" for getting full metadata as list. Please be aware that these lists can become very large.

More advanced options to search Europe PMC

Annotations

Europe PMC provides text-mined annotations contained in abstracts and open access full-text articles.

These automatically identified concepts and term can be retrieved at the article-level:

europepmc::epmc_annotations_by_id(c("MED:28585529", "PMC:PMC1664601"))
#> # A tibble: 774 x 13
#>    source ext_id  pmcid  prefix  exact  postfix name  uri   id     type  section
#>    <chr>  <chr>   <chr>  <chr>   <chr>  <chr>   <chr> <chr> <chr>  <chr> <chr>  
#>  1 MED    285855… PMC54… "tive … Beta … " allo… Beta… http… http:… Clin… Title …
#>  2 MED    285855… PMC54… "nomic… genes  ".\nRa… gene  http… http:… Sequ… Title …
#>  3 MED    285855… PMC54… "nomic… genes  " is o… gene  http… http:… Sequ… Abstra…
#>  4 MED    285855… PMC54… " One … genes  " are … gene  http… http:… Sequ… Abstra…
#>  5 MED    285855… PMC54… " iden… beet   " (Bet… Beta… http… http:… Clin… Abstra…
#>  6 MED    285855… PMC54… "ify t… Beta … " ssp.… Beta… http… http:… Clin… Abstra…
#>  7 MED    285855… PMC54… "ulgar… gene   " Rz2 … gene  http… http:… Sequ… Abstra…
#>  8 MED    285855… PMC54… "e gen… genome " sequ… geno… http… http:… Sequ… Abstra…
#>  9 MED    285855… PMC54… "equen… beet   ". Our… Beta… http… http:… Clin… Abstra…
#> 10 MED    285855… PMC54… "disco… genes  " rele… gene  http… http:… Sequ… Abstra…
#> # … with 764 more rows, and 2 more variables: provider <chr>, subType <chr>

To obtain a list of articles where Europe PMC has text-minded annotations, either subset the resulting data.frame

tt <- epmc_search("malaria")
tt[tt$hasTextMinedTerms == "Y" | tt$hasTMAccessionNumbers == "Y",]
#> # A tibble: 97 x 29
#>    id     source pmid  doi   title authorString journalTitle issue journalVolume
#>    <chr>  <chr>  <chr> <chr> <chr> <chr>        <chr>        <chr> <chr>        
#>  1 33530… MED    3353… 10.1… Disc… Hoarau M, V… J Enzyme In… 1     36           
#>  2 33594… MED    3359… 10.1… Mana… Kambale-Kom… Hematology   1     26           
#>  3 33372… MED    3337… 10.1… ATP2… Lamy A, Mac… Emerg Micro… 1     10           
#>  4 33535… MED    3353… 10.3… THE … Damiani E, … Acta Med Hi… 2     18           
#>  5 33095… MED    3309… 10.1… Hydr… Bansal P, G… Ann Med      1     53           
#>  6 33685… MED    3368… 10.1… Eval… Kodama Y, T… Drug Deliv   1     28           
#>  7 33509… MED    3350… 10.1… HIV/… Demartoto A… SAHARA J     1     18           
#>  8 33715… MED    3371… 10.1… Acti… Angeli A, U… J Enzyme In… 1     36           
#>  9 33053… MED    3305… 10.1… Stra… Vieira-Neta… Braz J Biol  4     81           
#> 10 33666… MED    3366… 10.1… Trai… Zhou J, Lv … Emerg Micro… 1     10           
#> # … with 87 more rows, and 20 more variables: pubYear <chr>, journalIssn <chr>,
#> #   pageInfo <chr>, pubType <chr>, isOpenAccess <chr>, inEPMC <chr>,
#> #   inPMC <chr>, hasPDF <chr>, hasBook <chr>, hasSuppl <chr>,
#> #   citedByCount <int>, hasReferences <chr>, hasTextMinedTerms <chr>,
#> #   hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> #   hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> #   firstPublicationDate <chr>, pmcid <chr>, versionNumber <int>

or expand the query choosing an annotation type or provider from the Europe PMC Advanced Search query builder.

epmc_search('malaria AND (ANNOTATION_TYPE:"Cell") AND (ANNOTATION_PROVIDER:"Europe PMC")')
#> # A tibble: 100 x 28
#>    id     source pmid   pmcid doi    title    authorString    journalTitle issue
#>    <chr>  <chr>  <chr>  <chr> <chr>  <chr>    <chr>           <chr>        <chr>
#>  1 31782… MED    31782… PMC7… 10.10… Increas… Jongo SA, Chur… Clin Infect… 11   
#>  2 31808… MED    31808… PMC7… 10.10… Retinop… Villaverde C, … J Pediatric… 5    
#>  3 30989… MED    30989… PMC7… 10.10… Clinica… Enane LA, Sull… J Pediatric… 3    
#>  4 31300… MED    31300… PMC7… 10.10… Blackwa… Opoka RO, Wais… Clin Infect… 11   
#>  5 31807… MED    31807… <NA>  10.10… Malaria… Marcombe S, Th… J Med Entom… 3    
#>  6 31505… MED    31505… <NA>  10.10… Acute K… Oshomah-Bello … J Trop Pedi… 2    
#>  7 31693… MED    31693… PMC7… 10.10… Reduced… Kingston HWF, … J Infect Dis 9    
#>  8 31679… MED    31679… <NA>  10.10… A Syste… Thiengsusuk A,… Eur J Drug … 2    
#>  9 31687… MED    31687… <NA>  10.10… Evaluat… Ferdinand DY, … Trans R Soc… 3    
#> 10 30852… MED    30852… <NA>  10.10… An Expe… Woodford J, Co… J Infect Dis 6    
#> # … with 90 more rows, and 19 more variables: journalVolume <chr>,
#> #   pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> #   isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> #   hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> #   hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> #   hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> #   firstPublicationDate <chr>

Data integrations

Another nice feature of Europe PMC is to search for cross-references between Europe PMC to other databases. For instance, to get publications cited by entries in the Protein Data bank in Europe published 2016:

europepmc::epmc_search('(HAS_PDB:y) AND FIRST_PDATE:2016')
#> # A tibble: 100 x 28
#>    id     source pmid   pmcid doi    title    authorString    journalTitle issue
#>    <chr>  <chr>  <chr>  <chr> <chr>  <chr>    <chr>           <chr>        <chr>
#>  1 28039… MED    28039… PMC5… 10.10… Structu… Su HP, Rickert… Proc Natl A… 3    
#>  2 28036… MED    28036… PMC5… 10.13… Structu… Kovaľ T, Øster… PLoS One     12   
#>  3 27977… MED    27977… <NA>  10.10… Compara… De Deurwaerdèr… ACS Chem Ne… 5    
#>  4 28144… MED    28144… PMC5… 10.37… Biochem… Ulrich V, Brie… Beilstein J… <NA> 
#>  5 28028… MED    28028… <NA>  10.10… Structu… Zhou Z, Liu Y,… Appl Microb… 7    
#>  6 27958… MED    27958… <NA>  10.10… Glycans… Hamark C, Bern… J Am Chem S… 1    
#>  7 27959… MED    27959… PMC6… 10.10… Structu… Reed AJ, Vyas … J Am Chem S… 1    
#>  8 28083… MED    28083… PMC5… 10.33… Conform… Paoletti F, de… Front Mol B… <NA> 
#>  9 28024… MED    28024… <NA>  10.10… Solutio… Bibow S, Polyh… Nat Struct … 2    
#> 10 28031… MED    28031… PMC5… 10.10… Structu… Sevrioukova IF… Proc Natl A… 3    
#> # … with 90 more rows, and 19 more variables: journalVolume <chr>,
#> #   pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> #   isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> #   hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> #   hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> #   hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> #   firstPublicationDate <chr>

The following sources are supported

To retrieve metadata about these external database links, use europepmc_epmc_db().

Citations and reference sections

Europe PMC let us also obtain citation metadata and reference sections. For retrieving citation metadata per article, use

europepmc::epmc_citations("9338777", limit = 500)
#> # A tibble: 233 x 11
#>    id     source citationType title authorString journalAbbrevia… pubYear volume
#>    <chr>  <chr>  <chr>        <chr> <chr>        <chr>              <int> <chr> 
#>  1 33353… MED    review-arti… Xeno… Galow AM, G… Int J Mol Sci       2020 21    
#>  2 31565… MED    research-ar… Regu… Chung HC, N… J Vet Sci           2019 20    
#>  3 30230… MED    research su… Bioe… Legallais C… Adv Healthc Mat…    2018 7     
#>  4 30264… MED    research su… Porc… Fiebig U, F… Xenotransplanta…    2018 25    
#>  5 29756… MED    historical … Infe… Weiss RA.    Xenotransplanta…    2018 25    
#>  6 29642… MED    research su… Trac… Kawasaki J,… Viruses             2018 10    
#>  7 28768… MED    research su… Pres… Kawasaki J,… J Virol             2017 91    
#>  8 28437… MED    research su… Thre… Colon-Moran… Virology            2017 507   
#>  9 28054… MED    research su… Anti… Inoue Y, Yo… Ann Biomed Eng      2017 45    
#> 10 27832… MED    research-ar… Tran… Kim N, Choi… PLoS One            2016 11    
#> # … with 223 more rows, and 3 more variables: issue <chr>, citedByCount <int>,
#> #   pageInfo <chr>

For reference section from an article:

europepmc::epmc_refs("28632490", limit = 200)
#> # A tibble: 169 x 19
#>    id     source citationType  title authorString journalAbbrevia… issue pubYear
#>    <chr>  <chr>  <chr>         <chr> <chr>        <chr>            <chr>   <int>
#>  1 12002… MED    JOURNAL ARTI… Tric… Adolfsson-E… Chemosphere      9-10     2002
#>  2 18795… MED    JOURNAL ARTI… In v… Ahn KC, Zha… Environ Health … 9        2008
#>  3 18556… MED    JOURNAL ARTI… Effe… Aiello AE, … Am J Public Hea… 8        2008
#>  4 17683… MED    JOURNAL ARTI… Cons… Aiello AE, … Clin Infect Dis  <NA>     2007
#>  5 15273… MED    JOURNAL ARTI… Rela… Aiello AE, … Antimicrob Agen… 8        2004
#>  6 18207… MED    JOURNAL ARTI… The … Allmyr M, H… Sci Total Envir… 1        2008
#>  7 17007… MED    JOURNAL ARTI… Tric… Allmyr M, A… Sci Total Envir… 1        2006
#>  8 26948… MED    JOURNAL ARTI… Pres… Alvarez-Riv… J Chromatogr A   <NA>     2016
#>  9 23192… MED    JOURNAL ARTI… Expo… Anderson SE… Toxicol Sci      1        2012
#> 10 25837… MED    JOURNAL ARTI… Obse… Vladar EK, … Methods Cell Bi… <NA>     2015
#> # … with 159 more rows, and 11 more variables: volume <chr>, pageInfo <chr>,
#> #   citedOrder <int>, match <chr>, essn <chr>, issn <chr>,
#> #   publicationTitle <chr>, publisherLoc <chr>, publisherName <chr>,
#> #   externalLink <chr>, doi <chr>

Fulltext access

Europe PMC gives not only access to metadata, but also to full-texts. Adding AND (OPEN_ACCESS:y) to your search query, returns only those articles where Europe PMC has also the fulltext.

Fulltext as xml document can accessed via the PMID or the PubMed Central ID (PMCID):

europepmc::epmc_ftxt("PMC3257301")
#> {xml_document}
#> <article article-type="research-article" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML">
#> [1] <front>\n  <journal-meta>\n    <journal-id journal-id-type="nlm-ta">PLoS  ...
#> [2] <body>\n  <sec id="s1">\n    <title>Introduction</title>\n    <p>Atmosphe ...
#> [3] <back>\n  <ack>\n    <p>We would like to thank Dr. C. Gourlay and Dr. T.  ...