Skip to contents

Split figure, table and supplementary material captions into sentences

Usage

pmc_caption(doc)

Arguments

doc

xml_document from PubMed Central

Value

a tibble with tag, label, sentence number and text

Author

Chris Stubben

Examples

# doc <- pmc_xml("PMC2231364") # OR
doc <- xml2::read_xml(system.file("extdata/PMC2231364.xml",
  package = "tidypmc"
))
x <- pmc_caption(doc)
#> Found 5 figures
#> Found 4 tables
#> Found 3 supplements
x
#> # A tibble: 30 × 4
#>    tag    label    sentence text                                                
#>    <chr>  <chr>       <int> <chr>                                               
#>  1 figure Figure 1        1 "Environmental modulation of expression of virulenc…
#>  2 figure Figure 1        2 "Shown in the squares are the putative stages of tr…
#>  3 figure Figure 1        3 "The TreeView charts show the transcriptional chang…
#>  4 figure Figure 1        4 "Color intensities denote log2 ratios as follows: g…
#>  5 figure Figure 2        1 "RT-PCR analysis of potential operons."             
#>  6 figure Figure 2        2 "Shown is the electrophoresis image of an RT-PCR pr…
#>  7 figure Figure 2        3 "Total RNA was used to synthesize cDNA in the prese…
#>  8 figure Figure 2        4 "Genomic DNA was used as a template, and is indicat…
#>  9 figure Figure 2        5 "\"Marker\" represents a DNA size marker (900, 700,…
#> 10 figure Figure 3        1 "Schematic representation of the clustered microarr…
#> # ℹ 20 more rows
dplyr::filter(x, sentence == 1)
#> # A tibble: 12 × 4
#>    tag        label                       sentence text                         
#>    <chr>      <chr>                          <int> <chr>                        
#>  1 figure     Figure 1                           1 Environmental modulation of …
#>  2 figure     Figure 2                           1 RT-PCR analysis of potential…
#>  3 figure     Figure 3                           1 Schematic representation of …
#>  4 figure     Figure 4                           1 Graphical representation of …
#>  5 figure     Figure 5                           1 EMSA analysis of the binding…
#>  6 table      Table 1                            1 Stress-responsive operons in…
#>  7 table      Table 2                            1 Classification of the gene m…
#>  8 table      Table 3                            1 Motif discovery for the clus…
#>  9 table      Table 4                            1 Designs for expression profi…
#> 10 supplement Additional file 1 Figure S1        1 Growth curves of Y. pestis s…
#> 11 supplement Additional file 2 Table S1         1 All the transcriptional chan…
#> 12 supplement Additional file 3 Table S2         1 List of oligonucleotide prim…