Split figure, table and supplementary material captions into sentences
Examples
# doc <- pmc_xml("PMC2231364") # OR
doc <- xml2::read_xml(system.file("extdata/PMC2231364.xml",
package = "tidypmc"
))
x <- pmc_caption(doc)
#> Found 5 figures
#> Found 4 tables
#> Found 3 supplements
x
#> # A tibble: 30 × 4
#> tag label sentence text
#> <chr> <chr> <int> <chr>
#> 1 figure Figure 1 1 "Environmental modulation of expression of virulenc…
#> 2 figure Figure 1 2 "Shown in the squares are the putative stages of tr…
#> 3 figure Figure 1 3 "The TreeView charts show the transcriptional chang…
#> 4 figure Figure 1 4 "Color intensities denote log2 ratios as follows: g…
#> 5 figure Figure 2 1 "RT-PCR analysis of potential operons."
#> 6 figure Figure 2 2 "Shown is the electrophoresis image of an RT-PCR pr…
#> 7 figure Figure 2 3 "Total RNA was used to synthesize cDNA in the prese…
#> 8 figure Figure 2 4 "Genomic DNA was used as a template, and is indicat…
#> 9 figure Figure 2 5 "\"Marker\" represents a DNA size marker (900, 700,…
#> 10 figure Figure 3 1 "Schematic representation of the clustered microarr…
#> # ℹ 20 more rows
dplyr::filter(x, sentence == 1)
#> # A tibble: 12 × 4
#> tag label sentence text
#> <chr> <chr> <int> <chr>
#> 1 figure Figure 1 1 Environmental modulation of …
#> 2 figure Figure 2 1 RT-PCR analysis of potential…
#> 3 figure Figure 3 1 Schematic representation of …
#> 4 figure Figure 4 1 Graphical representation of …
#> 5 figure Figure 5 1 EMSA analysis of the binding…
#> 6 table Table 1 1 Stress-responsive operons in…
#> 7 table Table 2 1 Classification of the gene m…
#> 8 table Table 3 1 Motif discovery for the clus…
#> 9 table Table 4 1 Designs for expression profi…
#> 10 supplement Additional file 1 Figure S1 1 Growth curves of Y. pestis s…
#> 11 supplement Additional file 2 Table S1 1 All the transcriptional chan…
#> 12 supplement Additional file 3 Table S2 1 List of oligonucleotide prim…