Skip to contents

jst_get_chapters() extracts meta-data from JSTOR-XML files for book chapters.

Usage

jst_get_chapters(file_path, authors = FALSE)

Arguments

file_path

The path to a .xml-file for a book or research report.

authors

Extracting the authors is an expensive operation which makes the function ~3 times slower, depending on the number of chapters and the number of authors. Defaults to FALSE. Use authors = TRUE to import the authors too.

Value

A tibble containing the extracted meta-data with the following columns:

  • book_id (chr): The book id of type "jstor", which is not a registered DOI.

  • file_name (chr): The filename of the original .xml-file. Can be used for joining with other data for the same file.

  • part_id (chr): The id of the part.

  • part_label (chr): A label for the part, if specified.

  • part_title (chr): The title of the part.

  • part_subtitle (chr): The subtitle of the part, if specified.

  • authors (list): A list-column with information on the authors. Can be unnested with tidyr::unnest(). See the examples and jst_get_authors().

  • abstract (chr): The abstract to the part.

  • part_first_page (chr): The page where the part begins.

Details

Currently, jst_get_chapters() is quite a lot slower than most of the other functions. It is roughly 10 times slower than jst_get_book, depending on the number of chapters to extract.

Examples

# extract parts without authors
jst_get_chapters(jst_example("book.xml"))
#> # A tibble: 36 × 9
#>    book_id     file_name part_id     part_label part_title part_subtitle authors
#>    <chr>       <chr>     <chr>       <chr>      <chr>      <chr>         <chr>  
#>  1 j.ctt24hdz7 book      j.ctt24hdz… NA         Front Mat… NA            NA     
#>  2 j.ctt24hdz7 book      j.ctt24hdz… NA         Table of … NA            NA     
#>  3 j.ctt24hdz7 book      j.ctt24hdz… NA         Acronyms … NA            NA     
#>  4 j.ctt24hdz7 book      j.ctt24hdz… NA         Authors’ … NA            NA     
#>  5 j.ctt24hdz7 book      j.ctt24hdz… 1.         The enigm… NA            NA     
#>  6 j.ctt24hdz7 book      j.ctt24hdz… 2.         ‘Anxiety,… Fiji’s road … NA     
#>  7 j.ctt24hdz7 book      j.ctt24hdz… 3.         Fiji’s De… Who, what, w… NA     
#>  8 j.ctt24hdz7 book      j.ctt24hdz… 4.         ‘This pro… The aftermat… NA     
#>  9 j.ctt24hdz7 book      j.ctt24hdz… 5.         The chang… NA            NA     
#> 10 j.ctt24hdz7 book      j.ctt24hdz… 6.         The Fiji … Analyzing th… NA     
#> # ℹ 26 more rows
#> # ℹ 2 more variables: abstract <chr>, part_first_page <chr>

# import authors too
parts <- jst_get_chapters(jst_example("book.xml"), authors = TRUE)
parts
#> # A tibble: 36 × 9
#>    book_id     file_name part_id    part_label part_title part_subtitle authors 
#>    <chr>       <chr>     <chr>      <chr>      <chr>      <chr>         <list>  
#>  1 j.ctt24hdz7 book      j.ctt24hd… NA         Front Mat… NA            <tibble>
#>  2 j.ctt24hdz7 book      j.ctt24hd… NA         Table of … NA            <tibble>
#>  3 j.ctt24hdz7 book      j.ctt24hd… NA         Acronyms … NA            <tibble>
#>  4 j.ctt24hdz7 book      j.ctt24hd… NA         Authors’ … NA            <tibble>
#>  5 j.ctt24hdz7 book      j.ctt24hd… 1.         The enigm… NA            <tibble>
#>  6 j.ctt24hdz7 book      j.ctt24hd… 2.         ‘Anxiety,… Fiji’s road … <tibble>
#>  7 j.ctt24hdz7 book      j.ctt24hd… 3.         Fiji’s De… Who, what, w… <tibble>
#>  8 j.ctt24hdz7 book      j.ctt24hd… 4.         ‘This pro… The aftermat… <tibble>
#>  9 j.ctt24hdz7 book      j.ctt24hd… 5.         The chang… NA            <tibble>
#> 10 j.ctt24hdz7 book      j.ctt24hd… 6.         The Fiji … Analyzing th… <tibble>
#> # ℹ 26 more rows
#> # ℹ 2 more variables: abstract <chr>, part_first_page <chr>

tidyr::unnest(parts)
#> Warning: `cols` is now required when using `unnest()`.
#>  Please use `cols = c(authors)`.
#> # A tibble: 39 × 14
#>    book_id     file_name part_id      part_label part_title part_subtitle prefix
#>    <chr>       <chr>     <chr>        <chr>      <chr>      <chr>         <chr> 
#>  1 j.ctt24hdz7 book      j.ctt24hdz7… NA         Front Mat… NA            NA    
#>  2 j.ctt24hdz7 book      j.ctt24hdz7… NA         Table of … NA            NA    
#>  3 j.ctt24hdz7 book      j.ctt24hdz7… NA         Acronyms … NA            NA    
#>  4 j.ctt24hdz7 book      j.ctt24hdz7… NA         Authors’ … NA            NA    
#>  5 j.ctt24hdz7 book      j.ctt24hdz7… 1.         The enigm… NA            NA    
#>  6 j.ctt24hdz7 book      j.ctt24hdz7… 1.         The enigm… NA            NA    
#>  7 j.ctt24hdz7 book      j.ctt24hdz7… 2.         ‘Anxiety,… Fiji’s road … NA    
#>  8 j.ctt24hdz7 book      j.ctt24hdz7… 3.         Fiji’s De… Who, what, w… NA    
#>  9 j.ctt24hdz7 book      j.ctt24hdz7… 4.         ‘This pro… The aftermat… NA    
#> 10 j.ctt24hdz7 book      j.ctt24hdz7… 5.         The chang… NA            NA    
#> # ℹ 29 more rows
#> # ℹ 7 more variables: given_name <chr>, surname <chr>, string_name <chr>,
#> #   suffix <chr>, author_number <dbl>, abstract <chr>, part_first_page <chr>