jst_get_article()
extracts meta-data from JSTOR-XML files for journal
articles.
Value
A tibble
containing the extracted meta-data with the following
columns:
file_name (chr): The file_name of the original .xml-file. Can be used for joining with other parts (authors, references, footnotes, full-texts).
journal_doi (chr): A registered identifier for the journal.
journal_jcode (chr): A identifier for the journal like "amerjsoci" for the "American Journal of Sociology".
journal_pub_id (chr): Similar to journal_jcode. Most of the time either one is present.
journal_title (chr): The title of the journal.
article_doi (chr): A registered unique identifier for the article.
article_jcode (chr): A unique identifier for the article (not a DOI).
article_pub_id (chr): Infrequent, either part of the DOI or the article_jcode.
article_type (chr): The type of article (research-article, book-review, etc.).
article_title (chr): The title of the article.
volume (chr): The volume the article was published in.
issue (chr): The issue the article was published in.
language (chr): The language of the article.
pub_day (chr): Publication day, if specified.
pub_month (chr): Publication month, if specified.
pub_year (int): Year of publication.
first_page (int): Page number for the first page of the article.
last_page (int): Page number for the last page of the article.
page_range (chr): The range of pages for the article.
A note about publication dates: always the first entry is being extracted, which should correspond to the oldest date, in case there is more than one date.
Examples
jst_get_article(jst_example("article_with_references.xml"))
#> # A tibble: 1 × 19
#> file_name journal_doi journal_jcode journal_pub_id journal_title article_doi
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 article_wi… NA tranamermicr… NA Transactions… 10.2307/32…
#> # ℹ 13 more variables: article_pub_id <chr>, article_jcode <chr>,
#> # article_type <chr>, article_title <chr>, volume <chr>, issue <chr>,
#> # language <chr>, pub_day <chr>, pub_month <chr>, pub_year <int>,
#> # first_page <chr>, last_page <chr>, page_range <chr>