Assign the primary language of a semantically rich dataset object using an
ISO 639 language code or full language name. This sets the language
attribute in the dataset's metadata.
Usage
language(x)
language(x, iso_639_code = "639-3") <- value
language(x, iso_639_code = "639-3") <- value
Arguments
- x
A dataset object created by
dataset_df()
oras_dataset_df()
.- iso_639_code
A character string indicating the desired return format: either
"639-3"
(default; terminologic) or"639-1"
(2-letter code).- value
A 2-letter or 3-letter language code (ISO 639-1 or ISO 639-2), or a full language name (case-insensitive).
Value
The dataset with an updated language
attribute, typically an ISO
639-2/T code (Alpha_3_T
) such as "fra"
, "eng"
, "spa"
, etc.
Details
This function supports recognition of:
2-letter codes (ISO 639-1, e.g.,
"en"
,"fr"
)3-letter codes from both:
Alpha_3_B
(bibliographic, e.g.,"fre"
)Alpha_3_T
(terminologic, e.g.,"fra"
)
Full language names (e.g.,
"English"
,"French"
)
For compatibility with open science repositories and modern metadata
standards, this function returns the terminologic code (Alpha_3_T
)
when available. If Alpha_3_T
is missing for a language, the legacy
bibliographic code (Alpha_3_B
) is used as a fallback.
Full language names (e.g., "English"
, "Spanish"
) are matched
case-insensitively against the ISO 639-2 Name field. Exact matches are
attempted first; if none are found, a prefix match is used. For example:
"English"
returns"eng"
"English, Old"
returns"ang"
This means that:
Both
"fra"
(terminologic) and"fre"
(bibliographic) will be accepted as valid input for FrenchThe resulting value stored and returned will be
"fra"
This behaviour aligns with:
Common repository practices (Zenodo, OSF, Figshare)
If value
is NULL
, the language is marked as ":unas"
(unspecified).
In some cases<U+2014>especially for historical or moribund languages<U+2014>multiple
similar names may exist. In such cases, it is safer to use a specific
language code (e.g., "ang"
instead of "English, Old"
and "enm"
for "English, Middle (1100-1500)"
). You can also
refer directly to the definitions in ISOcodes::ISO_639_2
for clarity.
See also
Other bibliographic helper functions:
contributor()
,
creator()
,
dataset_format()
,
dataset_title()
,
description()
,
geolocation()
,
get_bibentry()
,
publication_year()
,
publisher()
,
relation()
,
rights()
,
subject()
Examples
df <- dataset_df(data.frame(x = 1:3))
language(df) <- "English" # Returns "eng"
language(df) <- "fre" # Legacy code; returns "fra"
language(df) <- "fra" # Returns "fra"
language(df, iso_639_code = "639-1") <- "fra" # Returns "fr"
language(df) <- NULL # Sets ":unas"