Assign the primary language of a semantically rich dataset object using an
ISO 639 language code or full language name. This sets the language
attribute in the dataset's metadata.
Usage
language(x)
language(x, iso_639_code = "639-3") <- value
language(x, iso_639_code = "639-3") <- valueArguments
- x
A dataset object created by
dataset_df()oras_dataset_df().- iso_639_code
A character string indicating the desired return format: either
"639-3"(default; terminologic) or"639-1"(2-letter code).- value
A 2-letter or 3-letter language code (ISO 639-1 or ISO 639-2), or a full language name (case-insensitive).
Value
The dataset with an updated language attribute, typically an ISO
639-2/T code (Alpha_3_T) such as "fra", "eng", "spa", etc.
Details
This function supports recognition of:
2-letter codes (ISO 639-1, e.g.,
"en","fr")3-letter codes from both:
Alpha_3_B(bibliographic, e.g.,"fre")Alpha_3_T(terminologic, e.g.,"fra")
Full language names (e.g.,
"English","French")
For compatibility with open science repositories and modern metadata
standards, this function returns the terminologic code (Alpha_3_T)
when available. If Alpha_3_T is missing for a language, the legacy
bibliographic code (Alpha_3_B) is used as a fallback.
Full language names (e.g., "English", "Spanish") are matched
case-insensitively against the ISO 639-2 Name field. Exact matches are
attempted first; if none are found, a prefix match is used. For example:
"English"returns"eng""English, Old"returns"ang"
This means that:
Both
"fra"(terminologic) and"fre"(bibliographic) will be accepted as valid input for FrenchThe resulting value stored and returned will be
"fra"
This behaviour aligns with:
Common repository practices (Zenodo, OSF, Figshare)
If value is NULL, the language is marked as ":unas" (unspecified).
In some cases<U+2014>especially for historical or moribund languages<U+2014>multiple
similar names may exist. In such cases, it is safer to use a specific
language code (e.g., "ang" instead of "English, Old" and "enm"
for "English, Middle (1100-1500)"). You can also
refer directly to the definitions in ISOcodes::ISO_639_2
for clarity.
See also
Other bibliographic helper functions:
contributor(),
creator(),
dataset_format(),
dataset_title(),
description(),
geolocation(),
get_bibentry(),
publication_year(),
publisher(),
relation(),
rights(),
subject()
Examples
df <- dataset_df(data.frame(x = 1:3))
language(df) <- "English" # Returns "eng"
language(df) <- "fre" # Legacy code; returns "fra"
language(df) <- "fra" # Returns "fra"
language(df, iso_639_code = "639-1") <- "fra" # Returns "fr"
language(df) <- NULL # Sets ":unas"
