This function predicts the gender of a first name given a year or range of
years in which the person was born. The prediction can use one of several
data sets suitable for different time periods or geographical regions. See
the package vignette for suggestions on using this function with multiple
names and for a discussion of which data set is most suitable for your
research question. When using certain methods, the
package is required; you will be prompted to install it if it is not already
gender( names, years = c(1932, 2012), method = c("ssa", "ipums", "napp", "kantrowitz", "genderize", "demo"), countries = c("United States", "Canada", "United Kingdom", "Denmark", "Iceland", "Norway", "Sweden") )
First names as a character vector. Names are case insensitive.
The birth year of the name whose gender is to be predicted. This
argument can be either a single year, a range of years in the form
This value determines the data set that is used to predict the
gender of the name. The
The countries for which datasets are being used. For the
Returns a data frame containing the results of predicting the gender. The exact components of the returned list will depend on the specific method used. They include the following:
The name for which the gender has been predicted.
The proportion of male names for the given range of years.
The proportion of female names for the given range of years.
predicted gender based on the proportion of male and female names. Possible
"female" for proportions above
"either" for proportions that are exactly
NA for combinations of names and years for which a gender cannot
be predicted using the given method.
The lower bound (inclusive) of the year range used for the prediction.
The upper bound (inclusive) of the year range used for the prediction.
gender("madison", method = "demo", years = 1985)#> # A tibble: 1 x 6 #> name proportion_male proportion_female gender year_min year_max #> <chr> <dbl> <dbl> <chr> <dbl> <dbl> #> 1 madison 0.214 0.786 female 1985 1985gender("madison", method = "demo", years = c(1900, 1985))#> # A tibble: 1 x 6 #> name proportion_male proportion_female gender year_min year_max #> <chr> <dbl> <dbl> <chr> <dbl> <dbl> #> 1 madison 0.903 0.0972 male 1900 1985