charlatan
makes fake data, inspired from and borrowing some code from Python’s faker (https://github.com/joke2k/faker)
Make fake data for:
- person names
- jobs
- phone numbers
- colors: names, hex, rgb
- credit cards
- DOIs
- numbers in range and from distributions
- gene sequences
- geographic coordinates
- emails
- URIs, URLs, and their parts
- IP addresses
- more coming …
Possible use cases for charlatan
:
- Students in a classroom setting learning any task that needs a dataset.
- People doing simulations/modeling that need some fake data
- Generate fake dataset of users for a database before actual users exist
- Complete missing spots in a dataset
- Generate fake data to replace sensitive real data with before public release
- Create a random set of colors for visualization
- Generate random coordinates for a map
- Get a set of randomly generated DOIs (Digital Object Identifiers) to assign to fake scholarly artifacts
- Generate fake taxonomic names for a biological dataset
- Get a set of fake sequences to use to test code/software that uses sequence data
Reasons to use charlatan
:
- Light weight, few dependencies
- Relatively comprehensive types of data, and more being added
- Comprehensive set of languages supported, more being added
- Useful R features such as creating entire fake data.frame’s
Installation
cran version
install.packages("charlatan")
dev version
remotes::install_github("ropensci/charlatan")
library("charlatan")
#> Error in library("charlatan"): there is no package called 'charlatan'
set.seed(12345)
high level function
… for all fake data operations
x <- fraudster()
#> Error in fraudster(): could not find function "fraudster"
x$job()
#> Error in eval(expr, envir, enclos): object 'x' not found
x$name()
#> Error in eval(expr, envir, enclos): object 'x' not found
x$color_name()
#> Error in eval(expr, envir, enclos): object 'x' not found
locale support
Adding more locales through time, e.g.,
Locale support for job data
ch_job(locale = "en_US", n = 3)
#> Error in ch_job(locale = "en_US", n = 3): could not find function "ch_job"
ch_job(locale = "fr_FR", n = 3)
#> Error in ch_job(locale = "fr_FR", n = 3): could not find function "ch_job"
ch_job(locale = "hr_HR", n = 3)
#> Error in ch_job(locale = "hr_HR", n = 3): could not find function "ch_job"
ch_job(locale = "uk_UA", n = 3)
#> Error in ch_job(locale = "uk_UA", n = 3): could not find function "ch_job"
ch_job(locale = "zh_TW", n = 3)
#> Error in ch_job(locale = "zh_TW", n = 3): could not find function "ch_job"
For colors:
ch_color_name(locale = "en_US", n = 3)
#> Error in ch_color_name(locale = "en_US", n = 3): could not find function "ch_color_name"
ch_color_name(locale = "uk_UA", n = 3)
#> Error in ch_color_name(locale = "uk_UA", n = 3): could not find function "ch_color_name"
More coming soon …
generate a dataset
ch_generate()
#> Error in ch_generate(): could not find function "ch_generate"
ch_generate("job", "phone_number", n = 30)
#> Error in ch_generate("job", "phone_number", n = 30): could not find function "ch_generate"
person name
ch_name()
#> Error in ch_name(): could not find function "ch_name"
ch_name(10)
#> Error in ch_name(10): could not find function "ch_name"
phone number
ch_phone_number()
#> Error in ch_phone_number(): could not find function "ch_phone_number"
ch_phone_number(10)
#> Error in ch_phone_number(10): could not find function "ch_phone_number"
job
ch_job()
#> Error in ch_job(): could not find function "ch_job"
ch_job(10)
#> Error in ch_job(10): could not find function "ch_job"
credit cards
ch_credit_card_provider()
#> Error in ch_credit_card_provider(): could not find function "ch_credit_card_provider"
ch_credit_card_provider(n = 4)
#> Error in ch_credit_card_provider(n = 4): could not find function "ch_credit_card_provider"
ch_credit_card_number()
#> Error in ch_credit_card_number(): could not find function "ch_credit_card_number"
ch_credit_card_number(n = 10)
#> Error in ch_credit_card_number(n = 10): could not find function "ch_credit_card_number"
ch_credit_card_security_code()
#> Error in ch_credit_card_security_code(): could not find function "ch_credit_card_security_code"
ch_credit_card_security_code(10)
#> Error in ch_credit_card_security_code(10): could not find function "ch_credit_card_security_code"
Usage in the wild
- eacton/R-Utility-Belt-ggplot2 (https://github.com/eacton/R-Utility-Belt-ggplot2/blob/836a6bd303fbfde4a334d351e0d1c63f71c4ec68/furry_dataset.R)
Contributors
- Roel M. Hogervorst (https://github.com/rmhogervorst)
- Scott Chamberlain (https://github.com/sckott)
- Kyle Voytovich (https://github.com/kylevoyto)
- Martin Pedersen (https://github.com/MartinMSPedersen)
If you would like to contribute, see CONTRIBUTING (on github)
similar art
- wakefield (https://github.com/trinker/wakefield)
- ids (https://github.com/richfitz/ids)
- rcorpora (https://github.com/gaborcsardi/rcorpora)
- synthpop (https://cran.r-project.org/package=synthpop)
Meta
- Please report any issues or bugs.
- License: MIT
- Get citation information for
charlatan
in R doingcitation(package = 'charlatan')
- Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.