Contributing to charlatan
Scott Chamberlain
Roel M. Hogervorst
Source:vignettes/contributing.Rmd
contributing.Rmd
charlatan
is a wee bit complex. This vignette aims to
help you contribute to the package. For a general introduction on
contributing to rOpenSci packages see our Contributing
guide.
Communication
Open an issue if you want to add a new provider or locale to an existing provider; it helps make sure there’s no duplicated effort and we can help make sure you have the knowledge you need.
Let’s continue with some definitions.
Definitions
For the purposes of this package:
-
Provider: a type of data that can be generated in
charlatan
. For example, we have providers for phone numbers, addresses and people’s names. Adding a provider may involve a single file, more than one file; and a single R6 class or many R6 classes. -
Locale: a locale for our purposes is a specific
spoken language that’s associated with a specific country. You can have
more than one locale for a given language (e.g.,
en-US
,en-GB
). Some fakers won’t have any locales, whereas others can have many.
A Localized provider is a provider specific for that
locale: f.e. PhoneNumberProvider_en_US
. A Parent
provider is a provider that is inherited by the Localized
providers: f.e. PhoneNumberProvider
.
We have made these terms bold in this vignette. We hope the following examples makes this a bit more clear.
Example
There are Providers without locales, like
CurrencyProvider
.
And there are providers with locales: there is a
Parent Provider AddressProvider
, you
cannot use that one without a locale, but you can use its
Localized provider AddressProvider_en_US
.
The locale is en_US
.
R6
If you aren’t familiar with R6, have a look at the R6 website, in particular the introductory vignette.
Inheritance
At the heart of charlatan is the BareProvider
, this
class has all the basic number and text substitution that is used
throughout the the package.
All non-locale providers inherit directly from the
BareProvider
: NumericsProvider
inherits from
BareProvider
For all providers with locales, we have some added logic for locales
in the BaseProvider
.
Locale specific inheritance
All providers with locales inherit from a common provider
(Parent Provider), for example the English
(United States) AddressProvider (AddressProvider_en_US
)
inherits from AddressProvider
, which inherits from
BaseProvider
, which inherits from
BareProvider
:
BareProvider > BaseProvider > AddressProvider > AddressProvider_en_US
With inheritance we can define common functionality that works for most locales, but have the ability to overwrite functionality so that it works for that specific use-case.
For example:
- the
PersonProvider
class (Parent Provider) has methods:first_name
andlast_name
, the Japanese locale of PersonProvider (PersonProvider_ja_JP
) inherits this method and also adds new kana methods:
library(charlatan)
set.seed(2000)
en <- PersonProvider_en_US$new() # English
jp <- PersonProvider_ja_JP$new() # Japanese
en$first_name() # Georgia
#> [1] "Georgia"
jp$first_name() # Haruka
#> [1] "Haruka"
jp$first_kana_name() # カオリ
#> [1] "カオリ"
jp$last_kana_name() # コイズミ
#> [1] "コイズミ"
-
AddressProvider
(Parent Provider) has city and address methods, but the AddressProvider for New Zealand locale (PersonProvider_en_NZ
) has extra components for Maori names.
Adding new providers or locales
Yes we welcome new contributions. Look in the github issues or scratch your own itch.
Adding a new locale step by step
Yes we welcome new locales for existing Providers!
- First open an issue and explain your plan.
- then clone the repository
- then open a Pull-Request
Here is what we want to see in the Pull Request:
Code work:
- Please, first look at other locales and see how they are implemented this helps in uniformity.
- Add the locale to the Parent provider under
private - locales
- Make sure the locale is in the list of
available_locales.R
(if not, add it) - see if there is a template file for your provider under the folder inst/. Copy that template to a new file and fill it in.
- write code to enable your locale.
- run tests to make sure everything works
- if you add custom methods to your localized provider; add a test for that functionality in the test folder.
Documentation work:
We want to have great documentation for this package and that means
some work for you. - if you overwrite a method from the Parent
provider you have to add a docstring:
#' @description what the thing does
- if you add
information under public you still have to document that field with a
docstring #' @field name_of_field description of the thing
- if you add new functionality, provide an example under
#' @examples
above the code - run make doc
or
Rscript --no-init-file -e "library(methods); devtools::run_examples()"
in the terminal. And make sure there are no warnings or errors.
Example
Here we add a new locale to loremIpsumProvider. The loremIpsumProvider generates random words, letters and paragraphs in a language to be used as placeholder text.
We add the language Klingon (locale: tlh) to this Provider.
- first add the locale to
lorem-provider.R
- check if the locale exists at all: (not yet) so add it to the
available_locales.R
file too. - then create a new file
lorem-provider-tlh.R
:
lorem_word_list_tlh <- c("'Igh'aDmegh", "DIron", "Da'lar","moQbID")
#' Lorem provider for Klingon (Klingon)
#'
#' Methods for Lorem Ipsum generation
#' Lorem Ipsum is a placeholder text commonly used to demonstrate the visual
#' form of a document or a typeface without relying on meaningful content.
#' @family tlh
#' @export
#' @examples
#' x <- LoremProvider_tlh$new()
#' x$word()
#' x$words(3)
#' x$words(6)
#' x$sentence()
#' x$paragraph()
#' x$paragraphs(3)
#' cat(x$paragraphs(6), sep = "\n")
#' x$text(19)
#' x <- LoremProvider_tlh$new(word_connector = " --- ")
#' x$paragraph(4)
LoremProvider_tlh <- R6::R6Class(
inherit = LoremProvider,
"LoremProvider_tlh",
public = list(
#' @field locale (character) the locale
locale = "tlh"
),
private = list(
word_list = lorem_word_list_tlh
)
)
Adding a new provider, step by step
Yes we are open to new providers, but we need a use case: is it something you want to use in your work for example?
- First open an issue and explain your plan.
- then clone the repository
- then open a Pull-Request
Here is what we want to see in the Pull Request:
Code work:
- First look at existing providers and try to emulate those.
-
Providers are R6 classes. Create an R6 class that
inherits from the
BareProvider
(if you have a provider that has no locales) or inherit from theBaseProvider
if you do have locales. - Add your Provider to the available_providers list
in
available_providers.R
- if you have locales, first create the Parent Provider and then create the localized providers and inherit from the Parent Provider.
- if you have locales: make sure the Parent Provider raises an error on creation (look in other providers for help).
- if you have locales: there should at least be an en_US locale.
- create a test file under tests/testthat with the name
test-PROVIDERNAME.R
and add tests that will fail when your functionality does not work.
Documentation work:
We want to have great documentation for this package and that means
some work for you. - make sure the Providers are
described. - all public fields and methods need a description. - add
examples of functionality in the docs under #' @examples
-
run make doc
or
Rscript --no-init-file -e "library(methods); devtools::run_examples()"
in the terminal. And make sure there are no warnings or errors.
Guidelines for providers and locales
There are a few things we enforce in tests: - all Providers that
inherit from BaseProvider
are considered Parent
providers: they should never be directly initialized -
Localized providers inherit from Parent
Providers and should work. - Localized
providers need at least an en_US locale.
So PhoneNumberProvider
should error on instantiation,
but PhoneNumberProvider_en_US
should work.
But not everything can be tested so here are some other requisites: - New providers should go in the available_providers list - New locales should be in the available_locales_list - Parent Providers should have locale = NULL.
Where should I add logic or data?
In general we put new logic and data close to where it is used. If you need something for one locale only, place it there. Are we re-using that logic for multiple locales of one Provider? Consider if the logic should go in the Parent Provider.
Data generally goes into the private component of the R6 class: