
Estimate a Bayesian Dynamic Multivariate Panel Model With Multiple Imputation
Source:R/dynamice.R
dynamice.RdApplies multiple imputation using mice::mice() to the supplied data
and fits a dynamic multivariate panel model to each imputed data set using
dynamite(). Posterior samples from each imputation run are
combined. When using wide format imputation, the long format data is
automatically converted to a wide format before imputation to preserve the
longitudinal structure, and then converted back to long format for
estimation.
Usage
dynamice(
dformula,
data,
time,
group = NULL,
priors = NULL,
backend = "rstan",
verbose = TRUE,
verbose_stan = FALSE,
stanc_options = list("O0"),
threads_per_chain = 1L,
grainsize = NULL,
custom_stan_model = NULL,
interval = 1L,
debug = NULL,
mice_args = list(),
impute_format = "wide",
keep_imputed = FALSE,
stan_csv_dir = tempdir(),
...
)Arguments
- dformula
[
dynamiteformula]
The model formula. Seedynamiteformula()and 'Details'.- data
[
data.frame,tibble::tibble, ordata.table::data.table]
The data that contains the variables in the model in long format. Supported column types areinteger,logical,double, andfactor. Columns of typecharacterwill be converted to factors. Unused factor levels will be dropped. Thedatacan contain missing values which will simply be ignored in the estimation in a case-wise fashion (per time-point and per channel). Inputdatais converted to channel specific matrix representations viastats::model.matrix.lm().- time
[
character(1)]
A column name ofdatathat denotes the time index of observations. If this variable is a factor, the integer representation of its levels are used internally for defining the time indexing.- group
[
character(1)]
A column name ofdatathat denotes the unique groups orNULLcorresponding to a scenario without any groups. IfgroupisNULL, a new column.groupis created with constant value1Lis created indicating that all observations belong to the same group. In case of name conflicts withdata, see thegroup_varelement of the return object to get the column name of the new variable.- priors
[
data.frame]
An optional data frame with prior definitions. Seeget_priors()and 'Details'.- backend
[
character(1)]
Defines the backend interface to Stan, should be either"cmdstanr"(the default) or"rstan". Note thatcmdstanrneeds to be installed separately as it is not on CRAN. It also needs the actualCmdStansoftware. See https://mc-stan.org/cmdstanr/ for details. Defaults to"rstan"if"cmdstanr"cannot be used.- verbose
[
logical(1)]
All warnings and messages are suppressed if set toFALSE. Defaults toTRUE. Setting this toFALSEwill also disable checks for perfect collinearity in the model matrix.- verbose_stan
[
logical(1)]
This is theverboseargument forrstan::sampling(). Defaults toFALSE.- stanc_options
[
list()]
This is thestanc_optionsargument passed to the compile method of aCmdStanModelobject viacmdstan_model()whenbackend = "cmdstanr". Defaults tolist("O0"). To enable level one compiler optimizations, uselist("O1"). See https://mc-stan.org/cmdstanr/reference/cmdstan_model.html for details.- threads_per_chain
[
integer(1)]
A Positive integer defining the number of parallel threads to use within each chain. Default is1. Seerstan::rstan_options()and https://mc-stan.org/cmdstanr/reference/model-method-sample.html for details.- grainsize
[
integer(1)]
A positive integer defining the suggested size of the partial sums when using within-chain parallelization. Default is number of time points divided bythreads_per_chain. Setting this to1leads the workload division entirely to the internal scheduler. The performance of the within-chain parallelization can be sensitive to the choice ofgrainsize, see Stan manual on reduce-sum for details.- custom_stan_model
[
character(1)]
An optional character string that either contains a customized Stan model code or a path to a.stanfile that contains the code. Using this will override the generated model code. For expert users only.- interval
[
integer(1)]
This arguments acts as an offset for the evaluation of lagged observations when measurements are not available at every time point. For example, if measurements are only available at every second time point, settinginterval = 2means that a lag of orderkwill instead use the observation at2 * ktime units in the past. The default value is1meaning that there is a one-to-one correspondence between the lag order and the time scale. For expert users only.- debug
[
list()]
A named list of formname = TRUEindicating additional objects in the environment of thedynamitefunction which are added to the return object. Additionally, valuesno_compile = TRUEandno_sampling = TRUEcan be used to skip the compilation of the Stan code and sampling steps respectively. This can be useful for debugging when combined withmodel_code = TRUE, which adds the Stan model code to the return object.- mice_args
[
list()]
Arguments passed tomice::mice()excludingdata.- impute_format
[
character(1)]
Format of the data that will be passed to the imputation method. Should be either"wide"(the default) or"long"corresponding to wide format and long format imputation.- keep_imputed
[
logical(1)]
Should the imputed datasets be kept in the return object? The default isFALSE. IfTRUE, the imputations will be included in theimputedfield in the return object that is otherwiseNULL.- stan_csv_dir
[
character(1)]
A directory path to output the Stan .csv files whenbackendis"cmdstanr". The files are saved here via$save_output_files()to avoid garbage collection between sampling runs with different imputed datasets.- ...
For
dynamite(), additional arguments torstan::sampling()or the$sample()method of theCmdStanModelobject (see https://mc-stan.org/cmdstanr/reference/model-method-sample.html), such aschainsandcores(chainsandparallel_chainsincmdstanr). Forsummary(), additional arguments toas.data.frame.dynamitefit(). Forprint(), further arguments to the print method for tibbles (see tibble::formatting). Not used forformula().
See also
Model fitting
dynamite(),
get_priors(),
update.dynamitefit()