All the tests were done on an Arch Linux x86_64 machine with an Intel(R) Core(TM) i7 CPU (1.90GHz).
Empirical likelihood computation
We show the performance of computing empirical likelihood with el_mean()
. We test the computation speed with simulated data sets in two different settings: 1) the number of observations increases with the number of parameters fixed, and 2) the number of parameters increases with the number of observations fixed.
Increasing the number of observations
We fix the number of parameters at \(p = 10\), and simulate the parameter value and \(n \times p\) matrices using rnorm()
. In order to ensure convergence with a large \(n\), we set a large threshold value using el_control()
.
library(ggplot2)
library(microbenchmark)
set.seed(3175775)
p <- 10
par <- rnorm(p, sd = 0.1)
ctrl <- el_control(th = 1e+10)
result <- microbenchmark(
n1e2 = el_mean(matrix(rnorm(100 * p), ncol = p), par = par, control = ctrl),
n1e3 = el_mean(matrix(rnorm(1000 * p), ncol = p), par = par, control = ctrl),
n1e4 = el_mean(matrix(rnorm(10000 * p), ncol = p), par = par, control = ctrl),
n1e5 = el_mean(matrix(rnorm(100000 * p), ncol = p), par = par, control = ctrl)
)
Below are the results:
result
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> n1e2 437.445 469.045 510.3973 501.801 542.362 875.783 100
#> n1e3 1132.022 1334.695 1479.9849 1429.381 1539.426 3978.561 100
#> n1e4 10594.912 12492.387 14527.4458 14917.119 15772.925 23214.120 100
#> n1e5 179649.830 226199.170 272309.7140 263121.640 316161.041 455073.881 100
#> cld
#> a
#> a
#> b
#> c
autoplot(result)
Increasing the number of parameters
This time we fix the number of observations at \(n = 1000\), and evaluate empirical likelihood at zero vectors of different sizes.
n <- 1000
result2 <- microbenchmark(
p5 = el_mean(matrix(rnorm(n * 5), ncol = 5),
par = rep(0, 5),
control = ctrl
),
p25 = el_mean(matrix(rnorm(n * 25), ncol = 25),
par = rep(0, 25),
control = ctrl
),
p100 = el_mean(matrix(rnorm(n * 100), ncol = 100),
par = rep(0, 100),
control = ctrl
),
p400 = el_mean(matrix(rnorm(n * 400), ncol = 400),
par = rep(0, 400),
control = ctrl
)
)
result2
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> p5 707.731 769.170 811.624 808.979 852.550 1044.319 100
#> p25 2619.586 2686.616 2887.323 2750.345 2813.819 7487.315 100
#> p100 20447.640 22902.308 25113.666 23853.763 27596.148 45707.044 100
#> p400 235867.589 259775.978 293469.463 280659.142 316931.664 429653.787 100
#> cld
#> a
#> a
#> b
#> c
autoplot(result2)
On average, evaluating empirical likelihood with a 100000×10 or 1000×400 matrix at a parameter value satisfying the convex hull constraint takes less than a second.