Performance

All the tests were done on an Arch Linux x86_64 machine with an Intel(R) Core(TM) i7 CPU (1.90GHz).

Empirical likelihood computation

We show the performance of computing empirical likelihood with el_mean(). We test the computation speed with simulated data sets in two different settings: 1) the number of observations increases with the number of parameters fixed, and 2) the number of parameters increases with the number of observations fixed.

Increasing the number of observations

We fix the number of parameters at $p = 10$ , and simulate the parameter value and $n \times p$ matrices using rnorm(). In order to ensure convergence with a large $n$ , we set a large threshold value using el_control().

library(ggplot2)
library(microbenchmark)
set.seed(3175775)
p <- 10
par <- rnorm(p, sd = 0.1)
ctrl <- el_control(th = 1e+10)
result <- microbenchmark(
  n1e2 = el_mean(matrix(rnorm(100 * p), ncol = p), par = par, control = ctrl),
  n1e3 = el_mean(matrix(rnorm(1000 * p), ncol = p), par = par, control = ctrl),
  n1e4 = el_mean(matrix(rnorm(10000 * p), ncol = p), par = par, control = ctrl),
  n1e5 = el_mean(matrix(rnorm(100000 * p), ncol = p), par = par, control = ctrl)
)

Below are the results:

result
#> Unit: microseconds
#>  expr        min          lq        mean      median         uq        max
#>  n1e2    443.748    475.7625    580.4861    493.6865    535.454   5076.782
#>  n1e3   1185.071   1396.6205   2311.3019   1482.2000   1632.075  68801.060
#>  n1e4  10687.549  13223.3250  14615.9958  14977.4830  15899.098  21061.054
#>  n1e5 172394.562 204227.0865 243041.4950 233537.2195 269732.809 338287.505
#>  neval cld
#>    100 a  
#>    100 a  
#>    100  b 
#>    100   c
autoplot(result)

Increasing the number of parameters

This time we fix the number of observations at $n = 1000$ , and evaluate empirical likelihood at zero vectors of different sizes.

n <- 1000
result2 <- microbenchmark(
  p5 = el_mean(matrix(rnorm(n * 5), ncol = 5),
    par = rep(0, 5),
    control = ctrl
  ),
  p25 = el_mean(matrix(rnorm(n * 25), ncol = 25),
    par = rep(0, 25),
    control = ctrl
  ),
  p100 = el_mean(matrix(rnorm(n * 100), ncol = 100),
    par = rep(0, 100),
    control = ctrl
  ),
  p400 = el_mean(matrix(rnorm(n * 400), ncol = 400),
    par = rep(0, 400),
    control = ctrl
  )
)

result2
#> Unit: microseconds
#>  expr        min         lq        mean     median          uq        max neval
#>    p5    722.057    758.079    794.8151    789.222    819.6595    903.726   100
#>   p25   2882.297   2921.035   3048.9510   2939.209   3010.8375   8164.793   100
#>  p100  23323.454  25876.612  28142.0245  26703.791  30895.1805  50356.098   100
#>  p400 268178.995 292311.263 326318.4875 312712.174 341641.2320 573603.999   100
#>  cld
#>  a  
#>  a  
#>   b 
#>    c
autoplot(result2)

On average, evaluating empirical likelihood with a 100000×10 or 1000×400 matrix at a parameter value satisfying the convex hull constraint takes less than a second.

Eunseop Kim

Empirical likelihood computation

Increasing the number of observations

Increasing the number of parameters

About

Community

Resources