Performance

Eunseop Kim

All the tests were done on an Arch Linux x86_64 machine with an Intel(R) Core(TM) i7 CPU (1.90GHz). We first load the necessary packages.

library(melt)
library(microbenchmark)
library(ggplot2)

Empirical likelihood computation

We show the performance of computing empirical likelihood with el_mean(). We test the computation speed with simulated data sets in two different settings: 1) the number of observations increases with the number of parameters fixed, and 2) the number of parameters increases with the number of observations fixed.

Increasing the number of observations

We fix the number of parameters at \(p = 10\), and simulate the parameter value and \(n \times p\) matrices using rnorm(). In order to ensure convergence with a large \(n\), we set a large threshold value using el_control().

set.seed(3175775)
p <- 10
par <- rnorm(p, sd = 0.1)
ctrl <- el_control(th = 1e+10)
result <- microbenchmark(
  n1e2 = el_mean(matrix(rnorm(100 * p), ncol = p), par = par, control = ctrl),
  n1e3 = el_mean(matrix(rnorm(1000 * p), ncol = p), par = par, control = ctrl),
  n1e4 = el_mean(matrix(rnorm(10000 * p), ncol = p), par = par, control = ctrl),
  n1e5 = el_mean(matrix(rnorm(100000 * p), ncol = p), par = par, control = ctrl)
)

Below are the results:

result
#> Unit: microseconds
#>  expr        min          lq        mean      median          uq        max
#>  n1e2    472.719    571.7005    662.1916    613.9395    674.1195   1554.036
#>  n1e3   1408.129   1769.0415   2138.9093   2028.5750   2251.4865   4869.616
#>  n1e4  13616.880  18666.5195  22279.0953  21573.8380  24695.5150  47163.699
#>  n1e5 281631.320 365871.8680 463930.8058 450246.9130 516867.2950 884133.441
#>  neval cld
#>    100  a 
#>    100  a 
#>    100  a 
#>    100   b
autoplot(result)

Increasing the number of parameters

This time we fix the number of observations at \(n = 1000\), and evaluate empirical likelihood at zero vectors of different sizes.

n <- 1000
result2 <- microbenchmark(
  p5 = el_mean(matrix(rnorm(n * 5), ncol = 5),
    par = rep(0, 5),
    control = ctrl
  ),
  p25 = el_mean(matrix(rnorm(n * 25), ncol = 25),
    par = rep(0, 25),
    control = ctrl
  ),
  p100 = el_mean(matrix(rnorm(n * 100), ncol = 100),
    par = rep(0, 100),
    control = ctrl
  ),
  p400 = el_mean(matrix(rnorm(n * 400), ncol = 400),
    par = rep(0, 400),
    control = ctrl
  )
)
result2
#> Unit: microseconds
#>  expr        min         lq       mean     median         uq        max neval
#>    p5    925.732   1056.840   1253.365   1117.012   1308.915   2779.369   100
#>   p25   3669.131   4135.859   5436.506   4888.976   6507.479  11720.933   100
#>  p100  34312.301  40968.241  47884.391  46110.473  53782.856  81330.025   100
#>  p400 460696.228 536479.512 612616.016 587820.986 653332.783 936715.408   100
#>  cld
#>  a  
#>  a  
#>   b 
#>    c
autoplot(result2)

On average, evaluating empirical likelihood with a 100000×10 or 1000×400 matrix at a parameter value satisfying the convex hull constraint takes less than a second.