Frequently Asked Questions

library(mcmcensemble)

Can estimations go beyond outside the range of the inits?

Yes, the inits and inits control the range of the initial values, but the chain is still allowed to move freely after this initial step, as shown in the following example.

Please report to the next question to learn how you can specify hard limits for the chains.

## a log-pdf to sample from
p.log <- function(x) {
  B <- 0.03 # controls 'bananacity'
  -x[1]^2 / 200 - 1 / 2 * (x[2] + B * x[1]^2 - 100 * B)^2
}

unif_inits <- data.frame(
  a = runif(10, min = -10, max = -5),
  b = runif(10, min = -10, max = -5)
)

set.seed(20201209)

res1 <- MCMCEnsemble(
  p.log,
  inits = unif_inits,
  max.iter = 3000, n.walkers = 10,
  method = "stretch",
  coda = TRUE
)
#> Using stretch move with 10 walkers.

summary(res1$samples)
#> 
#> Iterations = 1:300
#> Thinning interval = 1 
#> Number of chains = 10 
#> Sample size per chain = 300 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>      Mean    SD Naive SE Time-series SE
#> a -2.4988 9.721   0.1775         1.1050
#> b -0.3725 3.806   0.0695         0.3982
#> 
#> 2. Quantiles for each variable:
#> 
#>     2.5%     25%     50%   75%  97.5%
#> a -20.62 -10.325 -2.0440 4.818 15.037
#> b -10.22  -2.071  0.5956 2.373  4.261
plot(res1$samples)
res2 <- MCMCEnsemble(
  p.log,
  inits = unif_inits,
  max.iter = 3000, n.walkers = 10,
  method = "differential.evolution",
  coda = TRUE
)
#> Using differential.evolution move with 10 walkers.

summary(res2$samples)
#> 
#> Iterations = 1:300
#> Thinning interval = 1 
#> Number of chains = 10 
#> Sample size per chain = 300 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>      Mean     SD Naive SE Time-series SE
#> a -1.6026 10.747  0.19621         0.9506
#> b -0.6902  5.133  0.09371         0.6192
#> 
#> 2. Quantiles for each variable:
#> 
#>     2.5%    25%     50%   75%  97.5%
#> a -24.77 -8.064 -0.6607 5.993 16.387
#> b -15.81 -2.009  1.1906 2.597  4.686
plot(res2$samples)

How to restrict the possible parameter range?

There is no built-in way to define hard limits for the parameter and make sure they never go outside of this range.

The recommended way to address this issue is to handle these cases in the function f you provide.

For example, to keep parameters in the 0-1 range:

p.log.restricted <- function(x) {
  
  if (any(x < 0, x > 1)) {
    return(-Inf)
  }
  
  B <- 0.03 # controls 'bananacity'
  -x[1]^2 / 200 - 1 / 2 * (x[2] + B * x[1]^2 - 100 * B)^2
}

unif_inits <- data.frame(
  a = runif(10, min = 0, max = 1),
  b = runif(10, min = 0, max = 1)
)

res <- MCMCEnsemble(
  p.log.restricted,
  inits = unif_inits,
  max.iter = 3000, n.walkers = 10,
  method = "stretch",
  coda = TRUE
)
#> Using stretch move with 10 walkers.
summary(res$samples)
#> 
#> Iterations = 1:300
#> Thinning interval = 1 
#> Number of chains = 10 
#> Sample size per chain = 300 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>     Mean     SD Naive SE Time-series SE
#> a 0.4893 0.2933 0.005354        0.02960
#> b 0.6585 0.2659 0.004854        0.02577
#> 
#> 2. Quantiles for each variable:
#> 
#>      2.5%    25%    50%    75%  97.5%
#> a 0.01970 0.2284 0.4886 0.7428 0.9728
#> b 0.05949 0.4930 0.7391 0.8683 0.9904
plot(res$samples)

This might seem inconvenient but in most cases, users will define their posterior probability as the product of a prior probability and the likelihood. In this situation, values that are not contained in the log-prior density automatically return -Inf in the log-posterior and it is not necessary to define it explicitly:

prior.log <- function(x) {
 dunif(x, log = TRUE)
}

lkl.log <- function(x) {
  B <- 0.03 # controls 'bananacity'
  -x[1]^2 / 200 - 1 / 2 * (x[2] + B * x[1]^2 - 100 * B)^2
}

posterior.log <- function(x) {
  sum(prior.log(x)) + lkl.log(x)
}

res <- MCMCEnsemble(
  posterior.log,
  inits = unif_inits,
  max.iter = 5000, n.walkers = 10,
  method = "stretch",
  coda = TRUE
)
#> Using stretch move with 10 walkers.
summary(res$samples)
#> 
#> Iterations = 1:500
#> Thinning interval = 1 
#> Number of chains = 10 
#> Sample size per chain = 500 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>     Mean     SD Naive SE Time-series SE
#> a 0.5154 0.2902 0.004104        0.02478
#> b 0.6838 0.2356 0.003332        0.02035
#> 
#> 2. Quantiles for each variable:
#> 
#>      2.5%    25%    50%    75%  97.5%
#> a 0.02785 0.2611 0.5357 0.7700 0.9854
#> b 0.12319 0.5347 0.7367 0.8734 0.9893
plot(res$samples)