bayesImageS::swNoData(..)

This is the first in a series of posts describing the functions and algorithms that I have implemented in the R package bayesImageS.

Gibbs sampling was originally designed by Geman & Geman (1984) for drawing updates from the Gibbs distribution, hence the name. However, single-site Gibbs sampling exhibits poor mixing due to the posterior correlation between the pixel labels. Thus it is very slow to converge when the correlation (controlled by the inverse temperature \( \beta \)) is high.

The algorithm of Swendsen & Wang (1987) addresses this problem by forming clusters of neighbouring pixels, then updating all of the labels within a cluster to the same value. When simulating from the prior, such as a Potts model without an external field, this algorithm is very efficient.

The SW function in the PottsUtils package is implemented in a combination of R and C. The swNoData function in bayesImageS is implemented using RcppArmadillo, which gives it a speed advantage. It is worth noting that the intention of bayesImageS is not to replace PottsUtils. Rather, an efficient Swendsen-Wang algorithm is used as a building block for implementations of ABC (Grelaud et al., 2009), path sampling (Gelman & Meng, 1998), and the exchange algorithm (Murray et al., 2006). These other algorithms will be covered in future posts.

There are two things that we want to keep track of in this simulation study: the speed of the algorithm and the distribution of the summary statistic. We will be using system.time(..) to measure both CPU and elapsed (wall clock) time taken for the same number of iterations, for a range of inverse temperatures:

beta <- seq(0, 2, by = 0.1)
tmMx.PU <- tmMx.bIS <- matrix(nrow = length(beta), ncol = 2)
rownames(tmMx.PU) <- rownames(tmMx.bIS) <- beta
colnames(tmMx.PU) <- colnames(tmMx.bIS) <- c("user", "elapsed")

We will discard the first 100 iterations as burn-in and keep the remaining 500.

iter <- 600
burn <- 100
samp.PU <- samp.bIS <- matrix(nrow = length(beta), ncol = iter - burn)

The distribution of pixel labels can be summarised by the sufficient statistic of the Potts model:

\( S(z) = \sum_{i \sim \ell \in \mathscr{N}} \delta(z_i, z_\ell) \)

where \( i \sim \ell \in \mathscr{N} \) are all of the pairs of neighbours in the lattice (ie. the cliques) and \( \delta(u,v) \) is 1 if \( u = v \) and 0 otherwise (the Kronecker delta function). swNoData returns this automatically, but with SW we will need to use the function sufficientStat to calculate the sufficient statistic for the labels.

library(bayesImageS)
## Loading required package: Rcpp
## Loading required package: RcppArmadillo
library(PottsUtils)

mask <- matrix(1, 50, 50)
neigh <- getNeighbors(mask, c(2, 2, 0, 0))
block <- getBlocks(mask, 2)
edges <- getEdges(mask, c(2, 2, 0, 0))

n <- sum(mask)
k <- 2
bcrit <- log(1 + sqrt(k))
maxSS <- nrow(edges)

for (i in 1:length(beta)) {
    # PottsUtils
    tm <- system.time(result <- SW(iter, n, k, edges, beta = beta[i]))
    tmMx.PU[i, "user"] <- tm["user.self"]
    tmMx.PU[i, "elapsed"] <- tm["elapsed"]
    res <- sufficientStat(result, neigh, block, k)
    samp.PU[i, ] <- res$sum[(burn + 1):iter]
    print(paste("PottsUtils::SW", beta[i], tm["elapsed"], median(samp.PU[i, 
        ])))

    # bayesImageS
    tm <- system.time(result <- swNoData(beta[i], k, neigh, block, c(0, n), 
        iter))
    tmMx.bIS[i, "user"] <- tm["user.self"]
    tmMx.bIS[i, "elapsed"] <- tm["elapsed"]
    samp.bIS[i, ] <- result$sum[(burn + 1):iter]
    print(paste("bayesImageS::swNoData", beta[i], tm["elapsed"], median(samp.bIS[i, 
        ])))
}
## [1] "PottsUtils::SW 0 9.07 2454"
## [1] "bayesImageS::swNoData 0 0.27 2455"
## [1] "PottsUtils::SW 0.1 8.48 2575.5"
## [1] "bayesImageS::swNoData 0.1 0.390000000000001 2572"
## [1] "PottsUtils::SW 0.2 8.01 2699"
## [1] "bayesImageS::swNoData 0.2 0.43 2700"
## [1] "PottsUtils::SW 0.3 7.39 2832"
## [1] "bayesImageS::swNoData 0.3 0.439999999999998 2832.5"
## [1] "PottsUtils::SW 0.4 6.88 2980"
## [1] "bayesImageS::swNoData 0.4 0.43 2976"
## [1] "PottsUtils::SW 0.5 6.34 3130"
## [1] "bayesImageS::swNoData 0.5 0.43 3129"
## [1] "PottsUtils::SW 0.6 5.85 3311"
## [1] "bayesImageS::swNoData 0.6 0.450000000000003 3310"
## [1] "PottsUtils::SW 0.7 5.14 3515.5"
## [1] "bayesImageS::swNoData 0.7 0.410000000000004 3516.5"
## [1] "PottsUtils::SW 0.8 4.57 3787"
## [1] "bayesImageS::swNoData 0.8 0.409999999999997 3784"
## [1] "PottsUtils::SW 0.9 3.94999999999999 4177.5"
## [1] "bayesImageS::swNoData 0.9 0.420000000000002 4178"
## [1] "PottsUtils::SW 1 3.41 4516.5"
## [1] "bayesImageS::swNoData 1 0.420000000000002 4524.5"
## [1] "PottsUtils::SW 1.1 3.22999999999999 4672.5"
## [1] "bayesImageS::swNoData 1.1 0.420000000000002 4683"
## [1] "PottsUtils::SW 1.2 3.08 4763"
## [1] "bayesImageS::swNoData 1.2 0.390000000000001 4768"
## [1] "PottsUtils::SW 1.3 3.16 4812"
## [1] "bayesImageS::swNoData 1.3 0.420000000000002 4810"
## [1] "PottsUtils::SW 1.4 3.00999999999999 4842"
## [1] "bayesImageS::swNoData 1.4 0.390000000000001 4841"
## [1] "PottsUtils::SW 1.5 2.98 4862.5"
## [1] "bayesImageS::swNoData 1.5 0.400000000000006 4863"
## [1] "PottsUtils::SW 1.6 3.01000000000001 4876"
## [1] "bayesImageS::swNoData 1.6 0.399999999999991 4876"
## [1] "PottsUtils::SW 1.7 2.98999999999999 4884"
## [1] "bayesImageS::swNoData 1.7 0.390000000000001 4884"
## [1] "PottsUtils::SW 1.8 2.97999999999999 4888"
## [1] "bayesImageS::swNoData 1.8 0.400000000000006 4889"
## [1] "PottsUtils::SW 1.9 2.92999999999999 4893"
## [1] "bayesImageS::swNoData 1.9 0.399999999999991 4893"
## [1] "PottsUtils::SW 2 2.91 4896"
## [1] "bayesImageS::swNoData 2 0.38000000000001 4896"

Here is the comparison of elapsed times between the two algorithms (in seconds):

summary(tmMx.PU)
##       user         elapsed    
##  Min.   :2.92   Min.   :2.91  
##  1st Qu.:3.00   1st Qu.:3.01  
##  Median :3.37   Median :3.41  
##  Mean   :4.72   Mean   :4.73  
##  3rd Qu.:6.34   3rd Qu.:6.34  
##  Max.   :9.05   Max.   :9.07
summary(tmMx.bIS)
##       user          elapsed     
##  Min.   :0.270   Min.   :0.270  
##  1st Qu.:0.390   1st Qu.:0.390  
##  Median :0.400   Median :0.410  
##  Mean   :0.404   Mean   :0.404  
##  3rd Qu.:0.430   3rd Qu.:0.420  
##  Max.   :0.450   Max.   :0.450
boxplot(tmMx.PU[, "elapsed"], tmMx.bIS[, "elapsed"], ylab = "seconds elapsed", 
    names = c("SW", "swNoData"))

plot of chunk unnamed-chunk-4

On average, swNoData using RcppArmadillo is seven times faster than SW.

library(lattice)
s_z <- c(samp.PU, samp.bIS)
s_x <- rep(beta, times = iter - burn)
s_a <- rep(1:2, each = length(beta) * (iter - burn))
s.frame <- data.frame(s_z, c(s_x, s_x), s_a)
names(s.frame) <- c("stat", "beta", "alg")
s.frame$alg <- factor(s_a, labels = c("SW", "swNoData"))
xyplot(stat ~ beta | alg, data = s.frame)

plot of chunk unnamed-chunk-5

plot(c(s_x, s_x), s_z, pch = s_a, xlab = expression(beta), ylab = expression(S(z)))
abline(v = bcrit, col = "red")

plot of chunk unnamed-chunk-5

The overlap between the two distributions is almost complete, although there is a statistically significant difference using a 2-way ANOVA:

s.frame$beta <- factor(c(s_x, s_x))
s.fit <- aov(stat ~ alg + beta, data = s.frame)
summary(s.fit)
##                Df   Sum Sq  Mean Sq  F value Pr(>F)    
## alg             1 5.14e+03 5.14e+03 4.24e+00   0.04 *  
## beta           20 1.74e+10 8.72e+08 7.18e+05 <2e-16 ***
## Residuals   20978 2.55e+07 1.21e+03                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(s.fit, which = "alg")
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = stat ~ alg + beta, data = s.frame)
## 
## $alg
##               diff     lwr   upr  p adj
## swNoData-SW 0.9898 0.04714 1.932 0.0396

References

Gelman, A. & Meng, X-L (1998) Simulating normalizing constants: from importance sampling to bridge sampling to path sampling

Geman, S. and Geman, D. (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images

Grelaud, A., Robert, C.P., Marin, J-M, Rodolphe, F. & Taly, J-F (2009) ABC likelihood-free methods for model choice in Gibbs random fields

Murray, I., Ghahramani, Z. & MacKay, D.J.C. (2006) MCMC for Doubly-intractable Distributions

Swendsen, R.H. & Wang, J-S (1987) Nonuniversal critical dynamics in Monte Carlo simulations