2022-05-24 @Atsushi Kawaguchi

In this vignette, the output is omitted. Please refer to the following book for the output.

Kawaguchi A. (2021). Multivariate Analysis for Neuroimaging Data. CRC Press.

First, we prepare and then we create the data matrix. Install package (as necessary)

`if(!require("mand")) install.packages("mand")`

Load package

`library(mand)`

The `mand`

package has a function to generate simulation
brain data from the base image, the difference image and the standard
deviation image. These basic images are loaded as follows.

```
data(baseimg)
data(diffimg)
data(mask)
data(sdevimg)
```

The number of voxels in the original 3D image is as follows.

`dim(baseimg)`

To understand the result easily, the difference region was restricted to Parahippocampus and Hippocampus.

`diffimg2 = diffimg * (tmpatlas %in% 37:40)`

An image data matrix with subjects in the rows and voxels in the
columns was generated by using the `simbrain`

function.

`img1 = simbrain(baseimg = baseimg, diffimg = diffimg2, sdevimg=sdevimg, mask=mask, n0=20, c1=0.01, sd1=0.05)`

The base image, the difference image and the standard deviation image were specified in the first three arguments. The out-of-brain region was specified by the mask argument, which was the binary image. The remaining arguments are the number of subjects per group, the coefficient multiplied by the difference image and the standard deviation for the noise.

The data matrix dimension was as follows.

`dim(img1$S)`

The `rec`

function creates the 3D image from the
vectorized data (the first subject).

`coat(rec(img1$S[1,], img1$imagedim, mask=img1$brainpos))`

The standard deviation image is created from the resulting data matrix.

```
sdimg = apply(img1$S, 2, sd)
coat(template, rec(sdimg, img1$imagedim, mask=img1$brainpos))
```

If the input is a matrix, a principal component analysis is
implemented by the `msma`

function of the `msma`

package. Principal component analysis with the number of components of 2
for the image data matrix is performed as follows.

`(fit111 = msma(img1$S, comp=2))`

The scatter plots for the score vectors specified by the argument
`v`

. The argument `axes`

is specified by the two
length vectors on which components are displayed.

`plot(fit111, v="score", axes = 1:2, plottype="scatter")`

The weight (loading) vectors can be obtained and reconstructed as follows.

```
midx = 1 ## the index for the modality
vidx = 1 ## the index for the component
Q = fit111$wbX[[midx]][,vidx]
outstat1 = rec(Q, img1$imagedim, mask=img1$brainpos)
```

The reconstructed loadings as image is overlayed on the template.

`coat(template, outstat1)`

The output is unclear; however, this will be improved later.

This is an example of the two-step dimension reduction.

Generate radial basis function.

```
B1 = rbfunc(imagedim=img1$imagedim, seppix=2, hispec=FALSE,
mask=img1$brainpos)
```

Multiplying the basis function to image data matrix.

`SB1 = basisprod(img1$S, B1)`

The original dimension was as follows.

`dim(img1$S)`

The dimension was reduced to as follows.

`dim(SB1)`

The Principal Component Analysis (PCA) is applied to the dimension-reduced image.

`(fit211 = msma(SB1, comp=2))`

The loading is reconstructed to the original space by using the
`rec`

function.

```
Q = fit211$wbX[[1]][,1]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
```

The plotted (sign-flipping) loading is smoother than the one without the dimension reduction by the basis function.

```
outstat2 = -outstat1
coat(template, outstat2)
```

If `lambdaX`

(>0) is specified, a sparse principal
component analysis is implemented.

`(fit112 = msma(SB1, comp=2, lambdaX=0.075))`

The plotted loading is narrower than that from the PCA.

```
Q = fit112$wbX[[midx]][,vidx]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
outstat2 = outstat1
coat(template, outstat2)
```

The region is reported as follows to be compared with the next method.

`atlastable(tmpatlas, outstat2, atlasdataset)`

The `simbrain`

generates the synthetic brain image data
and the binary outcome. The outcome Z is obtained.

`Z = img1$Z`

If the outcome Z is specified in the `msma`

function, a
supervised sparse principal component analysis is implemented.

`(fit113 = msma(SB1, Z=Z, comp=2, lambdaX=0.075, muX=0.5))`

The plotted loading is located differently from the sparse PCA.

```
Q = fit113$wbX[[1]][,1]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
outstat2 = -outstat1
coat(template, outstat2)
```

The region near the hippocampus, which differs from the sparse PCA (without supervision).

`atlastable(tmpatlas, outstat2, atlasdataset)`

The loading for the second component

```
Q = fit113$wbX[[1]][,2]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
outstat2 = -outstat1
coat(template, outstat2)
```

`atlastable(tmpatlas, outstat2, atlasdataset)`

This is similar to the result from the sparse PCA (without supervision).

The following method can be used to plot the weights of several
components simultaneously. It is first reconstructed in three dimensions
with the `multirec`

function and then plotted with the
`multicompplot`

function. It is set to display four columns
per component.

```
ws = multirec(fit113, imagedim=img1$imagedim, B=B1,
mask=img1$brainpos)
multicompplot(ws, template, col4comp=4)
```

In this section, we perform some experiments to show the characteristics of the methods in the previous section.

The basis function is parameterized by the radius, which is the
argument `seppix`

of the `rbfunc`

function. The
`msma`

function is applied.

```
seppixs = 2:7
fit115s = lapply(seppixs, function(sp){
B1 = rbfunc(imagedim=img1$imagedim, seppix=sp,
hispec=FALSE, mask=img1$brainpos)
SB1 = basisprod(img1$S, B1)
fit=msma(SB1, Z=Z, comp=2, lambdaX=0.075, muX=0.5)
list(fit=fit, B1=B1)
})
```

```
par(mfrow=c(2,3), mar=c(1,2,1,2))
for(i in 1:length(seppixs)){
Q = fit115s[[i]]$fit$wbX[[midx]][,vidx]
outstat1 = rec(Q, img1$imagedim, B=fit115s[[i]]$B1,
mask=img1$brainpos)
coat(template, -outstat1, pseq=10,color.bar=FALSE,
paron=FALSE, main=paste("seppix =", seppixs[i]))
}
```

With the larger value it is difficult to determine the region.

The iterative fits with different regularized parameters are
implemented specifying them using the argument `lambdaX`

.

```
lambdaXs = round(seq(0, 0.2, by=0.005), 3)
fit114s = lapply(lambdaXs, function(lam)
msma(SB1, Z=Z, comp=2, lambdaX=lam, muX=0.5, type="lasso") )
```

For some of the parameters, the sparsity can be controlled by \(\lambda\), as shown below.

```
lambdaXs2 = c(0, 0.025, 0.05, 0.075, 0.1, 0.15)
par(mfrow=c(2,3), mar=c(1,2,1,2))
for(i in which(lambdaXs %in% lambdaXs2)){
Q = fit114s[[i]]$wbX[[1]][,1]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
coat(template, -outstat1, pseq=10,color.bar=FALSE,
paron=FALSE, main=paste("lambda =", lambdaXs[i]))
}
```

The optimal value can be determined by the BIC.

```
nzwbXs = unlist(lapply(fit114s, function(x) x$nzwbX[2]))
BICs = unlist(lapply(fit114s, function(x) x$bic[2]))
(optlam = lambdaXs[which.min(BICs)])
(optnzw = nzwbXs[which.min(BICs)])
```

The plot of BIC values against the number of non-zero loadings and the \(\lambda\).

```
par(mfrow=c(1,2))
plot(lambdaXs, BICs, xlab="lambda", ylab="BIC")
abline(v=optlam, col="red", lty=2)
plot(nzwbXs, BICs, xlab="Number of non-zeros", ylab="BIC", log="x")
abline(v=optnzw, col="red", lty=2)
```

The optimal BIC value was \(\lambda\) which was relatively better among reconstructed loading images with different \(\lambda\)’s as shown above.

For the `msma`

function, four types of penalty functions
are currently available. These depend on parameters, “lasso” and “hard”
depend on one parameter, and “scad” and “mcp” depend on two
parameters.

```
penalties2 = c("lasso", "hard", "scad", "mcp")
etas = list(lasso=1, hard=1, scad=c(1, 3.7), mcp=c(2, 3))
```

Using the internal function `sparse`

in the
`mand`

package, we see how each penalty function performs the
transformation.

```
xs = seq(-6, 6, by=0.1)
par(mfrow=c(2,3), mar=c(2,2,3,2))
for(p1 in penalties2){
eta1 = etas[[p1]]
for(e1 in eta1){
sout1 = sparse(xs, 2, type=p1, eta=e1)
plot(xs, sout1, xlab="", ylab="",
main=paste(p1, "(eta =", e1, ")"), type="b")
}}
```

Next, we apply each penalty function in the `msma`

function to the brain image data.

```
par(mfrow=c(2,3), mar=c(1,2,1,2))
for(p1 in penalties2){
eta1 = etas[[p1]]
for(e1 in eta1){
fit = msma(SB1, Z=Z, comp=2, lambdaX=0.025, muX=0.5,
type=p1, eta=e1)
Q = fit$wbX[[midx]][,vidx]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
outstat2 = -outstat1
coat(template, outstat2, pseq=10, color.bar=FALSE,
paron=FALSE, main=paste(p1, "(eta =", e1, ")"))
}}
```

The number of components was set to be 30.

`fit114 = msma(SB1, Z=Z, comp=30, muX=0.5)`

The cumulative percentage of the explained variance (CPEV) is plotted
against the numbers of components. When the argument `v`

is
specified as “cpev”, the CPEVs are plotted.

```
plot(fit114, v="cpev")
abline(h=0.8, lty=2)
```

The function `ncompsearch`

searches for the optimal value
based on the criterion.

```
(ncomp1 = ncompsearch(SB1, Z=Z, muX=0.5,
comps = 50, criterion="BIC"))
```

The criterion can use not only BIC but also CV (Cross Validation).

```
(ncomp2 = ncompsearch(SB1, Z=Z, muX=0.5,
comps = c(1, seq(5, 30, by=5)), criterion="CV"))
```

```
par(mfrow=c(1,2))
plot(ncomp1)
plot(ncomp2)
```

For each component number from 1 to 5, the optimal \(\lambda\) was selected, and the number of non-zero loadings in each component was counted.

```
maxncomp = 5
opts = sapply(1:maxncomp, function(c1){
opt=regparasearch(SB1, Z=Z, comp=c1, muX=0.5)$optlambdaX
fit = msma(SB1, Z=Z, comp=c1, lambdaX=opt, muX=0.5)
nz = rep(NA, maxncomp)
nz[1:c1] = fit$nzwbX
c(c1, round(opt,3), nz)
})
```

```
opts1=t(opts)
colnames(opts1) = c("#comp", "lambda", paste("comp",1:maxncomp))
kable(opts1, "latex", booktabs = T)
```

From this result, it can be seen that when the number of components is large, a small non-zero loadings are selected for each component, and when the number of components is small, a large non-zero loadings are selected for each component.

```
(ncomp3 = ncompsearch(SB1, Z=Z, muX=0.5, lambdaX=0.075, comps = 30, criterion="BIC"))
plot(ncomp3)
```

Thus, there are several strategies for selecting the \(\lambda\) and the number of components. The
`msma`

package has an `optparasearch`

function and
the `search.method`

argument allows you to select one of four
methods.

The `regparaonly`

method searches for the regularized
parameters with a fixed number of components (set to be 5 and 10 as the
default).

`(opt11 = optparasearch(SB1, Z=Z, muX=0.5, comp=5, search.method = "regparaonly", criterion="BIC"))`

The optimized parameters are used in the `msma`

function
as follows.

`(fit311 = msma(SB1, Z=Z, muX=0.5, comp=opt11$optncomp, lambdaX=opt11$optlambdaX))`

The `regpara1st`

identifies the regularized parameters by
fixing the number of components, then searching for the number of
components with the selected regularized parameters. Note that the
default is to search up to 10 components.

```
(opt12 = optparasearch(SB1, Z=Z, muX=0.5, search.method = "regpara1st", criterion="BIC"))
fit312 = msma(SB1, Z=Z, muX=0.5, comp=opt12$optncomp, lambdaX=opt12$optlambdaX)
```

The `ncomp1st`

method identifies the number of components
with a regularized parameter of 0, then searches for the regularized
parameters with the selected number of components.

```
(opt13 = optparasearch(SB1, Z=Z, muX=0.5, search.method = "ncomp1st", criterion="BIC"))
fit313 = msma(SB1, Z=Z, muX=0.5, comp=opt13$optncomp, lambdaX=opt13$optlambdaX)
```

If the argument `criterion4ncomp`

is specified as
`criterion4ncomp="CV"`

, only the criteria for selecting the
number of components can be changed to CV.

The `simultaneous`

method identifies the number of
components by searching the regularized parameters in each
component.

```
(opt14 = optparasearch(SB1, Z=Z, muX=0.5, search.method = "simultaneous", criterion="BIC"))
fit314 = msma(SB1, Z=Z, muX=0.5, comp=opt14$optncomp, lambdaX=opt14$optlambdaX)
```

In this example, the results were the same except for
the`ncomp1st`

.

The NMF can be implemented by invoking the `nmf`

function
of the NMF library.

```
if(!require("NMF")) install.packages("NMF")
library(NMF)
```

NMF can be performed on a brain image data matrix with the number of components as 2 in the following way.

`res = nmf(SB1, 2)`

Using the `coef`

function, we can extract the weights and
plot them on the brain image using the `coat`

function in the
same way as when applying the `msma`

function.

```
Q = t(coef(res))[,1]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
coat(template, outstat1)
```

The ICA can be implemented by invoking the `icaimax`

function of the `ica`

library.

```
if(!require("ica")) install.packages("ica")
library(ica)
```

ICA using the information-maximization (Infomax) approach can be performed on a brain image data matrix with the number of components as 2 in the following way.

`imod = icaimax(SB1,2)`

In this case, the weights are extracted as follows and plotted on the brain image in the same way.

```
Q = imod$M[,1]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
coat(template, outstat1)
```

The package for drawing more informative dendrograms is loaded.

`library(dendextend)`

By inputting fit111, which is the result of PCA being applied to the
`hcmsma`

function by the `msma`

function,
hierarchical clustering with the score as input is executed.

`hcmsma111 = hcmsma(fit111)`

The next step is to plot and draw the dendrogram.

```
dend = as.dendrogram(hcmsma111$hcout)
d1 = color_branches(dend, k=4, groupLabels=TRUE)
labels_colors(d1) = Z[as.numeric(labels(d1))]+1
plot(d1)
```

The data was set to be divided into four clusters. The number at the bottom of the dendrogram is the subject number, where black is the control and red is the case.

It seems that clusters 1 and 2 have many cases, and cluster 4 also seems to have many cases. The number is summarized as the matrix with the case or control in the row and the cluster to belong in the column.

```
clus=cutree(d1, 4, order_clusters_as_data = FALSE)
clus=clus[as.character(1:length(clus))]
table(Z, clus)
```

As can be seen, the PCA score did not yield a cluster that completely bisected case and control.

Similarly, we performed clustering using the scores calculated from the sparse PCA.

```
hcmsma112 = hcmsma(fit112)
dend = as.dendrogram(hcmsma112$hcout)
d1 = color_branches(dend, k=4, groupLabels=TRUE)
labels_colors(d1) = Z[as.numeric(labels(d1))]+1
plot(d1)
clus=cutree(d1, 4, order_clusters_as_data = FALSE)
clus=clus[as.character(1:length(clus))]
table(Z, clus)
```

This result shows that case and control are more sparsely clustered.

Finally, we performed clustering with scores from the supervised sparse PCA.

```
hcmsma113 = hcmsma(fit113)
dend = as.dendrogram(hcmsma113$hcout)
d1 = color_branches(dend, k=4, groupLabels=TRUE)
labels_colors(d1) = Z[as.numeric(labels(d1))]+1
plot(d1)
clus=cutree(d1, 4, order_clusters_as_data = FALSE)
clus=clus[as.character(1:length(clus))]
table(Z, clus)
```

Because they are supervised by this case or control, the case and control are nicely clustered. This analysis indicates that the case is further divided into two clusters. For practical purposes, it may be useful for disease subtype classification.