--- title: "F. Complex multiblock analysis" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{F. Complex multiblock analysis} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width=6, fig.height=4 ) # Legge denne i YAML på toppen for å skrive ut til tex #output: # pdf_document: # keep_tex: true # Original: # rmarkdown::html_vignette: # toc: true ``` ```{r} # Start the multiblock R package library(multiblock) ``` # Complex data structures The following methods for complex data structures are available in the _multiblock_ package (function names in parentheses): * L-PLS - Partial Least Squares in L configuration (_lpls_) * SO-PLS-PM - Sequential and Orthogonalised PLS Path Modeling (_sopls_pm_) ## L-PLS To showcase L-PLS we will use simulated data specifically made for L-shaped data. Regression using L-PLS can be either outwards from _X1_ to _X2_ and _X3_ or inwards from _X2_ and _X3_ to _X1_. In the former case, prediction can either be of _X2_ or _X3_ given _X1_. Cross-validation is performed either on the rows of _X1_ or the columns of _X1_. ```{} ______N | | | | | X3 | | | K|_______| ______N ________J | | | | | | | | | X1 | | X2 | | | | | I|_______| I|_________| ``` ## Simulated L-shaped data We simulate two latent components in L shape with blocks having dimensions (30x20), (20x5) and (6x20) for blocks _X1_, _X2_ and _X3_, respectively. ```{r} set.seed(42) # Simulate data set sim <- lplsData(I = 30, N = 20, J = 5, K = 6, ncomp = 2) # Split into separate blocks X1 <- sim$X1; X2 <- sim$X2; X3 <- sim$X3 ``` ## Exo-L-PLS The first L-PLS will be outwards. Predictions have to be accompanied by a direction. ```{r fig.width=5, fig.height=5} # exo-L-PLS: lp.exo <- lpls(X1,X2,X3, ncomp = 2) # type = "exo" is default # Predict X1 pred.exo.X2 <- predict(lp.exo, X1new = X1, exo.direction = "X2") # Predict X3 pred.exo.X2 <- predict(lp.exo, X1new = X1, exo.direction = "X3") # Correlation loading plot plot(lp.exo) ``` ## Endo-L-PLS The second L-PLS will be inwards. ```{r} # endo-L-PLS: lp.endo <- lpls(X1,X2,X3, ncomp = 2, type = "endo") # Predict X1 from X2 and X3 (in this case fitted values): pred.endo.X1 <- predict(lp.endo, X2new = X2, X3new = X3) ``` ## L-PLS cross-validation Cross-validation comes with choices of directions when applying this to L-PLS since we have both sample and variable links. The cross-validation routines compute RMSECV values and perform cross-validated predictions. ```{r} # LOO cross-validation horizontally lp.cv1 <- lplsCV(lp.exo, segments1 = as.list(1:dim(X1)[1]), trace = FALSE) # LOO cross-validation vertically lp.cv2 <- lplsCV(lp.exo, segments2 = as.list(1:dim(X1)[2]), trace = FALSE) # Three-fold CV, horizontal lp.cv3 <- lplsCV(lp.exo, segments1 = as.list(1:10, 11:20, 21:30), trace = FALSE) # Three-fold CV, horizontal, inwards model lp.cv4 <- lplsCV(lp.endo, segments1 = as.list(1:10, 11:20, 21:30), trace = FALSE) ``` ## SO-PLS Path Modelling The following example uses the _potato_ data and the _wine_ data to showcase some of the functions available for SO-PLS-PM analyses. ### Single SO-PLS-PM model A model with four blocks having 5 components per input block is fitted. We set _computeAdditional_ to _TRUE_ to turn on computation of additional explained variance per added block in the model. ```{r} # Load potato data data(potato) # Single path pot.pm <- sopls_pm(potato[1:3], potato[['Sensory']], c(5,5,5), computeAdditional=TRUE) # Report of explained variances and optimal number of components . # Bootstrapping can be enabled to assess stability. # (LOO cross-validation is default) pot.pm ``` ### Multiple paths in an SO-PLS-PM model A model containing five blocks is fitted. Explained variances for all sub-paths are estimated. ```{r} # Load wine data data(wine) # All path in the forward direction pot.pm.multiple <- sopls_pm_multiple(wine, ncomp = c(4,2,9,8)) # Report of direct, indirect and total explained variance per sub-path. # Bootstrapping can be enabled to assess stability. pot.pm.multiple ```