This vignette shows how to use the package to sequentially refine a bundle of DGP emulators, each of which emulates an output of a simulator.

```
library(lhs)
library(ggplot2)
library(patchwork)
library(dgpsi)
```

We construct a synthetic simulator that has a one-dimensional input
between `[0, 1]`

and a three-dimensional output.

```
<- function(x) {
f = sin(30*((2*x-1)/2-0.4)^5)*cos(20*((2*x-1)/2-0.4))
y1 = 1/3*sin(2*(2*x - 1))+2/3*exp(-30*(2*(2*x-1))^2)+1/3
y2 = (sin(7.5*x)+1)/2
y3 return(cbind(y1, y2, y3))
}
```

Note that the function is defined in such a way that both its input
`x`

and output are matrices. The following figure shows the
true functional forms of the three outputs of the simulator over
`[0, 1]`

:

```
<- seq(0, 1, length = 200)
dense_x <- f(dense_x)
dense_y <- data.frame('x' = dense_x, 'y' = dense_y[,1])
output1 <- data.frame('x' = dense_x, 'y' = dense_y[,2])
output2 <- data.frame('x' = dense_x, 'y' = dense_y[,3])
output3 <- ggplot(data = output1, aes(x = x, y = y)) + geom_line(color = 'dodgerblue2') + ggtitle('Output 1') + theme(plot.title = element_text(size = 10))
p1 <- ggplot(data = output2, aes(x = x, y = y)) + geom_line(color = '#E31A1C') + ggtitle('Output 2') + theme(plot.title = element_text(size = 10))
p2 <- ggplot(data = output3, aes(x = x, y = y)) + geom_line(color = 'green4') + ggtitle('Output 3') + theme(plot.title = element_text(size = 10))
p3 wrap_plots(list(p1, p2, p3)) + plot_annotation(title = 'Synthetic Simulator')
```

We now specify a seed with `set_seed()`

from the package
for reproducibility

`set_seed(99)`

and generate an initial design with 5 design points using the maximin Latin hypercube sampler:

```
<- maximinLHS(5, 1)
X <- f(X) Y
```

We generate a validation dataset to track and stop the sequential design:

```
<- maximinLHS(200, 1)
validate_x <- f(validate_x) validate_y
```

Before we start the sequential design, we build three DGP emulators
that emulate the three outputs of the simulator `f`

independently:

`<- dgp(X, Y[,1], connect = F) m1 `

```
## Auto-generating a 2-layered DGP structure ... done
## Initializing the DGP emulator ... done
## Training the DGP emulator:
## Iteration 500: Layer 2: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 500/500 [00:02<00:00, 220.72it/s]
## Imputing ... done
```

`<- dgp(X, Y[,2], connect = F) m2 `

```
## Auto-generating a 2-layered DGP structure ... done
## Initializing the DGP emulator ... done
## Training the DGP emulator:
## Iteration 500: Layer 2: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 500/500 [00:02<00:00, 240.45it/s]
## Imputing ... done
```

`<- dgp(X, Y[,3]) m3 `

```
## Auto-generating a 2-layered DGP structure ... done
## Initializing the DGP emulator ... done
## Training the DGP emulator:
## Iteration 500: Layer 2: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 500/500 [00:01<00:00, 413.38it/s]
## Imputing ... done
```

Note that we have the global connection turned off for the first two
DGP emulators because we found that this yields better emulation
performances. We then build a bundle of the three DGP emulators using
`pack()`

:

`<- pack(m1, m2, m3) m `

To start with the sequential design, we specify the limit of the input:

`<- c(0, 1) lim `

and set a target RMSE to stop the sequential design:

`<- 0.01 target `

Here we choose `0.01`

because it is equivalent to
`1%`

normalized error given that the ranges of both outputs
are `[0,1]`

. We can of course set different targets for
different outputs, e.g., by setting
`target <- c(0.005, 0.02, 0.01)`

.

We start the first-wave of the sequential design with 10 steps:

```
# 1st wave of the sequential design with 10 steps
<- design(m, N = 10, limits = lim, f = f, x_test = validate_x, y_test = validate_y, target = target) m
```

```
## Initializing ... done
## * RMSE: 0.521768 0.162249 0.010418
## Iteration 1:
## - Locating ... done
## * Next design point (Emulator1): 0.003421
## * Next design point (Emulator2): 0.003421
## * Next design point (Emulator3): 0.007783
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.476133 0.169318 0.005171
## Iteration 2:
## - Locating ... done
## * Next design point (Emulator1): 0.362879
## * Next design point (Emulator2): 0.995688
## * Next design point (Emulator3): None (target reached)
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.472015 0.201739 0.005171
##
## ...
##
## Iteration 10:
## - Locating ... done
## * Next design point (Emulator1): 0.033959
## * Next design point (Emulator2): 0.281797
## * Next design point (Emulator3): None (target reached)
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.067343 0.011225 0.005171
## Targets are not reached for all emulators at the end of the sequential design.
```

It can be seen that at the second step, the DGP emulator for the
third output has already reached the target, so for the rest of the
steps no further refinements (i.e., additions of design points to the
third DGP emulator) are performed. At the end of the first wave, the DGP
emulators for both the first and second outputs have not reached the
target yet. At this point, we can proceed to a second wave by repeating
the command above, but we show below an alternative way, in which we
define an aggregation function that aggregates criterion scores across
the three outputs such that the same design points are added to the
three emulators at each step (instead of different design points for
each emulator). We define the aggregation function `g`

that
aggregate scores by calculating their weighted average:

```
<- function(x, weight){
g 1] <- x[,1]*weight[1]
x[,2] <- x[,2]*weight[2]
x[,3] <- x[,3]*weight[3]
x[,return(rowSums(x))
}
```

Since the third emulator has already reached the target, we assign zero weights to it and weights of 0.8 and 0.2 to the first and second emulators respectively:

`<- c(0.8, 0.2, 0) weight `

We now pass both the aggregate function `g()`

and its
`weight`

argument to `design()`

for the second
wave of the sequential design with a further 15 steps:

```
# 2nd wave with 15 steps
<- design(m, N = 15, limits = lim, f = f, x_test = validate_x, y_test = validate_y, aggregate = g, target = 0.01, weight = weight) m
```

```
## Initializing ... done
## * RMSE: 0.067343 0.011225 0.005171
## Iteration 1:
## - Locating ... done
## * Next design point (Emulator1): 0.403398
## * Next design point (Emulator2): 0.403398
## * Next design point (Emulator3): None (target reached)
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.114102 0.007900 0.005171
## Iteration 2:
## - Locating ... done
## * Next design point (Emulator1): 0.472785
## * Next design point (Emulator2): None (target reached)
## * Next design point (Emulator3): None (target reached)
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.057033 0.007900 0.005171
##
## ...
##
## Iteration 9:
## - Locating ... done
## * Next design point (Emulator1): 0.049099
## * Next design point (Emulator2): None (target reached)
## * Next design point (Emulator3): None (target reached)
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.007011 0.007900 0.005171
## Target reached! The sequential design stops at step 9.
```

The first and the second emulators reached the target after iteration
9 and 1 of the second wave, respectively. The sequential designs of the
three emulators can be plotted with `draw()`

:

```
draw(m, 1, 'design') + draw(m, 2, 'design') + draw(m, 3, 'design') +
plot_layout() & theme(legend.position = 'bottom')
```

The figure above shows that for the first emulator most of the design points are added below 0.5 whilst for the second emulator most of the design points concentrate around 0.5. For the third emulator, the resulting design is space-filling. It can be seen that these design point distributions are consistent with the functional complexities of the three outputs.

We build three independent DGP emulators for the three outputs with static space-filling Latin hypercube designs (LHD) of size 10, 20, and 30 respectively:

```
# DGP emulators with a LHD of size 10
<- maximinLHS(10, 1)
X1 <- f(X1)
Y1 <- dgp(X1, Y1[,1], connect = F, verb = F)
m11 <- dgp(X1, Y1[,2], connect = F, verb = F)
m12 <- dgp(X1, Y1[,3], verb = F) m13
```

```
# DGP emulator with a LHD of size 20
<- maximinLHS(20, 1)
X2 <- f(X2)
Y2 <- dgp(X2, Y2[,1], connect = F, verb = F)
m21 <- dgp(X2, Y2[,2], connect = F, verb = F)
m22 <- dgp(X2, Y2[,3], verb = F) m23
```

```
# DGP emulator with a LHD of size 30
<- maximinLHS(30, 1)
X3 <- f(X3)
Y3 <- dgp(X3, Y3[,1], connect = F, verb = F)
m31 <- dgp(X3, Y3[,2], connect = F, verb = F)
m32 <- dgp(X3, Y3[,3], verb = F) m33
```

We then extract their RMSEs

```
# validations of the first DGP emulator
<- validate(m11, x_test = validate_x, y_test = validate_y[,1], verb = F)
m11 <- validate(m21, x_test = validate_x, y_test = validate_y[,1], verb = F)
m21 <- validate(m31, x_test = validate_x, y_test = validate_y[,1], verb = F)
m31 <- data.frame('N' = c(10, 20, 30), 'rmse' = c(m11$oos$rmse, m21$oos$rmse, m31$oos$rmse), 'LHD' = c('lhd-10', 'lhd-20', 'lhd-30'))
rmse_static_1 # validations of the second DGP emulator
<- validate(m12, x_test = validate_x, y_test = validate_y[,2], verb = F)
m12 <- validate(m22, x_test = validate_x, y_test = validate_y[,2], verb = F)
m22 <- validate(m32, x_test = validate_x, y_test = validate_y[,2], verb = F)
m32 <- data.frame('N' = c(10, 20, 30), 'rmse' = c(m12$oos$rmse, m22$oos$rmse, m32$oos$rmse), 'LHD' = c('lhd-10', 'lhd-20', 'lhd-30'))
rmse_static_2 # # validations of the third DGP emulator
<- validate(m13, x_test = validate_x, y_test = validate_y[,3], verb = F)
m13 <- validate(m23, x_test = validate_x, y_test = validate_y[,3], verb = F)
m23 <- validate(m33, x_test = validate_x, y_test = validate_y[,3], verb = F)
m33 <- data.frame('N' = c(10, 20, 30), 'rmse' = c(m13$oos$rmse, m23$oos$rmse, m33$oos$rmse), 'LHD' = c('lhd-10', 'lhd-20', 'lhd-30')) rmse_static_3
```

and add them to the sequential design validation plot (in log-scale) for comparisons:

```
<- draw(m, emulator = 1, type = 'rmse', log = T) +
p1 geom_point(data = rmse_static_1, mapping = aes(x = N, y = rmse, group = LHD, shape = LHD), color = '#E69F00', size = 1.5) +
scale_shape_manual(values = c(3, 4, 8))
<- draw(m, emulator = 2, type = 'rmse', log = T) +
p2 geom_point(data = rmse_static_2, mapping = aes(x = N, y = rmse, group = LHD, shape = LHD), color = '#E69F00', size = 1.5) +
scale_shape_manual(values = c(3, 4, 8))
<- draw(m, emulator = 3, type = 'rmse', log = T) +
p3 geom_point(data = rmse_static_3, mapping = aes(x = N, y = rmse, group = LHD, shape = LHD), color = '#E69F00', size = 1.5) +
scale_shape_manual(values = c(3, 4, 8))
+ p2 + p3 + plot_layout(guides = 'collect') & theme(legend.position = 'bottom') p1
```

It can be seen from the plot above that with the sequential design, emulators in the bundle can achieve higher or similar accuracy with smaller number of design points.

See `Sequential Design I`

for the sequential design of a DGP emulator on a 2D simulator.