The purpose of this documentation is to understand how candidates are generated
. To do this, before applying the process, there is a step of preparing the dataset. After that, the process can be started. The goal in the end is to find the candidates and gather all the necessary information.
The process is composed by five steps:
This step is fundamental to ensure that the data is in the same scale/basis. To do this normalization the method Z-score
is used.
head(STMotif::example_dataset[,1:10])
#> 1 2 3 4 5 6 7 8 9 10
#> 360 737 1350 869 750 1138 758 1006 1095 99 -83
#> 361 283 565 504 317 1849 944 -80 -895 -936 906
#> 362 -118 -375 -564 -803 870 472 -922 -1009 -698 741
#> 363 -696 -844 -654 -1303 -474 -591 -262 1034 1012 376
#> 364 -251 -622 -14 -587 -1108 -1401 404 1545 1696 247
#> 365 645 -10 -4 411 -858 -1261 -574 -329 -367 -680
head(round(STSNormalization(vector = as.matrix(STMotif::example_dataset)),digits = 2)[,1:10])
#> 1 2 3 4 5 6 7 8 9 10
#> 360 0.21 0.39 0.25 0.21 0.33 0.22 0.29 0.32 0.02 -0.04
#> 361 0.07 0.16 0.14 0.08 0.54 0.27 -0.04 -0.28 -0.29 0.26
#> 362 -0.05 -0.12 -0.18 -0.25 0.25 0.13 -0.29 -0.31 -0.22 0.21
#> 363 -0.22 -0.26 -0.21 -0.40 -0.15 -0.19 -0.09 0.30 0.29 0.10
#> 364 -0.09 -0.20 -0.02 -0.19 -0.34 -0.43 0.11 0.45 0.50 0.06
#> 365 0.18 -0.01 -0.01 0.11 -0.27 -0.39 -0.18 -0.11 -0.12 -0.22
The observations of subsequences trends to be normally distributed. Thereby, the discretization space is made over the Gaussian curve in different intervals with the same probability. To encode values, we must give a number of letters in the alphabet.
SAX Encoding with 3 letters
This step divides the original dataset into blocks using the parameters Spatial Slice and Time Slice.
Blocks creation
The goal of this step is to create a combined series from each block. After this combination, the spatial-time series present in the block are transformed into a combined series.
Combine the spatial-time series into each block
Now we can run the motif discovery algorithm
and find the candidates.
Application of motif discovery algorithm