Generation of candidates

Amin Bazaz, Heraldo Borges, Eduardo Ogasawara

2018-08-23

The purpose of this documentation is to understand how candidates are generated. To do this, before applying the process, there is a step of preparing the dataset. After that, the process can be started. The goal in the end is to find the candidates and gather all the necessary information.

Presentation of the process

The process is composed by five steps:

  1. Normalization
  2. Symbolic Aggregation ApproXimation (SAX)
  3. Partitioning the spatial-time dataset into blocks
  4. Combination of spatial-time series in each block
  5. Generation of candidates

Description of each step

Normalization

This step is fundamental to ensure that the data is in the same scale/basis. To do this normalization the method Z-score is used.

Symbolic Aggregation ApproXimation (SAX)

The observations of subsequences trends to be normally distributed. Thereby, the discretization space is made over the Gaussian curve in different intervals with the same probability. To encode values, we must give a number of letters in the alphabet.

SAX Encoding with 3 letters

SAX Encoding with 3 letters

Partitioning spatial-time series into blocks

This step divides the original dataset into blocks using the parameters Spatial Slice and Time Slice.

Blocks creation

Blocks creation

Combination of spatial-time series in each block

The goal of this step is to create a combined series from each block. After this combination, the spatial-time series present in the block are transformed into a combined series.

Combine the spatial-time series into each block

Combine the spatial-time series into each block

Generation of candidates

Now we can run the motif discovery algorithm and find the candidates.

Application of motif discovery algorithm

Application of motif discovery algorithm