This package STSMotifs
allows to perform a research of motif in spatial-time series. The main purpose is to find a way to handle the issue of large amounts of data. The package offers a way to do this research quickly and efficiently. To find the motifs, the Combined Series Approach (CSA)
is used. The process is decomposed by several steps :
To use functions of this package, some inputs are needed. The quality of outputs depends strongly by these parameters.
dataset
: Dataframe which contains numerics values. Columns represent the space and rows the time.#> 1 2 3 4 5 6 7 8 9 10
#> 1 737 1350 869 750 1138 758 1006 1095 99 -83
#> 2 283 565 504 317 1849 944 -80 -895 -936 906
#> 3 -118 -375 -564 -803 870 472 -922 -1009 -698 741
#> 4 -696 -844 -654 -1303 -474 -591 -262 1034 1012 376
#> 5 -251 -622 -14 -587 -1108 -1401 404 1545 1696 247
#> 6 645 -10 -4 411 -858 -1261 -574 -329 -367 -680
alpha
: The size of the alphabet used to encode the numerical values into a string with SAX.
window_size
: The size of the subsequences.
tslice
and sslice
: The size of the blocks.
A part of the process is applied into blocks (subsets of the original dataset). With the tslice (“Time slice” number of rows in each block) and sslice (“Space slice” number of columns in each block), the user can specify the block size and the block shape.
kappa
: Threshold to check the minimal number of spatial occurrences of each motif.
sigma
: Threshold to check the minimal number of global occurrences of each motif.
In this step, using tslice and sslice parameters, we create blocks from the original dataset. By using CSA, each column of the block are combined to create a single big series. The output of this step is described below.
See more at Generation of candidates
This step requires the candidates and the kappa and sigma thresholds. All the information about candidates are extracted and manipulated. In the end, the candidates that passed the restriction of the two thresholds are stored into a list of motifs. Each motif has as information:
See more at Treatment of candidates
The previous step has created a list of motifs with all information about them. These motifs are ranked by their global and spatial occurrences. The output is the same as the previous step but ordered.
There are three ways to visualize the result:
Plot the intensity of values and highlight one or the top five motifs.
Plot the spatial-time series by selecting a range of columns in the dataset and highlight one motif.
Run shiny application to have an interactive interface to visualize the result.
To see an example of output : Output Example