This vignette creates a detailed link between the methods described in the paper
Herla, F., Horton, S., Mair, P., and Haegeli, P.: Snow profile alignment and similarity assessment for aggregating, clustering, and evaluating of snowpack model output for avalanche forecasting, Geosci. Model Dev., https://doi.org/10.5194/gmd-14-239-2021, 2021.
and this companion R package. While the basic workflow and the use of the high-level functions are described in the vignette Basic workflow, this vignette describes how the (default) workflow of the package can be altered and the how the methods could be improved.
The individual steps of aligning snow profiles—taken from the
documentation of the core function
resamplingRate = NA) (cf.,
resampleSPpairs). This approach handles the profiles with
minimum layer storage, but leads to skewed profile alignments since the
algorithm cannot keep track of different layer thicknesses.
dtw (eponymous R package)
simSP) will be returned.
A series of functions exist that manipulate the snow profiles prior
to the alignment or the similarity assessment. All these manipulations
can be controlled in the arguments to
Rescaling and resampling
scaleSnowHeight: Scale the total snow height of a
profile with a uniform scaling factor (commonly determined from the
height of a second ‘reference’ profile). While there certainly are
justified scenarios for this approach, it is not used per default
anymore. For most cases it is advised to align the profiles on their
native height grids without rescaling them.
resampleSP: Resample an individual profile
reScaleSampleSPx: Both rescale and resample a set of
profiles to identical snow heights and onto a regular grid
Reducing the number of layers
mergeIdentLayers: Merge adjacent layers that have
rmZeroThicknessLayers: Remove or reset layers of zero
thickness (i.e., these layers originate from warping one profile onto
Computing a local cost matrix is fundamental to DTW alignments and is
carried out in
Assessing the differences of individual snow layers Currently, distance functions are implemented for the layer characteristics grain type, hardness, and deposition date. The distance function for categorical grain types relies on a matrix that stores the distances between different categories. Since the similarity requirements are slightly different for aligning profiles versus assessing their similarity, two matrices are implemented:
grainSimilarity_align (Table 1A in the paper)
grainSimilarity_evaluate (Table 1B in the paper)
swissSimilarityMatrix (grain type similarity matrix
defined by Lehning et al, 2001)
Computing a local cost matrix
First, a distance matrix is computed for each included layer characteristic that stores the distances between individual layer combinations. Then these distance matrices are combined into one resulting distance matrix (i.e., local cost matrix) by weighted averaging. Optionally, a preferential layer matching manipulation can be included into the local cost matrix.
distMatSP all parameters related to the local cost
matrix can be controlled, e.g.
solely based on grain type information. Note that this approach should
not be applied when the profiles are matched with deposition date
information that is only available for a few select layers
Obtaining the optimal alignment of pairs of snow profiles is the root
task of this package. All functions and controls from the sections 1.1
and 1.2 above can be modified in the call to the core function
dtwSP. Additional controls are, e.g.
Partial alignments (i.e.,
open.end) can be started from
the snow surface downwards (
top.down) or from the ground
bottom.up). When the function is called to align
the profiles with multiple different boundary conditions (global,
top-down, bottom-up), the alignment that yields the highest similarity
While DTW computes the matching between the layers, the actual
alignment is carried out with a warping function
Since that warping is different for bottom-up and top-down alignments,
and it is different for the two profiles,
solutions for most combinations.
While the hyperparameters and alignment approaches can be tuned to optimize the alignment of select pairs of profiles in a supervised manner, the algorithm will likely be applied unsupervised to a large set of profiles, either to obtain similarity results or to aggregate/cluster a large amount of information. While the default alignment settings produce good results in most cases, large and highly diverse data sets will inevitably contain cases that are not well taken care of. If you align large amounts of profiles, make sure to understand the diversity of conditions in your data set and in which situations the alignment algorithm works well and does not. As a general example, poor layer matches are likelier if their snow depths differ considerably or their layer sequences show very few common patterns.