This vignette presents a general overview of the *clugen*
algorithm. A complete description of the algorithm’s theoretical
framework is available in the article “Generating
multidimensional clusters with support lines” (an open version is available on arXiv).

*Clugen* is an algorithm for generating multidimensional
clusters. Each cluster is supported by a line segment, the position,
orientation and length of which guide where the respective points are
placed. For brevity, *line segments* will be referred to as
*lines*.

Given an \(n\)-dimensional direction
vector \(\mathbf{d}\) (and a number of
additional parameters, which will be discussed shortly), the
*clugen* algorithm works as follows (\(^*\) means the algorithm step is
stochastic):

- Normalize \(\mathbf{d}\).
- \(^*\)Determine cluster sizes.
- \(^*\)Determine cluster centers.
- \(^*\)Determine lengths of cluster-supporting lines.
- \(^*\)Determine angles between \(\mathbf{d}\) and cluster-supporting lines.
- For each cluster:
- \(^*\)Determine direction of the cluster-supporting line.
- \(^*\)Determine distance of point projections from the center of the cluster-supporting line.
- Determine coordinates of point projections on the cluster-supporting line.
- \(^*\)Determine points from their projections on the cluster-supporting line.

Figure 1 provides a stylized overview of the algorithm’s steps.

The example in Figure 1 was generated with the following parameters:

Parameter values | Description |
---|---|

\(n=2\) | Number of dimensions. |

\(c=4\) | Number of clusters. |

\(p=200\) | Total number of points. |

\(\mathbf{d}=\begin{bmatrix}1 & 1\end{bmatrix}^T\) | Average direction. |

\(\theta_\sigma=\pi/16\approx{}11.25^{\circ}\) | Angle dispersion. |

\(\mathbf{s}=\begin{bmatrix}10 & 10\end{bmatrix}^T\) | Average cluster separation. |

\(l=10\) | Average line length. |

\(l_\sigma=1.5\) | Line length dispersion. |

\(f_\sigma=1\) | Cluster lateral dispersion. |

Additionally, all optional parameters (not listed above) were left to
their default values. The complete list of parameters is presented in
the `clugen()`

function documentation.