A plethora of latent Markov models, including **hidden Markov
models** (HMMs), **hidden semi-Markov models**
(HSMMs), **state space models** (SSMs) as well as
**continuous-time HMMs**, **continuous-time
SSMs**, and **Markov-modulated marked Poisson
processes** (MMMPPs) can be formulated and estimated within the
same framework via directly maximizing the (approximate) likelihood
using the so-called **forward algorithm** (for details see
Zucchini
et al. 2016). While these models are immensely popular for
conducting inference on time series driven by latent processes, due to
their great flexibility, researchers using these models in applied work
often need to build highly customized models for which standard software
implementation is lacking, or the construction of such models in said
software is as complicated as writing fully tailored **R**
code. The latter provides great flexibility and control, but suffers
from slow estimation speeds that make custom solutions inconvenient.
This **R** package addresses the above issues in two ways.
Standard blocks of code common to all these model classes, most
importantly the forward algorithm, are implemented as simple-to-use
functions. These can be added like Lego blocks to an otherwise fully
custom likelihood function, making building fully custom models much
easier. Moreover, under the hood, these functions are written in
**C++**, allowing for 10-20 times faster evaluation time,
and thus drastically speeding up estimation by numerical optimizers like
`nlm()`

or `optim()`

.

Current implementations of the forward algorithm are:

`forward()`

for models with**homogeneous**transition probabilities,`forward_g()`

for general (pre-calculated)**inhomogeneous**transition probabilities (including**continuous-time**HMMs and points processes), and`forward_s()`

for fitting**HSMMs**.

The functions are built to be included in the **negative
log-likelihood function**, after parameters have been transformed
and the *allprobs* matrix (containing all state-dependent
probabilities) has been calculated.

To serve as a powerful toolbox, this package also includes many auxiliary functions like

The

`tpm`

family with`tpm()`

for calculating a homogeneous transition probability matrix via the multinomial logistic link,`tpm_g()`

for calculating general inhomogeneous transition probabilty matrices,`tpm_p()`

for calculating transition matrices of periodically inhomogeneous HMMs,`tpm_cont()`

for calculating the transition probabilites of a continuous-time Markov chain,`tpm_hsmm()`

for calculating the transition matrix of an HSMM-approximating HMM,

the

`stationary`

family to compute stationary and periodically stationary distributions,functions of the

`stateprobs`

family for local decoding and of the`viterbi`

family for global decoding,and

`trigBasisExp()`

for efficient computation of a trigonometric basis expansion.

Further functionalities will be added as needed. Have fun!

To aid in building fully custom likelihood functions, this package also contains several vignettes that show how to simulate data from and estimate a wide range of models:

- Introduction to LaMa
- Inhomogeneous HMMs
- Longitudinal data
- Periodic HMMs
- State space models
- Continuous-time HMMs
- Hidden semi-Markov models
- Markov-modulated (marked) Poisson processes

To install and use the package, you need to have a functional C++ compiler. For details click here. Then you can use:

```
# install.packages("devtools")
::install_github("janoleko/LaMa", build_vignettes = TRUE) devtools
```

Feel free to use

`browseVignettes("LaMa")`

or

`help(package = "LaMa")`

for detailed examples on how to use the package.

`library(LaMa)`

Here we can use `stationary()`

to compute the stationary
distribution.

```
# parameters
= c(0, 6)
mu = c(2, 4)
sigma = matrix(c(0.95, 0.05, 0.15, 0.85), nrow = 2, byrow = TRUE)
Gamma = stationary(Gamma) # stationary HMM
delta
# simulation
= 10000 # rather large
n set.seed(123)
= x = rep(NA, n)
s 1] = sample(1:2, 1, prob = delta)
s[1] = rnorm(1, mu[s[1]], sigma[s[1]])
x[for(t in 2:n){
= sample(1:2, 1, prob = Gamma[s[t-1],])
s[t] = rnorm(1, mu[s[t]], sigma[s[t]])
x[t]
}
plot(x[1:200], bty = "n", pch = 20, ylab = "x",
col = c("orange","deepskyblue")[s[1:200]])
```

Here, we build the transition probability matrix using the
`tpm()`

function, compute the stationary distribution using
`stationary()`

and calculate the log-likelihood using
`forward()`

in the last line.

```
= function(theta.star, x){
mllk # parameter transformations for unconstraint optimization
= tpm(theta.star[1:2])
Gamma = stationary(Gamma) # stationary HMM
delta = theta.star[3:4]
mu = exp(theta.star[5:6])
sigma # calculate all state-dependent probabilities
= matrix(1, length(x), 2)
allprobs for(j in 1:2){ allprobs[,j] = stats::dnorm(x, mu[j], sigma[j]) }
# return negative for minimization
-forward(delta, Gamma, allprobs)
}
```

```
= c(-1,-1,1,4,log(1),log(3))
theta.star # initial transformed parameters: not chosen too well
= Sys.time()
s = nlm(mllk, theta.star, x = x)
mod Sys.time()-s
#> Time difference of 0.1056249 secs
```

Really fast for 10.000 data points!

Again, we use `tpm()`

and `stationary()`

to
tranform the unconstraint parameters to working parameters.

```
# transform parameters to working
= tpm(mod$estimate[1:2])
Gamma = stationary(Gamma) # stationary HMM
delta = mod$estimate[3:4]
mu = exp(mod$estimate[5:6])
sigma
hist(x, prob = TRUE, bor = "white", breaks = 40, main = "")
curve(delta[1]*dnorm(x, mu[1], sigma[1]), add = TRUE, lwd = 2, col = "orange", n=500)
curve(delta[2]*dnorm(x, mu[2], sigma[2]), add = TRUE, lwd = 2, col = "deepskyblue", n=500)
curve(delta[1]*dnorm(x, mu[1], sigma[1])+delta[2]*dnorm(x, mu[2], sigma[2]),
add = TRUE, lwd = 2, lty = "dashed", n=500)
legend("topright", col = c("orange", "deepskyblue", "black"), lwd = 2, bty = "n",
lty = c(1,1,2), legend = c("state 1", "state 2", "marginal"))
```