dynamite: Bayesian Modeling and Causal Inference for Multivariate Longitudinal Data

Project Status: Active – The project has reached a stable, usable state and is being actively developed. R-CMD-check Codecov test coverage Status at rOpenSci Software Peer Review dynamite status badge dynamite CRAN badge

The dynamite R package provides an easy-to-use interface for Bayesian inference of complex panel (time series) data comprising of multiple measurements per multiple individuals measured in time via dynamic multivariate panel models (DMPM). The main features distinguishing the package and the underlying methodology from many other approaches are:

The dynamite package is developed with the support of Academy of Finland grant 331817 (PREDLIFE). For further information on DMPMs and the dynamite package, see the related arXiv and SocArXiv preprints.

Installation

You can install the most recent stable version of dynmite from CRAN or the development version from R-universe by running one the following lines:

install.packages("dynamite")
install.packages("dynamite", repos = "https://ropensci.r-universe.dev")

Example

A single-channel model with time-invariant effect of z, time-varying effect of x, lagged value of the response variable y and a group-specific random intercepts:

set.seed(1)
library(dynamite)
gaussian_example_fit <- dynamite(
  obs(y ~ -1 + z + varying(~ x + lag(y)) + random(~1), family = "gaussian") +
    splines(df = 20),
  data = gaussian_example, time = "time", group = "id",
  iter = 2000, chains = 2, cores = 2, refresh = 0
)

Summary of the model:

gaussian_example_fit
#> Model:
#>   Family   Formula                                       
#> y gaussian y ~ -1 + z + varying(~x + lag(y)) + random(~1)
#> 
#> Correlated random effects added for response(s): y
#> 
#> Data: gaussian_example (Number of observations: 1450)
#> Grouping variable: id (Number of groups: 50)
#> Time index variable: time (Number of time points: 30)
#> 
#> Smallest bulk-ESS: 557 (sigma_nu_y_alpha)
#> Smallest tail-ESS: 1032 (sigma_nu_y_alpha)
#> Largest Rhat: 1.006 (alpha_y[28])
#> 
#> Elapsed time (seconds):
#>         warmup sample
#> chain:1  5.169  2.753
#> chain:2  4.897  1.763
#> 
#> Summary statistics of the time- and group-invariant parameters:
#> # A tibble: 6 × 10
#>   variable      mean median      sd     mad     q5   q95  rhat ess_bulk ess_tail
#>   <chr>        <num>  <num>   <num>   <num>  <num> <num> <num>    <num>    <num>
#> 1 beta_y_z    1.97   1.97   0.0121  0.0124  1.95   1.99  1.00     2122.    1385.
#> 2 sigma_nu_y… 0.0944 0.0938 0.0112  0.0113  0.0774 0.114 0.999     557.    1032.
#> 3 sigma_y     0.198  0.198  0.00368 0.00382 0.192  0.204 1.00     2169.    1398.
#> 4 tau_alpha_y 0.209  0.202  0.0497  0.0453  0.143  0.298 1.00     1237.    1419.
#> 5 tau_y_x     0.362  0.353  0.0674  0.0650  0.268  0.485 1.00     2177.    1670.
#> 6 tau_y_y_la… 0.106  0.103  0.0216  0.0206  0.0770 0.146 1.00     1936.    1144.

Posterior estimates of time-varying effects:

plot_deltas(gaussian_example_fit, scales = "free")

And group-specific intercepts:

plot_nus(gaussian_example_fit, groups = 1:10)

Traceplots and density plots:

plot(gaussian_example_fit, type = "beta")

Posterior predictive samples for the first 4 groups (samples based on the posterior distribution of model parameters and observed data on first time point):

library(ggplot2)
pred <- predict(gaussian_example_fit, n_draws = 100)
pred |>
  dplyr::filter(id < 5) |>
  ggplot(aes(time, y_new, group = .draw)) +
  geom_line(alpha = 0.25) +
  # observed values
  geom_line(aes(y = y), colour = "tomato") +
  facet_wrap(~id) +
  theme_bw()

For more examples, see the package vignette and the blog post about dynamite.

Contributing

Contributions are very welcome, see CONTRIBUTING.md for general guidelines.