Longitudinal Models in missingHE

Currently, only longitudinal selection models can be fitted using missingHE, for which the package provides a series of customisation options to allow a flexible specification of the models in terms of modelling assumptions and prior choices. These can be extremely useful for handling the typical features of trial-based CEA data, such as non-normality, clustering, and type of missingness mechanism. This tutorial shows how it is possible to customise different aspects of longitudinal models using the arguments of each type of function in the package. Throughout, we will use the built-in dataset called PBS as a toy example, which is directly available when installing and loading missingHE in your R workspace. See the vignette called Introduction to missingHE for an introductory tutorial of each function in missingHE and a general presentation of the data used in trial-based economics evaluations.

If you would like to have more information on the package, or would like to point out potential issues in the current version, feel free to contact the maintainer at a.gabrio@maastrichtuniversity.nl. Suggestions on how to improve the package are also very welcome.

Changing the distributional assumptions

A general concern when analysing trial-based CEA data is that, in many cases, both effectiveness and costs are characterised by highly skewed distributions, which may cause standard normality modelling assumptions to be difficult to justify, especially for small samples. missingHE allows to chose among a range of parametric distributions for modelling both outcome variables, which were selected based on the most common choices in standard practice and the literature. In the context of a trial-based analysis, health economic outcomes are typically collected at baseline and a series of follow-up time points. Traditionally, the analysis is conducted at the aggregate level using cross-sectional effectiveness and cost outcomes obtained by combining the outcomes collected at different times as this considerably simplify the analysis task which does not require a longitudinal model specification. However, such an approach can be justified only when very limited missing outcome values occur throughout the time period of the study. When this is not true, then focussing at the aggregate level may considerably hinder the validity and reliability of the results in terms of either bias or efficiency. To deal with this problem, then a longitudinal model specification is reuiquired which allows to make full use of all observed outcome values collected in the trial while also being able to specify different missingness assumptions based on the different missingness patterns observed across the trial period.

Different approaches are available for modelling longitudinal or repeated measurement binary outcomes under different missingness assumptions, such as longitudinal selection models, which are implemented in missingHE together with a series of customisation options in terms of model specification. Within the model, the specific type of distributions for the effectiveness or utility (\(u\)) and cost (\(c\)) outcome at each time point of the analysis can be selected by setting the arguments dist_u and dist_c to specific character names. Available choices include: Normal ("norm") and Beta ("beta") distributions for \(u\) and Normal ("norm") and Gamma ("gamma") for \(c\). Distributions for modelling both discrete and binary effectiveness variables are also available, such as Poisson ("pois") and Bernoulli (bern) distributions. The full list of available distributions for each type of outcome can be seen by using the help function on each function of the package.

In the PBS dataset the longitudinal health economic variables are the utilities (from EQ5D-3L questionnaires) and costs (from clinic records) collected at baseline (time = 1) and two follow-up times (6 months, time = 2; and 12 months, time = 3). To have an idea of what the data look like, we can inspect the first entries for each variable at a specific time (e.g. time = 1) by typing

> #first 10 entries of u and c at time 1
> head(PBS$u[PBS$time == 1], n = 10)
 [1]  0.17300001  0.85000002 -0.16599999  0.25800008  0.03800001  1.00000000
 [7]  0.41400003  0.71000004  0.88300002  0.81500006
> head(PBS$c[PBS$time == 1], n = 10)
 [1]  9214.0  2492.5 15283.0  1858.0   797.0   358.0   803.0   336.0   472.0
[10]   959.0

We can check the empirical histograms of the data at different times, for example \(u\) at time \(1\) and \(2\) by treatment group by typing

> par(mfrow=c(2, 2))
> hist(PBS$u[PBS$t==1 & PBS$time == 1], main = "Utility at time 1 - Control")
> hist(PBS$u[PBS$t==2 & PBS$time == 1], main = "Utility at time 1 - Intervention")
> hist(PBS$u[PBS$t==1 & PBS$time == 2], main = "Utility at time 2 - Control")
> hist(PBS$u[PBS$t==2 & PBS$time == 2], main = "Utility at time 2 - Intervention")

We can also see that the proportion of missing values in \(u\) at each time is moderate in both treatment groups.

> #proportions of missing values in the control group 
> sum(is.na(PBS$u[PBS$t==1 & PBS$time == 1])) / length(PBS$u[PBS$t==1 & PBS$time == 1])  
[1] 0.06617647
> sum(is.na(PBS$u[PBS$t==1 & PBS$time == 2])) / length(PBS$u[PBS$t==1 & PBS$time == 2])
[1] 0.125
> sum(is.na(PBS$u[PBS$t==1 & PBS$time == 3])) / length(PBS$u[PBS$t==1 & PBS$time == 3])
[1] 0.08088235
> 
> #proportions of missing values in the intervention group
> sum(is.na(PBS$u[PBS$t==2 & PBS$time == 1])) / length(PBS$u[PBS$t==2 & PBS$time == 1])
[1] 0.0462963
> sum(is.na(PBS$u[PBS$t==2 & PBS$time == 2])) / length(PBS$u[PBS$t==2 & PBS$time == 2])  
[1] 0.05555556
> sum(is.na(PBS$u[PBS$t==2 & PBS$time == 3])) / length(PBS$u[PBS$t==2 & PBS$time == 3])  
[1] 0.0462963

As an example, we fit a longitudinal selection model assuming Normal distributions to handle \(u\), and we choose Gamma distributions to capture the skewness in the costs. We note that, in case some of individuals have costs that are equal to zero (as in the PBS dataset), standard parametric distributions with a positive support are not typically defined at \(0\) (e.g. the Gamma distributions), making their implementation impossible. Thus, in these cases, it is necessary to use a trick to modify the boundary values before fitting the model. A common approach is to add a small constant to the cost variables. These can be done by typing

> PBS$c <- PBS$c + 0.01

We note that, although simple, this strategy has the potential drawback that results may be affected by the choice of the constant added and therefore sensitivity analysis to the value used is typically recommended.

We are now ready to fit our longitudinal selection model to the PBS dataset using the following command

> NN.sel=selection_long(data = PBS, model.eff = u ~ 1, model.cost = c ~ u, 
+   model.mu = mu ~ gender, 
+   model.mc = mc ~ gender, type = "MAR", 
+   n.iter = 500, dist_u = "norm", dist_c = "norm", time_dep = "none")

The arguments dist_u = "norm" and dist_c = "norm" specify the distributions assumed for the outcome variables and, in the model of \(c\), we also take into account the possible association between the two outcomes at each time by uncluding the utility as a covariate (u). According to the type of distributions chosen, missingHE automatically models the dependence between covariates and the mean outcome on a specific scale to reduce the chance of incurring into numerical problems due to the constraints of the distributions. For example, for both Poisson and Gamma distributions means are modelled on the log scale, while for Beta and Bernoulli distirbutions they are modelled on the logit scale. To see the modelling scale used by missingHE according to the type of distribution selected, use the help command on each function of the package. The optional argument time_dep allows to specify the type of time dependence structure assumed by the model, here corresponding to no temporal dependence ("none"). An alternative choice available in missingHE is to set time_dep = "AR1", which assumes an autoregressive of order one time structure.

The model assumes MAR conditional on gender as auxiliary variable for predicting missingness in both outcomes. We can look at how the model generate imputations for the outcomes by treatment group using the generic function plot. For example, we can look at how the missing \(u\) at time \(3\) are imputed by typing

> plot(NN.sel, outcome = "effects", time_plot = 3)

Summary results of our model from a statistical perspective can be inspected using the command coef, which extracts the estimates of the mean regression coefficients for \(u\) and \(c\) by treatment group and time point. By default, the lower and upper bounds provide the \(95\%\) credibile intervals for each estimate (based on the \(0.025\) and \(0.975\) quantiles of the posterior distribution). For example, we can inspect the coefficients in the utility and cost models at time \(2\) by typing

> coef(NN.sel, time = 2)
$Comparator
$Comparator$Effects
             mean    sd lower upper
(Intercept) 0.502 0.032 0.441 0.562

$Comparator$Costs
                 mean      sd     lower     upper
(Intercept)  1443.615 165.982  1123.037  1754.898
u           -2062.772 437.469 -2898.600 -1219.158


$Reference
$Reference$Effects
             mean    sd lower upper
(Intercept) 0.641 0.033 0.575   0.7

$Reference$Costs
                 mean      sd     lower    upper
(Intercept)  2822.054 234.262  2372.573 3285.865
u           -1903.106 684.131 -3346.278 -598.933

The entire posterior distribution for each parameter of the model can also be extracted from the output of the model by typing NN.sel$model_output, which returns a list object containing the posterior estimates for each model parameter. An overall summary of the economic analysis based on the model estimates can be obtained using the summary command

> summary(NN.sel)

 Cost-effectiveness analysis summary 
 
 Comparator intervention: intervention 1 
 Reference intervention: intervention 2 
 
 Parameter estimates under MAR assumption
 
 Comparator intervention 
                         mean      sd       LB       UB
mean effects (t = 1)    0.491   0.019     0.46    0.522
mean costs (t = 1)   2888.862 388.703 2197.963 3488.828

 Reference intervention 
                         mean      sd       LB       UB
mean effects (t = 2)    0.615   0.021     0.58    0.645
mean costs (t = 2)   5677.399 292.016 5204.342 6142.506

 Incremental results 
                   mean      sd       LB       UB
delta effects     0.123   0.029    0.075     0.17
delta costs    2788.537 506.129 1942.413 3661.652
ICER          22631.358                          

which shows summary statistics for the mean effectiveness and costs in each treatment group, for the mean differentials and the estimate of the ICER. It is important to clarify how these results are obtained in that they pertain summary statistics for aggregated health economic outcomes (e.g. QALYs and Total costs) which are retrieved by combining the posterior results from the longitudinal model for each type of outcome and time point of the analysis. More specifically, the function selection_long, after deriving the posterior results from the model, applies common CEA techniques for computing aggregate variables based on the posterior mean results of the effectiveness and cost variables at each time point of the analysis. These include: Area Under the Curve approach for \(u\) and total sum over the follow-up period for \(c\). By default, missingHE assumes a value of \(0.5\) and \(1\) for the weights used to respectively calculate the aggregate mean QALYs and Total costs over the time period of the study. When desired, users can provide their own weights as additional argument named qaly_calc and tcost_calc that can be passed into the function selection_long when fitting the model.

Including random effects terms

For each type of model, missingHE allows the inclusion of random effects terms to handle clustered data. To be precise, the term random effects does not have much meaning within a Bayesian approach since all parameters are in fact random variables. However, this terminology is quite useful to explain the structure of the model.

We show how random effects can be added to the model of \(u\) and \(c\) within a longitudinal selection model fitted to the PBS dataset using the function selection_long. The clustering variable over which the random effects are specified is the factor variable site, representing the centres at which data were collected in the trial. Using the same distributional assumptions of the selection model, we fit the pattern mixture model by typing

> PBS$site <- factor(PBS$site)
> NN.re=selection_long(data = PBS, model.eff = u ~ 1 + (1 | site), 
+                model.cost = c ~ u + (1 | site), model.mu = mu ~ gender, 
+                model.mc = mc ~ gender, type = "MAR", 
+                n.iter = 500, dist_u = "norm", dist_c = "norm", time_dep = "none")

The function fits a random intercept only model for \(u\) and \(c\), as indicated by the notation (1 | site)and (1 | site). In both models, site is the clustering variable over which the random coefficients are estimated. missingHE allows the user to choose among different clustering variables for the model of \(u\) and \(c\) if these are available in the dataset. Random intercept and slope models can be specified using the notation (1 + gender | site), where gender is the name of a covariate which should be inlcuded as fixed effects in the corresponding outcome model. Finally, it is also possible to specify random slope only models using the notation (0 + gender | site), where 0 indicates the removal of the random intercept.

Coefficient estimates for the random effects at each time can be extracted using the coef function and setting the argument random = TRUE (if set to FALSE then the fixed effects estimates are displayed).

> coef(NN.re, time = 3, random = TRUE)
$Comparator
$Comparator$Effects
                mean    sd lower upper
(Intercept) 1  4.777 0.422 4.137 5.344
(Intercept) 2  4.797 0.430 4.159 5.386
(Intercept) 3  4.715 0.429 4.065 5.301
(Intercept) 4  4.839 0.421 4.209 5.440
(Intercept) 5  4.747 0.427 4.102 5.316
(Intercept) 6  4.752 0.427 4.111 5.349
(Intercept) 7  4.699 0.426 4.038 5.273
(Intercept) 8  4.749 0.424 4.091 5.331
(Intercept) 9  4.782 0.430 4.111 5.388
(Intercept) 10 4.916 0.439 4.286 5.601
(Intercept) 11 4.754 0.428 4.103 5.336
(Intercept) 12 4.754 0.422 4.133 5.349

$Comparator$Costs
                  mean     sd    lower   upper
(Intercept) 1   -8.493 74.892 -177.766 131.328
(Intercept) 2   -5.464 75.301 -173.826 144.953
(Intercept) 3   -2.938 74.825 -162.873 157.590
(Intercept) 4   -4.574 76.058 -163.989 159.071
(Intercept) 5    6.555 71.536 -130.067 168.417
(Intercept) 6   -3.768 72.618 -153.317 160.399
(Intercept) 7   -4.558 73.906 -186.342 145.056
(Intercept) 8   -1.918 71.450 -147.770 157.473
(Intercept) 9   -4.665 71.335 -150.948 143.872
(Intercept) 10   1.420 74.571 -144.955 157.894
(Intercept) 11 -11.235 75.023 -186.677 128.839
(Intercept) 12   3.091 70.912 -144.800 142.586


$Reference
$Reference$Effects
                mean    sd  lower upper
(Intercept) 1  0.712 0.422 -0.014 1.294
(Intercept) 2  0.767 0.436 -0.008 1.393
(Intercept) 3  0.702 0.421 -0.017 1.272
(Intercept) 4  0.720 0.423 -0.005 1.318
(Intercept) 5  0.680 0.418 -0.033 1.252
(Intercept) 6  0.678 0.415 -0.026 1.247
(Intercept) 7  0.745 0.431  0.004 1.322
(Intercept) 8  0.759 0.432  0.001 1.344
(Intercept) 9  0.651 0.407 -0.032 1.198
(Intercept) 10 0.668 0.412 -0.035 1.234
(Intercept) 11 0.755 0.432  0.002 1.381

$Reference$Costs
                  mean     sd    lower   upper
(Intercept) 1    2.368 59.647 -122.734 137.984
(Intercept) 2   -6.763 58.955 -130.632 112.049
(Intercept) 3   -3.104 58.607 -124.665 122.053
(Intercept) 4   -9.590 59.754 -146.144 121.782
(Intercept) 5  -14.998 60.737 -158.812 110.097
(Intercept) 6    5.453 62.314 -123.702 138.898
(Intercept) 7  -11.447 62.219 -142.054 131.603
(Intercept) 8   -1.988 63.197 -132.991 134.336
(Intercept) 9   -5.102 63.259 -144.608 129.732
(Intercept) 10   5.547 58.731 -111.926 132.147
(Intercept) 11 -10.979 59.127 -140.742  98.179

For both \(u\) and \(c\) models, summary statistics for the random coefficient estimates are displayed for each treatment group and each of the clusters in site.

Changing the priors

By default, all models in missingHE are fitted using vague prior distributions so that posterior results are essentially derived based on the information from the observed data alone. This ensures a rough approximation to results obtained under a frequentist approach based on the same type of models.

However, in some cases, it may be reasonable to use more informative priors to ensure a better stability of the posterior estimates by restricting the range over which estimates can be obtained. For example if, based on previous evidence or knowledge, the range over which a specific parameter is likely to vary is known, then priors can be specified so to give less weight to values outside that range when deriving the posterior estimates. However, unless the user is familiar with the choice of informative priors, it is generally recommended not to change the default priors of missingHE as the unintended use of informative priors may substantially drive posterior estimates and lead to incorrect results.

For each type of model in missingHE, priors can be modified using the argument prior, which allows to change the hyperprior values for each model parameter. The interpretation of the prior values change according to the type of parameter and model considered. For example, we can fit a longitudinal selection model to the PBS dataset using more informative priors on some parameters.

Prior values can be modified by first creating a list object which, for example, we call my.prior. Within this list, we create a number of elements (vectors of length two) which should be assigned specific names based on the type of parameters which priors we want to change.

> my.prior <- list(
+   "alpha0.prior" = c(0 , 0.0000001),
+   "beta0.prior" = c(0, 0.0000001),
+   "gamma0.prior.c"= c(0, 1),
+   "gamma.prior.c" = c(0, 0.01),
+   "sigma.prior.c" = c(0, 100)
+ )

The names above have the following interpretations in terms of the model parameters:

The values shown above are the default values set in missingHE for each of these parameters. It is possible to change the priors by providing different values, for example by increasing the precision for some of the coefficient estimates or decreasing the upper bound for standard deviation parameters. Different names should be used to indicate for which parameter the prior should be modified, keeping in mind that the full list of names that can be used varies depending on the type of models and modelling assumptions specified. The full list of parameter names for each type of model can be assessed using the helpcommand on each function of missingHE.

We can now fit the hurdle model using our priors by typing

> NN.prior=selection_long(data = PBS, model.eff = u ~ 1 + (1 | site), 
+                model.cost = c ~ u + (1 | site), model.mu = mu ~ gender, 
+                model.mc = mc ~ gender, type = "MAR", 
+                n.iter = 500, dist_u = "norm", dist_c = "norm", time_dep = "none",
+                prior = my.prior)

Finally, we can inspect the statistical results from the model at time \(3\) by typing

> coef(NN.prior, random = FALSE, time = 3)
$Comparator
$Comparator$Effects
             mean    sd lower upper
(Intercept) 2.913 0.382 1.828 3.469

$Comparator$Costs
                 mean       sd     lower    upper
(Intercept)  1403.267  330.903   734.967 2033.578
u           -1412.674 1009.988 -3431.254  679.444


$Reference
$Reference$Effects
              mean    sd  lower upper
(Intercept) -0.305 1.183 -1.673 1.078

$Reference$Costs
                 mean      sd     lower    upper
(Intercept)  2853.720 192.780  2468.538 3242.845
u           -1061.623 539.509 -2146.405  -28.792

and

> coef(NN.prior, random = TRUE, time = 3)
$Comparator
$Comparator$Effects
                 mean    sd  lower  upper
(Intercept) 1  -2.423 0.388 -2.975 -1.343
(Intercept) 2  -2.410 0.392 -2.973 -1.324
(Intercept) 3  -2.488 0.385 -3.039 -1.380
(Intercept) 4  -2.363 0.392 -2.948 -1.328
(Intercept) 5  -2.456 0.385 -2.995 -1.374
(Intercept) 6  -2.446 0.388 -2.999 -1.362
(Intercept) 7  -2.496 0.376 -3.035 -1.407
(Intercept) 8  -2.451 0.384 -3.009 -1.375
(Intercept) 9  -2.425 0.393 -3.002 -1.331
(Intercept) 10 -2.292 0.421 -2.935 -1.252
(Intercept) 11 -2.446 0.384 -2.998 -1.377
(Intercept) 12 -2.445 0.390 -3.009 -1.365

$Comparator$Costs
                 mean     sd    lower   upper
(Intercept) 1   2.214 60.465 -133.780  94.893
(Intercept) 2   2.264 64.379 -125.602 129.749
(Intercept) 3   3.289 58.603 -120.273  99.872
(Intercept) 4   6.973 60.166 -117.305 112.142
(Intercept) 5  10.470 61.587 -107.603 131.094
(Intercept) 6   0.032 65.208 -139.383 113.221
(Intercept) 7   4.713 60.664 -114.812 111.643
(Intercept) 8   3.316 60.012 -123.457 121.502
(Intercept) 9   3.300 63.544 -126.000 106.322
(Intercept) 10  3.555 59.645 -122.852 105.364
(Intercept) 11  1.197 61.502 -125.707 117.566
(Intercept) 12  2.789 60.398 -128.851 111.930


$Reference
$Reference$Effects
                mean    sd  lower upper
(Intercept) 1  0.921 1.187 -0.469 2.309
(Intercept) 2  0.982 1.182 -0.434 2.365
(Intercept) 3  0.917 1.188 -0.514 2.302
(Intercept) 4  0.928 1.182 -0.472 2.319
(Intercept) 5  0.893 1.189 -0.504 2.294
(Intercept) 6  0.884 1.186 -0.516 2.267
(Intercept) 7  0.954 1.189 -0.448 2.322
(Intercept) 8  0.968 1.181 -0.430 2.347
(Intercept) 9  0.861 1.184 -0.576 2.254
(Intercept) 10 0.872 1.182 -0.550 2.257
(Intercept) 11 0.966 1.185 -0.437 2.318

$Reference$Costs
                 mean     sd    lower   upper
(Intercept) 1  16.437 72.968 -117.404 173.488
(Intercept) 2  -6.281 69.757 -153.603 126.842
(Intercept) 3   2.854 70.069 -128.847 141.100
(Intercept) 4   7.369 63.861 -131.735 145.198
(Intercept) 5  -8.363 65.953 -159.498 122.817
(Intercept) 6   5.941 66.065 -123.337 139.964
(Intercept) 7   0.202 71.886 -147.109 138.875
(Intercept) 8   2.823 71.076 -154.977 144.029
(Intercept) 9  -1.760 69.268 -162.141 129.411
(Intercept) 10 16.798 73.703 -132.601 175.583
(Intercept) 11 -0.431 70.695 -149.705 146.676