# Multiple variables with measurement error and missingness

library(inlamemi)

In certain cases, it may be necessary to account for measurement error or missingness in more than one covariate. inlamemi is structured in a way that accommodates this, and in this example we illustrate how to specify the input arguments in such a case.

Many of the arguments to fit_inlamemi are specific to one of the error variables, such as formula_imp (specifying the formula for the imputation model), error_type (specifying the type(s) of error for a given variable), repeated_observations, classical_error_scaling and prior.beta.error (specifies the prior for the coefficient of the variable with error). In order for these to be specified correctly in the case where we have multiple error variables, the input is structured in lists, where the first list element corresponds to the argument for the first error variable, the second list element corresponds to the argument for the second error variable, etc. That means that the list elements may themselves be vectors of multiple elements, for instance if the first error variable has Berkson error and missingness and the second error variable has classical error, the error_type variable would be error_type = list(c("berkson", "missing"), "classical").

Error types Likelihood Response Covariate with error Other covariate(s)
Classical (present in two covariates) Gaussian y x1, x2 z

In this example, we have a simple case with three covariates (x1, x2 and z), where two of these have classical measurement error (x1 and x2). We believe that x1 is correlated with the error free covariate z, but that x2 is independent of the other covariates. Here is our data:

head(two_error_data)
#>           y         x1         x2    x1_true   x2_true          z
#> 1 11.479199  3.9241547  2.0065523  2.9122427 1.0015263  0.9819694
#> 2  7.425331  0.1536308  0.6705511  1.4380422 1.2869254  0.4687150
#> 3  2.337587 -0.7050359  0.1312219 -0.1184743 1.5287945 -0.1079713
#> 4  3.006696 -2.1684821 -1.5747725  0.2022806 0.8315696 -0.2128782
#> 5 12.248170  2.7510710  1.8532884  3.1277636 1.1663660  1.1580985
#> 6 13.478741  0.8219551  2.5649969  2.8480912 1.8619438  1.2923548

In this case, we will need to specify the entire control.family argument manually. This is done like this:

# Multiple error variables
control.family <- list(
# Model of interest
list(hyper = list(prec = list(initial = log(1),
param = c(10, 9),
fixed = FALSE))),
# Classical error model for x1
list(hyper = list(prec = list(initial = log(1),
param = c(10, 9),
fixed = FALSE))),
# Imputation model for x1
list(hyper = list(prec = list(initial = log(1),
param = c(10, 9),
fixed = FALSE))),
# Classical error model for x2
list(hyper = list(prec = list(initial = log(1),
param = c(10, 9),
fixed = FALSE))),
# Imputation model for x2
list(hyper = list(prec = list(initial = log(1),
param = c(10, 9),
fixed = FALSE)))
)

If you are uncertain about the order of the model levels, look at the order in the data stack. fit_inlamemi calls the function make_inlamemi_stacks to create the data stack, so we can view it like this:

stack_two_error <- make_inlamemi_stacks(
formula_moi = y ~ x1 + x2 + z,
formula_imp = list(x1 ~ z, x2 ~ 1),
family_moi = "gaussian",
data = two_error_data,
error_type = list("classical", "classical"))

And then we can see the order of the models by looking at

stack_two_error$data$names
#> $y_moi #> [1] "y_moi" #> #>$x1_classical
#> [1] "x1_classical"
#>
#> $x1_imp #> [1] "x1_imp" #> #>$x2_classical
#> [1] "x2_classical"
#>
#> \$x2_imp
#> [1] "x2_imp"

Then we go on to fit the model:

two_error_model <- fit_inlamemi(formula_moi = y ~ x1 + x2 + z,
formula_imp = list(x1 ~ z, x2 ~ 1),
family_moi = "gaussian",
data = two_error_data,
error_type = list("classical", "classical"),
prior.beta.error = list(c(0, 1/1000), c(0, 1/1000)),
control.family = control.family,
control.predictor = list(compute = TRUE)
)

The summary and plot can be viewed as normal:

summary(two_error_model)
#> Formula for model of interest:
#> y ~ x1 + x2 + z
#>
#> Formula for imputation model:
#> [[1]]
#> x1 ~ z
#>
#> [[2]]
#> x2 ~ 1
#>
#>
#> Error types:
#> [[1]]
#> [1] "classical"
#>
#> [[2]]
#> [1] "classical"
#>
#>
#> Fixed effects for model of interest:
#>            mean        sd 0.025quant 0.5quant 0.975quant     mode
#> beta.0 1.032206 0.2117097  0.6092825 1.035605   1.438392 1.035590
#> beta.z 2.026115 0.6650804  0.9289027 2.065058   3.045674 2.042773
#>
#> Coefficient for variable with measurement error and/or missingness:
#>             mean        sd 0.025quant 0.5quant 0.975quant     mode
#> beta.x1 1.937259 0.3629647   1.231761 1.934183   2.660865 1.921210
#> beta.x2 2.066859 0.3202801   1.428971 2.069365   2.690016 2.079914
#>
#> Fixed effects for imputation model:
#>                 mean         sd 0.025quant  0.5quant 0.975quant      mode
#> alpha.x1.0 1.0361515 0.04400913  0.9498399 1.0361515   1.122463 1.0361515
#> alpha.x1.z 1.9748164 0.04480513  1.8869437 1.9748164   2.062689 1.9748164
#> alpha.x2.0 0.9734143 0.04552568  0.8841303 0.9734143   1.062698 0.9734143
#>
#> Model hyperparameters (apart from beta.x1, beta.x2):
#>                                       mean        sd 0.025quant  0.5quant 0.975quant      mode
#> Precision for model of interest  1.1414627 0.3655011  0.5876136 1.0871141   2.010812 0.9861135
#> Precision for x1 classical model 1.0738557 0.2155365  0.7072146 1.0544437   1.552142 1.0189594
#> Precision for x1 imp model       1.0384566 0.2045764  0.6983236 1.0173976   1.500597 0.9745010
#> Precision for x2 classical model 1.0325493 0.1780454  0.7298622 1.0163835   1.428761 0.9829946
#> Precision for x2 imp model       0.9356804 0.1549813  0.6652513 0.9239377   1.273659 0.9022791
plot(two_error_model)
plot of chunk unnamed-chunk-8