ggpredict()
and ggemmeans()
compute predicted values for all possible levels or values from a model’s predictor. Basically, ggpredict()
wraps the predict()
-method for the related model, while ggemmeans()
wraps the emmeans()
-method from the emmeans-package. Both ggpredict()
and ggemmeans()
do some data-preparation to bring the data in shape for the newdata
-argument (predict()
) resp. the at
-argument (emmeans()
). It is recommended to read the general introduction first, if you haven’t done this yet.
For models without categorical predictors, the results from ggpredict()
and ggemmeans()
are identical (except some slight differences in the associated confidence intervals, which are, however, negligable).
library(ggeffects)
data(efc)
fit <- lm(barthtot ~ c12hour + neg_c_7, data = efc)
ggpredict(fit, terms = "c12hour")
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted std.error conf.low conf.high
#> 0 75.072 1.077 72.962 77.183
#> 20 70.155 0.895 68.400 71.909
#> 45 64.008 0.818 62.405 65.610
#> 65 59.090 0.902 57.323 60.857
#> 85 54.172 1.087 52.042 56.302
#> 105 49.255 1.331 46.645 51.864
#> 125 44.337 1.609 41.184 47.490
#> 170 33.272 2.289 28.787 37.758
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
ggemmeans(fit, terms = "c12hour")
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted conf.low conf.high
#> 0 75.072 72.959 77.186
#> 20 70.155 68.398 71.912
#> 45 64.008 62.403 65.612
#> 65 59.090 57.320 60.860
#> 85 54.172 52.039 56.305
#> 105 49.255 46.641 51.868
#> 125 44.337 41.180 47.494
#> 170 33.272 28.780 37.764
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
As can be seen, the continuous predictor neg_c_7
is held constant at its mean value, 11.83. For categorical predictors, ggpredict()
and ggemmeans()
behave differently. While ggpredict()
uses the reference level of each categorical predictor to hold it constant, ggemmeans()
- like ggeffects()
- averages over the proportions of the categories of factors.
library(sjmisc)
data(efc)
efc$e42dep <- to_label(efc$e42dep)
fit <- lm(barthtot ~ c12hour + neg_c_7 + e42dep, data = efc)
ggpredict(fit, terms = "c12hour")
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted std.error conf.low conf.high
#> 0 92.745 2.173 88.485 97.004
#> 20 91.317 2.169 87.067 95.567
#> 45 89.532 2.208 85.206 93.859
#> 65 88.105 2.274 83.649 92.561
#> 85 86.677 2.368 82.037 91.318
#> 105 85.250 2.486 80.376 90.123
#> 125 83.822 2.627 78.674 88.970
#> 170 80.610 3.005 74.721 86.499
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
#> * e42dep = independent
ggemmeans(fit, terms = "c12hour")
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted conf.low conf.high
#> 0 73.515 71.853 75.176
#> 20 72.087 70.646 73.528
#> 45 70.302 68.894 71.711
#> 65 68.875 67.287 70.462
#> 85 67.447 65.550 69.344
#> 105 66.019 63.735 68.304
#> 125 64.592 61.875 67.309
#> 170 61.380 57.608 65.152
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
In this case, one would obtain the same results for ggpredict()
and ggemmeans()
again, if condition
is used to define specific levels at which variables, in our case the factor e42dep
, should be held constant.
ggpredict(fit, terms = "c12hour")
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted std.error conf.low conf.high
#> 0 92.745 2.173 88.485 97.004
#> 20 91.317 2.169 87.067 95.567
#> 45 89.532 2.208 85.206 93.859
#> 65 88.105 2.274 83.649 92.561
#> 85 86.677 2.368 82.037 91.318
#> 105 85.250 2.486 80.376 90.123
#> 125 83.822 2.627 78.674 88.970
#> 170 80.610 3.005 74.721 86.499
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
#> * e42dep = independent
ggemmeans(fit, terms = "c12hour", condition = c(e42dep = "independent"))
#>
#> # Predicted values of Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted conf.low conf.high
#> 0 92.745 88.479 97.010
#> 20 91.317 87.061 95.573
#> 45 89.532 85.199 93.865
#> 65 88.105 83.642 92.567
#> 85 86.677 82.030 91.324
#> 105 85.250 80.370 90.130
#> 125 83.822 78.667 88.977
#> 170 80.610 74.712 86.507
#>
#> Adjusted for:
#> * neg_c_7 = 11.83
Creating plots is as simple as described in the vignette Plotting Marginal Effects.