Marginal Means

In the context of this package, “marginal means” refer to the values obtained by this three step process:

  1. Construct a “grid” of predictor values with all combinations of categorical variables, and where numeric variables are held at their means.
  2. Calculate adjusted predictions for each cell in that grid.
  3. Take the average of those adjusted predictions across one dimension of the grid to obtain the marginal means.

For example, consider a model with a numeric, a factor, and a logical predictor:

library(marginaleffects)

dat <- mtcars
dat$cyl <- as.factor(dat$cyl)
dat$am <- as.logical(dat$am)
mod <- lm(mpg ~ hp + cyl + am, data = dat)

Using the predictions function, we set the hp variable at its mean and compute predictions for all combinations for am and cyl:

p <- predictions(
    mod,
    newdata = datagrid(am = dat$am, cyl = dat$cyl))

For illustration purposes, it is useful to reshape the above results:

am
cyl TRUE FALSE Marginal means by cyl
6 21.0 16.9 19.0
4 25.0 20.8 22.9
8 21.4 17.3 19.4
Marginal means by am 22.5 18.3

The marginal means by am and cyl are obtained by taking the mean of the adjusted predictions across cells. The marginalmeans function gives us the same results easily:

marginalmeans(mod)
#>   term value marginalmean std.error conf.low conf.high       p.value statistic
#> 1   am FALSE     18.31987 0.7853925 16.78053  19.85921 2.429618e-120  23.32575
#> 2   am  TRUE     22.47772 0.8343346 20.84246  24.11299 7.291801e-160  26.94090
#> 3  cyl     4     22.88479 1.3566479 20.22581  25.54378  7.654792e-64  16.86863
#> 4  cyl     6     18.96022 1.0729360 16.85730  21.06313  6.972277e-70  17.67134
#> 5  cyl     8     19.35138 1.3770817 16.65235  22.05041  7.440777e-45  14.05246

The same results can be obtained using the very powerful emmeans package:

library(emmeans)
emmeans(mod, specs = "cyl")
#>  cyl emmean   SE df lower.CL upper.CL
#>  4     22.9 1.36 27     20.1     25.7
#>  6     19.0 1.07 27     16.8     21.2
#>  8     19.4 1.38 27     16.5     22.2
#> 
#> Results are averaged over the levels of: am 
#> Confidence level used: 0.95
emmeans(mod, specs = "am")
#>  am    emmean    SE df lower.CL upper.CL
#>  FALSE   18.3 0.785 27     16.7     19.9
#>   TRUE   22.5 0.834 27     20.8     24.2
#> 
#> Results are averaged over the levels of: cyl 
#> Confidence level used: 0.95

Interactions

By default, the marginalmeans() function calculates marginal means for each categorical predictor one after the other. We can also compute marginal means for combinations of categories by setting interaction=TRUE:

library(lme4)

dat <- "https://vincentarelbundock.github.io/Rdatasets/csv/Stat2Data/Titanic.csv"
dat <- read.csv(dat)
titanic <- glmer(
    Survived ~ Sex * PClass + Age + (1 | PClass),
    family = binomial,
    data = dat)

Regardless of the scale of the predictions (type argument), marginalmeans() always computes standard errors using the Delta Method:

marginalmeans(
    titanic,
    type = "response",
    variables = c("Sex", "PClass"))
#>      Sex PClass marginalmean   conf.low conf.high      p.value
#> 1 female    1st    0.9701725 0.92682949 0.9881687 4.597322e-13
#> 2 female    2nd    0.8803768 0.78990733 0.9350899 5.720827e-09
#> 3 female    3rd    0.3644761 0.27274377 0.4672392 1.030256e-02
#> 4   male    1st    0.4450399 0.34763191 0.5468621 2.898452e-01
#> 5   male    2nd    0.1422606 0.09231109 0.2128986 6.035481e-13
#> 6   male    3rd    0.1189557 0.08321607 0.1672447 4.901659e-23

When the model is linear or on the link scale, it also produces confidence intervals:

marginalmeans(
    titanic,
    type = "link",
    variables = c("Sex", "PClass"))
#>      Sex PClass marginalmean std.error   conf.low  conf.high      p.value
#> 1 female    1st    3.4820418 0.4811643  2.5389771  4.4251065 4.597322e-13
#> 2 female    2nd    1.9960034 0.3426780  1.3243669  2.6676399 5.720827e-09
#> 3 female    3rd   -0.5559886 0.2167170 -0.9807461 -0.1312311 1.030256e-02
#> 4   male    1st   -0.2207323 0.2085408 -0.6294648  0.1880001 2.898452e-01
#> 5   male    2nd   -1.7966393 0.2495444 -2.2857374 -1.3075413 6.035481e-13
#> 6   male    3rd   -2.0023565 0.2025929 -2.3994313 -1.6052817 4.901659e-23
#>   statistic
#> 1  7.236700
#> 2  5.824720
#> 3 -2.565506
#> 4 -1.058461
#> 5 -7.199678
#> 6 -9.883645

It is easy to transform those link-scale marginal means with arbitrary functions using the transform_post argument:

marginalmeans(
    titanic,
    type = "link",
    transform_post = insight::link_inverse(titanic),
    variables = c("Sex", "PClass"))
#>      Sex PClass marginalmean   conf.low conf.high      p.value
#> 1 female    1st    0.9701725 0.92682949 0.9881687 4.597322e-13
#> 2 female    2nd    0.8803768 0.78990733 0.9350899 5.720827e-09
#> 3 female    3rd    0.3644761 0.27274377 0.4672392 1.030256e-02
#> 4   male    1st    0.4450399 0.34763191 0.5468621 2.898452e-01
#> 5   male    2nd    0.1422606 0.09231109 0.2128986 6.035481e-13
#> 6   male    3rd    0.1189557 0.08321607 0.1672447 4.901659e-23

When a model does not include interactions, marginalmeans() defaults to reporting EMMs for each category individually, without interactions:

titanic2 <- glmer(
    Survived ~ Sex + PClass + Age + (1 | PClass),
    family = binomial,
    data = dat)

marginalmeans(
    titanic2,
    variables = c("Sex", "PClass"))
#>     term  value marginalmean  conf.low conf.high      p.value
#> 1 PClass    1st    0.7778338 0.7059595 0.8362154 7.498210e-11
#> 2 PClass    2nd    0.4902824 0.4071513 0.5739545 8.210648e-01
#> 3 PClass    3rd    0.2195429 0.1683384 0.2810591 4.247510e-14
#> 4    Sex female    0.7854373 0.7307995 0.8315419 1.779451e-17
#> 5    Sex   male    0.2085450 0.1709283 0.2519246 1.660045e-26

We can force the interactions:

marginalmeans(
    titanic2,
    interaction = TRUE,
    variables = c("Sex", "PClass"))
#>      Sex PClass marginalmean   conf.low conf.high      p.value
#> 1 female    1st   0.92882414 0.89008215 0.9546074 5.036296e-26
#> 2 female    2nd   0.78190514 0.70413410 0.8437691 1.012089e-09
#> 3 female    3rd   0.51183442 0.42257389 0.6003465 7.963418e-01
#> 4   male    1st   0.48435732 0.39415021 0.5755954 7.383894e-01
#> 5   male    2nd   0.20512692 0.15125465 0.2720369 7.462936e-13
#> 6   male    3rd   0.07017461 0.04785084 0.1017996 1.310275e-35

Group averages with the by argument

We can collapse marginal means via averaging using the by argument:

pkgload::load_all()
dat <- mtcars
dat$am <- factor(dat$am)
dat$vs <- factor(dat$vs)
dat$cyl <- factor(dat$cyl)

mod <- glm(gear ~ cyl + vs + am, data = dat, family = poisson)

by <- data.frame(
    by = c("(4 & 6)", "(4 & 6)", "(8)"),
    cyl = c(4, 6, 8))

marginalmeans(mod, by = by)
#>        by marginalmean conf.low conf.high      p.value
#> 1 (4 & 6)     3.861699 2.856543  5.220546 1.587406e-18
#> 2     (8)     3.593085 2.106679  6.128253 2.662085e-06

And we can use the hypothesis argument to compare those new collapsed subgroups:

marginalmeans(mod, by = by, hypothesis = "pairwise")
#>            term marginalmean  conf.low conf.high   p.value
#> 1 (4 & 6) - (8)     1.074758 0.5149323  2.243218 0.8477116

Custom Contrasts and Linear Combinations

See the vignette on Custom Contrasts and Combinations

Tidy summaries

The summary, tidy, and glance functions are also available to summarize and manipulate the results:

mm <- marginalmeans(mod)

tidy(mm)
#>   term value estimate      p.value conf.low conf.high
#> 1   am     0 3.271615 3.955749e-15 2.434117  4.397269
#> 2   am     1 4.344309 6.833090e-19 3.141106  6.008399
#> 3  cyl     4 3.826645 2.336920e-08 2.389395  6.128417
#> 4  cyl     6 3.897074 1.573029e-12 2.672537  5.682685
#> 5  cyl     8 3.593085 2.662085e-06 2.106679  6.128253
#> 6   vs     0 3.794921 1.882061e-12 2.618291  5.500313
#> 7   vs     1 3.745245 5.820764e-10 2.466321  5.687361

glance(mm)
#>        aic     bic r2.nagelkerke      rmse nobs         F    logLik
#> 1 113.0034 120.332     0.6720045 0.4368402   32 0.7371939 -51.50168

summary(mm)
#>   Term Value  Mean   Pr(>|z|) 2.5 % 97.5 %
#> 1   am     0 3.272 3.9557e-15 2.434  4.397
#> 2   am     1 4.344 < 2.22e-16 3.141  6.008
#> 3  cyl     4 3.827 2.3369e-08 2.389  6.128
#> 4  cyl     6 3.897 1.5730e-12 2.673  5.683
#> 5  cyl     8 3.593 2.6621e-06 2.107  6.128
#> 6   vs     0 3.795 1.8821e-12 2.618  5.500
#> 7   vs     1 3.745 5.8208e-10 2.466  5.687
#> 
#> Model type:  glm 
#> Prediction type:  link 
#> Results averaged over levels of: cyl, vs, am

Thanks to those tidiers, we can also present the results in the style of a regression table using the modelsummary package. For examples, see the tables and plots vignette.

Case study: Multinomial Logit

This example requires version 0.2.0 of the marginaleffects package.

To begin, we generate data and estimate a large model:

library(nnet)
library(marginaleffects)

set.seed(1839)
n <- 1200
x <- factor(sample(letters[1:3], n, TRUE))
y <- vector(length = n)
y[x == "a"] <- sample(letters[4:6], sum(x == "a"), TRUE)
y[x == "b"] <- sample(letters[4:6], sum(x == "b"), TRUE, c(1 / 4, 2 / 4, 1 / 4))
y[x == "c"] <- sample(letters[4:6], sum(x == "c"), TRUE, c(1 / 5, 3 / 5, 2 / 5))

dat <- data.frame(x = x, y = factor(y))
tmp <- as.data.frame(replicate(20, factor(sample(letters[7:9], n, TRUE))))
dat <- cbind(dat, tmp)
void <- capture.output({
    mod <- multinom(y ~ ., dat)
})

Try to compute marginal means, but realize that your grid won’t fit in memory:

marginalmeans(mod, type = "probs")
#> Error: You are trying to create a prediction grid with more than 1 billion rows, which is likely to exceed the memory and computational power available on your local machine. Presumably this is because you are considering many variables with many levels. All of the functions in the `marginaleffects` package include arguments to specify a restricted list of variables over which to create a prediction grid.

Use the variables and variables_grid arguments to compute marginal means over a more reasonably sized grid:

marginalmeans(mod,
              type = "probs",
              variables = c("x", "V1"),
              variables_grid = paste0("V", 2:3))

Plot conditional marginal means

The marginaleffects package offers several functions to plot how some quantities vary as a function of others:

There is no analogous function for marginal means. However, it is very easy to achieve a similar effect using the predictions() function, its by argument, and standard plotting functions. In the example below, we take these steps:

  1. Estimate a model with one continuous (hp) and one categorical regressor (cyl).
  2. Create a perfectly “balanced” data grid for each combination of hp and cyl. This is specified by the user in the datagrid() call.
  3. Compute fitted values (aka “adjusted predictions”) for each cell of the grid.
  4. Use the by argument to take the average of predicted values for each value of hp, across margins of cyl.
  5. Compute standard errors around the averaged predicted values (i.e., marginal means).
  6. Create symmetric confidence intervals in the usual manner.
  7. Plot the results.
library(ggplot2)

mod <- lm(mpg ~ hp + factor(cyl), data = mtcars)

p <- predictions(mod,
    by = "hp",
    newdata = datagrid(
        model = mod,
        hp = seq(100, 120, length.out = 10),
        cyl = mtcars$cyl))

ggplot(p) +
    geom_ribbon(aes(hp, ymin = conf.low, ymax = conf.high), alpha = .2) +
    geom_line(aes(hp, predicted))