Bayesian Mixed Effects Models

Mark Andrews

When are mixed models needed?

Data are grouped when observations share a common context:

  • Students within schools
  • Patients within hospitals
  • Repeated measures within individuals
  • Plots within regions

Observations within a group are correlated. Ignoring this leads to underestimated standard errors.

Fixed and random effects

  • Fixed effects: population-level parameters common to all groups
  • Random effects: group-level deviations, treated as random draws from a population distribution
  • Partial pooling: group estimates are shrunk toward the overall mean

The varying intercept model

Each group has its own baseline; the effect of predictors is the same across groups.

M_15 <- lmer(mathach ~ ses + (ses | school), data = mathach_df)

M_16 <- brm(mathach ~ ses + (ses | school),
  cores = 4, data = mathach_df)

(ses | school): intercept and slope of ses both vary by school.

The formula syntax

Formula Meaning
(1 \| group) Varying intercepts
(x \| group) Varying intercepts and slopes of \(x\)
(0 + x \| group) Varying slopes only

The same syntax works in lme4 and brms.

Posterior summaries for mixed models

fixef(M_16)    # population-level effects
ranef(M_16)    # group-level deviations
prior_summary(M_16)

The posterior for random effects captures uncertainty about each group’s deviation. Classical methods treat these as known once the model converges.

Marginalisation

[Diagram: 2D parameter space with joint posterior shown as contours. Collapsing one dimension illustrates marginalisation to a univariate posterior.]

\[\int p(x, y)\, dx = p(y)\]

Marginalising over group-level parameters gives the population-level posterior.

Why Bayesian mixed models converge more reliably

Classical REML (lme4) often fails when:

  • Many varying slopes
  • Few groups
  • Binary outcomes with random effects
  • Highly unbalanced designs

brms handles these cases because MCMC does not optimise; it explores. Priors on variance components prevent boundary estimates.

Bayesian logistic mixed models

M_17 <- brm(I(cigs > 0) ~ ses + (1 | school),
  data = mathach_df,
  family = bernoulli(),
  cores = 4
)

glmer frequently fails here. brm does not require any extra configuration.

Summary

  • Mixed effects models handle grouped data through partial pooling
  • The lme4 formula syntax carries over unchanged to brms
  • The posterior includes uncertainty over both fixed and random effects
  • Bayesian mixed models are more reliable than REML for complex structures
  • All the tools from earlier sessions (diagnostics, model comparison, pp_check) apply here too