Normal Random Effects Models

Mark Andrews

Grouped continuous data

Per-capita alcohol consumption (litres/year) in 190 countries, one observation per country per year.

alcohol_df <- alcohol
alcohol_df

Multiple observations per country: within-country (year-to-year) variation is different from between-country variation.

The non-multilevel model

One mean per country, estimated independently.

\[ y_i \sim N(\mu_{[x_i]}, \sigma^2), \quad \mu_1, \ldots, \mu_J \text{ treated as free parameters} \]

This is a one-way ANOVA model. Countries are treated as entirely separate problems.

The multilevel model

The country means are not free parameters: they are drawn from a normal distribution.

\[ \begin{aligned} y_i &\sim N(\mu_{[x_i]}, \sigma^2) \\ \mu_j &\sim N(\phi, \tau^2), \quad j = 1, \ldots, J \end{aligned} \]

Two sources of variance

Writing \(\mu_j = \phi + \xi_j\) with \(\xi_j \sim N(0, \tau^2)\):

\[ y_i = \phi + \xi_{[x_i]} + \epsilon_i, \qquad \epsilon_i \sim N(0, \sigma^2) \]

\(\tau^2\) is the between-country variance. \(\sigma^2\) is the within-country (year-to-year) variance.

Fitting in R

M6 <- lmer(alcohol ~ 1 + (1 | country), data = alcohol_df)
fixef(M6)

VarCorr(M6)

Std.Dev. under country is \(\hat\tau\); under Residual is \(\hat\sigma\).

Intraclass correlation coefficient

Each observation has total variance \(\tau^2 + \sigma^2\). The proportion due to between-country differences is the ICC:

\[ \mathrm{ICC} = \frac{\tau^2}{\tau^2 + \sigma^2} \]

tau   <- sqrt(as.numeric(VarCorr(M6)$country))
sigma <- sigma(M6)
tau^2 / (tau^2 + sigma^2)

Two observations from the same country have correlation equal to the ICC.

Interpretation of ICC

An ICC close to 1: most variation is between countries (countries differ substantially, each is consistent across years).

An ICC close to 0: most variation is within countries (year-to-year fluctuation dominates).

The ICC also tells us how much the standard independence assumption is violated: observations from the same country are not exchangeable with observations from different countries.

Shrinkage

flat_ests <- coef(lm(alcohol ~ 0 + country, data = alcohol_df)) |>
  enframe(name = "country", value = "flat") |>
  mutate(country = str_remove(country, "^country"))
ml_ests   <- coef(M6)$country |>
  rownames_to_column("country") |> rename(multilevel = `(Intercept)`)
inner_join(flat_ests, ml_ests, by = "country") |>
  ggplot(aes(x = flat, y = multilevel)) +
  geom_point(alpha = 0.5) +
  geom_abline(slope = 1, intercept = 0, linetype = "dashed")

Multilevel estimates are shrunk toward the grand mean \(\hat\phi\) relative to the non-multilevel estimates.

Why shrinkage is right

Countries with few data points (or extreme values) have high-variance MLEs. The multilevel model regularises these by pulling them toward the grand mean.

The strength of shrinkage depends on \(\tau / \sigma\): when between-country variation is large relative to within-country noise, there is less shrinkage because each country’s data is informative.