Multilevel Models for Nested Data

Mark Andrews

Three-level nested structure

Children nested within classes, classes nested within schools.

The classroom dataset: 1,190 children in 312 classes in 107 schools.

Identifiers

classroom_df <- classroom
classroom_df

schoolid: unique school identifier. classid: unique class identifier across the whole dataset. classid2: class number within school (1, 2, … per school).

Two-level model: school-level variation

Allow the effect of ses to vary across schools.

M14 <- lmer(mathscore ~ ses + (ses | schoolid), data = classroom_df)
fixef(M14)
VarCorr(M14)

Three-level model

The relationship between ses and mathscore can vary across schools. Furthermore, within each school, there may be variation in the ses effect across classes.

A three-level varying intercepts and slopes model:

\[ \begin{aligned} \text{for } i \in 1\ldots n,\quad y_i &\sim N(\mu_i, \sigma^2),\\ \mu_i &= \gamma_{[c_i]0} + \gamma_{[c_i]1}\, x_i,\\ \text{for } j \in 1\ldots J,\quad \vec{\gamma}_{j} &\sim N(\vec{\beta}_{[s_j]},\, \Sigma_c),\\ \text{for } k \in 1\ldots K,\quad \vec{\beta}_{k} &\sim N(\vec{b},\, \Sigma_s). \end{aligned} \]

\(c_i \in 1\ldots J\) is the class of child \(i\); \(s_j \in 1\ldots K\) is the school of class \(j\). \(\vec{\gamma}_j\) are class-level coefficients; \(\vec{\beta}_k\) are school-level coefficients.

Rewriting: mean plus offset

Since \(\vec{\gamma}_j \sim N(\vec{\beta}_{[s_j]}, \Sigma_c)\), write \(\vec{\gamma}_j = \vec{\beta}_{[s_j]} + \vec{\xi}_j\) where \(\vec{\xi}_j \sim N(0, \Sigma_c)\).

Since \(\vec{\beta}_k \sim N(\vec{b}, \Sigma_s)\), write \(\vec{\beta}_k = \vec{b} + \vec{\zeta}_k\) where \(\vec{\zeta}_k \sim N(0, \Sigma_s)\).

Substituting:

\[ \vec{\gamma}_j = \vec{\beta}_{[s_j]} + \vec{\xi}_j = \vec{b} + \vec{\zeta}_{[s_j]} + \vec{\xi}_j. \]

Fixed effects, school effects, class effects

This allows us to rewrite the original model as:

\[ \begin{aligned} \text{for } i \in 1\ldots n,\quad y_i &\sim N(\mu_i, \sigma^2),\\ \mu_i &= \underbrace{b_{0} + b_{1}\, x_i}_{\text{fixed effects}} + \underbrace{\zeta_{[z_i]0} + \zeta_{[z_i]1}\, x_i}_{\text{school random effects}} + \underbrace{\xi_{[c_i]0} + \xi_{[c_i]1}\, x_i}_{\text{class random effects}},\\ \text{for } j \in 1\ldots J,\quad \vec{\xi}_{j} &\sim N(0, \Sigma_c),\\ \text{for } k \in 1\ldots K,\quad \vec{\zeta}_{k} &\sim N(0, \Sigma_s). \end{aligned} \]

\(z_i = s_{[c_i]}\) is the school of child \(i\).

Singular fit: correlated class effects

M15 <- lmer(mathscore ~ ses + (ses | schoolid) + (ses | classid),
            data = classroom_df)
isSingular(M15)
VarCorr(M15)

The class-level correlation is exactly \(\pm 1\): the \(2 \times 2\) covariance matrix is rank-deficient. Fewer than 4 children per class on average is insufficient to identify intercept and slope variation independently at the class level.

Forcing independence: slope collapses to zero

M16 <- lmer(mathscore ~ ses + (ses | schoolid) + (ses || classid),
            data = classroom_df)
isSingular(M16)
VarCorr(M16)

Removing the correlation does not resolve the problem. The class-level slope SD collapses to zero: the data contain no evidence for class-level variation in the ses slope.

The appropriate model

Restrict the class-level term to a random intercept only.

M17 <- lmer(mathscore ~ ses + (ses | schoolid) + (1 | classid),
            data = classroom_df)
isSingular(M17)
fixef(M17)
VarCorr(M17)

Not singular. School-level variation in both intercepts and slopes is retained; class-level variation is in intercepts only.

Model comparison

anova(M14, M17, M16)

M17 significantly improves over M14: adding a class-level random intercept matters. M16 adds nothing over M17: the class-level slope is at the boundary.

Nested identifiers

classid2 labels classes within each school starting from 1. Passing it directly to lmer would be incorrect: class 1 in school A and class 1 in school B are different classes.

The / syntax creates a nested random effect:

M18 <- lmer(mathscore ~ ses + (1 | schoolid / classid2), data = classroom_df)

This is equivalent to creating the unique identifier explicitly:

M19 <- lmer(mathscore ~ ses + (1 | schoolid) + (1 | schoolid:classid2),
            data = classroom_df)

All three are equivalent

M20 <- lmer(mathscore ~ ses + (1 | schoolid) + (1 | classid), data = classroom_df)
fixef(M18)
fixef(M19)
fixef(M20)

Variance components as a guide

VarCorr(M17)

School-level: variation in both average scores and the ses effect. Class-level: variation in average scores only (no ses slope component). Levels with near-zero variance add little and can usually be dropped.