Multilevel Models for Crossed Data

Mark Andrews

Nested versus crossed

Nested: each lower-level unit belongs to exactly one higher-level unit. Classes belong to one school; pupils belong to one class.

Crossed: each observation is classified by two or more grouping factors that are independent of each other. In a lexical decision experiment, participant \(j\) responds to word \(k\): neither belongs to the other.

Nested and crossed structures

Each response \(r_i\) is simultaneously linked to one participant \(s_j\) and one word \(w_k\). Both factors are crossed: \(s_j\) and \(w_k\) are independent.

The BLP data

58 words, 78 participants, lexical decision response times.

blp_df <- blp_short2
blp_df <- mutate(blp_df, freq2 = scale(freq)[, 1])
blp_df

The model equations

\[ \begin{aligned} y_i &\sim N(\mu_i, \sigma^2) \\ \mu_i &= \beta_{[s_i]} + \gamma_{[w_i]} \\ \beta_j &\sim N(b_s, \tau_s^2), \quad j = 1, \ldots, J \\ \gamma_k &\sim N(b_w, \tau_w^2), \quad k = 1, \ldots, K \end{aligned} \]

Participant effects \(\beta_j\) and word effects \(\gamma_k\) are independent of each other. This is the defining feature of crossed random effects.

Participant effects only

M24 <- lmer(rt ~ freq2 + (freq2 | participant), data = blp_df)
fixef(M24)
VarCorr(M24)

High-frequency words are recognised faster. The random effects capture how much this frequency effect varies across individuals.

Adding word-level effects

Some words are consistently faster or slower to recognise, regardless of who is responding.

M25 <- lmer(rt ~ freq2 + (freq2 | participant) + (1 | item), data = blp_df)
VarCorr(M25)

The word-level SD quantifies systematic variability across words beyond what freq2 explains.

Model comparison

anova(M24, M25)

A significant improvement confirms that words vary in their response times beyond their frequency.

Random versus fixed effects for items

An alternative to a word random effect is a word fixed effect: include item as a factor.

The random effect approach treats words as a sample from the population of English words. Generalisation to the population is the goal — not to the specific 58 words in this dataset.

Clark (1973): treating items as fixed effects leads to anticonservative inference when the question is about a population of stimuli. The random effects approach for both participants and items is now standard in psycholinguistics.

Specifying the formula

Nested:

(1 | schoolid / classid2)   # classes nested in schools

Crossed:

(freq2 | participant) + (1 | item)  # participants and items independent

In nested models, the / syntax encodes the hierarchical constraint. In crossed models, two separate | terms are independent; lmer handles the covariance structure accordingly.