Introduction to Multilevel and Mixed Effects Models with R
A Workshop
This workshop provides a comprehensive practical and theoretical introduction to multilevel models using R. Multilevel models, also known as hierarchical models or mixed effects models, are designed for data in which observations are grouped into clusters, and they have become one of the most widely used statistical tools across the social, behavioural, and natural sciences.
The eight topics below move from first principles to a working command of the main model families and their practical extensions. A given delivery will cover a selection depending on length, audience, and emphasis.
Foundations
Random Effects Models
Multilevel models are, in a precise sense, models of models. The clearest way to see this is through the binomial random effects model, which treats a collection of binomial experiments as samples from a shared population distribution. This guide introduces the four progressive model types using the immr package, explains the mechanics of partial pooling and shrinkage, and shows how the multilevel model provides a population-level prediction for new groups.
Normal Random Effects Models
When the grouped outcome is continuous, the random effects model takes a normal form. This guide applies the model to per-capita alcohol consumption data across countries, derives the intraclass correlation coefficient from the estimated variance components, and illustrates how the degree of shrinkage depends on the ratio of between-group to within-group variance.
Linear Mixed Effects Models
The linear mixed effects model extends the random effects framework to regression settings where both the intercept and slope of a relationship vary across groups. This guide uses the sleep deprivation study from lme4 to develop the varying intercepts, varying slopes, and correlated random effects model, covers the fixed-versus-random effects decomposition, model comparison using likelihood ratio tests, and inference via confidence intervals and lmerTest.
Extensions
Multilevel Models for Nested Data
When groups are themselves nested inside higher-level groups — pupils in classes in schools — the model acquires a third level. This guide develops the three-level linear mixed effects model using classroom mathematics data, addresses the practical issue of relative versus unique group identifiers, and covers model simplification through variance component comparisons.
Multilevel Models for Crossed Data
In crossed designs, observations belong to more than one grouping factor simultaneously and the groups do not nest. The canonical example is a psycholinguistic experiment in which every participant responds to every word. This guide develops the crossed random effects model using British Lexicon Project data and contrasts it with the nested case.
Group-Level Predictors
Group-level predictors take the same value for every observation in a group, and they require careful handling in mixed effects models. This guide uses the MathAchieve data to demonstrate why group-level predictors cannot be assigned random slopes, how they enter the fixed effects, and how they interact with observation-level predictors to produce cross-level interactions.
Further Topics
Explained Variance
The ordinary \(R^2\) does not extend directly to mixed effects models, because predictions can be formed at the population level (fixed effects only) or at the subject level (fixed plus random effects). This guide shows how \(R^2\) decomposes in the linear model, develops the marginal and conditional \(R^2\) for mixed models due to Nakagawa and Schielzeth, and demonstrates computation via the performance package.
Power Analysis
Determining adequate sample sizes for multilevel studies requires simulation because analytical formulas are not available in general. This guide uses the simr package to construct a hypothetical mixed effects model with specified effect sizes and variance components, and then estimates power by simulating many datasets and counting how often the key effect is detected.