Bayesian Generalised Linear Models

Mark Andrews

From linear to generalised linear

Standard linear model: \[y_i \sim \mathrm{N}(\mu_i, \sigma^2), \quad \mu_i = \beta_0 + \sum_k \beta_k x_{ki}\]

Generalised linear model: keep the linear predictor, change the family and add a link: \[g(\mu_i) = \beta_0 + \sum_k \beta_k x_{ki}\]

Binary outcomes: logistic regression

  • Response: \(y_i \in \{0, 1\}\)
  • Parameter of interest: \(\theta_i = P(y_i = 1)\)
  • Logit link: \(g(\theta) = \log[\theta/(1-\theta)]\)
  • Inverse logit: \(\theta_i = \mathrm{ilogit}(\beta_0 + \beta_1 x_i)\)

Fitting logistic regression with brms

M_13 <- glm(I(cigs > 0) ~ educ + age,
  data = smoking_df, family = binomial())

M_14 <- brm(I(cigs > 0) ~ educ + age,
  data = smoking_df, cores = 4,
  family = bernoulli())

Same formula; different framework; richer output.

Inspecting priors for GLMs

prior_summary(M_14)

For logistic regression, priors are on the log-odds scale. A coefficient of 2 on the log-odds scale corresponds to an odds ratio of \(e^2 \approx 7.4\). The default weakly informative priors are typically appropriate.

Interpreting coefficients

Coefficients are on the log-odds scale:

  • \(\beta_k = 0.3\): a one-unit increase in \(x_k\) multiplies the odds by \(e^{0.3} \approx 1.35\)
  • To convert to a change in probability: need to specify the baseline level of all other predictors
  • The logistic function is non-linear; the effect on probability depends on where you are

Posterior predictive checks for binary data

pp_check(M_14)

For binary outcomes: compares the observed proportion of ones to the distribution of predicted proportions from the posterior.

Other families in brms

Family Outcome Link
bernoulli() Binary 0/1 logit
binomial() Counts with known \(n\) logit
poisson() Counts log
negbinomial() Overdispersed counts log
cumulative() Ordered categories logit

The Bayesian workflow is identical across all families.

The general Bayesian GLM workflow

  1. Specify model formula and family
  2. Inspect default priors; set custom priors if needed
  3. Fit with brm
  4. Check MCMC diagnostics (trace plots, Rhat, ESS)
  5. Posterior predictive check
  6. Compare models with LOO or Bayes factors

Summary

  • GLMs extend linear regression to non-normal response variables
  • The Bayesian GLM workflow in brms is the same as for linear models
  • Change family = bernoulli() for binary outcomes
  • Coefficients are on the link scale; interpret accordingly
  • Posterior predictive checks and model comparison work identically