Gradient Boosted Mixed Models: Flexible Estimation of Mean and Variance Components for Clustered Data
Abstract
We introduce Gradient Boosted Mixed Models (GBMixed), a framework which extends boosting to clustered data by jointly modeling the mean and variance components in a linear mixed model via likelihood-based gradients. GBMixed estimates a nonparametric fixed effects function characterizing the overall mean of the response, while also allowing the random effects covariance matrix along with the residual variance to depend on covariates in a flexible manner. We demonstrate how GBMixed facilitates covariate-dependent random effect predictions, and subsequently point predictions and prediction intervals for individual treatment effects, that can adapt between population-level and cluster-level information. Simulations and applications to two real-world datasets demonstrate that GBMixed can accurately recover complex nonlinear fixed effect functions and covariate-dependent covariances in a linear mixed model, while also improving point and probabilistic predictive performance compared with several existing approaches such as parametric linear mixed models, Natural Gradient Boosting, and Gaussian Process Boosting.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.