Using MixedModels.jl and MixedModelsMakie.jl, I will show several different ways to visualize different aspects of the model fit as well as the model fitting process. I will focus especially on shrinkage (downward bias of the random effects relative to similar estimates from a classical OLS model) and how MixedModels.jl uses BOBYQA and a profiled log likelihood to efficiently explore the parameter space.
Visualization of mixed-effects models often focuses on the same plots as classical regression models: visualization is focused on effects plots and diagnostic plots that largely ignore the complexity and subtlety introduced by random effects. MixedModelsMakie.jl provides a shrinkage plot, which displays the change from classical OLS estimates to the conditional modes (random effects) for the block-level predictions. In addition to demonstrating the concept of shrinkage, these displays also provide informative diagnostic information on random-effects structure. For example, models with a degenerate random-effects structure, i.e. singular models, generally show the excess dimensionality quite clearly in shrinkage plots. Shrinkage plots also provide a convenient way to visualize the tradeoffs of a restricted covariance structure: fewer parameters to optimize, but less efficient shrinkage.
MixedModels.jl also allows tracing of the optimization procedure, i.e. exploration of the parameter space. We can take advantage of this trace to visualize and better understand the behavior of the optimizer and the challenges involved in fitting large or complex models. For example, we can observe that optimization generally follows three phases: an initial phase of broad exploration of the parameter space, a phase rapid convergence to the neighborhood of the optimum and a final phase of fine tuning of parameter estimates and verification. In large models, the final phase tends to dominate, which has implications for a speed-accuracy tradeoff in certain applications.
Finally, we can also examine animation of shrinkage across the course of optimization. The change in shrinkage is relevant as a practical implication for speed-accuracy tradeoffs in model fits and also serves to highlight how shrinkage -- like all regularization -- is an example of the bias-variance tradeoff. For mixed models, this means a tradeoff between the observation-level variance and the between-group variance.