Maintainer: | Ben Bolker, Julia Piaskowski, Emi Tanaka, Phillip Alday, Wolfgang Viechtbauer |

Version: | 2022-10-31 |

URL: | https://CRAN.R-project.org/view=MixedModels |

Source: | https://github.com/cran-task-views/MixedModels/ |

Contributions: | Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see the Contributing guide. |

Citation: | Ben Bolker, Julia Piaskowski, Emi Tanaka, Phillip Alday, Wolfgang Viechtbauer (2022). CRAN Task View: Mixed, Multilevel, and Hierarchical Models in R. Version 2022-10-31. URL https://CRAN.R-project.org/view=MixedModels. |

Installation: | The packages from this task view can be installed automatically using the ctv package. For example, `ctv::install.views("MixedModels", coreOnly = TRUE)` installs all the core packages or `ctv::update.views("MixedModels")` installs all packages that are not yet installed and up-to-date. See the CRAN Task View Initiative for more details. |

**Contributors**: Maintainers *plus* Michael Agronah, Matthew Fidler, Thierry Onkelinx

*Mixed* (or *mixed-effect*) *models* are a broad class of statistical models used to analyze data where observations can be assigned *a priori* to discrete groups, and where the parameters describing the differences between groups are treated as random (or *latent*) variables. They are one category of *multilevel*, or *hierarchical* models; *longitudinal* data are often analyzed in this framework. In econometrics, longitudinal or cross-sectional time series data are often referred to as *panel data* and are sometimes fitted with mixed models. Mixed models can be fitted in either frequentist or Bayesian frameworks.

This task view only includes models that incorporate *continuous* (usually although not always Gaussian) latent variables. This excludes packages that handle hidden Markov models, latent Markov models, and finite (discrete) mixture models (some of these are covered by the Cluster task view). Dynamic linear models and other state-space models that do not incorporate a discrete grouping variable are also excluded (some of these are covered by the TimeSeries task view). Bioinformatic applications of mixed models hosted on Bioconductor are excluded as well.

Linear mixed models (LMM) make the following assumptions:

- The expected values of the responses are linear combinations of the fixed predictor variables and the random effects.
- The conditional distribution of the responses is Gaussian (equivalently, the errors are Gaussian).
- The random effects are normally distributed.

*Frequentist:*

The most commonly used packages and/or functions for frequentist LMMs are:

- nlme:
`nlme::lme()`

provides REML or ML estimation. Allows multiple nested random effects, and provides structures for modeling heteroscedastic and/or correlated errors. Wald estimates of parameter uncertainty. - lme4:
`lmer4::lmer()`

provides REML or ML estimation. Allows multiple nested or crossed random effects, can compute profile confidence intervals and conduct parametric bootstrapping. - mbest: fits large nested LMMs using a fast moment-based approach.

*Bayesian:*

Most Bayesian R packages use Markov chain Monte Carlo (MCMC) estimation: MCMCglmm, rstanarm, and brms; the latter two packages use the Stan infrastructure. blme, built on lme4, uses maximum a posteriori (MAP) estimation. bamlss provides a flexible set of modular functions for Bayesian regression modeling.

Generalized linear mixed models (GLMMs) can be described as hierarchical extensions of generalized linear models (GLMs), or as extensions of LMMs to different response distributions, typically in the exponential family. The random-effect distributions are typically assumed to be Gaussian on the scale of the linear predictor.

*Frequentist:*

- MASS:
`MASS::glmmPQL()`

fits via penalized quasi-likelihood. - lme4:
`lme4::glmer()`

uses Laplace approximation and adaptive Gauss-Hermite quadrature; fits negative binomial as well as exponential-family models. - glmmTMB uses Laplace approximation; allows some correlation structures; fits some non-exponential families (Beta, COM-Poisson, etc.) and zero-inflated/hurdle models.
- GLMMadaptive uses adaptive Gauss-Hermite quadrature; fits exponential family, negative binomial, beta, zero-inflated/hurdle/censored Gaussian models, user-specified log-densities.
- hglm fits hierarchical GLMs using
*h*-likelihood (*sensu*Nelder, Lee and Pawitan (2017) - glmm fits GLMMs using Monte Carlo likelihood approximation.
- glmmEP fits probit mixed models for binary data by expectation propagation.
- mbest: fits large nested GLMMs using a fast moment-based approach.

*Bayesian:*

Most Bayesian mixed model packages use some form of Markov chain Monte Carlo (or other Monte Carlo methods).

- MCMCglmm: Gibbs sampling. Exponential family, multinomial, ordinal, zero-inflated/altered/hurdle, censored, multimembership, multi-response models. Pedigree (animal/kinship/phylogenetic) models.
- rstanarm Hamiltonian Monte Carlo (based on Stan); designed for
`lme4`

compatibility. - brms: Hamilton Monte Carlo. Linear, robust linear, count data, survival, response times, ordinal, zero-inflated/hurdle/censored data.
- bamlss: optimization and derivative-based Metropolis-Hastings/slice sampling. Wide range of distributions and link functions.

The following packages (in addition to bamlss) find maximum *a posteriori* fits to Bayesian (G)LMMs by optimization:

- blme wraps lme4 to add prior distributions.
- INLA uses integrated nested Laplace approximation to fit GLMMs using a wide range of latent models (especially for spatial estimation), priors, and distributions.
- inlabru facilitates spatial modeling using integrated nested Laplace approximation via the R-INLA package. Additionally, extends the GAM-like model class to more general nonlinear predictor expressions and implements a log-Gaussian Cox process likelihood for modeling univariate and spatial point processes based on ecological survey data.
- inlatools provides tools to set sensible priors and check the dispersion and distribution of INLA models.

vglmer estimates GLMMs by variational Bayesian methods.

Nonlinear mixed models incorporate arbitrary nonlinear responses that cannot be accommodated in the framework of GLMMs. Only a few packages can accommodate *generalized* nonlinear mixed models (i.e., parametric nonlinear mixed models with non-Gaussian responses). However, many packages allow smooth nonparametric components (see “Additive models” below). Otherwise, users may need to implement GNLMMs themselves in a more general hierarchical modeling framework.

*Frequentist:*

`nlme::nlme()`

from nlme and`lmer4::nlmer()`

from lme4 fit nonlinear mixed effects models by maximum likelihood.`nlmixr2::nlmixr2()`

from nlmixr2 fits nonlinear mixed effects model by first order conditional estimation (focei) maximum likelihood approximation (a different approximation than`nlme:nlme()`

and`lmer4:nlmer()`

), and allows generalized likelihood as well as a selection of built-in link functions.`gnlmm()`

and`gnlmm3()`

from repeated fit GNLMMs by Gauss-Hermite integration.- saemix and nlmixr2 both use a stochastic approximation of the EM algorithm to fit a wide range of GNLMMs.

*Bayesian:*

- brms supports GNLMMs.

General estimating equations (GEEs) are an alternative approach to fitting clustered, longitudinal, or otherwise correlated data. These models produce estimates of the *marginal* effects (averaged across the group-level variation) rather than *conditional* effects (conditioned on group-level information).

- geepack, gee, and geeM are standard GEE solvers, providing GEE estimation of the parameters in mean structures with possible correlation between the outcomes.
- wgeesel implements a weighted extension of generalized linear models for longitudinal clustered data by incorporating the correlation within-cluster when data is missing at random.
- geesmv: GEE estimator using the original sandwich variance estimator proposed by Liang and Zeger (1986), and eight types of variance estimators for improving the finite small-sample performance.
- multgee is a GEE solver for correlated nominal or ordinal multinomial responses.

**Additive models**(models incorporating smooth functional components such as regression splines or Gaussian processes): gamm4, mgcv, brms, lmeSplines, bamlss, gamlss, LMMsolver, R2BayesX, GLMMRR.**Big data/distributed computation**: lmmpar, mbest. See also MixedModels.jl (Julia), diamond (Python).**Bioinformatics/quantitative genetics**: MCMC.qpcr, QGglmm, CpGassoc (methylation studies).

**Censored data**(response data known only up to lower/upper bounds): brms and nlmixr2 (general), ARpLMEC (censored Gaussian, autoregressive errors). Censored Gaussian (Tobit) responses: GLMMadaptive, MCMCglmm, gamlss.**Differential equations**(fitting DEs with group-structured parameters; this category overlaps considerably with**pharmacokinetic modeling**): mixedsde for stochastic DEs. Ordinary DEs can be run with nlmixr2 using the “focei” or “saem” (EM) methods, or using the nlme package; see also the DifferentialEquations task view.**Doubly hierarchical GLMs**: dhglm, mdhglm (multivariate)**Factor analytic, latent variable, and structural equation models**: lavaan, nlmm*(archived)*,sem, piecewiseSEM, semtree, and blavaan; see also the Psychometrics task view.**Kinship-augmented models**(responses where individuals have a known family relationship): pedigreemm, coxme, kinship2, LMMsolver, MCMCglmm, sommer, rrBLUP, BGLR, lme4GS, lme4qtl, pedigreemm, qgtools, cpgen, QTLRel.**Location-scale models**: nlme, glmmTMB, brms, mgcv [with`family`

chosen from one of the`*ls`

/`*lss`

options] all allow modeling of the dispersion/scale component.**Missing values**: mice, mlmmm (EM imputation), CRTgeeDR, JointAI, mdmb, pan; see also the MissingData task view.**Multiple membership models**: (Bayesian) MCMCglmm, brms, rmm; (frequentist) lmerMultiMember (can also fit the Bradley-Terry model)**Multinomial responses**: bamlss, R2BayesX, MCMCglmm, mgcv, mclogit.**Multi-trait analysis**: (multiple dependent variables) BMTME, MCMCglmm, MegaLMM**Non-Gaussian random effects**: brms, repeated, spaMM.**Ordinal-valued responses**(responses measured on an ordinal scale): ordinal, cplm.**Over-dispersed models**: aod, aods3.**Panel data**: in econometrics,*panel data*typically refers to subjects (individuals or firms) that are sampled repeatedly over time. The theoretical and computational approaches used by econometricians overlap with mixed models (e.g., see here). The plm package can fit mixed-effects panel models; see also the Econometrics task view.**Quantile regression**: lqmm, qrLMM, qrNLMM.**Phylogenetic models**: pez, phyr, MCMCglmm, brms.**Repeated measures**: (packages with specialized covariance structures for handling repeated measures) nlme, mmrm, glmmTMB, LMMsolver, repeated, mmrm**Regularized/penalized models**(regularization or variable selection by ridge, lasso, or elastic net penalties): splmm fits LMMs for high-dimensional data by imposing penalty on both the fixed effects and random effects for variable selection. glmmLasso fits GLMMs with L1-penalized (LASSO) fixed effects. bamlss implements LASSO-like penalization for generalized additive models.**Robust/heavy-tailed estimation**(downweighting the importance of extreme observations): robustlmm, robustBLME (Bayesian robust LME), CRTgeeDR for the doubly robust inverse probability weighted augmented GEE estimator. Some packages (brms, bamlss, mgcv with`family = "scat"`

, nlmixr2) allow heavy-tailed response distributions such as Student-*t*.**Skewed data**: skewlmm fits a scale mixture of skew-normal linear mixed models using expectation-maximization (EM). nlmixr2 can fit skewed data with dynamic transform of both sides with both`coxBox()`

and`yeoJohnson()`

transformations with maximum likelihood or the EM method “saem”.**Spatial models**: nlme (with`corStruct`

functions), CARBayesST, sphet, spind, spaMM, glmmfields, glmmTMB, inlabru (spatial point processes via log-Gaussian Cox processes), brms, LMMsolver, bamlss; see also the Spatial and SpatioTemporal CRAN task views.**Sports analytics**: mvglmmRank, multivariate generalized linear mixed models for ranking sports teams.**Survival analysis**: coxme.**Tree-based models**: glmertree, semtree, gpboost**Weighted models**: WeMix (linear and logit models with weights at multiple levels)**Zero-inflated models**: (frequentist) glmmTMB, cplm; (Bayesian): MCMCglmm, brms, bamlss, mgcv (zi Poisson only).

These packages do not directly provide functions to fit mixed models, but instead implement interfaces to general-purpose sampling and optimization toolboxes that can be used to fit mixed models. While models require extra effort to set up, and often require programming in a domain-specific language other than R, these frameworks are more flexible than most of the other packages listed here.

- Interfaces to JAGS/OpenBUGS: R2jags, rjags, R2OpenBUGS (BUGS language).
- Interfaces to Stan (C++ extensions): rstan, cmdstanr, rethinking.
- Other frameworks: TMB (automatic differentiation and Laplace approximation) (C++ extensions), tmbstan, nimble, greta (R interface to TensorFlow).

**general**: HLMdiag (diagnostic tools for hierarchical (multilevel) linear models), rockchalk, performance, multilevelTools, merTools (for models fitted using`lme4`

).**influential data points**: influence.ME, influence.SEM.**residuals**: DHARMa.

**Correlations**: iccbeta (intraclass correlation), rptR (repeatabilities): r2glmm (*R*^{2}calculations*R*^{2}and partial*R*^{2}), MuMIn (`r.squaredGLMM()`

function), partR2, performance (`r2()`

function) (Note that there are many different methods for computing*R*^{2}values for (G)LMMs: see e.g. Nakagawa, Johnson and Schielzeth (2017), Jaeger et al. (2017).)**Information criteria**: cAIC4 (conditional AIC) , blmeco (WAIC).**Robust variance-covariance estimates**: clubSandwich, merDeriv.

The first and second derivatives of log-likelihood with respect to parameters can be useful for various model evaluation tasks (e.g., computing sensitivities, robust variance-covariance matrices, or delta-method variances).

Many packages include small example data sets (e.g., lme4, nlme). These packages provide previously described data sets often used in evaluating mixed models.

- mlmRev: examples from the Multilevel Software Comparative Reviews.
- SASmixed: data sets from
*SAS System for Mixed Models*. - StroupGLMM: R scripts and data sets for
*Generalized Linear Mixed Models*. - blmeco: Data and functions accompanying
*Bayesian Data Analysis in Ecology using R, BUGS and Stan*. - nlmeU: Data sets, functions and scripts described in
*Linear Mixed-Effects Models: A Step-by-Step Approach*. - VetResearchLMM: R scripts and data sets for
*Linear Mixed Models. An Introduction with applications in Veterinary Research*. - languageR: R scripts and data sets for
*Analyzing Linguistic Data: A practical introduction to statistics using R*. - nlmixr2data: includes the data sets for testing nlmixr2 against commercial competitors like ‘NONMEM’ and ‘Monolix’

Functions and frameworks for convenient and tabular and graphical output of mixed model results:

**Tables**: huxtable, broom.mixed, rockchalk, parameters, modelsummary.**Figures**: dotwhisker, sjPlot, rockchalk.

These functions provide convenient frameworks to fit and interpret mixed models.

**Model fitting**: multilevelmod, ez, mixlm, afex, dalmatian (wrapper to JAGS and nimble).**Model summary**: broom.mixed, insight.**Variable selection & model averaging**: LMERConvenienceFunctions, MuMIn, glmulti (see, e.g., maintainer’s blog or here for use with mixed models).

**Fixed effects**: car, lmerTest, RVAideMemoire, emmeans, afex, pbkrtest, CLME.**Random effects**: varTestnlme, RLRsim, mvctm.

- pbkrtest, lme4 (
`lme4::bootMer()`

function), lmeresampler.

These topics are closely related because there are few available analytical methods for computing statistical power for mixed models; power usually needs to be estimated by simulation.

**Power**: longpower, clusterPower, pass.lme simr**Simulation**: faux;`simulate()`

in`lme4`

(for formula arguments); rxode2, mrgsolve, PKPDsim (ODE/pharmacokinetic models)

- cAIC4 (
`cAIC4::stepcAIC`

), buildmer, MuMIn, StatisticalModels (`GLMERSelect`

).

- Help: R-SIG-mixed-models mailing list for discussion of mixed-model-related questions, course announcements, etc..
- Help: [r] + [mixed-models] tags on Stack Overflow.
- Help: Cross Validated.
- Other software: Mixed models Bioconductor
- Other software: ASReml-R (asremlPlus).
- Other software: assist.
- Other software: INLA.
- Other software: Zelig Project.
- Other software: MixWild/MixRegLS for scale-location modeling.
- Other software: MixedModels.jl for mixed models in Julia.
- Book:
*Mixed-Effects Models in S and S-PLUS*. - Book:
*SAS System for Mixed Models*. - Book:
*Generalized Linear Mixed Models*. - Book:
*Bayesian Data Analysis in Ecology using R, BUGS and Stan*. - Book:
*Linear Mixed-Effects Models: A Step-by-Step Approach*. - Book:
*Mixed Effects Models and Extensions in Ecology with R*. - Online Book:
*Embrace Uncertainty: Mixed-effects models with Julia*.

- CRAN Task View: Cluster
- CRAN Task View: DifferentialEquations
- CRAN Task View: Econometrics
- CRAN Task View: MissingData
- CRAN Task View: Psychometrics
- CRAN Task View: Spatial
- CRAN Task View: SpatioTemporal
- CRAN Task View: TimeSeries
- GitHub Project: cmdstanr
- GitHub Project: cpgen
- GitHub Project: inlatools
- GitHub Project: lme4GS
- GitHub Project: lme4qtl
- GitHub Project: lmerMultiMember
- GitHub Project: LMMsolver
- GitHub Project: MegaLMM
- GitHub Project: rethinking
- GitHub Project: rmm
- GitHub Project: StatisticalModels