Introduction

broom.mixed is a spinoff of the broom package. The goal of broom is to bring the modeling process into a “tidy”(TM) workflow, in particular by providing standardized verbs that provide information on

  • tidy: estimates, standard errors, confidence intervals, etc.
  • augment: residuals, fitted values, influence measures, etc.
  • glance: whole-model summaries: AIC, R-squared, etc.

broom.mixed aims to provide these methods for as many packages and model types in the R ecosystem as possible. These methods have been separated from those in the main broom package because there are issues that need to be dealt with for these models (e.g. different types of parameters: fixed, random-effects parameters, conditional modes/BLUPs of random effects, etc.) that are not especially relevant to the broader universe of models that broom deals with.

Mixed-model-specific issues

Terminology

  • the upper-level parameters that describe the distribution of random variables (variance, covariance, precision, standard deviation, or correlation) are called random-effect parameters (ran_pars in the effects argument when tidying)
  • the values that describe the deviation of the observations in a group level from the population-level effect (which could be posterior means or medians, conditional modes, or BLUPs depending on the model) are called random-effect values (ran_vals in the effects argument when tidying)
  • the parameters that describe the population-level effects of (categorical and continuous) predictor variables are called fixed effects (fixed in the effects argument when tidying)
  • the categorical variable (factor) that identifies which group or cluster an observation belongs to is called a grouping variable (group column in tidy() output)
  • the particular level of a factor that specifies which level of the grouping variable an observation belongs to is called a group level (level column in tidy() output)
  • the categorical or continuous predictor variables that control the expected value (i.e., enter into the linear predictor for some part of the model) are called terms (term column in tidy() output); note that unlike in base broom, the term column may have duplicated values, because the same term may enter multiple model components (e.g. zero-inflated and conditional models; models for more than one response; fixed effects and random effects)

Time-consuming computations

Some kinds of computations needed for mixed model summaries are computationally expensive, e.g. likelihood profiles or parametric bootstrapping. In this case broom.mixed may offer an option for passing a pre-computed object to tidy(), eg. the profile argument in the tidy.merMod (lmer/glmer) method.

There are many, many things one might want to do with a fitted model, and broom.mixed can only do a few of them.

  • emmeans
  • multcomp
  • car
  • afex
  • sjStats/sjPlots
  • rockchalk

huxtable + broom.mixed

dotwhisker + broom.mixed

dotwhisker is a convenient platform for creating dot-whisker plots - either directly from models or lists of models (tidy() methods are automatically called to convert the models to a tidy format), or from the (possibly post-processed) output of a tidy() call. There are a couple of caveats and issues to be aware of when using dotwhisker in conjunction with broom.mixed, however.

  1. For fixed effects, the group value is set to NA: in the current CRAN version (0.5.0), an unfortunate na.omit() within the dwplot code will eliminate all of the fixed effects unless you drop this column before passing the results to dwplot (this has been fixed in the current GitHub version, which you can install with devtools::install_github("fsolt/dotwhisker")).
  2. In broom.mixed output, it is fairly common for a single tidied model to have duplicated entries in the term column (e.g. effects that appear in both the conditional and the zero-inflated model, or intercept standard deviations for several different grouping variables). dotwhisker::dwplot takes this as evidence that it has been handed a tidied object containing the results from several different models, and asks for a model column that will distinguish the non-unique terms. There are (at least) two strategies you can take:
    • dwplot(list(fitted_model)) will plot all of the non-unique values together
    • tidy(fitted_model) %>% tidyr::unite(term, group, term) will create a new term column that’s the combination of the group and term columns (which will disambiguate random-effect terms from different grouping variables); unite(term, component, term) will disambiguate conditional and zero-inflation parameters. The code below shows a slightly more complicated (but prettier) approach. (Some sort of disambiguate_terms() function could be added in a future version of the package …)
library(dplyr)
library(tidyr)
library(broom.mixed)
if (require("brms") && require("dotwhisker") && require("ggplot2")) {
    L <- load(system.file("extdata", "brms_example.rda", package="broom.mixed"))
    gg0 <- (tidy(brms_crossedRE)
        ## disambiguate
        %>% mutate(term=ifelse(grepl("sd__(Int",term,fixed=TRUE),
                               paste(group,term,sep="."),
                               term))
        %>% dwplot
    )
    gg0 + geom_vline(xintercept=0,lty=2)
}

Capabilities

Automatically retrieve table of available methods:

## # A tibble: 22 × 4
##    class    tidy  glance augment
##    <chr>    <lgl> <lgl>  <lgl>  
##  1 allFit   TRUE  TRUE   FALSE  
##  2 brmsfit  TRUE  TRUE   TRUE   
##  3 gamlss   TRUE  TRUE   FALSE  
##  4 gamm4    TRUE  TRUE   TRUE   
##  5 glmmadmb TRUE  TRUE   TRUE   
##  6 glmmTMB  TRUE  TRUE   TRUE   
##  7 gls      TRUE  TRUE   TRUE   
##  8 lme      TRUE  TRUE   TRUE   
##  9 lmList4  TRUE  FALSE  FALSE  
## 10 mcmc     TRUE  FALSE  FALSE  
## # … with 12 more rows

Manually compiled list of capabilities (possibly out of date):

## Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
## logical.return = TRUE, : there is no package called 'pander'