Ryan Raaum writes:

I’m hoping you’ll be willing to shed some light on a question I have regarding “pooling” in modeling. In your book with Jennifer Hill, you lay out two ends of a spectrum for dealing with structured data: (1) “Complete pooling” – ignoring the groups and pooling everything together for an overall average ± standard error, and (2) “No pooling” – subsetting the data by group and calculating per-group averages ± standard errors. Which is very clear to me. However, many discussions of multilevel models that I have found equate “no pooling” with the usual handling of categorical predictors in standard linear models. Is there not _some_ pooling in the standard linear modeling approach? Not in the group averages, which are identical to those calculated by subsetting the data, but could one say that the residual variation is “pooled” across all observations and then dispersed across all groups?

To illustrate, I’ve used the Galton height data, with 898 measurements from 197 families (average ~4 individuals per family, range 1-15). Working with a subset of the total dataset (so the plots aren’t too crammed), in the first plot I have the “No Pooling” averages and 95% CIs (by subsetting) and the global average and 95% CI. In the second plot, I have the averages and 95% CIs from a standard linear model using dummy variables for the different families. Does the regularization of uncertainty for the per-group average estimates in the second plot count as “pooling” or would you describe it some other way?

My reply: Yes, “no pooling” is relative to the model, it does not mean that the data from different groups are analyzed completely separately.

> one say that the residual variation is “pooled” across all observations

I went in too perhaps too much detail about those issues here http://andrewgelman.com/wp-content/uploads/2011/05/plot13.pdf

In particular see figure 4.