Demystifying Blup

In our recent thread on computing hierarchical models with big datasets, someone brought up Blup. I thought it might be worth explaining what Blup is and how it relates to hierarchical models.

Blup stands for Best Linear Unbiased Prediction, but in my terminology it’s just hierarchical modeling. Let me break it down:
– “Best” doesn’t really matter. What’s important is that our estimates and predictions make sense and are as accurate as possible.
– “Linear” isn’t so important. Statistical predictions are linear for Gaussian linear models, otherwise not. We can and do perform hierarchical generalized linear models all the time.
– “Unbiased” doesn’t really matter (see discussion of “Best,” above).
– “Prediction” is the key word for relating Blup and hierarchical modeling to classical statistical terminology. In classical statistics, “estimation” of a parameter theta is evaluated conditional on the true value of theta, whereas “prediction” of a predictive quantity phi is evaluated unconditional on phi, but conditional on theta. “Prediction” is a way to do Bayesian inference in a classical setting. In the classical “empirical Bayes” framework, some of the unknowns are called “parameters” and some are called “predictive quantities” or missing data. “Unbiased estimation” is different from “unbiased prediction.” We discuss this briefly in BDA (maybe in a footnote somewhere).

For the purposes of modeling and data analysis, Blup is hierarchical regression.

The relevance of the Blup literature for computation for hierarchical models is not that Blup is a better or even a different method, but rather that Blup users have their own history with applications and big datasets and thus may have developed some useful tools that have not made their way into mainstream statistics and machine learning. Any computational method for Blup can be ported directly into computation for hierarchical modeling, and vice-versa.