Phil recently posted on the challenge of extrapolation of inferences to new data. After telling the story of a colleague who flat-out refused to make predictions from his model of buildings to new data, Phil wrote, “This is an interesting problem because it is sort of outside the realm of statistics, and into some sort of meta-statistical area. How can you judge whether your results can be extrapolated to the ‘real world,’ if you cant get a real-world sample to compare to?”

In reply, I wrote:

I agree that this is an important and general problem, but I don’t think it is outside the realm of statistics! I think that one useful statistical framework here is multilevel modeling. Suppose you are applying a procedure to J cases and want to predict case J+1 (in this case, the cases are buildings and J=52). Let the parameters be theta_1,…,theta_{J+1}, with data y_1,…,y_{J+1}, and case-level predictors X_1,…,X_{J+1}. The question is how to generalize from (theta_1,…,theta_J) to theta_{J+1}. This can be framed in a hierarchical model in which the J cases in your training set are a sample from population 1 and your new case is drawn from population 2. Now you need to model how much the thetas can vary from one population to another, but this should be possible. They’re all buildings, after all. And, as with hierarchical models in general, the more information you have in the observed X’s, the less variation you would hope to have in the thetas.

Unfortunately, I posted this response in the comments and it seems to have gotten lost. Or so I am guessing given that the long thread that followed included very little discussion of hierarchical modeling as a framework for extrapolation. Instead, there was lots of general discussion of bias, extrapolation, randomization, and statistical foundations.

General principles are fine but I like my above suggestion to frame the extrapolation problem as a hierarchical model because it points a way forward, linking general concerns about out-of-sample predictions to information that could be available in a specific problem.

Agree that there is no alternative to doing some sort of hierarchical modeling (at least to be statistical) but credible arguments for exactly what is common (i.e. hopefully replicating) between the units/studies are necessary.

With randomization, the variation of effects (treatment interaction) might be hoped to be replicating and some study and summary of that useful (e.g. essentially no variation may support an unimportant effect or a simple common effect).

Without randomization, the biases almost surely will be replicating and preventing any learning of direct interest. Information from outside the units/studies will be required to address this and it would seem that this be best done using explicit informative (about the biases) priors but few seem will to do this.

Also agree that it is hard to get a sense of what readers make of the posts and comments, especially to get feedback on what did or did not make sense (there were a few notable exceptions).

“Now you need to model how much the thetas can vary from one population to another, but this should be possible.” — Researcher states extrapolation problem.

“They’re all buildings, after all.” — Researcher states theory/prior about the variability of thetas across pop. 1 and pop. 2: Items that look comparable are comparable.

“And, as with hierarchical models in general, the more information you have in the observed X’s, the less variation you would hope to have in the thetas.” — Researcher theory/prior slips in a shape restriction.

The key here is the assumptions and their justification. To the extent hierarchical models help make these explicit, then good. For example, one can always conjure the possibility that buildings in pop. 2 are made of material u not in pop. 1, and that the presence of u makes the buildings fragile. This is always a possibility, but how probable is it? Unlikely, “They’re all buildings, after all.” and “the more information you have in the observed X’s, the less variation you would hope to have in the thetas.”

[…] discussing issues of generalizing from sample to population. Here’s the basic theoretical idea (not new to me, it’s just coming from a bunch of papers from about 1970 to 1980 by Lindley, […]