R-squared: useful or evil?

I had the following email exchange with Gary King.

Me: I know you hate R-squared and you hate standardization; nonetheless you might like this paper and this one. I’ve found the standardization idea, in particular, very helpful–I’ve been using it on many applications recently.

Gary: If R-sq were used as a data summary only, I’d have no objection (as an aside, I think ‘data summary’ which has good uses, often just means ‘it’s just description so don’t bother me, anything goes!’). Instead, it is used as a measure of the quality or success or correctness or validity of the model, which is usually nuts.

Me: I agree with you there. By “data summary,” I more precisely mean something that inherently depends on the design of the data collection. Thus, keep the model the same but spread out the x’s, and R-squared goes up. But the model doesn’t change. Similarly, “stat. signif.” changes as you change the sample size.

Gary: Spreading out the x’s is changing the model. Also, you can write down two equivalent models where the data give identical inferences about all the key parameters, but R2 can differ drastically. That’s not abt the model or data summaries.

Me: What’s in the model depends where you draw the line. If you have a dose-response model of the form y=f(x) + error, and you’re interested in f, then I don’t consider x part of the model; you set x to get a good estimate of f. At the other extreme, you can define anything as part of the model. Even sample size is part of the model if you consider it as a random variable. But I see what you’re talking about. You’re talking about comparing models. I’m not particularly interested in comparing models. I’m using R^2 to understand a single model (in particular, the way in which a particular dataset is informative about that single model). If I were to compare models, I’d do it directly.

Gary: So if you want to compare models, you wouldn’t use R^2. But almost all uses of R^2 in the literature are about comparisons of some kind, even when implicit (the R^2 indicates that my model is better than yours! etc.). Anyway, I agree that it shouldn’t be used to compare models, altho one (perhaps the only one?) valid use of R^2 is to compare two models or two specifications so long as they have the same dependent variable. The problem with R^2 is comparisons of data or model or anything when Y changes. The problem is identifying the question R^2 is the optimal answer to

That’s enough for now, I’m sure…

1 thought on “R-squared: useful or evil?

Comments are closed.