Skip to content

Jay Goodliffe’s comments on standardizing regression inputs

Jay Goodliffe writes,

I recently read your paper on scaling coefficients that you posted on the PolMeth site. I hope you don’t mind if I send a comment/question on your manuscript.

I usually like to include some sort of “substantive significance” table after the regular tables to report something like first differences. I have also thought recently about how to compare relative effects of variables when some variables are binary and others are not.

My current approach is to code all binary variables with the modal category as 0, set all variables to their median, and then see how the predicted dependent variable changes when each independent variable is moved to the 90th percentile, one at a time. This approach makes it easy to specify the “baseline” observation, so there are no .5 Female voters, which occurs if all variables are set to the mean instead. There are, of course, some problems with this. First, you need all of the binary variables to have at least 10% of the observations in each category. Second, it’s not clear this is the best way to handle skewed variables. But it is similar in kind to what you are suggesting.

My comment is that your approach may not always work so well for skewed variables. With such variables, the range mean +/- s.d. will be beyond the range of observed data. Indeed, in your NES example, Black is such a variable. In linear models, this does not matter since you could use the range [mean, mean + 2 s.d.] and get the same size effect. But it might matter in non-linear models, since it matters what the baseline is. And there is something less…elegant in saying that you are moving Black from -0.2 to 0.5, rather than 0 to 1.

My question is: You make some comments in passing that you prefer to present results graphically. Could you give me a reference to something that shows your preferred practice?



P.S. I’ve used tricks from _Teaching Statistics_ book in my undergraduate regression class.

To start with, I like anyone who uses our teaching tricks, and, to answer the last question first, here’s the reference to my preferred practice on making graphs instead of tables.

On to the more difficult questions: There are really two different issues that Jay is talking about:

1. What’s a reasonable range of variation to use in a regression input, so as to interpret how much of its variation translates into variation in y?

2. How do you summarize regressions in nonlinear models, such as logistic regression?

For question 1, I think my paper on scaling by dividing by two sd’s provides a good general answer: in many cases, a range of 2 sd’s is a reasonable low-to-high range. It works for binary variables (if p is not too far from .5) and also for many continuous variables (where the mean-sd is a low value, and the mean+sd is a high value). For this interpretation of standardized variables, it’s not so important that the range be mean +/- 1sd; all that matters is the total range. (I agree that it’s harder to interpret the range for a binary variable where p is close to 0 or 1 (for example, the indicator for African American), but in these cases, I don’t know that there’s any perfect range to pick–going from 0 to 1 seems like too much, it’s overstating the reasonable changes that could be expected–and I’m happy with 2sd’s a choice.

For question 2, we have another paper just on the topic of these predictive comparisons. The short answer is that, rather than picking a single center point to make comparisons, we average over all of the data, considering each data point in turn as a baseline for comparisons. (I’ll have to post a blog entry on this paper too….)

One Comment

  1. Eric L. Sevigny says:

    Hello, I have a question regarding using standardization to assess relative impact of independent variables in a regression. I read Gelman's "Scaling regression inputs by dividing by two standard deviations". An unanswered question for me is how one might go about interpreting the relative importance of an entire construct when measured as a multicategory nominal variable. For example, for drug type when dummies for heroin, meth, and cocaine are included and marijuana is the reference category.

    To provide some context, I am attempting to determine which Xs have the greatest effect on the Y. Y is natural log transformed. The Xs include natural log transformed, untransformed continuous, binary, and multi-category nominal variables.

    Not to complicate the picture too much, but I am employing a design-based approach to account for the survey's complex sampling design as well.

    Any comments would be helpful.