Standardizing regression inputs

Posted on August 18, 2012 9:23 AM by Andrew

Andy Flies, Ph.D. candidate in zoology, writes:

After reading your paper about scaling regression inputs by two standard deviations I found your blog post stating that you wished you had scaled by 1 sd and coded the binary inputs as -1 and 1. Here is my question:

If you code the binary input as -1 and 1, do you then standardize it? This makes sense to me because the mean of the standardized input is then zero and the sd is 1, which is what the mean and sd are for all of the other standardized inputs. I know that if you code the binary input as 0 and 1 it should not be standardized.

Also, I am not interested in the actual units (i.e. mg/ml) of my response variable and I would like to compare a couple of different response variables that are on different scales. Would it make sense to standardize the response variable also?

My reply: No, I don’t standardize the binary input. The point of standardizing inputs is to make the coefs directly interpretable, but with binary inputs the interpretation is already clear, since there is only one possible comparison. Also, yes, I do standardize continuous responses. Unless the analysis is on the log scale (as in an elasticity model), in which case, again, the coefs are already directly interpretable.

1 thought on “Standardizing regression inputs”

KMC on August 23, 2012 2:21 PM at 2:21 pm said:

In our practice, how we approach this depends on what is being measured. Sometimes the original units of measurement make good sense. For things like age we often rescale to 10 year intervals. Coef’s scaled to a SD of age are sometimes more cumbersome to interpret and compare.

Comments are closed.