I Am Too Absolutely Heteroskedastic for This Probit Model

Soren Lorensen wrote:

I’m working on a project that uses a binary choice model on panel data. Since I have panel data and am using MLE, I’m concerned about heteroskedasticity making my estimates inconsistent and biased.

Are you familiar with any statistical packages with pre-built tests for heteroskedasticity in binary choice ML models? If not, is there value in cutting my data into groups over which I guess the error variance might vary and eyeballing residual plots? Have you other suggestions about how I might resolve this concern?

I replied that I wouldn’t worry so much about heteroskedasticity. Breaking up the data into pieces might make sense, but for the purpose of estimating how the coefficients might vary—that is, nonlinearity and interactions.

Soren shot back:

I’m somewhat puzzled however: homoskedasticity is an identifying assumption in estimating a probit model: if we don’t have it all sorts of bad things can happen to our parameter estimates. Do you suggest not worrying about it because the means of dealing with it are so noisy? [I had hoped to test for it using the algorithm suggested by Davidson & MacKinnon (1993) and to correct for it using a multiplicative heteroskedasticity model.]

I recently graduated from undergrad so my concerns stem from very recent study of econometrics (the professors for whom I work at first nearly scoffed at my concern), but could you please describe (or point me to a source / paper) on why we might not be so concerned about heteroskedasticity in maximum likelihood binary choice models?

To which I replied:

If you’re worried you can always check your model fitting using some simulated data. (That’s the sort of thing I always say.)

14 thoughts on “I Am Too Absolutely Heteroskedastic for This Probit Model

  1. My professor had this concern as well when I worked on my Masters thesis. She had me run the regressions as GLS instead. Other professors in the department thought this was a terrible idea.

  2. I often encounter these sorts of students who try to use everything they learned in class. Just because you learn it doesn’t you should use it. My data mining class is raising tons of these types of students who just run decision trees and cross-validations automatically with thinking whether those tools are appropriate for their end goal.

  3. But Raymond, this question *is* relevant to the end goal: we always want to maximize unbiasedness and consistency (subject to some constraints).

  4. If Soren is concerned with estimation and testing, that is, a priori specifying the hypothesis/parameter of interest (a measure of association) for inference via a probit model (not modeling the data for prediction), then heteroskedaticity is of course a concern when drawing inference on the parameter (say beta1) as the asymptotic distribution used for testing and constructing confidence intervals depends upon the ASSUMED mean-variance relationship of the model. A 95% CI would not cover the “truth” (what beta hat is consistent for) 95% of the time; this stems from the fact that the estimated variability of beta hat is wrong.

    The use of robust standard errors (sandwich estimator) would allow for valid inference (in the sense that 95% CI would cover the “truth” 95% of the time); that is, properly quantify the variability of beta hat.

  5. IF I understand your answer, you’re pointing that with fake simulated data, you can understand how well the model fits the data and if something like heteroskedasticity is driving your model to the wrong places. Am I right?

    Manoel

    • Yes. By simulating data with, perhaps, means consistent with a given dataset but variances specified according to guesses about the dimensions along which the het. persists, via Monte Carlo, you can observe rather well how the het. affects the model.

  6. Does anyone have a good data set or reference to illustrate this? I’m not sure why I’d expect to see this in any data I have, although now I’m curious to know more.

    • Allison, P. D., 1999. Comparing logit and probit coefficients across groups. Sociological Methods & Research 28 (2), 186–208.

      Buis, M. L., 2011 The Consequences of Unobserved Heterogeneity in a Sequential Logit Model. Research in Social Stratification and Mobility 29 (3), 247-262.

      Mare, R. D., 2006. Response: Statistical models of educational stratification — hauser and andrew’s models for school transitions. Sociological Methodology 11, 27–37.

      Mood, C., 2010. Logistic regression: Why we cannot do what we think we can do, and what we can do about it. European Sociological Review 26 (1), 67–82.

      Neuhaus, J. M., Jewell, N. P., 1993. A geometric approach to assess bias due to omited covariates in generalized linear models. Biometrika 80 (4), 807–815.

      Neuhaus, J.M., Kalbfleisch, J. D., Hauck,W.W., 1991. A comparison of clusterspecific and population-averaged approaches for analyzing correlated binary data. International Statistical Review 59 (1), 25–35.

      Williams, R., 2009. Using heterogenous choice models to compare logit and probit coefficients across groups. Sociological Methods & Research 37 (4), 531–559.

  7. Stata allows you to estimate probit models with multiplicative het’. So you can specify what variables (which may be covariates or external variables) that affect the variance.
    The routine is “hetprob”.
    With panel data, however it may be (much) more complicated, as you are presumably estimating a random effects model.

Comments are closed.