Skip to content
 

Problems with Heterogeneous Choice Models

Introduction

Heterogeneous choice model allows researchers to model the variance of individual level choices. These models have their roots in heteroskedasticity (unequal variance) and the problems it creates for statistical inference. In the context of linear regression, unequal variance does not bias the estimates, rather it inflates or underestimates the true standard errors in the model. Unequal variance is more problematic in discrete choice models, such as logit or probit (and their ordered or multinomial variants). If we have unequal variances in the error term of a discrete choice model, not only are the standard errors incorrect, but the parameters are also biased.

Example

The classic political science example involves ambivalence and value conflict. Alvarez and Brehm (1995) modeled the variation in responses to survey questions on abortion to demonstrate that this variation results not from respondents offering ill-informed opinions, but instead is a product of the ambivalence that results from wrestling with a difficut and important choice. This is a case where the variability in the choice is actually more interesting than what determines the choice.

Estimation

Estimation of heterogeneous choice models is fairly straightforward. A researcher that suspects heterogeneous choices can select a set of covariates and model the heterogeneity. However, the properties of these models are not well understood. There is little analytical or empirical evidence how well these models perform.

Monte Carlo Experiments

Using monte carlo experiments, we examine how well these models estimate the parameters used to make inferences about heterogeneous choices. We find that these models are deeply flawed. Not only are these model an ineffective “cure” for heteroskedasticity, the estimates of these models are biased and provide incorrect estimates of the standard errors.

The estimated sampling variability and coverage rates were less than ideal even under a perfect specification. Measurement error in the variance model induced significant amounts of bias, and almost any specification error causes the estimates of both the choice and variance model to be completely unreliable. Even in models where teh variable used to estimate the choice was correlated with the true variable at 0.90, all the parameters were estimated very poorly, being biased by over 60%. And, it is entirely possible to misspecify both the choice and variance model, which should only make the estimates worse.

Conclusion

Of course it is easier to tear down than to build up. To that end, we intend to devise an alternative way to model heterogeneous choices. Bayesian estimation techniques offers such a possibility, with its focus on the variance of all parameters. Such an approach should give us better leverage over the heterogeneity of individual choices