Clustered standard errors vs. hierarchical modeling

I received the following question in the mail:

My question has to do with the choice between OLS and clustered standard errors, on the one hand, and hierarchical modeling, on the other hand. In finance and perhaps to a lesser extent in economics generally, people seem to use clustered standard errors. Hierarchical modeling seems to be very rare. When I ask financial economists about it, no one even knows what it is. Everyone, however, knows about clustered standard errors. The only reason why I know about hierarchical modeling is that an epidemiologist brought it to my attention. Eventually, I found your book.

It seems to me (and I’m about as formally untrained in statistics as one can get–it’s a long story) that both OLS with clustered standard errors and hierarchical modeling are intended to address the same problem: the correlation of residuals within a cluster (be it a state, in some of your research, or a country, as in my research). Am I correct here?

Assuming that I am correct, then the obvious question for an applied person like me: Which method, OLS with clustered standard errors or hierarchical modeling, is best for my particular problem? I realize that the Primo, et al article that you link on your blog essentially tackles this problem, but it doesn’t seem a very satisfying treatment to me. I realize that you have a brief reference in your text, but that seems to be it. Do you know of other papers that address the choice between clustered OLS and hierarchical modeling? . . .

I’m particularly concerned with how hierarchical modeling deals with small clusters. Take Venezuela. As I said, there are only two observations. Even if these are randomly chosen, how much do two observations tell us about the ownership concentration in Venezuela? Not much, if anything. Yet those two observations seem to make a big difference for hierarchical modeling, at least compared with clustered OLS.

Sorry to be so long-winded, but this sure has been a challenging research project. I guess I have three specific questions (all noted above in my long-winded way)

1. Do you know of any papers (other than Primo, et al) comparing clustered OLS with hierarchical modeling? (BTW, if such a paper doesn’t exist, you should think about writing it. I think it could be very influential.)

2. Does it seem crazy to use country averages of ownership concentration when you have the firm-level data and you want to understand a firm-level characteristic?

3. In my case of firms nested within countries, with both firm-level and country-level determinates, and unbalanced clusters, including some very small clusters, which would you recommend: clustered OLS or hierarchical modeling?

My reply: In econometrics the usual focus is on estimating a single beta, in which case clustered se’s or hierarchical models are different ways of handling correlated errors. But once you start to be interested in varying coefficients (e.g., what’s happening in individual states and countries), then I think hierarchical models are better because they do partial pooling of coefficients when sample sizes are small. I don’t know of any articles on the topic, but maybe there’s something out there.

4 thoughts on “Clustered standard errors vs. hierarchical modeling

  1. I share the original post' interest in understanding why some people are excited about multilevel models. It is true that economists do not share this enthusiasm to the same degree.

    In my experience economists' first preoccupation is to

    1. get an unbiased estimate of the effect they are interested in (which typically includes interactions/non-linearities contrary to your impression from reading the Angrist & Pischke book).

    After this they try to get

    2. the standard errors right.

    Somehow your remark seems to confound 1 and 2.

    I think that economists see multilevel models as general random effects models, which they typically find less compelling than fixed effects models.

    Ed.

  2. Ed: I agree with what you wrote about the economists' perspective. As I wrote above, multilevel models are particularly helpful when you're estimating varying treatment effects or, more generally, interactions. Multilevel models are also helpful for problems of prediction. If all you care about is a single "beta," then the motivation for multilevel models is indeed less compelling–although even there they can help in settings with unbalanced designs or small sample sizes.

  3. Hi All,

    Here are a few papers that I've found helpful on these topics plus my own hunches based on them.

    (1) David Freedman did not like the post-hoc correction strategy in general: his article here.

    Among other worries, he seems to suggest that estimating a MLE model that one knows is wrong and then post-hoc adjusting some pieces of it is conceptually confused. He does note that the linear case makes such a correction easier to justify in some ways. I do not know much about the theory of quasi-likelihood estimation (i.e. estimating wrong models but perhaps correct means and variances) or whether it might help with some of the conceptual confusion described by Freedman.

    (2) Don Green and Lynn Vavreck compare these methods for the analysis of cluster-assigned experiments (only varying intercepts in the multilevel aka random effects models): here

    (3) Eduardo Leoni compares these kinds of strategies for logit models: Analyzing Multiple Surveys: Results from Monte Carlo Experiments

    I seem to recall that Leoni highlights the importance of remembering that the cluster-correction is consistent in the number of clusters.

    Those pieces plus reading of the literature on the HC1,2,3,4 etc… versions of the correction (each of which tries to manage the problems in the original correction that stem from influential points), make me worry especially about the coverage of confidence intervals produced from the cluster correction when the number of clusters is small, when the model fit is not linear, and/or when there are many especially influential points. (for more on HC1,2,3,4 see Cribari-Neto, F. 2004. Asymptotic Inference Under Heteroskedasticity of Unknown Form. Computational Statistics and Data Analysis. and Long, J.S., Ervin, L.H., 2000. Using heteroskedasticity-consistent standard errors in the linear regression model. Amer. Statist. 54, 217–224.)

    Are multilevel models better in those situations? Correct coverage of confidence intervals will still depend on numbers of clusters being large (of course, a meaningful posterior does not depend on large sample sizes). And the distinction between Normal and Binomial, Poisson outcome models seems less important for the issue of getting standard errors right. Although, I suppose that with few clusters the attention ought to turn to getting the models of the parameters (priors) to be credible since they will exert more control over the posterior as the amount of information in the likelihood goes down.

    What do to about situations with thousands of people in relatively few states? I don't think the thousands of people help overcome the problem of few clusters when it comes to the cluster correction. I suspect they may help more when it comes to multilevel models —- but that still, the operating characteristics of the multilevel model for estimating cluster-level effects still depend on large numbers of clusters.

    Just my hunches about this stuff. I'd love to know whether these guesses are sensible or not. Even if they are not sensible, I hope the pointers to papers referenced above are helpful.

  4. Jake: I totally agree with you about conceptual confusion. Your hunches sound sensible to me.

    To defend against Freedman a bit; rather than claim to be patching-up a wrong-model MLE, one can instead motivate clusted se's directly, as a tool for inferences about population parameters – not parameters in a potentially mis-specified parametric model. Then, as you're not assuming anything model-based, the issue of whether the model is right or wrong goes away.

    Now, in some situations the asymptotics may certainly need some model-based "help" – as per Andrew's comment. But in many applied statistical settings they're okay, and there's a lot you can learn from "single" possibly multivariate "beta". So they should be in everyone's toolbox – econometrician or not.

Comments are closed.