Hendrik Juerges writes:

I am an applied econometrician. The reason I am writing is that I am pondering a question for some time now and I am curious whether you have any views on it.

One problem the practitioner of instrumental variables estimation faces is large standard errors even with very large samples. Part of the problem is of course that one estimates a ratio. Anyhow, more often than not, I and many other researchers I know end up with large point estimates and standard errors when trying IV on a problem. Sometimes some of us are lucky and get a statistically significant result. Those estimates that make it beyond the 2 standard error threshold are often ridiculously large (one famous example in my line of research being Lleras-Muney’s estimates of the 10% effect of one year of schooling on mortality). The standard defense here is that IV estimates the complier-specific causal effect (which is mathematically correct). But still, I find many of the IV results (including my own) simply incredible.

Now comes my question: Could it be that IV is particularly prone to “type M” errors? (I recently read your article on beauty, sex, and power). If yes, what can be done? Could Bayesian inference help?

My reply:

I’ve never actually done any instrumental variables analysis, Bayesian or otherwise. But I do recall that Imbens and Rubin discuss Bayesian solutions in one of their articles, and I think they made the point that the inclusion of a little bit of prior information can help a lot.

In any case, I agree that if standard errors are large, then you’ll be subject to Type M errors. That’s basically an ironclad rule of statistics.

My own way of understanding IV is to think of the instrument has having a joint effect on the intermediate and final outcomes. Often this can be clear enough, and you don’t need to actually divide the coefficients.

And here are my more general thoughts on the difficulty of estimating ratios.

I'm sorry, but I don't get how the IV approach in particular leads one to estimate ratios. Your (Andrew's) last but one paragraph seems to suggest that this refers to the "total effect" estimates (indirect and direct effects), but in IV regressions in particular you are typically not interested in the effect of the instrument on the outcome. (Of course, if it is a valid instrument, there are no direct effects, so total effects are equal to the indirect effects.) For example, when you want to know about the effect of immigration into US cities on those cities' economic outcomes and use proximity to the Mexican border as an instrument for immigration, then you don't care about the indirect effect of border proximity on economic outcomes.

"Those estimates that make it beyond the 2 standard error threshold are often ridiculously large…"

One approach to inference in these types of problems is to condition on the fact that only estimates that exceed some threshold will be estimated. I like the paper of Ghosh et al. on this subject (http://dx.doi.org/10.1016/j.ajhg.2008.03.002). The discussion is in the context of genome-wide association studies, but the math should be broadly applicable.