## How to think about instrumental variables when you get confused

“Instrumental variables” is an important technique in applied statistics and econometrics but it can get confusing. See here for our summary (in particular, you can take a look at chapter 10, but Chapter 9 would help too).

Now an example. Piero spoke in our seminar last Thursday on the effects of defamation laws on reporting of corruption in Mexico. In the basic analysis, he found that, in the states where defamation laws are more punitive, there is less reporting of corruption, which suggests a chilling effect of the laws. But there are the usual worries about correlation-is-not-causation, and so Piero did a more elaborate instrumental variables analysis using the severity of homicide penalties as an instrument.

We had a long discussion about this in the seminar. I originally felt that “severity of homicide penalties” was the wackiest instrument in the world, but Piero convinced me that it was reasonable as a proxy for some measure of general punitiveness of the justice system. I said that if it’s viewed as a proxy in this way, I’d prefer to use a measurement-error model, but I can see the basic idea.

Still, though, there was something bothering me. So I decided to go back to basics and use my trick for understanding instrumental variables. It goes like this:

The trick: how to think about IV’s without getting too confused

Suppose z is your instrument, T is your treatment, and y is your outcome. So the causal model is z -> T -> y. The trick is to think of (T,y) as a joint outcome and to think of the effect of z on each. For example, an increase of 1 in z is associated with an increase of 0.8 in T and an increase of 10 in y. The usual “instrumental variables” summary is to just say the estimated effect of T on y is 10/0.8=12.5, but I’d rather just keep it separate and report the effects on T and y separately.

In Piero’s example, this translates into two statements: (a) States with higher penalties for murder had higher penalties for defamation, and (b) States with higher penalties for murder had less reporting of corruption.

Fine. But I don’t see how this adds anything at all to my understanding of the defamation/corruption relationship, beyond what I learned from his simpler finding: States with higher penalties for defamation had less reporting of corruption.

In summary . . .

If there’s any problem with the simple correlation, I see the same problems with the more elaborate analysis–the pair of correlations which is given the label “instrumental variables analysis.” I’m not opposed to instrumental variables in general, but when I get stuck, I find it extremely helpful to go back and see what I’ve learned from separately thinking about the correlation of z with T, and the correlation of z with y. Since that’s ultimately what instrumental variables analysis is doing.

1. Ken says:

The model of higher penalties for defamation resulting in reduced reporting seems a reasonable model, and it would be surprising if the data didn't confirm this. I know that in Australia where defamation is unlikely to be a criminal offence, but can be an expensive civil case, newspapers take very seriously defamation as defending a case is a minimum of \$100,000 and penalties may be several million. The risk of criminal convictions would make them even more careful.

I think the relationship the author is trying to get, is that where newspapers are opinionated, the penalties may have been raised even higher. The only measurement of this is the relationship between defamation and other penalties for example homicide. If it is, it doesn't come through clearly in the paper.

2. bccheah says:

I'm always confused about IV as well so thanks for the post. Isn't there some requirement that z also be independent of the error term in the causal model of z ~ Y (in a regression context)? Also, I'm confused about the estimated size using the instrument – is the estimated size valid since different instruments will give different size estimates or am I confusing myself even more?

3. Loren says:

In Piero's example, this translates into two statements: (a) States with higher penalties for murder had higher penalties for defamation, and (b) States with higher penalties for murder had less reporting of corruption.

If the policy question is based on the prediction problem: "What is the effect of defamation laws on reporting of corruption?", how can these statements help? Presumably the motivation for IV in this example is some evidence or conviction that the correlation between defamation laws and reporting is non-informative.

4. Hal Varian says:

You have to assume that the only way that z affects Y is through the treatment, T. So the IV model is
T = az + e
y = bT + d
It follows that
E(y|z) = b E(T|z) + E(d|z)
Now if we
1) assume E(d|z) = 0
2) verify that E(T|z) != 0
we can solve for b by division. Of course, assumption 1 is untestable.

An extreme case is a purely randomized experiment, where e=0 and z is a coin flip.

5. Andrew Gelman says:

Hal: Yes, of course. Jennifer and I discuss this in sections 10.5-10.6 of our book. The point of my note above is to explain my thinking in settings where you don't necessarily believe the assumptions. In those more speculative contexts, I find it clears my head to think of (T,y) as a joint outcome.