## Difference-in-difference estimators are a special case of lagged regression

Jens Hainmueller has an interesting entry here about estimating the causal effects of the 2004 Madrid bombing on the subsequent Spanish elections, by comparing regular votes to absentee votes that were cast before the bombing. Jens cites a paper by Jose Montalvo that uses difference-in-difference estimation; that is, a comparison of this form:

[(avg of treated units at time 2) – (avg of controls at time 2)] – [(avg of treated units at time 1) – (avg of controls at time 1)]

I’m sure this is fine, but it’s just a special case of lagged regression where the lag is restricted to have a coefficient of 1. In educational research, this is sometimes called the analysis of “gain scores.” In any case, you’re generally limiting your statistical efficiency and range of applicability by using differences rather than the more general regression formulation.

I can see why people set up these difference models–if you have a model with error terms for individual units (in this case, precincts or whatever–I can’t actually get the link to the Montalvo paper), then differencing makes the error terms drop out, seemingly giving a cleaner estimator. But once you realize that it’s a special case of regression, and you start thinking of things like regression to the mean (not to mention varying treatment effects), you’re led to the more general lagged regression.

Not that lagged regression solves all problems. It’s just better than difference in differences.

P.S. Actually, I would expect there to be varying treatment effects in the Spanish election example.

1. Bruce McCullough says:

Andrew,

What chapter of your book covers this?

Regards,

Bruce

2. Andrew says:

Bruce,

It's on page 177. (You can look up "gain scores" in the index.)

3. Winston Lin says:

Andrew, there's a very good paper by Paul Allison comparing difference-in-differences and lagged regression ("Change scores as dependent variables in regression analysis," Sociological Methodology, 1990). The model that justifies DID isn't a special case of the lagged regression model. Lagged regression is appropriate if regression to the mean occurs to the same extent within and between groups. DID is appropriate if regression to the mean occurs within groups, but not between groups.

4. Jens says:

Andrew,

they are not the same. In fact, they are based on very different assumptions.

Think about it: Let Y_1i, Y_0i be the response for unit i in the post and pre-treatment period respectively. D_i is a binary treatment indicator. Also assume no covariates and linearity, then the DID approach yields:

Y_1i – Y_0i = beta_0 + beta_1 D_i + e_i

so need mean-independence (e orthogonal to D_i). while the lagged regression yields:

Y_1i = gamma_0 + gamma_1 Y_0i + gamma_3 D_i + e_i

See the difference? It's not true that the latter model is less demanding because it frees up gamma_1. In contrast, the DID assumption implies that you may not want to condition on Y_0i; it could be correlated with e_i.

So it's not that one is better, but they are very different. You have to decide based on substantive knowledge which assumptions fit better to your application. Evidently, if gamma_1 is close to 1 there will be little difference or equivalently if the average outcomes in the treatment and control groups are similar in the first period.

Re. heterogeneous treatment effects. Notice that in the DID setup the effect of the intervention might differ across individuals. The
standard DID estimand then simply gives you the average effect of the intervention on the treatment group ATT.

Jens

5. Andrew says:

Thanks for the comments. I'll take a look more carefully and get back to youall. My tentative answer is that, although differences-in-differences might have some nice theoretical properties and be great in some scenarios, I suspect that in most cases where it's used, it's basically just a lagged regression with coefficient constrained to be 1, and inferior to a usual lagged regression. I could be wrong on this: most likely, the different approaches have different regimes of applicability.

6. Jim B. says:

Right, I think the comments saying they are not equivalent in the general sense are correct. I also had something prepared about economic data, but I think the commenters sum up what I had to say pretty well.