Jens Hainmueller has an interesting entry here about estimating the causal effects of the 2004 Madrid bombing on the subsequent Spanish elections, by comparing regular votes to absentee votes that were cast before the bombing. Jens cites a paper by Jose Montalvo that uses difference-in-difference estimation; that is, a comparison of this form:
[(avg of treated units at time 2) - (avg of controls at time 2)] – [(avg of treated units at time 1) - (avg of controls at time 1)]
I’m sure this is fine, but it’s just a special case of lagged regression where the lag is restricted to have a coefficient of 1. In educational research, this is sometimes called the analysis of “gain scores.” In any case, you’re generally limiting your statistical efficiency and range of applicability by using differences rather than the more general regression formulation.
I can see why people set up these difference models–if you have a model with error terms for individual units (in this case, precincts or whatever–I can’t actually get the link to the Montalvo paper), then differencing makes the error terms drop out, seemingly giving a cleaner estimator. But once you realize that it’s a special case of regression, and you start thinking of things like regression to the mean (not to mention varying treatment effects), you’re led to the more general lagged regression.
Not that lagged regression solves all problems. It’s just better than difference in differences.
P.S. Actually, I would expect there to be varying treatment effects in the Spanish election example.