Skip to content

Doug Hibbs on the fundamentals in 2010

Hibbs, one of the original economy-and-elections guys, writes:

The number of House seats won by the president’s party at midterm elections is well explained by three pre-determined or exogenous variables: (1) the number of House seats won by the in-party at the previous on-year election, (2) the vote margin of the in-party’s candidate at the previous presidential election, and (3) the average growth rate of per capita real disposable personal income during the congressional term. Given the partisan division of House seats following the 2008 on-year election, President Obama’s margin of victory in 2008, and the weak growth of per capita real income during the …rst 6 quarters of the 111th Congress, the Democrat’s chances of holding on to a House majority by winning at least 218 seats at the 2010 midterm election will depend on real income growth in the 3rd quarter of 2010. The data available at this writing indicate the that Democrats will win 211 seats, a loss of 45 from the 2008 on-year result that will put them in the minority for the 112th Congress.

Hibbs clarifies:

Although this essay features some predictions about likely outcomes of the 2010 election for the US House of Representatives, the underlying statistical model is meant to be structural or causal and is not targeted on forecasting accuracy.

The model presented in this essay is designed to explain midterm House election outcomes in terms of systematic predetermined and exogenous factors rather than to deliver optimal predictions. For that reason the model does not include trend terms or polling measurements of the public’s political sentiments and voting intentions of the sort populating forecasting equations.

I defer to Hibbs entirely on the political economy, but I would like to make one small methodological point. Hibbs writes:

Most statistical models of aggregate House election outcomes focus exclusively on vote shares going to the major parties. . . . But aggregate votes are mainly of academic interest. What really matters politically is the partisan division of seats, and that is the object of attention here.

I think Hibbs is missing the point here. Even if your sole goal is to forecast seats, I think the most efficient way to do this is to forecast national vote trends, and then apply the national swing to each district, correcting for incumbency and uncontestedness where appropriate. See here for further discussion of this point. Or you could go even further and use the fundamentals (for example, local economic conditions and demographic trends) to modify your vote forecast at the regional and state levels.

I mean, sure, it’s ok to forecast seats directly. It’s simple, clear, and less effort than forecasting votes and then doing the district-by-district work of transmuting vote swings to expected seat swings. But it’s nothing to be proud of–it’s certainly not better than modeling votes, then seats.

But I don’t want to end with that criticism, which is (as noted above) minor. The real point is the connection between the economy and the vote, and on that topic Hibbs has interesting things to say.


  1. William Ockham says:

    I still don't understand the theoretical basis for the assumption that the same structural factors that governed congressional elections in 1950 are still in force at exactly the same level today. There are so many things that have changed that literally must have had an impact on the this stuff that the assumption that one formula will work in 1950 and in 2010 seems completely ludicrous to me. The composition of the electorate in 1950 was completely different, especially in specific congressional districts in the South. It's pretty clear from looking at Presidential elections that black voters' choices are completely immune to economic forces. The same seems to be true of Hispanic voters.

    In addition, Hibbs' assertion of "the electorate's propensity to seek balance in
    partisan dominance of the executive and legislative branches of government at the
    first opportunity following each on-year presidential election outcome" is oft asserted, but I don't see evidence for it. Hibbs' definition of it is different from the usual definition, but I don't think there is evidence for either, especially in the last 40 years.

    I am particularly suspicious of a formula that makes 1974 look like every other midterm. Republicans lost big that year because of the Watergate scandal, not due to some propensity to seek balance or the economic conditions or anything else. The evidence that it was Watergate is pretty overwhelming. Likewise, 2002 really was about 9/11. The models that factor in Presidential approval will at least capture effects like that.

  2. Matt says:

    As a statistician I see the argument you're making on the methodological point, but I also saw Nate Silver's argument that forecasting seat totals is, in his opinion, harder when you start from national swing and then try to adjust on a seat by seat basis, as compared to what he describes as his approach, which is more starting at the district by district level.

    As someone with no experience with applications in poli sci, much less election forecasting, I haven't dug deep enough to have a particularly informed opinion about which approach seems more likely to produce better predictions, but to me it looks quite plausible that either could be the better prediction model. Are you saying you're pretty sure your approach makes more accurate predictions, or that you think they could both make equally good predictions and you just prefer starting from national swing?

  3. Andrew Gelman says:


    I agree that these models are just starting points and can be improved in various ways.


    I think that what Nate does is reasonable, but his description of it can be confusing. He does indeed work with the national swing and then adjust it at the district level, which is what I am recommending. See my link above for further details.

  4. David Shor says:


    The argument is that they *can't*. n=15, and there are already 5 parameters. There isn't enough room for more complicated models.

    Of course, neither vote-share nor economic performance are constant regionally. One can imagine vastly increasing sample-size by doing it on a state-by-state or even district level.

    To some extent, I don't think it would be as accurate locally – Note Ben Nelson heading toward a catastrophic loss despite Nebraska's relatively good economic performance. But on aggregate, it probably works pretty well. I wonder why nobody has done it.

  5. Andrew Gelman says:


    You can't do it using least squares. But you can still do it. There's always room for more complicated models as long as you don't tie one hand behind your back when fitting them.

    Also, people have forecast presidential elections using state-level economic variables. Rostenstone did it in his 1983 book, and Campbell did it in an article in 1992. People generally seem to put less effort into forecasting off-year elections.

  6. David Shor says:


    Sorry to create a back and forth here. But using more complex modeling just replaces over-fitting with an ability to have informative posteriors on the parameters.

    One could put informative priors on the parameters, but poly-sci isn't pharmacokinetics. Models don't fit very well. And so I'd worry about robustness concerns when dealing with n=15. But the model fits pretty well, so maybe I'm just rambling.

    Also, thank you for the references. Those had slipped under my radar…

  7. Andrew Gelman says:


    No problem on the back-and-forth; I just hope we're not the only two people reading this thread!

    Excluding a variable from a regression model is equivalent to setting its coefficient to 0. If I'm trying to make a prediction, I can replace this 0 with something more reasonable, something that is not zero and also is consistent with the data and whatever conceptual understanding I have.

    You can also get larger n's by forecasting other elections, such as state legislature, governor, etc.

    Anyway, I'm not saying that you need to include other predictors. Hibbs's model seems pretty reasonable to me. But if you want to add information, it can be done.