Before Iowa, Hillary was beating Obama in NH by like 20 points, or at least double digits. After Iowa, Obama got this huge surge in the polls. You can see the time series here.
It’s a mystery why the polls were so wrong. Here’s my theory (which gets a bit long and technical but might be interesting to some, and I just feel like writing it down). I think it comes down to three parts:
1. The likely voter screen and its potential deficiencies
2. Problems in survey weighting, especially when Iowa turnout was so strange
3. Obama being black
First – Erikson, Panagopoulos, and Wlezien wrote a paper showing that the Gallup poll overestimates fluctuation in the electorate when using the likely voter screen early in the election (paper attached). In a nutshell, what happens is this: because the Gallup poll (and most other polls) are interested in interviewing “likely voters” only, they ask a series of screening questions at the beginning of the poll to gauge the respondents’ interest in the election. They then have some formula to determine who is a “likely voter”, and they throw out the remainder of the results. This paper examined the results that were thrown out along with the poll and found that, when something is going wrong for a candidate, their supporters are less enthusiastic and therefore less likely to be considered “likely voters” during this screening process. As a result, many of the supporters of the “losing” candidate just aren’t counted in the poll, because pollsters think they’re not going to vote. This makes fluctuations in polling seem more dramatic than they actually are.
In this case, Hillary was winning big in NH. When Barack won Iowa and everyone in the media started praising him relentlessly, he started getting a boost. Because of this likely voter screen thing, his boost in the polls was exaggerated, and because the elections were so close to one another, the polls didn’t have a chance to settle down into an equilibrium. This means Obama was never actually leading, and all this talk about “something happened in the last 24 hours” is all a load of BS.
Second – survey weighting. Whenever a pollster does a survey, they need to make the poll representative of the voting electorate (that’s why they do the likely voter screen, for example). Another big thing they do is essentially guess what the demographic makeup of the electorate is going to be. Usually this is done on historical data and census data, but it’s always really hard in primaries because they’re not very consistent. So, for example, usually it’ll be something like 10% of the electorate is people 18-25, and like 25% are 65+, and so on (I’m making these numbers up). So the pollster will first try to get this breakdown in who they actually talk to, and if they can’t, they’ll then “weight” the survey – meaning count certain people more than others – to simulate the expected breakdown.
In this case – I’m guessing here, but I think the pollsters probably saw how weird the electorate was in Iowa (i.e. SO many people turned out, and so many young people), that they probably tried to compensate by weighting young people a ton in the following polls to NH. Now, we know that young people support Obama disproportionately. If the pollsters overcompensated for young people, then Obama’s support was artificially strengthened in the polls. I’d have to look at the actual turnout numbers in more detail to check this out.
Third, Obama is black. Some people have a theory that people will lie in a poll and say they support the black candidate because they don’t want to seem racist, but then they actually vote for the white person. This one is going around in the media already, but I find it hard to believe, or at least I don’t think it’s the only reason for the problems. First, the idea in general seems kind of crazy, that people think it makes sense to lie in this way in large numbers – crazy that it would have such a large effect, anyway. Second, we’re talking about Democratic primary voters, NOT the general electorate. These people are the least likely to be racist. Third, it’s not like the alternative was some gun-toting white guy from the Klan, it was Hillary Clinton. If these people are racist, they’re probably not going to be running to her. Still, this might have had a small effect.
So anyway, that’s my theory. It should be noted that none of this is written or talked about anywhere in the media, which is a shame in my opinion. And if I’m correct, this has huge implications to the election which are going to be ignored. Specifically what I mean is this – these early primaries and caucuses are important not really because of the delegates, but mostly because they build momentum and a storyline for the media to talk about in advance of the future primaries. In this case, the media’s storyline goes something like this … “Obama won Iowa and had all the momentum. Hillary was on the ropes and losing by double digits. But her campaign rallied in the final 24 hours. She ‘found her voice’, showed some resiliency and this is a turning point for her.” Based on the data they’re looking at, this makes sense. Too bad it might be totally wrong.
In reality, I think Hillary was steadily losing ground as Obama was gaining momentum, and it truly is remarkable that Obama closed the gap by so much in the final weeks. If the media saw this, the storyline would be totally different, which has significant effects on the future primaries – donations, momentum, etc. And to me, this storyline makes a lot more sense. I’m sorry, but I just don’t believe that crying on TV in the middle of a speech is good for a presidential campaign. I looked at a bunch of the events on CSPAN in the last week, and I’m telling you, that looked like a campaign on the ropes. She was breaking down, Bill was going crazy, the audiences were NOT enthusiastic at all, and the media coverage was dismal. The theory that “something just happened” in the last 24 hours seems insane to me.
My only comment is that things are a lot less stable when there are several candidates in the race to choose from. Even if the main focus is on #1 and #2, there are a lot of these other options floating around that make the decision more complicated and the outcome less predictable.
P.P.S. More here.