How did white people vote? Updated maps and discussion

Awhile ago I posted some maps based on the Pew pre-election polls to estimate how Obama and McCain did among different income groups, for all voters and for non-Hispanic whites alone. The next day the blogger and political activist Kos posted some criticisms. I disagree with one of Kos’s suggestions—he wanted me to rely on exit polls, but I don’t actually see them as more reliable than the Pew pre-election polls—but he pointed out some serious problems with my maps. I realized that some fixes were in order. Most importantly:

– My maps would be improved by replacing solid red and blue with continuous shading to distinguish between landslides and narrow margins.

– I needed a more flexible model that would allow the nonlinear pattern of voting and income to vary by state. (In the previous model, I fit a nonlinear pattern (by including a separate logistic regression coefficient for each of the five income categories) but allowed the states to vary only with intercepts and slopes. In the new model, we’re letting all five coefficients vary by state.)

During the past couple of months, I’ve been working on this when I’ve had a spare hour or two, and now I think we have something reasonable to share. Here it is:


States colored deep red and deep blue indicate clear McCain and Obama wins; pink and light blue represent wins by narrower margins, with a continuous range of shades going to pure white for states estimated at exactly 50/50.

General comments

The maps are based on a model fit to four ethnic categories (non-Hispanic white, black, Hispanic, other), but I’m only displaying total and non-Hispanic whites. The others are interesting too but they’re based on a lot less data: they’re my (current) best estimates but are much more reliant on model extrapolation.

The estimates are entirely based on the Pew data—except that we use Census-based voter turnout estimates to reweight estimates in each state, and we shift each state’s estimates to be consistent with the actual election outcome in the state. (For example, if our estimate says that Obama got 48% of the total vote in a state (adding up voters from all income and ethnicity categories), and he actually got 46%, then we’d pull down our estimates for each category so that the estimated total is 46%.)

Some particular changes

I’ll talk about a couple of states where Kos pointed out issues with my original maps.

New Hampshire. John McCain won 45% of the two-party vote in New Hampshire, a state which is 93% non-Hispanic white, 1% black, 2% Hispanic, 2% Asian, and 2% other. Based on the Census survey, we estimate that non-Hispanic whites were 96% of New Hampshire’s voters in 2008. If whites represented 96% of the voters, and if McCain received 20% of the votes of the other 4%, then his share of the white vote would be 46%—thus, as Kos pointed out, it’s hard to believe that McCain won in four of the five income categories among whites in the state, as my original map had implied. The problem was in the way that I’d adjusted things to the national vote.

Michigan. As Kos points out, Michigan was closely divided among whites, and so there was something fishy about my original maps, which had Obama winning among whites in four of the five income categories. The new map does not have this problem.

Colorado. This state reveals some problems with the published exit poll data: according to CNN, McCain got 48% of the white vote in Colorado, but, when this was broken down by income, he got 45% of the vote of whites under $50,000 and 47% of the vote of whites over $50,000. This is a mathematical impossibility: using the exit poll numbers, McCain’s percentage of the total white vote should then be (.19*45% + .62*47%)/(.19+.62) = 46.5%, not 48%. I don’t know which of these—if either—is correct. I assume all of these numbers are from the corrected exit polls, adjusted to match up to the actual vote proportions in each state. Our estimate gives McCain 51% of the white vote in Colorado. I think this is possible too, and for that matter it’s consistent with the exit poll estimate of 48%, which has a standard error of at least sqrt(.48*(1-.48)/(.81*1254))=.015, so the exit poll number is within two standard errors of our estimate.

Estimates and raw data

Here are graphs showing our estimates, along with the weighted average from the Pew surveys in each group.(including only those respondents who expressed a preference for Obama or McCain and also said they were “absolutely certain” they had registered to vote):


You can see the partial pooling from the data to the model, with more pooling in small states such as Wyoming, Rhode Island, and Vermont, and less pooling in states such as California, Texas, and New York where sample sizes were larger. The graphs show estimated McCain vote share, so, unsurprisingly, the lines for whites are higher than the lines for all voters, with differences smaller in states such as Wyoming or Vermont where there are very few nonwhite voters.

Some technical details

Even after restricting to respondents who are certain they are registered, the pre-election polls don’t do a great job matching the population of voters. To correctly weight to voters (rather than to the general adult population), we used the 2008 Current Population Survey post-election supplement, which has information on voter turnout. We’ll write a technical article describing exactly what we did, but the short version is that the CPS numbers are generally considered to be much more reliable than exit polls or pre-election polls for estimating turnout rates among different groups within a state. What we actually did was to use a multilevel model to smooth the CPS numbers using the latest population totals from the American Community Survey.

Yair also came up with a cool color scheme. Instead of going from deep red to deep blue through purple, we divided up the color scheme as follows: for proportions between 0 and .5, we used different shades of blue (deep blue, getting progressively lighter, toward white), then going from .5 to 1, we used deeper and deeper reds, starting with white, through light pink, to red. (Don’t worry, I’ll post the R code.) This worked much, much better than the purple schemes I was playing with before. More visual resolution, and a key benefit is that it’s immediately clear which states are above and below the 50% threshold. Finally, I did a little trick of my own and used a square-root transformation (more specifically, if the estimate vote proportion for McCain is x, I defined z = 2*(x-.5), and then worked with sign(z)*sqrt(z)) to spread out the resolution near 0.5 and compress it near 0 and 1.

One other thing. The Pew organization sent me their raw data and posted them on the web for anyone to use. The exit polls still refuse to report anything but summaries. I don’t see this refusal as a sign of confidence on their part. Please also read my earlier note for further discussion of the Pew and exit polls.

All this work is joint with Yair Ghitza.


  1. William Ockham says:

    The raw data for the 2006 national exit polls are available through ICPSR. The D/R split on family income is very similar to the 2008 numbers. Of course, the income categories (200k) are different than Pew's.

  2. William Ockham says:

    Oops. I used "less than" and "greater than" symbols in my description of income categories. Of course, those are html tag signifiers in this context. That should read:

    (less than 15k, 15-30k, 30-50k, 50-75k, 75-100k, 100k-150k, 150-200k, and greater than 200k)

  3. Robert Kern says:

    My colorblind eyes thank you so much for using a red-white-blue scheme instead of the awful red-purple-blue one that has become so popular. Let us hope you start a trend.

  4. BWB says:

    I like what you've done with these charts… but I'm wondering whether the Pew would allow to break down the subsets by religiosity and/or religion. The goal here would be to find a characteristic that is better defined than is income.

    My guess is that you will receive better nuance from tagging evangelicals wherever they may reside.

  5. Andrew Gelman says:

    Yes, Pew breaks things down by religion. I've posted a few graphs on religion and voting on this blog already, maybe will do some more at some point.

  6. Adam says:

    I would love to know hos this compared to 2004 when Bush defeated Kerry?? Thoughts?

  7. Andrew Gelman says:

    Adam: Mostly, it's a uniform swing since 04. See here.

  8. William Ockham says:

    I'm still confused by your statement that the difference between 2004 and 2008 was mostly a uniform swing. In both the Pew data and the national exit polls, it's pretty clear that white voters born before 1962 voted in 2008 exactly the same way they did in 2004. That's nearly 50% of the electorate which didn't change at all and split roughly 57% R – 42% D. The entire swing between 04 and 08 occurred among minority voters and younger whites.

  9. Peter says:


    But a couple questions comments:
    1) Wouldn't it be better to show nonHispanic Whites and "all others" rather than "total"?

    2) There are some clear floor/ceiling effects, especially in some states and some income levels. How many nonWhites or Hispanics with income over $150K are there in (say) North Dakota or Montana?

    In Montana, only 0.227 out of every 100 people are Black, and the median income for all residents was $42K or so …

  10. AgainstTheRull says:

    "With the politicians its n*gg*r this and n*gg*r that, but its just trick the rich whites play on the poor whites; if they hate the negroes enough they don't notice they're still poor white trash."
    This outburst in my oldest political memory; its from a family political fight between my leftwing redneck grandmother and her more traditional sibs. At issue was who should be President: Truman, Dewey, or the Dixicrat. As I was 4 at the time, I don't know if the Strom was on the California ballot.

    I know for a fact that my grandmother had never read the "Mind of the South", or heard of A.J. Cash or the "proto-Dorian bond", but she knew a trick when she saw it.

    While the trick has lost its economic function for the rich, your graphs suggest that it may still ease the pain of poor Southern whites.

  11. Pete Backof says:

    Are you still willing to post the R code?