Nate Cohn at the New York Times arranged a comparative study on a recent Florida pre-election poll. He sent the raw data to four groups (Charles Franklin; Patrick Ruffini; Margie Omero, Robert Green, Adam Rosenblatt; and Sam Corbett-Davies, David Rothschild, and me) and asked each of us to analyze the data how we’d like to estimate the margin of support for Hillary Clinton vs. Donald Trump in the state. And then he compared this to the New York Times pollster’s estimate.
Here’s what everyone estimated:
Franklin: Clinton +3 percentage points
Ruffini: Clinton +1
Omero, Green, Rosenblatt: Clinton +4
Us: Trump +1
NYT, Siena College: Clinton +1
We did Mister P, and the big reason our estimate was different from everyone else’s was that one of the variables we adjusted for was party registration, and this particular sample of 867 respondents had more registered Democrats than you’d expect, compared to Florida voters from 2012 with an adjustment for anticipated changes in the electorate for the upcoming election.
In previous efforts we’d adjusted on stated party identification but in this case the survey was conducted based on registered voter lists, so we knew party registration of the respondents. It was simpler to adjust for registration than stated party ID because we know the poststratification distribution for registration (based on the earlier election), whereas if we wanted to poststratify on party ID we’d need to take the extra step of estimating from that distribution.
Anyway, the exercise was fun and instructive. As Cohn put it:
We Gave Four Good Pollsters the Same Raw Data. They Had Four Different Results. . . . How so? Because pollsters make a series of decisions when designing their survey, from determining likely voters to adjusting their respondents to match the demographics of the electorate. These decisions are hard. They usually take place behind the scenes, and they can make a huge difference.
And he also provided some demographics on the different adjusted estimates:
I like this. You can evaluate our estimates not just based on our headline numbers but also based on the distributions we matched to.
But the differences weren’t as large as they look
Just one thing. At first I was actually surprised the results varied by so much. 5 percentage points seems like a lot!
But, come to think of it, the variation wasn’t so much. The estimates had a range of 5 percentage points, but that corresponds to a sd of about 2 percentage points. And that’s the sd on the gap between Clinton and Trump, hence the sd for either candidate’s total is more like 1 percentage point. Put it that way, and it’s not so much variation at all.