“Stop the Polling Insanity”

Norman Ornstein and Alan Abramowitz warn against over-interpreting poll fluctuations:

In this highly charged election, it’s no surprise that the news media see every poll like an addict sees a new fix. That is especially true of polls that show large and unexpected changes. Those polls get intense coverage and analysis, adding to their presumed validity.

The problem is that the polls that make the news are also the ones most likely to be wrong.

Well put. Don’t chase the goddamn noise. We discussed this point on the sister blog the other day.

But this new op-ed by Ornstein and Abramowitz goes further by picking apart problems in recent outlier polls:

Take the Reuters/Ipsos survey. It showed huge shifts during a time when there were no major events. There is a robust scholarship, using sophisticated panel surveys, that demonstrates remarkable stability in voter preferences, especially in times of intense partisan preferences and tribal political identities. The chances that the shifts seen in these polls are real and not artifacts of sample design and polling flaws? Close to zero.

What about the neck-and-neck race described in the NBC/Survey Monkey poll? A deeper dig shows that 28 percent of Latinos in this survey support Mr. Trump. If the candidate were a conventional Republican like Mitt Romney or George W. Bush, that wouldn’t raise eyebrows. But most other surveys have shown Mr. Trump eking out 10 to 12 percent among Latino voters.

There’s only one place where I disagree with Ornstein and Abramowitz. They write:

Part of the problem stems from the polling process itself. Getting reliable samples of voters is increasingly expensive and difficult, particularly as Americans go all-cellular. Response rates have plummeted to 9 percent or less. . . . With low response rates and other issues, pollsters try to massage their data to reflect the population as a whole, weighting their samples by age, race and sex.

So far so good, but then they say:

But that makes polling far more of an art than a science, and some surveys build in distortions, having too many Democrats or Republicans, or too many or too few minorities. If polling these days is an art, there are a lot of mediocre or bad artists.

What’s my problem with that paragraph? First off, I don’t like the “more of an art than a science” framing. Science is an art! Second, and most relevant in this context, the adjustment of sample to the population is a scientific process! Suppose a chemist is calculating energy release in an experiment and has to subtract off the energy emitted by a heat source. That’s science—it’s taking raw data and adjusting it to estimate what you want to learn. And that’s what we do when we do survey adjustment (for example, here). Yes, you can do this adjustment badly or with bias, just as you can introduce sloppy or bias adjustments in a chemistry experiment. But it’s still science.

Anyway, I agree with the main points in their op-ed.

P.S. For more on polling biases in the 2016 campaign, see this thoughtful news article by Nate Cohn, “Is Traditional Polling Underselling Donald Trump’s True Strength?”

P.P.S. I would’ve posted this all on the sister blog where it’s a natural fit, but I couldn’t muster the energy to add paragraphs of background material. One pleasant thing about blogging here is that I can take it as your responsibility to figure out what I’ve written, not my responsibility to make it accessible to you. Indeed, I suspect that part of the fun of reading this blog for many people is that I don’t write down to you. I write at my own level and give you the chance to join in.

I’m planning to write a few books, though, so I’ll have to shift gears at some point. Damn. I’ve become so comfortable with this style. Good for me to get out of my comfort zone, I know. But still.

35 thoughts on ““Stop the Polling Insanity”

  1. I think the complaints about the characterization of potential polling distortions may be a bit overstated. Yes, science *is* art–I agree completely. But, just because sample adjustment is a scientific process doesn’t mean that it’s going to give the correct answers. If political attitudes are associated with differential response rates, no amount of survey weighting is going to unbias the results completely, not to mention the “forking paths.” Many people (non-scientists) seem to believe that having numbers and following a “scientific” process is a guarantee of true conclusions, which certainly isn’t the case! I think the spirit of the paragraph is to warn non-technical readers that “assumptions may be violated here” without overwhelming them with technical details. Could the phrasing have been better? Probably, but in my view, it’s more important that people be disabused of the fiction that statistics provides a means of divining truth independent of the statistician than to protect some scientific image of polling.

    • If there is no confound between attitudes and response rates, then adjustments will result in minimal to no adjustments, particularly when analyzing the results of a poll using voting day as the outcome. What is required is critically thinking through the potential sources of bias at any given point in time and adjusting in such a way that brings the estimates closer to reality rather than further away.

    • I like to think of the symptom you describe as “mathiness”.

      Once you couch your theory in mathematics people seem to regard the conclusion as hard “truth” ignoring the fact that the maths may still be correct but the assumptions / priors / initial-conditions / measurements might be terrible.

  2. If there is no confound between attitudes and response rates, then adjustments will result in minimal to no change in parameter estimates, particularly when analyzing the results of a poll using voting day as the outcome. What is required is critically thinking through the potential sources of bias at any given point in time and adjusting in such a way that brings the estimates closer to reality rather than further away.

  3. Edit: “adjustments will result in minimal to no change in parameter estimates, particularly when analyzing the results of a poll using voting day as the outcome.”

  4. “Science is an art” but art is not science. Whenever samples are “adjusted” to approximate populations there needs to be some adjudication that this population is not an “artistic” creation. This is always a temptation, especially when one has a point to point to prove. Better, I think, to sample randomly.

    • If responses were random, there would be no need to adjust. I think you do not understand the science nor the art of response distribution adjustment.

      • No need to be nasty. If attitudes are associated with demographics (eg) artistic reweighting of samples can give results to support any position.

        • I apologize if my response seemed nasty, it is simply a battle I have with some people I work with who do not seem to understand the distinction between empirical reweighting and subjective reweighting. So perhaps I was engaged in a bit of projection.

          I am attempting to point out that reweighting poll data is itself done empirically with variables for which there is evidence of response bias with possible causal implications. That aspect of it is not artistic so much as a result of thinking critically about potential biases that can enter into the process between random samples/stratified random samples, etc. and response sample that will be analyzed.

          1. It is not possible to empirically reweight something into anything you want.

          2. It IS on the other hand possible to use subjective weights to distort a parameter intentionally. Perhaps that is what you are talking about.

          If you are referring to (2) then we are in agreement, but I would disagree strongly if you are arguing that a simple random sample will produce a more robust estimate than a properly empirically reweighted sample.

        • If you are referring to (2) then we are in agreement, but I would disagree strongly if you are arguing that a simple random sample with response bias will produce a more robust estimate than a properly empirically reweighted sample adjusted for response bias.

        • The effects of polling adjustment strategies can in fact be compared empirically by modeling them against voting results.

        • The problem is that the polling adjustment models themselves are totally opaque to most consumers of the data.

          Even a professional journalist can hardly evaluate the goodness of one adjustment strategy over another.

        • If you prefer the term ‘art’ to ‘critical thinking’, fine. But, what underlies the skill in this ‘art form’ is ‘critical thinking’.

  5. “Indeed, I suspect that part of the fun of reading this blog for many people is that I don’t write down to you.”

    Yeah — the transition from blogging-as-conversation to Explainers has had heavy stylistic costs. (Also I do not mind the “more of an art than a science” phrase. I interpret it as “requires multiple judgment calls” or “is non-algorithmic” and find it perfectly natural to use e.g. when talking about ad hoc solution methods for physics problems.)

  6. “Adjusting” opinion polling data after its collection relies upon correct prior knowledge/assumptions about key aspects of the population under study.
    Yes, polling data adjustments generally are attempted (artistic?) fixes for non-random sampling.
    For example, pollsters typically use Census data to weight poll data for assumed under/over representation of sub-populations (gender, ethnicity, age, etc) . This supposes official Census data is correct and assumes significant uniformity in the opinions of interest within the sub-populations.
    How does one scientifically apply/weight Census data to an initially unknown opinion distribution in a population?

    • Velaren:

      Sure, but the point is that some adjustment has to be done in any case. “Not adjusting” is not an option. In answer to your question, “How does one scientifically apply/weight Census data to an initially unknown opinion distribution in a population?”: There are different ways to do this. One approach is to model the unknown quantities and then check your model in various ways. This is a scientific procedure and, like most scientific procedures, not guaranteed to give the correct answer.

  7. Andrew wrote: I can take it as your responsibility to figure out what I’ve written, not my responsibility to make it accessible to you.

    Well, maybe. But, I find this blog informative and accessible—you may not regard it as your “responsibility”—but you usually make it accessible.

    Bob

  8. This is all, of course, good and true (and sort of obvious), but probably not enough. We are talking science aren’t we? “Close to 0” is not good enough. The question is how close? If you come up with a number then we can talk about some reasonable view of individual polls. Take as an example a recent mea culpa postmortem by Nate Silver about his Trump coverage. He comes to the conclusion that somewhere around October last he had to have adjusted his expectations about the possibility of Trump win and that he missed that because of lack of formal model.

    What sort of model is required here? Suppose that at this stage a month worth of national polling (maybe time discounted) is needed to establish a “trend” in the race (I put scare quotes here because it is possible that there is not much of a trend in the race overall, only short-term and long-term fluctuations). Now we can simply look at how much this one poll shifts our model. Maybe it is 0.1% toward Trump (from Clinton +3.0% to Clinton +2.9%) whatever it means (almost nothing, but every little bit means almost nothing). Also interesting, would be to look at the uncertainty. After all, Clinton +3% does not mean much at this stage, but “does not mean much” can be 3+/-10 or 3+/-15 or 3+/-20, which sort of makes difference. Ask Nate Silver.

  9. I put much more credibility in the prediction markets than polling, e.g., the Iowa Electronics Market or PredictIt, if for no other reason than they are probabilistic. It’s one index of just how uncertain the November election will be that these two resources vary almost as much as the polls. The IEM has Clinton in the lead with a 60% chance of winning (for WTA) while PredictIt gives her a 57% chance of winning.

  10. }”…can in fact be compared empirically by modeling them against voting results”

    Empiricism (direct observation/experimentation) is the opposite of abstract modeling; comparison done by modeling is not empiricism.

    Difficult/impossible to verify opinion polling adjustments accuracy by comparison with actual voting results.
    One can not isolate nor measure the polling adjustments variables from all other variables and error sources in the overall opinion polling process. Also, officially reported voting results themselves usually have an appreciable margin of error.

    It would seem that all “adjustments” in mainstream national opinion polls are ultimately arbitrary and empirically non-verifiable.

    • Tlg:

      No survey of humans is truly a random sample. Nonresponse rates in U.S. polls exceeds 90%. All surveys need to be adjusted. The results of a survey depend on choices made in design and analysis; as such, yes, these can be considered arbitrary. Nonetheless the adjustment can be done according to scientific principles; we can do the best we can. The phrase “ultimately arbitrary and empirically non-verifiable” applies to just about everything we do. But there are degrees of arbitrariness and degrees of verifiability.

      Your attitude that empiricism is “the opposite of abstract modeling; comparison done by modeling is not empiricism,” is naive. For some simple principles you can use something like pure empiricism, for others you can use something like pure abstract modeling, but for the interesting problems we work on, we use both.

  11. HOw about just stop over-interpreting polls??

    didn’t some political scientists point out that poll results mos. in advance of election are not partiuclarly reliable? B/c peopole don’t pay attention until close to election? Can’t remember who; also something to do w/ x-boxes?

    But anyway, people reason from “election polls,” where we know there is pretty good fit between what survey items is measuring & what we think it is measuring, to reliability/validity of all manner of public opinion polls that in fact aren’t measuring *anything* in particular (I’d put in another hyperlink here, to Pew survey data on GM foods risks, http://www.culturalcognition.net/blog/2015/1/31/weekend-update-pews-disappointing-use-of-invalid-survey-meth.html, but I’ve learned that the Hal9000 series comupter that evalutes comments treats every comment w/ more than 1 — at least by me! — as spam!)

    Pollsters & bad social science reasearchers think that responses to survey items are sampling “opinions” or “views” on something. Good opinion researchers think that survey items are either either valid or invalid indicators of some thing we can’t see in people; they worry about how to validate that what’s being measured is what they think it is & they figure out means for assessing the measurement precision of the survey items in relation to that unobserved thing.

    A few relevant refernces, ones I’d hyperlink if Hal900 weren’t so damn sensitivge (maybe my spelling is the problem?):

    Bishop, G.F. The illusion of public opinion : fact and artifact in American public opinion polls (Rowman & Littlefield, Lanham, MD, 2005).

    Berinsky, A.J. & Druckman, J.N. The Polls—Review: Public Opinion Research and Support for the Iraq War. Public Opin Quart 71, 126-141 (2007).

    Shuman, H. Interpreting the Poll Results Better. Public Perspective 1, 87-88 (1998).

    Damn, I better get back to grading exams…

Leave a Reply to Curious Cancel reply

Your email address will not be published. Required fields are marked *