The National Election Study is hugely important in political science, but, as with just about all surveys, it has problems of coverage and nonresponse. Hence, some adjustment is needed to generalize from sample to population.
Matthew DeBell and Jon Krosnick wrote this report summarizing some of the choices that have to be made when considering adjustments for future editions of the survey. The report was put together in consultation with several statisticians and political scientists: Doug Rivers, Martin Frankel, Colm O’Muircheartaigh, Charles Franklin, and me. Survey weighting isn’t easy, and this sort of report is just about impossible to write–you can’t help leaving things out. They did a good job, though, and it’s great to have this stuff put down in an official way, so that people can work off it of it when going forward.
It’s a lot harder to write a procedure for general use than to do a single analysis oneself.
I have a few corrections to add to the report that unfortunately didn’t make it into the final version (no doubt because of space limitations):
p.4, item 1: In practice this statement is ok but in general it is not a correct statement. What is ultimately relevant is not the probability of selection but rather the number of people from such households in the sample, compared to the number in the population. “Probability of selection” is wrong for (at least) two reasons: (1) what is more important is probability of inclusion (which includes nonresponse) not merely probability of selection, (2) strictly speaking, it is not probability that matters but the actual number in the sample. I discuss this last point in the final paragraph of this brief article. And I discuss it more detail here.
p.5, item 2: weighting by #adults in household is not a good idea. It would be better to poststratify on household size (this variable is in the Census) or as an approximation to weight by sqrt (# adults in household); see this article from Public Opinion Quarterly. In particular, see table 2 in that article.
p.5, item 3: It says that location is the only thing known about all cases that did not complete an interview. You should also know sex (people can estimate this with close to 100% accuracy over the phone, also pretty well in face-to-face interviews, I would think).
p.6-7, items 5-6: That there is an element of judgment here. I can imagine that this 2-5% rule has worked in the past but it can’t really work as a general principle.
Different weighting procedures–really, I’d say different analysis procedures–can also be warranted by differences in the questions being asked.
p.10, item 14: This is a great idea (“provide separate variables to indicate all levels of stratification and clustering, such as stratum, PSU/cluster, and area segment”). As the report points out, this information can be given without violating any confidentiality “by
assigning each unit an arbitrary number without specifying exactly which geographic
location is indicated by each number.”
p.5, item 24: Perhaps also add an item 25 that a design effect be calculated and reported based on the estimate of the vote intention variable? This could be a useful starting point for any simple analyses that are done. Although once people start running regressions, such crude design effects won’t be so helpful.
p.11: Wow, I’ve never seen the word “attrited” before!
p. 11, 14: I like the idea of the NES reporting design effects for selected statistics of interest. This should help a lot.
The big issue
Finally, the only flat-out error in the report appears right near the end, where the report says, “unweighted percentages and regression coefficients do not constitute legitimate estimates of population parameters.” I’m with them on percentages but not necessarily with regression coefficients. In a regression analysis, if you control for the variables used in the weighting, then you should be running an unweighted regression. And then poststratification can be used to sum back to the population totals of interest. With that in mind, I think it would be good for NES to provide a supplementary file with all the poststratification information used to construct its weights. I recommend that this be done in coordination with Jeff Lax and Justin Phillips, because they are supplying this sort of poststratification files for their Mister P analyses (published in AJPS and APSR) for state-level opinions.
I don’t mind the report’s focus on weighting, but in the recommendations section I think we should be clear that in a regression context it is not necessarily appropriate to use weights.
DeBell and Krosnick dd a great job in preparing this report, and I’m glad to have had the opportunity to work (if only remotely; we didn’t all actually meet) with several researchers who know a lot more about public opinion and survey research than I do.