Some thoughts on census adjustment

Groves rules out use of sampling in 2010 census:

President Barack Obama’s pick to lead the Census Bureau on Friday ruled out the use of statistical sampling in the 2010 head count, seeking to allay GOP concerns that he might be swayed to put politics over science. Robert M. Groves, a veteran survey researcher from the University of Michigan, also testified during his confirmation hearing that he remains worried about fixing a persistent undercount of hard-to-reach populations . . . Census officials have already acknowledged that tens of millions of residents in dense urban areas — about 14 percent of the U.S. population — are at high risk of being missed because of language problems and an economic crisis that has displaced homeowners.

My comments:

I have a great respect for Bob Groves, and I would trust his decisions on what to do with the Census more than I would trust my own.

Bob’s statement that “there is simply no time to prepare for it” seems eminently reasonable to me, especially given the cost constraints under which the census operates. On the statistical merits of the issue, I’m pretty sure that adjusted numbers would be better than unadjusted numbers. The census people know what they’re doing, and there are known problems of nonresponse, and, for anything where I care about the damn answer, I’d use their adjusted estimates over the raw numbers.

As a social scientist, I hope the census bureau could release two sets of numbers, one unadjusted for political reasons and one adjusted for those of us who want the most accurate inferences possible.

That said, I’m ignoring a possible indirect effect of adjusting the numbers: If people know that the census will do adjustment, maybe they’ll be less likely to participate in the enumeration in the first place. It’s hard to measure such an effect and, hey, it might be important. I don’t know.

I’m not thinking so much of individuals deciding whether to respond to the census, but rather of the decisions of local jurisdictions, where various spending formulas depend on population. For example, if it’s known that the census won’t be adjusted, then I’d expect the government of New York City to put a lot of effort into convincing people to participate. If it is known that the census will be adjusted, then there’d be a lot less motivation for localities to do what it takes to boost participation.

Conditional on the data already being collected, you’d definitely want to make statistical adjustments; it’s a tougher call to decide on this ahead of time. Also, if you know for sure you won’t be adjusting, this will affect the effort you put into collecting the data in different places. So if you’re not going to adjust, you might as well make that decision right away.

P.S. To expand on this slightly, I think any debates over census adjustments are fundamentally political debates, not statistical disagreements. The scientific consensus on adjustment is pretty easy (although people can argue about the details of implementation, as noted by Lawrence in comments below). It’s the political consensus that’s difficult, as there are clear winners and losers. With a lack of political consensus, all you need is a little bit of dust and confusion in the air to give a sense of a lack of scientific consensus, which then gets piped back in to justify inaction in the political process.

7 thoughts on “Some thoughts on census adjustment

  1. I think it is a very odd individual who is less motivated to fill out the census because they knew that they were using inference. I could see myself thinking along those lines, and you and your readers. But I don't think we add up to statistically significant numbers.

    I like the idea of two datasets. I feel like we can't let go of the goal of a raw count for sentimental reasons. It is in the constitution! It is traditional. But yeah, accurate numbers would be good too.

  2. Why is there only a dichotomous choice between the unadjusted numbers and the most accurate possible statistical estimate? Seventy social scientists given the task of estimating the "actual" US population are probably not going to return the same numbers. Maybe they would all be identical in terms of allocating congressional seats, but there might be substantial variation in terms of intrastate estimates that would affect federal funding. So isn't the choice between a head count and an indefinite number of estimates?

  3. Lawrence: It's not a matter of arbitrary guesses. It's a matter of professional judgment, which the census bureau is qualified to carry out. There are various legal issues that are determined by professional judgment; to take a simple example, consider property tax assessments. The fact that an adjustment procedure won't be perfect isn't, to me, enough of a reason to prefer it to something demonstrably worse.

  4. Hi Andrew: I did not claim that statistical estimates are arbitrary guesses. I just tried to say that there's no guarantee that the estimates produced by the census would be the best possible ones.

    Compared to the percentage missed by the census, a much larger percentage of people don't vote. Social scientists could run a statistical model that better approximates the "public will" than raw election returns, but for some reason there is not much of a push to replace election numbers with statistical estimates.

    Maybe that's because an inferior result from a well-defined rule is sometimes preferable to an estimate based on debatable assumptions and post-hoc adjustments. That's why students who miss a quiz should receive a zero instead of a statistical approximation of what their grade might have been if they showed up.

    Compare the relative faith that sports fans have in results from college football and professional football, despite the former's use of supercomputers and professional judgment. The Arizona Cardinals probably weren’t the best team in the NFC last year, but I would rather them make the Super Bowl through a tournament than to have the New York Giants or Carolina Panthers placed there because of a model that Bob Groves ran.

  5. Lawrence,

    I agree with you that it's all about the rules. But the rules for the census are that they are supposed to count everyone. Everyone. Not just the people who want to get counted.

    Voting is different. Here the rules are that you just count the voters. It's not about the public will, it's not about public opinion in the abstract, it's about the voters. As we've learned in Florida, Minnesota, and elsewhere, statistics are still necessary here: actual ballots can be ambiguous. But that's another story.

    Sports is different in another way. Winning matters. Take a simpler case, no tournament, just two teams playing. If team A beats team B, that's it. That's the winner. Team B doesn't get a trophy even if they are better in some abstract sense. It's about winning, because them's the rules.

    Back to the census. The goal is to estimate the number of people, and I'd like to get the best estimate. As I noted above, the census has a limited budget, and if Groves thinks it's not gonna happen, that's good enough for me.

  6. I was actually thinking a few months ago that I'd be more likely to refuse to answer if I knew they *weren't* adjusting. Since I'm a relatively wealthy, educated, 40-year-old white guy, I figure my participation can only make any biases worse…

  7. Currently, the census is used to collect all sorts of statistical data. But the constitutional requirement is just for a headcount.

    Why not two surveys, a headcount that is exactly that, a count of how many people live in each place, and a completely separate survey to collect statistical data that can use sampling, inference, etc. More people might actually cooperate with the headcount if it was just a headcount.

    On another note, it wouldn't be a bad idea for a central federal statistical office that would collect data for all federal departments in a nonpoliticized manner.

Comments are closed.