The authors compare the official census count (based on the tallying up of all Census forms) with their own calculations, based on the sub-sample released for researchers (the “public use micro sample,” available through IPUMS). If all is well, then the authors’ estimates should be very close to 100% of the official population count. But they aren’t:
The two estimates are pretty similar for those younger than 65. But then things go haywire, with the alternative estimates disagreeing by as much as 15%. . . .
What’s the source of the problem? The Census Bureau purposely messes with the microdata a little, to protect the identity of each individual. . . . But the problem arose because of a programming error in how the Census Bureau ran these procedures. The right response is obvious: fix the programs, and publish corrected data. Unfortunately, the Census Bureau has refused to correct the data.
Huh? There must be something I’m missing here. Also, apparently there are similar problems with the American Community Survey and the Current Population Survey.
P.S. There’s one place where I disagree with Justin, though. He writes:
The microdata suggest that there are more very old men than very old women — I know some senior women who wish this were true!
You better do the decision analysis on this one carefully. If “this were true,” presumably it would mean that these senior women might very well be wishing themselves to be dead!
P.P.S. Usually there’s little point to me linking to a blog that gets 100 times our readership, but this is important enough, both to political scientists and more generally in sending the message that we must always check our data, that I wanted to highlight it here. As Daniel Lee can tell you, I spend lots and lots and lots of time checking my data. Even so, I’ve used IPUMS and never checked this!