Perils of comparing statistics, and don’t forget to look at standard errors too

Stuart Buck has an interesting story (linked from Tyler Cowen and Jane Galt of a map that was published in the newspaper showing gains and losses in median household incomes. Apparently the graph (from the Detroit Free Press) was mistaken. Buck writes,

Let’s take my home state of Arkansas. According to the Census Bureau’s page, Arkansas’ 1999 median household income — in 2005 dollars — was $34,770. Then in 2005, the median household income was $36,658. That’s an increase of 5.4%, as opposed to the 7.2% decrease that the Detroit Free Press claims to have found.

How about another state: Utah. In 1999 (again, in 2005 dollars): $53,943. In 2005: $54,813. That’s a rise of 1.6%, not a decline of 10.5% as the Free Press claims. . . .

The first journalist then followed up and explained further that the 1999 data came from the 2000 Census (it’s available here). They used the inflation calculator recommended by the Census Bureau. And then the 2005 data came from the American Community Survey (here). . . .

Estimates from any one survey will almost never exactly match the estimates from any other (unless explicitly controlled), because of differences such as in questionnaires, data collection methodology, reference period, and edit procedures.

Most importantly here, the American Community Survey seems, for whatever reason, to produce lower results than the official Census figures. For example, in one detailed analysis comparing ACS to the Census in a couple of counties, the Bureau reported:

There were significant differences in the estimation of median household income. In Tulare County, the Census reported a value of $33,983 compared to the ACS estimate of $31,467. This is consistent with Census Bureau research in other ACS sites that generally found lower income values reported in the ACS . . . .

This seems like a great example for a statistics (or policy analysis) class. Of course, the ultimate solution is not to give up but to get parallel series of both surveys (if possible) to better adjust for differences in making comparisons.

The other thing to be considered is uncertainty. Looking at the linked webpage, I see some big standard errors. For example, considering Stuart Buck’s example of Arkansas, we see $36,700 +/- 1400 (for 2005) and $34,800 +/- 1200 (for 1999). Assuming independent surveys (which maybe isn’t right), the difference is $1900 +/- 1800. That is, a difference of 5.4% +/- 5.2%. With numbers like these, it seems a little silly to be looking at individual states.

There is a statistical message here, too, which is that differences are hard to estimate precisely (unless they are studied using a panel design which keeps the data comparable from year to year).

P.S. See here for a table showing how variable the state estimates are–with color and two significant digits included to make the noise be even more visible! There are many comments on that blog entry, and they all seem to be taking the numbers at face value.

1 thought on “Perils of comparing statistics, and don’t forget to look at standard errors too

  1. The weird thing is that 2000 ACS data does exist. I'm not sure why the journalists didn't use that, and compared it to the 2005 data. (One can get some extremely unlikely "changes" from 1999 to 2000 by using the difference between the Census and ACS data.) I'd assume sloppiness, since they seem like reasonable (if misguided people) based on their emailed replies to the critiques.

Comments are closed.