Where the fat people at?

Pearly Dhingra points me to this article, “The Geographic Distribution of Obesity in the US and the Potential Regional Differences in Misreporting of Obesity,” by Anh Le, Suzanne Judd, David Allison, Reena Oza-Frank, Olivia Affuso, Monika Safford, Virginia Howard, and George Howard, who write:

Data from BRFSS [the behavioral risk factor surveillance system] suggest that the highest prevalence of obesity is in the East South Central Census division; however, direct measures suggest higher prevalence in the West North Central and East North Central Census divisions. The regions relative ranking of obesity prevalence differs substantially between self-reported and directly measured height and weight.

And they conclude:

Geographic patterns in the prevalence of obesity based on self-reported height and weight may be misleading, and have implications for current policy proposals.

Interesting. Measurement error is important.

But, hey, what’s with this graph:

Screen Shot 2015-07-31 at 10.45.29 AM

Who made this monstrosity? Ed Wegman?

I can’t imagine a clearer case for a scatterplot. Ummmm, OK, here it is:

fatties

Hmmm, I don’t see the claimed pattern between region of the country and discrepancy between the measures.

Maybe things will be clearer if we remove outlying Massachusetts:

fatties_no_ma

Maryland’s a judgment call; I count my home state as northeastern but the cited report places it in the south. In any case, I think the scatterplot is about a zillion times clearer than the parallel coordinates plot (which, among other things, throws away information by reducing all the numbers to ranks).

P.S. Chris in comments suggests redoing the graphs with same scale on the two axes. Here they are:

fatties_squarefatties_no_ma_square

It’s a tough call. These new graphs make the differences between the two assessments more clear, but then it’s harder to compare the regions. It’s fine to show both, I guess.

26 thoughts on “Where the fat people at?

  1. Why don’t you have the x and y scales and range the same?
    That way the reader can quickly tell which states are over or under-reporting vs. a 45 degree line.

  2. – it’s obvious (by inspection) that their conclusion was made (by inspection) using the top two in each list.
    – Hi Massachusetts, this is your mom. Is everything OK? Are you eating enough? I worry about you.

  3. The first two comments beat me to it. Certainly the trellis chart is not as useful as the scatter, but the scatter does not directly address the claims about whether self-reporting is biased from the actual data by region – without including the 45 degree line. I have plotted it that way as well as just comparing ranks with a 45 degree line. In either case, there is no apparent regional difference. I would note that virtually all states under-report obesity in terms of the raw measures, while they are more or less equally split (between under- and over-reporting) by rank.

  4. The self reported numbers, particularly ignoring MA, seem uncorrelated with the measured numbers. I’m too lazy to copy the numbers out of the data though. If anything the reported numbers seem to indicate whether or not the state borders an ocean.

    • I also looked at the rank differences a bit, and as you say, they appear quite different. I think the rank differences speak to perceptions about relative positions while the absolute differences just speak to the obesity perceptions vs. reality. However, we are really digging deep here into, what seems to me to be, fairly subtle and uninteresting “findings.”

      • I haven’t really been able to intuitively grasp the difference between what the “ranking by raw difference” means vs what “ranking by difference of ranking” means.

        All I can intuite is that the second approach ought to amplify noise. Akin to how taking numerical derivatives of noisy data does.

        But if the relative position of states changes based on the approach I don’t know how to explain that to myself.

  5. Those scatterplots are horrible! What’s with the tiny ticks and labels so far away from the ticks? I would also prefer a grey background with some nice lighter grey rules so the contrast isn’t so glaring. And maybe drop the primary colors—they’re tacky. Add a standard legend indicating the color coding rather than sneaking it into the subtitle. Why label all the states rather than just color coding—you don’t want everyone to read all the state names, do you?

    And isn’t the whole issue much more socio-economic than geographic? But what I really want is a red-state/blue-state analysis (of income vs. Republican vote share), but instead with income versus obesity. There’s a huge difference (pun intended) in size of people moving from 12th St (where I get on the subway downtown) up to 125th St (where I get off the subway uptown). Similar difference to moving from the neighborhood where my sister now lives (Birmingham, MI) to where I went to high school (Livonia, MI), two very different socio-economic communities in the Detroit suburbs.

    • Since I grew up in Detroit proper, I couldn’t help but look up whether the differences between Birmingham and Livonia are similar to what I remember — and also to compare with Detroit proper. The differences between Birmingham and Livonia seem a little greater than when I was a kid (with Birmingham still the ritzier), but both are still predominantly white, whereas Detroit is about 83% black. Per capita incomes compare as follows:

      Birmingham$68,435

      Livonia $29,536

      Michigan overall $25,681

      Detroit $14,870

  6. Maryland ain’t the Northeast or the South! Maryland is Mid-Atlantic, where Mid-Atlantic is defined as any state where you can find scrapple on a diner menu.

  7. The original chart was focused on the differences in *ranks*, and the scatterplots are of the actual obesity rates. Hence the authors’ misguided attempt to illustrate the differences in rank with that chart. However, I do believe that focusing on the differences in *rates* is more informative than the ranks.

    Regarding what the (definitely improved) graphs reveal – it appears that there is a line that runs parallel to the “no difference” line, only starting at about 33% on the REGARDS axis (FL/NY to LA/MS). Most of these points are in the Southeast, while the blue points are more varied above & below this line. Not sure if this means anything…just an observation.

Leave a Reply to Dale Lehman Cancel reply

Your email address will not be published. Required fields are marked *