Skip to content

Calorie labeling reduces obesity Obesity increased more slowly in California, Seattle, Portland (Oregon), and NYC, compared to some other places in the west coast and northeast that didn’t have calorie labeling

Ted Kyle writes:

I wonder if you might have some perspective to offer on this analysis by Partha Deb and Carmen Vargas regarding restaurant calorie counts.

[Thin columnist] Cass Sunstein says it proves “that calorie labels have had a large and beneficial effect on those who most need them.”

I wonder about the impact of using self-reported BMI as a primary input and also the effect of confounding variables. Someone also suggested that investigator degrees of freedom is an important consideration.

They’re using data from a large national survey (Behavioral Risk Factor Surveillance System) and comparing self-reported body mass index of people who lived in counties with calorie-labeling laws, compared to counties without such laws, and they come up with these (distorted) maps:

Screen Shot 2016-04-09 at 10.15.36 PMScreen Shot 2016-04-09 at 10.15.43 PM

Here’s their key finding:

Screen Shot 2016-04-09 at 10.11.57 PM

The two columns correspond to two different models they used to adjust for demographic differences between the people in the two groups of counties. As you can see, average BMI seems to have increased faster in the no-calorie-labeling counties.

On the other hand, if you look at the map, it seems like they’re comparing {California, Seattle, Portland (Oregon), and NYC} to everyone else (with Massachusetts somewhere in the middle), and there are big differences between these places. So I don’t know how seriously we can attribute the differences between those trends to food labeling.

Also, figure 5 of that paper, showing covariate balance, is just goofy. I recommend simple and more readable dotplots as in chapter 10 of ARM. Figure 4 is a bit mysterious too, I’m not quite clear on what is gained by the barplots on the top; aren’t they just displaying the means of the normal distributions on the bottom? And Figures 1 and 2, the maps, look weird: they’re using some bad projection, maybe making the rookie mistake of plotting latitude vs. longitude, not realizing that when you’re away from the equator one degree of latitude is not the same distance as one degree of longitude.

As to the Cass Sunstein article (“Calorie Counts Really Do Fight Obesity”), yeah, it seems a bit hypey. Key Sunstein quote: “All in all, it’s a terrific story.” Even aside from the causal identification issues discussed above, don’t forget that the difference between significant and non-significant etc.

Speaking quite generally, I agree with Sunstein when he writes:

A new policy might have modest effects on Americans as a whole, but big ones on large subpopulations. That might be exactly the point! It’s an important question to investigate.

But of course researchers—even economists—have been talking about varying treatment effects for awhile. So to say we can draw this “large lesson” from this particular study . . . again, a bit of hype going on here. It’s fine for Sunstein if this particular paper has woken him up to the importance of interactions, but let’s not let his excitement about the general concept, and his eagerness to tell a “terrific story” and translate into policy, distract us from the big problems of interpreting the claims made in this paper.

And, to return to the multiple comparisons issue, ultimately what’s important is not so much what the investigators did or might have done, but rather what the data say. I think the right approach would be some sort of hierarchical model that allows for effects in all groups, rather than a search for a definitive result in some group or another.

P.S. Kyle referred to the article by Deb and Vargas as a “NBER analysis” but that’s not quite right. NBER is just a consortium that publishes these papers. To call their paper an NBER analysis would be like calling this blog post “a WordPress analysis” because I happen to be using this particular software.


  1. Jonathan says:

    What is also interesting is that the article you are referring to came shortly after this article:

    I would also point to the new journal article about to be published in Health Economics on this topic:

  2. Carlos Ungil says:

    The analogy in your PS seems a bit strange. I’m curious: would you raise the same objection if a piece published as part of the Columbia Economics Discussion Paper Series was referred to as a “Columbia University analysis”?

    • Corey says:

      Based on Krugman’s description of NBER working papers, I’d say AG is right to note the oddity of calling a paper an NBER analysis even though the analogy to WordPress is a bit tenuous IMHO.

    • Andrew says:


      Yes, I would have the same objection.

      • Carlos Ungil says:

        Ok, thanks for the clarification. It makes sense to distinguish authors from the organisations they are affiliated with. This kind of reference (“a Yale study”, “a new Harvard paper”) is a simple way to provide some context, but I can also see how it could be used to mislead the reader.

        • Andrew says:


          Rather than “a Harvard paper,” I’d say, “a paper by Harvard professor Roland Fryer,” for example. You can provide the institutional affiliation while still naming the author. But NBER in particular conveys very little information. It’s like saying that a paper’s on Arxiv: it tells the reader where you found the paper, but NBER or Arxiv is not the author of the paper or even the organization that supported the writing of the paper.

          • Carlos Ungil says:

            Arxiv is not a research institute, I don’t think this analogy is valid either. If the authors thinks their affiliation with NBER conveys very little information, why don’t they publish it elsewhere?

            • Andrew says:


              For members of the club, NBER is just an easy way of posting unpublished papers in a way that they can get some attention. It really is like Arxiv, it’s just a different club of authors. I don’t think the organization NBER does any vetting of the papers uploaded to their site, beyond the same very basic that Arxiv does for the papers uploaded there.

              And people do publish NBER papers elsewhere. NBER is where economists post preprints; once a paper is accepted in some journal, it’s published there too.

              In any case, my real point is that NBER is in no sense the author of the paper, and in no sense is NBER doing the analysis.

              • Carlos Ungil says:

                There is no club at all in the arxiv case, anyone can submit anything (I don’t know what kind of filters are in place, but the bar doesn’t seem too high). NBER, by contrast, is an invitation-only club with some fifteen hundred members.

                I understand your point: NBER is in no sense the author of the paper. But this is not different than saying that Columbia University is in no sense the author of any of your papers, and I don’t think Columbia University does any vetting of the working papers that you publish on your web bag, and of course those papers will be published in some journal once they are accepted. Or tt least the difference is much more subtile than when you compare NBER with wordpress or arxiv.

              • Carlos Ungil says:

                By the way, people do sometimes include their NBER affiliation in the published version of the papers. For example:

              • Andrew says:


                Exactly what you say. I include my Columbia affiliation in the papers I write, but Columbia is in no sense the author of my papers.

  3. Dean Eckles says:

    “Inference is based on standard errors adjusted for clustering at the county level.”
    If the treatment is actually correlated beyond the county level (i.e. at the state or regional level) — as we expect for the outcome — this is going to be anti-conservative.

    This is similar to Andrew’s point about what is actually being compared (often really big states)… Like Barrios et al (, we can ask, if treatment assignment were as good as random, how much independence is there really in the randomization of counties to treatments?

  4. jrc says:

    “On the other hand, if you look at the map, it seems like they’re comparing {California, Seattle, Portland (Oregon), and NYC} to everyone else (with Massachusetts somewhere in the middle), and there are big differences between these places. So I don’t know how seriously we can attribute the differences between those trends to food labeling.”

    I would guess, based on their discussion, that they are actually using estimators that rely on mean-differencing within a county over time (“xtreg, fe” in Stata). So while the laws were concentrated in some states, the “persistent” differences across states (in the mean) are wiped out by the mean-differencing, and then indicator variables for each year control for an idiosyncratic national trend.

    Now of course, that isn’t perfect, and if the states had differential trends in BMI rates that were also correlated with adoption of a labeling law, then there would be problems. But I think that is second-order to the claim you are making.

    One very nice way to show this (hinted at in their figures, but not explicitly shown) is as an “event study” or “semi-parametric difference-in-difference”. That is, instead of just showing us one treatment effect average, let that effect play out from -T years before the law to +T years after the law. Then we could have visual evidence of an effect that is accumulating over time (which I think this effect would have to be). This also doesn’t take care of the fact that choice to implement labeling laws is correlated with all kinds of unobservable things about counties, but it would add some credibility to the estimated effects if we could see them playing out over time.

    But I actually suspect they see the contribution in the re-weighting scheme and in the finite-mixture-models and exploring heterogeneity in treatment effects in that manner. I’m not super convinced their re-weighting scheme is really going to help with the differential trends assumption as they claim it might (too much work driving trends done by unobservables), but I can see how the idea would interest people.

    • Andrew says:


      I recognize they are looking at time trends. Just look at the title of my post. When I say “there are big differences between these places,” yes, I think these big differences could correspond to differences in time trends, not just differences in legels.

  5. jrc says:

    That’s what I get for judging a cover by it’s book! I should’ve read the title.

    I retract my claims regarding your interpretation. I stick by my desire to see things in event time.

    PS – for those who don’t know, “legels” is a technical term meaning “propensity or desire to legislate nutritional information dissemination”, as in “California has serious legels compared to Nevada.”

  6. I have a student doing an honours project on combining real measurements from NHANES with self-report from BRFSS. Unsurprisingly, BMI from self-reported weight is lower than from measured weight, but the bias seems reasonably stable over 14 years of data, so trends from BRFSS may be ok.

Leave a Reply