Skip to content

Controversy over the Christakis-Fowler findings on the contagion of obesity

Nicholas Christakis and James Fowler are famous for finding that obesity is contagious. Their claims, which have been received with both respect and skepticism (perhaps we need a new word for this: “respecticism”?) are based on analysis of data from the Framingham heart study, a large longitudinal public-health study that happened to have some social network data (for the odd reason that each participant was asked to provide the name of a friend who could help the researchers locate them if they were to move away during the study period.

The short story is that if your close contact became obese, you were likely to become obese also. The long story is a debate about the reliability of this finding (that is, can it be explained by measurement error and sampling variability) and its causal implications.

This sort of study is in my wheelhouse, as it were, but I have never looked at the Christakis-Fowler work in detail. Thus, my previous and current comments are more along the lines of reporting, along with general statistical thoughts.

We last encountered Christakis-Fowler last April, when Dave Johns reported on some criticisms coming from economists Jason Fletcher and Ethan Cohen-Cole and mathematician Russell Lyons.

Lyons’s paper was recently published under the title, The Spread of Evidence-Poor Medicine via Flawed Social-Network Analysis. Lyons has a pretty aggressive tone–he starts the abstract with the phrase “chronic widespread misuse of statistics” and it gets worse from there–and he’s a bit rougher on Christakis and Fowler than I would be, but this shouldn’t stop us from evaluating his statistical arguments. Here are my thoughts:

1. Lyons’s statistical critiques seem reasonable to me. There could well be something important that I’m missing, but until I hear otherwise (for example, in a convincing reply by Christakis and Fowler, which could well appear soon), I’d have to go with Lyons and say that the claimed results on contagion of obesity (and also sleep problems, drug use, depression, and divorce) have not been convincingly demonstrated.

2. That said, this does not mean that Christakis and Fowler are wrong in their claims, merely that their evidence is weaker than may have at first appeared. Lyons recognizes this, writing, “while the world may indeed work as C&F say, their studies do not provide evidence to support such claims.” I wouldn’t go quite so far as to say they don’t provide evidence, but it seems fair to say they don’t provide convincing or compelling evidence. And, had the criticisms of Lyons and others been available when the papers were first submitted, I doubt they would’ve been accepted by top journals. (Again, this is my current impression, and I’m open to changing my opinion if Christakis, Fowler, or others can supply a convincing response.)

3. In debates about empirical social science, there is often a tendency to simply accept descriptive claims and move straight to the arguments about their implications. But as I’ve learned in my own research, often the descriptive claims themselves should be disputed. (For example: No, congressional elections are not increasingly likely to be close; No, redistricting does not in general create safe seats; No, we don’t need to explain why rich people now vote for Democrats or why Kansas has suddenly gone Republican.)

So I’d like to separate Lyons’s criticism of the descriptive inferences and the causal implications. The descriptive criticism is that some of Christakis and Fowler’s observed differences are not statistically significant, thus there is some doubt about generalization to the larger population, it could all just be patterns in random noise. The causal criticism is that, if the descriptive patterns do generalize, they could be explained in other ways than contagion.

4. Some of Lyons’s points relate to my own research! In particular, he notes on page 6 that the difference between significant and non-significant is not itself statistically significant, a point that should be familiar to regular readers of this space. And on page 20, he discusses difficulties with average predictive comparisons in nonlinear models.

I love seeing these ideas in new places. It’s like traveling in some foreign country and seeing McDonald’s and Gap.

5. Lyons goes a bit over the top in the conclusion of his article, slamming observational studies and modeling in general. But statistical modeling is important and useful in many many areas of science and engineering. We all know about the modeling success of the past, Kepler etc., but even modern-day statisticians can make progress with models. For example, see this paper where we explicitly discuss how the model works for us to fit a nonlinear differential equation in toxicology. There’s also political science (lots of examples, starting with the recent work by Lax and Phillips who used multilevel regression and poststratification to estimate state-level opinions on gay rights issues), civil engineering (they’ve been modeling road traffic for a long time, as discussed in the comments to Aleks’s recent blog entry on the topic), indoor air quality (ask Phil for details), business (lots of models were used in the Netflix prize, including for the winning teams), etc etc.

Bob writes the following about models in computational linguistics:

Google translate is heavily model based, being derived from IBM’s original statistical translation models.

Ad placement is also heavily model based, and works at least as far as Google’s revenue is concerned.

All of the speech recognition in everything from call centers to the desktop is heavily
model based, and works pretty well judging by the numbers of people using it.

A neat example is the Swype and T9 interfaces for entering numbers on cell phones. . . .

All of these models make crazy and wrong assumptions about independence and so on, but they work in the sense that they’re useful, not in the sense that they’re right.

Let me echo my friend here on the “useful” point. I don’t think our models are true. Even seemingly slam-dunk models such as simple random sampling are not true with real surveys, nor are randomization models actually true with experiments on real people. Models are always approximations. (Further ranting here.)

I think one should step back before slamming any research just cos it’s observational and model based. (And it doesn’t help that Lyons cites Larry Summers as an authority on statistical evidence.)

6. Just a minor technical point: On pages 17-18, Lyons implies that modeling is something you do when you don’t have enough data:

Small-scale experiments could be initiated to see what the effects of intervention actually are. Since the collection of good data is usually very hard and expensive, most papers substitute for it by statistical modeling.

This is misleading: first, a small-scale experiment can be noisy and also, by the very nature of its small scale, its larger implications can be limited. See our recent discussion of the claim (based on evidence from a randomized experiment!) that “a raise won’t make you work harder.” Reliability score: 100. Validity score: zero. Well, maybe not zero, but I don’t buy the generalization from lab to real world at all in that example. To me it seems more of a case of lab results + ideology = policy claim.

The second problem with Lyons’s argument above is the implication that modeling is what you do when you don’t have good data. Au contraire! Once you have good data, you might very well want to model to learn important things. Consider our radon project. We had 5000 excellent data points and 80,000 good data points. And to learn what we needed to learn, we fit a model. Which involved lots of work, lots of interaction between the science and the data, and lots of checking. It wasn’t easy but it’s what we needed to do.

Some of these points are subtle. Applied statistics can be subtle. Taking an intro stat course, or even teaching such a class, doesn’t give you the full story. From an intro book, you can easily get the idea that when you have clean data you don’t need to model. But it all depends on what questions you’re asking.

It’s easy to write a sentence like, “viewing observational data through the lens of statistical modeling produces new biases, generally unknown and mostly unacknowledged, lurking in mathematical thickets.” That sounds reasonable enough. But if I want an estimate (and uncertainty) about the distribution of radon levels of houses with basements in Lac Qui Parle County, then, yes, I’ll accept those mathematical thickets. Math is ok if it helps us get good answers.

The bottom line

To return to Christakis and Fowler: I’d be interested to see their reply to the criticisms of Lyons and others. Perhaps they’ll simply step back a few paces and say that the Framingham data are sparse, that they’ve found some interesting patterns that they hope will inspire further study in other contexts.

After all, even if the Framingham results were unambiguously statistically significant, robust to reasonable models of measurement error, and had a clean identification strategy–even then, it’s just one group of people. In that sense, the debate about Christakis and Fowler’s particular claims, interesting and (methodologically) important as it is, is only part of a larger story of personal networks, health, and behavior. I hope that Lyons’s article and any responses by Christakis, Fowler, and others will be helpful in designing and analyzing future studies and in piecing together the big picture.

P.S. I conveyed point 5 above to Lyons and he responded that he respected models too but was concerned with models that cannot be tested. I agree with him on that. I believe that model checking is central to applied statistics. (Another point that will be familiar to regular readers of this blog.)

P.P.S. More Fowler here and here.


  1. Kaiser says:

    I have trouble even understanding what they mean by "obesity is contagious". Is that a causal statement in the sense that obesity is like the flu, and there exists some as yet unknown vector that transmits the condition from one person to another? Or is it a statement of correlation saying that the chance of someone being obese is higher if we know this person's "best friend" is obese?
    If the former, we can hold our belief until someone discovers the causal mechanism. If the latter, I'm sure there are other datasets that can corroborate or overturn this correlational evidence.
    I haven't read anything yet except Andrew's musings. But what if these "best friends" are often relatives and siblings, and in that case, aren't they just saying obesity runs in the family?

  2. Michael says:

    Another great, readable response to a thorny issue – this is why I enjoy reading your blog.

    I haven't had time to look carefully through the statistical arguments that Lyons has. I'm expecting many of them to be reasonable and worth discussing, and maybe I'll pipe in later in the comments. That being said, I think you are much too kind about Lyons' tone in this paper. He is nothing less than savage in his direct attacks on Christakis and Fowler, which I think is inexcusable professional behavior. Don't you find it unacceptably personal in the tone of the language?

    Also, a brief thought on a bigger picture: perhaps the methods used in the C&F studies will seem laughably outdated in a few years, but I think it's reasonable to say that much of the increased research in statistical methods for network science is due to the popularity/controversy of these papers. I think that good science should inspire more questions and excite people, and if it is proven wrong in the future, that can happen, but nothing is more important than publishing something that is compelling enough to create entirely new directions for research.

    (disclosure: I work with Christakis closely, but have had zero input on the social contagion projects)

  3. Michael says:

    Also, by the way, I am struck by the extreme arrogance of writing a section in one's paper criticizing the peer review process because journals have rejected the paper. Peer review certainly has lots of problems, but perhaps Lyons is also getting a message that he needs to find another way to communicate his nasty critique rather than suggesting that everyone else is wrong?

  4. conchis says:

    The core of the critique really seems to lie in Lyons' section 2 (which, like Andrew, I pretty much buy – unless there's a convincing rebuttal forthcoming).

    The critiques in section 4 and 5 seem to me to be based on a) C&F's description of their methods not being entirely free of ambiguity; b) Lyons' resolving the ambiguity in the least favourable way possible. Hard to know really what to make of that.

    Either way, I agree with Michael that the tone is pretty irritating.

  5. David Manheim says:

    Agreed – regardless of the validity of the claims, if we do not insist upon some level of civility, we effectively reward antisocial behaviors.

    You don't need to be a statistical modeler to see the effect that declining levels of civility have on the overall tone of public discourse in this country.

  6. Joseph says:

    I agree about the issue of civility. It makes it harder to resolve disputes and personalizes what should be serious work on trying to understand an often confusing world.

    I have once been on a paper that had an issue (it was tangential to the main point). When it got personal, tempers flared. When it got less personal, we ended up becoming friends and allies with our critics. And my personal reaction to thinking I might have made a mistake on a graph: intense shame. Nobody needed to rub it in.

  7. I think this article on "On Chomsky and the Two Cultures of Statistical Learning" by Peter Norvig could be relevant here: It illustrates the danger of models that goes beyond the interpretation of data when models become ideologies – become part of language and preclude formulations of other models.

    @David Menheim: I think that one should be a statistician before claiming "declining levels of civility". More than likely there is no such thing.

  8. numeric says:

    The short story is that if your close contact became obese, you were likely to become obese also. The long story is a debate about the reliability of this finding (that is, can it be explained by measurement error and sampling variability) and its causal implications.

    Haven't read the paper, but there are numerous anecdotal (and maybe scholarly) articles on how we are becoming two peoples–the lower class those who are overweight, and the upper class of those who are "thin" (how may overweight colleagues do you have in the Columbia Statistics Department?–yet 50% of Americans are overweight (or some such factoid)).

    Anyway, the obvious point is that your friend is likely of the same social class as you, and if you are lower, you're heavier. Correlation/causation etc. Like the observation that shoe size and scores on adult IQ tests are heavily correlated, until you control for age.

  9. connecting links says:

    Here is a link to your follow up post (containing links to still other papers and to a draft response by Christakis/Fowler), added to this thread, in case those coming here directly don't see it: