Skip to content

It’s hard to know what to say about an observational comparison that doesn’t control for key differences between treatment and control groups, chili pepper edition

Jonathan Falk points to this article and writes:

Thoughts? I would have liked to have seen the data matched on age, rather than simply using age in a Cox regression, since I suspect that’s what really going on here. The non-chili eaters were much older, and I suspect that the failure to interact age, or at least specify the age effect more finely, has a gigantic impact here, especially since the raw inclusion of age raised the hazard ratio dramatically. Having controlled for Blood, Sugar, and Sex, the residual must be Magik.

My reply: Yes, also they need to interact age x sex, and smoking is another elephant in the room. A good classroom example, I guess.

It’s no scandal than a weak analysis is published in Plos-One. Indeed, I think it’s just fine for such speculative research to be published—along with the raw data and code used in the analysis—so that others can follow it up.


  1. a reader says:

    Yes, with access to the data, we can follow and answer all the questions the reader may have left open by the authors! Which is important, because no reader will ever be 100% satisfied by an author’s analysis, no matter how thorough the author is.

    Looking at the demographics, it’s very interesting that hyper tension rate was about 35% higher for the non-chili eating group; 27.1% vs 19.9%. It’s my guess that this would not be explained by the age difference alone (difference in average age = 6 years). But before we start making causal conclusions, we also have to worry about survivorship bias: while I love the flavor, I don’t eat hot chilis because I have stomach problems. This could easily put me at risk for a number of other issues.

    Man, applied analyses can be hard.

  2. Anoneuoid says:

    Lots of things are correlated with food. Look at this one, where they find that patients with “probable Alzheimer’s” are much better at smelling peanut butter with their left nostril than the right compared to people in other groups:

  3. This is an example where I think using age as the time scale makes more sense than using time since NHANES interview. That way, all the information in the partial likelihood comes from comparisons between people of the same age. I think all survival analysis software now supports left-truncation (delayed entry), so it isn’t that hard to do. You do introduce the problem that at the same age some people’s dietary assessment will be more recent that others, but if that matters a lot I think you’re in trouble with a single dietary assessment anyway.

  4. dl says:

    From the title, I thought this was going to be about RateMyProfessor

Leave a Reply