It’s not enough to be a good person and to be conscientious. You also need good measurement. Cargo-cult science done very conscientiously doesn’t become good science, it just falls apart from its own contradictions.

Kevin Lewis points us to a biology/psychology paper that was a mix of reasonable null claims (on the order of, the data don’t give us enough information to say anything about XYZ) and some highly questionable noise mining supported by p-values and forking paths.

The whole thing is just so sad. The researchers are aware of the statistical problems of forking paths, but they still persist in doing noise-mining research, perhaps in response to the requirements of clueless reviewers. The thing don’t always seem to be understood in this sort of work is that it’s not enough to be a good person and to be conscientious. You also need good measurement. Cargo-cult science done very conscientiously doesn’t become good science, it just falls apart from its own contradictions.

Again: you don’t have to be a good person to be a good scientist.

If you do happen to be a good person, the above sentence implies two things:

1. You can be doing bad work! If your measurements are noisy or not well connected to theory, and you’re using statistical methods that don’t work well in such settings, you can mislead yourself and others. Purity of heart is no protection: the math doesn’t care. If you conduct power = .06 research, or if you try to study ovulation and you get the dates of ovulation wrong, or if you study sex ratios without understanding scales of variation, or if you study himmicanes without getting control of your data, etc., then you will fail to learn about reality. You will be doing bad science. Science has its own logic.

2. To flip it around: if you do happen to be doing bad work, that doesn’t make you a bad person! And if Greg Francis or Uri Simonsohn or Anna Dreber or Jeremy Freese etc etc say that you’re doing bad work, that doesn’t mean they’re saying you’re a bad person. They’re just saying you’re doing bad science, that the methods you’re using are not appropriate for the scientific questions you’re trying to study.

Or if you completely screwed up on a project and have to publicly correct your work (which has happened to me four separate times; see here, here, here, and here), that doesn’t make you a bad person. And if someone points out a serious error in your work, they’re not calling you a bad person. Just eat the ball, you don’t have to do a Yepremian.

Remember, we all make mistakes, and as scientists it’s our job to learn from them.

P.S. I removed the link and reference to the paper that motivated this post, because (a) the point I’m trying to make is general and doesn’t really require reference to any particular paper, and (b) in this particular case, it’s possible that some of the worst of the analysis was reluctantly done in response to the review process, which is annoying in its own right but not really so relevant to our main concerns here. It’s just another example of the problems with peer review—in this case, review can make a paper worse.

18 thoughts on “It’s not enough to be a good person and to be conscientious. You also need good measurement. Cargo-cult science done very conscientiously doesn’t become good science, it just falls apart from its own contradictions.

  1. I’ve been reading this blog for about 6 months, and I just dropped in to say I never thought I’d find a Garo Yepremian reference, but here we are. There are now at least TWO stats-related blogs out there that have mentioned Yepremian not falling on the stupid ball as a teaching point.

    • Anon:

      I assume that when Cuddy says “fake it,” she’s not advocating dishonesty but rather that she’s advocating that people proceed as if they know what they’re doing even when they lack full confidence, on the theory that the social benefits of people trying new things will exceed the costs of failure. This does not seem an unreasonable position to take. In keeping with the post above, I think I’d say something like:

      Fake it till you make it and then learn when you mistake it.

    • Mason:

      Getting control of your data includes understanding what’s being measured and how the measurements relate to the statistical analysis and the underlying questions of interest. The himmicanes and air rage studies were two examples in which researchers didn’t seem in complete control of their data; they seemed to think it was enough to gather some data that had some general relevance to their questions, and then just proceed with analysis and interpretation. Another example were the ovulation-and-voting and ovulation-and-clothing studies, where the researchers mislabeled the dates of peak fertility, which completely destroyed the claims they were making with regard to their substantive theory.

  2. Grad student here who has recently been reading this blog and still learning about detecting the forking paths issue. I’m curious, what is the major objection to the paper? My takeaway is “yep, power posing doesn’t do anything to testosterone,” and that seems very worthy of publishing given many people are interested in power posing (or: the issue is not resolved for many people, even if it is for some). The researchers seem to indicate with the last sentence that their takeaway is the same as my interpretation, but with a slightly less dismissive tone that seems appropriate for publishing a basically null finding in a journal. Is it that they don’t do enough to say “this is noise”, the actual design itself, or something else?

    • Alex:

      I have not looked at every detail of the paper, but some clear forking paths include the choice of which interactions on which to focus, and the sort of overinterpretation of noise which commonly occurs after dichotomizing results into statistically significant and non-statistically significant.

  3. And if you are an author of a stats textbook that garners royalties year after year from thousands of students who are required to buy it for $100-200+ a pop, and if you put that book on your cv and it helps get you tenure and promotion, and if that book doesn’t dig deep into the replication crisis and really help students understand what Meehl and Cohen and Ioannidis and Simonsohn and many others have been saying for the last half century or more, then you are a huge part of the problem.

    I’m not singling out Gelman and Hill, since Andrew has said he and Jennifer are revising their book along these lines. But I just did a quick search on Amazon for “statistics textbook” and then searched for any reference to Meehl. Freedman had one footnote. None of the others I could search had anything (but many weren’t searchable either). I also did a quick search on google books. McElreath cites several of the above folks, and a few recent edited volumes have some refs, but not too much else that I could see.

    • Ed:

      My textbooks all cost less than $100. I’ve put in a lot of effort to keep the prices down. Royalties from book sales are a tiny proportion of my income. Also, my books did not help me get tenure or promotion. Actually, based on the remarks of the department chair at Berkeley back when I worked there, I think that my book made it harder for me to get tenure!

      • Ed, Andrew:

        Maybe there’s a more general point here that’s consistent with what both of you have said:

        If you have enough status to challenge obvious flaws in a field, then you were probably oblivious to those flaws for a long time. Otherwise, you wouldn’t have gained enough status in the field to challenge its flaws successfully.

        So, in practice, most people who challenge deep-seated flaws used to be part of the problem. On the other hand, those who were aware of the problem to begin with probably decided to work in another field, left science, had little status, or were OK pretending the problem didn’t exist. Those people aren’t likely to address the problem successfully.

        This applies well to me and others I know. In the field I work in currently, cancer screening, there are fundamental flaws that — looking back — really should have been obvious years ago. But it is only people who were initially oblivious to these problems that gained enough skill and experience in the field to start addressing them.

  4. Last year we had an editor write something along these lines: “In avoiding the garden of forking paths [something we said we were trying to do by not following all of the data probing analyses suggested] you may wander into the garden of missed opportunities.” We withdrew the paper from that journal and went elsewhere.

    • Lorne:

      It sounds like there was some misunderstanding. My recommended solution to forking-paths analysis is not to look at only a subset of opportunities but rather to look at all comparisons of potential interest. Indeed, my criticism of typical forking-paths analyses is that they select arbitrary, essentially random, comparisons that happened to reach some threshold. This can represent a missed opportunity to learn.

      The general point seems important to me, if people mistakenly think there’s a tradeoff in which noise-chasing data analysis represents “opportunities” that can be missed. So maybe this is worth a longer post. Do you happen to have a quote from that editor’s report that you could share?

    • Exploring the data is good science. Unfortunately, it is incompatible with the standard method of drawing conclusions (NHST). Why not keep the science and lose the NHST?

Leave a Reply

Your email address will not be published. Required fields are marked *