“Six red flags for suspect work”

Raghu Parthasarathy sends along this article by C. Glenn Begley listing six questions to ask when worried about unreplicable work in biology:

Were experiments performed blinded? (Even animal studies should be blinded when it comes to the recording and interpretation of the data—do you hear that, Mark Hauser?)

Were basic experiments repeated? (“If reports fail to state that experiments were repeated, be sceptical.”)

Were all the results presented? (That one’s a biggie.)

Were there positive and negative controls? (He offers some details from lab experiments.)

Were reagents validated? (Whatever.)

Were statistical tests appropriate? (I don’t like the idea of statistical “tests” at all, but I agree with his general point.)

9 thoughts on ““Six red flags for suspect work”

    • Louis,

      Remembering from my experience “volunteering*” for Pscychology experiments as an undergrad, I think a lot of them can be blinded in some way. In the case I remember, I was told to read something, then answer some questions. It was a framing experiment, so the answers to my questions were to be analyzed related to what I had read. There was no need for the grad student administering the experiment to know which document/Treatment I had read. He could just hand over an envelope that had a number on it, write down my name and put the number next to it, and the statistician/researcher could match envelope number, outcome responses, and treatment status later.

      Similarly, in a larger scale field experiment in public health that I work on, we have a team that distributes the “Treatment”, and a separate team, which meets with the subjects in a separate location, to record outcomes. The first team is obviously not blinded to treatment, but the team collecting the outcome data is.

      So it can be done in a lot of contexts with just a little bit of creativity, and its not that everyone has to always be blind, just the people collecting the outcome data.

    • In some cases, you may not be able to blind a study, but you may be able to blind the analysis. In marketing, for example, you might introduce a new product into a set of randomly chosen areas, say anchovy flavored Jello. Obviously, you can’t fully blind this study (particularly as you need to advertise the product!). But you might be able to blind much of the analysis (e.g. as to how much the new item increased overall Jello sales in the test stores versus the control stores).

      Unfortunately, this is seldom done.

  1. “Were reagents validated” seems to be covering a lot ground that we’d spend in social science talking about measurement reliability and validity (which, for example, I’ve seen someone somewhere list as the most important assumption of regression models).

    • I would say reagent validation is more analogous to manipulation checks than it is to measurement reliability. I suppose in some cases the effect observed on some reagent might be an outcome measure, in which case your analogy is more apt. But I think more often reagents are treatments or interventions (or a component thereof). This is an important point because measurement reliability and validity frequently get a lot of discussion, but manipulation checks are often overlooked.

      My first instinct is that I also wouldn’t say that measurement reliability and validity are an assumption of regression models per se, because the same assumptions are required for any other statistical method you might use. A regression model on bad measures still produces a valid estimate of the relationship between those measures – assuming the real assumptions of the method (independence, equivariance, etc.) are met.

  2. > Were reagents validated? (Whatever.)

    “Whatever.”? No. Definitely not “Whatever.”

    When I run simulations I don’t generally check to confirm that the machine I’m using reports 4 when I ask it to add 2 and 2 but if it’s reporting 3.97 then I have problem. Reagent validation is the equivalent of checking to see that you get 4.*

    (* “Why did Intel call the successor to the 486 Pentium instead of 586?” “Because they added 100 to 486 and got 585.99999999.” Probably funnier twenty years ago than it is now.)

    • Validating reagents is really important in biology, or chemistry, but not so important if you’re trying to extrapolate the same issues to social science, unless you’re testing for the effect of administered drugs on voting policy ;-). Perhaps that’s what Andrew meant by “whatever”.

Comments are closed.