“I thought it would be most unfortunate if a lab . . . wasted time and effort trying to replicate our results.”

[cat picture]

Mark Palko points us to this news article by George Dvorsky:

A Harvard research team led by biologist Douglas Melton has retracted a promising research paper following multiple failed attempts to reproduce the original findings. . . .

In June 2016, the authors published an article in the open access journal PLOS One stating that the original study had deficiencies. Yet this peer-reviewed admission was not accompanied by a retraction. Until now.

Melton told Retraction Watch that he finally decided to issue the retraction to ensure zero confusion about the status of the paper, saying, “I thought it would be most unfortunate if a lab missed the PLOS ONE paper, then wasted time and effort trying to replicate our results.”

He said the experience was a valuable one, telling Retraction Watch, “It’s an example of how scientists can work together when they disagree, and come together to move the field forward . . . The history of science shows it is not a linear path.”

True enough. Each experiment, successful or not, takes us a step closer to an actual cure.

Are you listening, John Bargh? Roy Baumeister?? Andy Yap??? Editors of the Lancet???? Ted talk people????? NPR??????

I guess the above could never happen in a field like psychology, where the experts assure us that the replication rate is “statistically indistinguishable from 100%.”

In all seriousness, I’m glad that Melton and their colleagues recognize that there’s a cost to presenting shaky work as solid and thus sending other research teams down blind alleys for years or even decades. I don’t recall any apologies on those grounds ever coming from the usual never-admit-error crowd.

35 thoughts on ““I thought it would be most unfortunate if a lab . . . wasted time and effort trying to replicate our results.”

  1. Well I was gonna comment on what an excellent example of professional behavior (not to mention epistemological candor) we just witnessed in that post… but now I’m gonna comment on the fact that I know you just spent the last 10 minutes or so looking for an appropriate cat picture.

    Bravo.

  2. I thought this was the best part:

    “In retrospect, he [ Douglas Melton] said he wished he’d performed the original experiment with more mice (“more attention to the statistical strength is a lesson that I’ve learned”

    I think this bit needs to be drilled more often into budding researchers.

  3. The part that makes me cringe is that Google tells me that the original Cell article (now retracted) has already been cited by 337 other papers.

    Ouch. Too little too late?

    I wonder how much of crap proliferation this has already initiated and whether the downstream authors & readers will ever realize they are using a retracted starting point.

    • Let’s look at the rate of citations for that paper pre/post the PLOS follow-up. And then pre/post the retraction. It is a neat little natural experiment in the propagation of false information through the literature.

      Just, like, date on the x-axis and citations (in that month) on the Y-axis, and some vertical lines where the two papers were published. And then connect the dots (and/or smooth between them). We could judiciously choose some “control” papers if we wanted, and graph the difference.

      If we grab more examples, we can make the x-axis “event time” with 0=date of announcement. That would be cool.

      I mean, maybe if we wasted a whole lot of money that would suck, or if someone got hurt or something that would be real real bad. But otherwise, I’m not necessarily worried about 300+ citations.

      • +1 I had the exact same thought. Does retraction cause any step change in the citation rate of that retracted paper?

        OTOH, there may be a counter effect where the paper gets post-retraction citations from authors pointing to the paper in some meta sense.

        • Wow, what a great idea for a way to increase my citations – start retracting original papers with a lot of fanfare!

          Wonder what the optimal citation count or fall off in citation rate should be before sending off the retraction?

      • What’d be interesting is to see a post-hoc analysis of the statistical arguments in the retracted paper. Could we have, in hindsight, said that the arguments were weak / flawed?

    • It looks like the original paper was quoted about 100 times AFTER another paper in Cell disputing the study was published. I am a bit divided in my response about this paper. On the one hand, the community raised questions about the paper as soon as it came out; there were multiple papers questioning its validity and calls for a retraction. Dr Melton, on the other hand, did not retract but proceeded to test the work by collaborating with others in a blinded fashion. Probably that is a commendable way to deal with the problem and sort out the science. Apparently another group replicated the work in rats while the controversy was going on so there is a motivation if one was needed. And then there is the issue of many labs basing their work on the original paper that was all over the media. So…

      PS: The “Retraction Watch” site has better summary of the situation than the Gizmodo link above. I got the reference numbers from there. I was aware of the controversy because of similar problems stem cell research was having at that time.

  4. I think the situation is different in medicine than in psychology. In the former case, lives depend on a result being true. In the latter, careers depend on results being published, there can be no fatalities or mistreatment of patients. As a consequence, it’s fine if in psych* non-replicable results remain unretracted, but it’s not fine in medicine.

    • So suicides after therapy based on shoddy research aren’t fatalities? Of course some fields more distant to human lives, the closer they get the more we check. But ~same thing in medicine.

        • (A version with links is awaiting moderation.)

          One of the most entrenched dogmas of research into organizational sensemaking is that “any old map will do”. Andrew and I have written about how this came about through the uncritical circulation of a plagiarized story about some soldiers in the Alps using a map of the Pyrenees. Many scholars in the social psychology of organizing seriously believe that it doesn’t matter whether your map accurately depicts the territory you are navigating in. Even Daniel Kahneman has invoked the story as though it really happened.

          The management of Eli Lilly used the story internally to guide the company’s strategic response to the expiration of their patent on Prozac in 2004. The strategy would include attempts to improve the sales of its other products. Two management scholars explained Eli Lilly’s approach as follows:

          “…the leadership, as it were, knew that [their description of the current situation to the staff wasn’t] a map of the Alps. They were clever enough to know that it was what people did rather than the map that made strategy in such circumstances. The leadership was also smart enough to keep this knowledge to themselves.” (Colville and Pye 2010, p. 374)

          The rest, as they say, is history. In 2009, Eli Lilly was found criminally responsible for promoting off-label uses of Zyprexa and had to pay 1.4 billion dollars in fines and damages. Bad social psychology has real-world consequences. It can be (and has been) used to justify doubtful strategic decisions as “wisdom”.

        • But that’s just stupid, to use social psychology research for anything real. The Amy Cuddy level of stuff is fun to read about in Bild Zeitung, but hopefully it will never be used in real life applications.

          I do take your point though; even useless and irrelevant research can filter down to real life and have real life consequences.

          I guess there are intrinsic reasons to try to do the best one can even in these useless areas. Thankfully, when people don’t do this, the probability of it hurting real people is *generally* low.

        • It’s important to keep in mind that many business leaders are educated in business schools through what is called “research-based” teaching. They are being explicitly taught these things as a basis for their decision-making. So it doesn’t just “filter down”. It’s being actively inculcated in the minds of the students.

        • I didn’t know that. I always thought that business schools must be teaching business related stuff like accounting, i.e., adding, subtracting, multiplication and division. I guess they should teach statistical criticism as a discipline.

        • Shravan:

          My experience in MBA school was very much like Thomas indicated “research-based” or as I would put it “trying to bring the best of science to bear on management of projects and assets”. So a bunch of stuff from social psychology research (which I critically filtered given my previous training in semiotics and theory of inquiry which other students did not have.) Overall I enjoyed the program very much – the statistical material though was extremely poor but again I had some previous serious courses in statistics to help me realize that. Maybe social psychology research does not require the same degree of training for a faculty member to appear to get it mostly right?

          I also think you are being overly optimistic about non-replicable results in clinical research being quickly or definitively retracted. As a case in point, in this 2014 article there is a dispute about who’s (2008 versus 2010) analysis is leading to many avoidable deaths – which one do you think should be retracted – “Regulatory decisions pertaining to aprotinin may be putting patients at risk” http://www.cmaj.ca/content/186/18/1379.full.pdf+html?sid=4b041ca6-01d6-4b0e-99f2-26d5e86044d5

        • My impression is that “research based” is often interpreted as “Someone has proposed this theory in a Peer-Reviewed-Publication, and the following is based on that theory.”

        • This social psychology stuff is all around. In some degree it shapes everybody’s beliefs about human behavior. How many people now talk about “cognitive biases” in the Kanheman and Tversky style. Some chapters of Danny’s book (thinking, fast and slow) have a really low replication rate, but I’m sure lots of people interpret the world using those results.

    • One of the most entrenched dogmas of research into organizational sensemaking is that “any old map will do”. Andrew and I have written about how this came about through the uncritical circulation of a plagiarized story about some soldiers in the Alps using a map of the Pyrenees. Many scholars in the social psychology of organizing seriously believe that it doesn’t matter whether your map accurately depicts the territory you are navigating in. Even Daniel Kahneman has invoked the story as though it really happened.

      The management of Eli Lilly used the story internally to guide the company’s strategic response to the expiration of their patent on Prozac in 2004. The strategy would include attempts to improve the sales of its products. Two management scholars explained Eli Lilly’s approach as follows:

      “…the leadership, as it were, knew that [their description of the current situation to the staff wasn’t] a map of the Alps. They were clever enough to know that it was what people did rather than the map that made strategy in such circumstances. The leadership was also smart enough to keep this knowledge to themselves.” (Colville and Pye 2010, p. 374)

      The rest, as they say, is history. In 2009, Eli Lilly was found criminally responsible for promoting off-label uses of Zyprexa and had to pay 1.4 billion dollars in fines and damages. Bad social psychology has real-world consequences. It can be (and has been) used to justify doubtful strategic decisions as “wisdom”.

    • Sadly, medical research is plagued by the same pressures to publish as psychology, and perhaps a greater resistance to admitting wrong. Much research is done by MDs with little training in science — often students, and strong egos are at play. I’m very aware of hospitals treating patients on the basis of weak & invalidated studies. I take whatever my doctors advise with a strong sense of skepticism.

      • +1

        And an MD might say to a patient, “If we could clone you, and you took this medicine but the clone didn’t, you would live longer than the clone.” (Yes, I actually heard this from an MD — no mention of probability; no evidence that the MD was aware that “conclusions” are based on means or proportions.)

        • Yes, that’s sort of like what Brian Little says in Me, Myself, and Us (regarding the “lemon introvert test”:

          “One of the more interesting ways of informally assessing extraversion at the biogenic level is to do the lemon-drop test. [Description of experiment omitted from present quote—DS.] For some people the swab will remain horizontal. For others it will dip on the lemon juice end. Can you guess which? For the extraverts, the swab stays relatively horizontal, but for introverts it dips. … I have done this exercise on myself a number of times, and each time my swab dips deeply. I am, at least by this measure, a biogenic introvert.”

          I mean, really….

        • Diana:

          What the hell? People are so science-illiterate. But maybe we’re making progress. 50 years ago we’d be hearing rumors of perpetual motion machines, car engines that ran on water, spoon bending, and even bigfoot. Now the purveyors of pseudoscience have moved to embodied cognition, lemon juice extraversion, power pose, and himmicanes. The boundaries of pseudoscience are being pushed back into the trivial. From perpetual motion to the lemon juice test: in the grand scheme of things this is a retreat.

        • Andrew,

          Yes, maybe this is cause for hope: “The boundaries of pseudoscience are being pushed back into the trivial.”

          Along with that, there’s a trivialization of the trivial: a tendency (maybe increasing, maybe not) to take an already silly study and draw absurd and oddly domestic conclusions from it.

          Little and others are referring to an experiment discussed in a 1967 paper by Eysenck and Eysenck: H. J. Eysenck and Sybil B. G. Eysenck, “On the Unitary Nature of Extraversion,” Acta Psychologica, vol. 26 (1967): 383–390.

          In this paper, which inquires into the unitary nature of extraversion, they refer to an earlier, apparently unpublished experiment, which seems to be the one that gets all the press. In reference to this earlier experiment, they state that “extreme extraverts show little or no increment in salivation, while extreme introverts show an increment of almost 1 gram; intermediate groups show intermediate amounts of increment.” They claim to have found a correlation of .71 on 50 male and 50 female subjects between increment scores and introversion, with no difference between the sexes. I suspect that the middle range was quite noisy–but it’s impossible to know without looking at the data.

          In any case, they do not suggest that the results apply to individuals or that you can find anything out by performing a lemon test on yourself. But that’s the notion that took off in the press–in Little’s book, in Susan Cain’s book (and the version for teenagers), and in various online articles.

          There seems to be a drift toward Amazing Science that You Can Perform on Yourself at Home or in a Public Restroom–which, on the one hand, lends itself to hype (You can do it too! In two minutes!) but which also suggests a retreat, as you say.

        • P.S. The Eysenck & Eysenck paper focuses mainly on a subsequent lemon juice experiment. Here 45 men and 48 women took the lemon test and a 57-question version of the Eysenck Personality Inventory. According to the authors, the lemon test had a loading of -0.74 on extraversion and a loading of 0.01 on neuroticism; they provide no raw data. They then analyze the factor loadings of individual EPI test items. Too much to go into here–but in any case this test says nothing about salivation and introversion at the individual level.

    • Yes we know. That in the adobe above the rape factory there were angels sitting and shipping out food to the poor.

      Gawker were low lifes. But they had other useful sub-websites.

      Do we need to be gentle with horrible people just because they also have useful webpages?

  5. love the relentless pursuit of scientific charlatans on this blog. before i found this blog, i used to rail against absurd social psych experiments (e.g. https://goo.gl/1FmxY4) whose main goal seemed to confirm everyday observations with p-values. and whose practitioners seemed all too eager to impress us with their “scientific” wisdom.

Leave a Reply to Jazi Zilber Cancel reply

Your email address will not be published. Required fields are marked *