Too big to fail: Why it’s unrealistic to expect scientific journals to retract their huge backlog of erroneous papers

I couple years ago I wrote an article, “It’s too hard to publish criticisms and obtain data for replication.” I gave two examples demonstrating the struggles of myself and others to get journals to admit errors. The problem is that the standards for post-publication review are higher than for pre-publication review. You can find an error that would clearly make a paper ineligible for publication—but if the paper has already appeared, it’s not enough to point out the error, you need to demonstrate “irrefutable proof,” in the words of one of the authors whose work had been questioned. We’ve talked about this a lot, the idea that once a paper is published it is supposed to have some special truth-status.

Recently David Allison, Andrew Brown, Brandon George, and Kathryn Kaiser published an important article making similar points but focusing on journals’ policies on corrections.

It’s not a pretty sight. Here are some quotes:

In the course of assembling weekly lists of articles in our field, we began noticing more peer-reviewed articles containing what we call substantial or invalidating errors. These involve factual mistakes or veer substantially from clearly accepted procedures in ways that, if corrected, might alter a paper’s conclusions. . . .

After attempting to address more than 25 of these errors with letters to authors or journals, and identifying at least a dozen more, we had to stop — the work took too much of our time. . . .

Too often, the process spiralled through layers of ineffective e-mails among authors, editors and unidentified journal represent- atives, often without any public statement added to the original article. . . .

Science relies essentially but compla- cently on self-correction, yet scientific publishing raises severe disincentives against such correction. One publisher states that it will charge the author who initiates withdrawal of a published paper US$10,000.

Journals rarely state whom to contact about potentially invalidating errors. We had to guess whether to send letters to a staff member or editor, formally submit the letter as a manuscript, or contact the authors of a paper directly.

That’s what happened with me with the American Sociological Review. They have no method of publishing corrections. All I could do was submit my correction as its own article. In the review process they did not disagree with my points at all, but they refused to publish the correction on the grounds that they only publish the very best submissions to ASR. Of course, they published the original article with its error, but since the error was not caught in the original round of reviewing, I guess it stands forever as ASR-worthy!

Allison et al. continue:

For one article that we believed contained an invalidating error, our options were to post a comment in an online com- menting system or pay a ‘discounted’ submission fee of US$1,716. With another journal from the same publisher, the fee was £1,470 (US$2,100) to publish a letter. Letters from the journal advised that “we are unable to take editorial considerations into account when assessing waiver requests, only the author’s documented ability to pay”.

Wow! I wonder what publisher that was! I’m reminded of that journal “Wiley Interdisciplinary Reviews: Computational Statistics” which was charging people $2800 for that article that Weggy plagiarized from Wikipedia. Who knows, though, maybe a garbled and plagiarized Wikipedia article is worth the price—it does bear the signature of a recipient of the American Statistical Association Founders Award. . . .

Allison et al. conclude:

Scientists who engage in post-publication review often do so out of a sense of duty to their community, but this important work does not come with the same prestige as other scientific endeavours. Recognizing and incentivizing such activities could go a long way to cleaning up the literature.

Our work was not a systematic search; we simply looked more closely at papers that caught our eye and that we were prepared to assess. We do not know the rate of errors or the motivations behind them (that is, whether they are honest mistakes or a ‘sleight of statistics’). But we showed that a small team of investigators with expertise in statistics and experimental design could find dozens of problematic papers while keeping abreast of the literature. Most were detected simply by reading the paper.

No joke. Especially if you include as errors things like basing strong claims on p-values that are subject to the garden of forking paths, or miscalculated p-values that get pushed over to the right side of .05. You’d have to trash some huge chunk of the literature on embodied cognition, and I think literally half the papers that have appeared in the journal Psychological Science in the past five years. (See, for example, slide 16 here).

That’s part of the problem right there. If that many papers get retracted, and if every retraction required its own investigation, it would suck up all the resources of the journals for years. And this doesn’t even get into the costs in prestige. If PPNAS retracts the himmicanes paper and the Chinese air pollution paper and all the rest of those noise-chasing social science papers it’s been publishing, if JPSP goes into its archives and retracts all the erroneous papers in its history . . . can you imagine?

So unfortunately I don’t think it’s possible. Reform would be great, post-publication review is great, but I think we just have to give up on retraction. The system is just too big to fail.

The new rules

Let’s just put a bright line down right now. 2016 is year 1. Everything published before 2016 is provisional. Don’t take publication as meaning much of anything, and just cos a paper’s been cited approvingly, that’s not enough either. You have to read each paper on its own. Anything published in 2015 or earlier is part of the “too big to fail” era, it’s potentially a junk bond supported by toxic loans and you shouldn’t rely on it.

P.S. See here for Retraction Watch interview with Allison and Brown, and here for a news article from Retraction Watch’s Ivan Oransky.

22 thoughts on “Too big to fail: Why it’s unrealistic to expect scientific journals to retract their huge backlog of erroneous papers

  1. And the realistic path to fixing this is what?
    The journal managers will listen to They Who Hold The Purse Strings, which is largely still the U.S. federal government through NSF grant funding, etc, right? Could the NSF specify that publication of results be done in a journal that complies with some NSF-specified standards for making corrections?
    Or is it going to be a mess whether changes are made or not?

    Are there other journals in other fields somewhere that are doing it right?

    • The income comes from journal subscribers (universities, professional societies, companies, etc.) and pay-to-publish (and this is the traditional route to open-access from most of the traditional journals).

      NIH mandated open access to all its funded research. As far as I know, NSF hasn’t done anything like that.

      I think JMLR is an instance of a journal that’s at least doing the open-access thing right. Computational Linguistics has been open access for ages, but they’re very slow turnaround and at least while I was on the editorial board, had the nasty habit of rating every paper revise-and-resubmit, even after revision and resubmission, leading to a neverending cascade of re-reviewing.

      I’m pretty much done with journals, but then I never trusted peer review after I was on the reviewing side on editorial boards and grant-review panels.

      • The entire incentive system is broken. Want an academic job? Publish or perish. Just want a PhD? Publish or perish. Not in the Academy? Fork over $35 per paper. Try to write a paper without citing Obscure et al.(2010) or Anonymous Reviewer and Editor’s Own Paper (2015)? Rejected. Have a good idea as someone with a Master’s degree (or less) and data to back it up, but no published paper? You’re nobody. I ain’t got the time to look at your data or hear your argument.

        I’d love to be done with journals, but I find myself as a quantitative person in my chosen field (environmental and ecological science) outside the Academy. Reputation matters more than evidence out here. The difference between two Master’s Degrees and a PhD in aptitude is not measurable, but it is significant to those who don’t understand Statistics. But every time I think about going back, the three years of indentured servitude to a system with such a perverse reward structure just demoralizes. Shouldn’t have waited til my thirties I suppose.

  2. Papers need to have an afterlife. There is absolutely no excuse for using a format today that is optimised for disseminating methods involving horse carriages. One big obstacle is the publishing model, since as long as someone’s supposed to pay for papers, basically everyone will download them from random places in fixed pdf format. Post-publication stuff can work well on dynamic web-based platforms, but people will only look at those if they are free.

  3. Also, published papers are sacred cows now. It’s the ultimate horror for a researcher to discover that there was an error in a paper, especially in an early career stage. But this entanglement of epistemological and existential risk creates a very distorted incentive, and must be replaced by a rational culture of discussion where one can be wrong without being cast out. Eve Marder gets it right: http://lens.elifesciences.org/11628/index.html

  4. “Our efforts revealed invalidating practices that occur repeatedly…we showed that a small team of investigators with expertise in statistics and experimental design could find dozens of problematic papers while keeping abreast of the literature. Most were detected simply by reading the paper.”
    http://statmodeling.stat.columbia.edu/wp-content/uploads/2016/02/Allison_Comment.pdf

    Yes, right now it is absolutely trivial for someone with a tiny bit of skepticism and competence to find the same flaws over and over, the lack of sophistication is one of the most shocking aspects. I also want it to stop, but be careful what you wish for here. From a certain perspective the easiest solution to the problem is to just avoid mentioning any “danger” phrase/analysis, without otherwise changing the behavior or thought process.

    • > Anything published in 2015 or earlier is part of the “too big to fail” era, it’s potentially a junk bond supported by toxic loans
      > and you shouldn’t rely on it.

      Unfortunately that is my take unless I have inside information (know the researchers commitment to carefulness and purposefulness) or someone like the the FDA has (or even could have) audited the data. Folks seem a bit shocked when I tell them this – but I do believe it is the most sensible take.

  5. In economics the number of submissions to the top 5 journals doubled since 1990 and the average length of a paper also doubled. So four times the total number of pages to read. I imagine the number of reviewers did not keep up with that growth. Are other disciplines also seeing this deluge?
    http://www.voxeu.org/article/nine-facts-about-top-journals-economics

    Regarding incentives, how does that interact with the publication requirement for promotion? If editors just reject more out of hand to focus on a smaller number of quality papers, but the 6 in 6 (or whatever) publication requirement doesn’t change then getting promoted becomes essentially impossible. Rather than too big to fail, this sounds more like regulatory capture. The gatekeepers have an incentive to not punish their constituents.

  6. > “Regarding incentives, how does that interact with the publication requirement for promotion?”

    Yes, incentives seem the core of the problem here.

    Academia should be the key quality-control agent for all these professional papers, but practical disincentives blunt that role. Ye olde publish-or-perish compulsion soils the whole process.

    Academics have the expertise, ostensible motivation, and time to review published papers in their field. And as noted above, identifying a large percentage of faulty papers is often just a simple process of reading a given paper by a knowledgeable person. Academics and academic institutions are also the largest direct buyers of professional journals… and as consumers should have a strong quality interest in getting what they pay for. Senior staff at these professional journals generally has an academic background.

    But the publish-perish academic culture works against this. A published paper is generally more important to one’s career than the objective quality of that paper. Quantity trumps Quality and the informal academic culture of peer review develops a high tolerance for non-replicable, marginal quality papers/studies (especially in Social Sciences & Bio-Medical)– don’t rock this comfy boat by actually enforcing quality. Bemoaning low quality is fine– just don’t do anything serious about it. Incentives matter.

    • “Academics and academic institutions are also the largest direct buyers of professional journals… and as consumers should have a strong quality interest in getting what they pay for. ”

      Maybe academic librarians should be brought into the conversation — they are part of the chain if approval of institutional subscriptions to journals; they might be interested in (at least suggesting) putting quality of reviewing, retractions, etc. on the checklist for approval.

    • From the same page:

      “We may assume, for the sake of argument in deciding this case, that in a capital case a truly persuasive demonstration of “actual innocence” made after trial would render the execution of a defendant unconstitutional, and warrant federal habeas relief if there were no state avenue open to process such a claim. But because of the very disruptive effect that entertaining claims of actual innocence would have on the need for finality in capital cases, and the enormous burden that having to retry cases based on often stale evidence would place on the States, the threshold showing for such an assumed right would necessarily be extraordinarily high. The showing made by petitioner in this case falls far short of any such threshold.
      Petitioner’s newly discovered evidence consists of affidavits.”

      Quoting legal opinions narrowly is pretty hard to do while remaining accurate to the sense of the thing. lawyers gotta talk.

  7. The “new rule” you propose is the one people should’ve been using since forever, and should continue to use until God decides to reveal His existence and join the editorial board of some journal(s). Then, and only then, should something being published in a peer reviewed journal mean that it is Truth. Because in reality academics are smart people but often will get things wrong (especially while our society incentivizes outrageous claims), and the fact that 1-10 authors + 3 peer reviewers thought something was worthy of publication doesn’t make it true. (Note also that “worthy of publication” and “true” aren’t necessarily the same thing even in a perfect world)

Leave a Reply

Your email address will not be published. Required fields are marked *