Skip to content
 

Proposals for alternative review systems for scientific work

I recently became aware of two new entries in the ever-popular genre of, Our Peer-Review System is in Trouble; How Can We Fix It?

Political scientist Brendan Nyhan, commenting on experimental and empirical sciences more generally, focuses on the selection problem that positive rather then negative findings tend to get published, leading via the statistical significance filter to an overestimation of effect sizes. Nyhan recommends that data-collection protocols be published ahead of time, with the commitment to publish the eventual results:

In the case of experimental data, a better practice would be for journals to accept articles before the study was conducted. The article should be written up to the point of the results section, which would then be populated using a pre-specified analysis plan submitted by the author. The journal would then allow for post-hoc analysis and interpretation by the author that would be labeled as such and distinguished from the previously submitted material. By offering such an option, journals would create a positive incentive for preregistration that would avoid file drawer bias. More published articles would have null findings (at least 5%!), but that’s how science is supposed to work.

Quality control could be maintained by “replication audits of a random subset of published articles. At a minimum, these audits would verify that all the results in an article could be replicated. They could conceivably go further in some cases and try to recreate the author’s data and results from publicly available sources, re-run lab experiments, etc. when possible.”

Coming from a completely different direction, theoretical statistician Larry Wasserman (link from Xian) suggests abandoning peer-reviewed journals entirely and replacing them by the Arxiv, a system run by physicists where people can upload their own articles. Arxiv is somewhat restricted (you have to be connected in some way to post an article, and they do enough screening so that, for example, they at first refused to publish my zombies paper (and, even when they did publish it, they removed George A. Romero from the list of authors)), but they don’t do anything like refereeing.

As Larry puts it:

The refereeing process is very noisy, time consuming and arbitrary. We should be dissem- inating our research as widely as possible. Instead, we let two or three referees stand in between our work and the rest of our field. . . . We should think about our field like a marketplace of ideas. Everyone should be free to put their ideas out there. There is no need for referees. Good ideas will get recognized, used and cited. Bad ideas will be ignored. . . .

As Xian points out, there are problems with Larry’s proposal in that it relies so strongly on the Arxiv and on personal websites. Larry’s suggestion of daily scanning of the Arxiv daily is not so practical—and it would be even much less so if his plan kicked in and the Arxiv suddenly started including the tens of thousands of papers outside of math and physics that are daily submitted to journals. Currently, journals serve as a filter for busy researchers and evaluators of research.

I think Larry would respond to this criticism by arguing that a no-journals system would create an incentive for groups of scholars to manage a filtering service. For example, instead of the American Statistical Association running JASA, JEBS, JBES, Technometrics, etc., and maintaining a separate editorial staff for each (representing a huge amount of volunteer service on the part of editors and referees), they could run a set of filtering services. The editors of each filter would be expected to scan the literature and handle submissions (which in this case would be pointers to articles already published on the web). The editorial boards would have the responsibility to come up with monthly (say) recommended reading material. Doing this would require some work, but less than the existing job of producing a journal. The main concern I see would be to keep the editors focused on solid research rather than getting tabloidlike. It would be tempting for an aggregator to give pointers to “pathological science” such as the Bible Code and silly sex-ratio analyses and ESP studies, in order to grab more attention. But as long as these aggregators took their jobs seriously, I’d think they could supplant journals.

Putting all this advice together

Considered separately, both Nyhan’s and Wasserman’s recommendations makes sense to me. But it’s striking that they go in opposite directions! Nyhan recommends a more rigorous system, where to publish an article you have to supply data and other replication materials, whereas Wasserman would place no restrictions at all.

I don’t have any answers here, it’s just interesting that two reasonable-sounding reforms of the current system can be so different. I can well imagine someone reading Nyhan’s suggestions and saying Yeah, then reading Wasserman’s suggestions and agreeing to those too, without even fully realizing their different directions.

15 Comments

  1. Matt R. says:

    Larry’s suggestion of daily scanning of the Arxiv daily is not so practical

    It’s what hundreds or thousands of physicists like me do every day. Provided that content is broken down into subject areas, it’s not unreasonable to read all the title & author lists and see if something catches your eye. Important things one might have overlooked usually get mentioned in a conversation with a colleague soon enough, anyway.

  2. John Mashey says:

    I think there is an implicit mis-frame, which is that one size fits all.
    In industry, people always think about value-chain analysis to provide a product or service:
    who adds value and how much at each stage of a process. Those who add more value than they cost tend to do well.

    1) Consider a horizontal scale from 0 (blog entries) to 1 (strongly peer-reviewed, strong editing).
    The value chains are different, and yield different products, with plusses and minuses, such as speed versus confidence.

    2) For each discipline, there is probably some optimal weighting of publication across that scale,
    but no obvious reason that one size fits all within a discipline. The filedrawer effect seems more relevant to some than others.

    3) It is not at all clear tbat the profiles should be the same between disciplines.

  3. K? O'Rourke says:

    Different directions, but in very different domains.

    Nyhan is concerned about claims of empirical outcomes in next to impossible areas to replicate (would I get the same result in another 100 patients with end stage cancer?)

    Larry is concerned about claims of math/deduction (diagrammatic experiments in Peirce’s sense possible of endless near cost free replication ) or relatively easy to replicate (or fully documented)physics experiments.

  4. Wayne says:

    I’m a naive outsider on this, but one concern I have with the peer review system is that it allows stifling of dissent from the current orthodoxy. Editors pick articles, reviewers are anonymous and can give positive reviews to their friends and throw roadblocks in front of their competitors/enemies. With so much grant money on the line (not to mention publish-or-perish, gaining tenure, etc), I think this is a real concern.

    How about a solution that’s something like Arxiv Meets Twitter? It would be Arxiv-like but required all papers to be accompanied by all data, code, etc, necessary to replicate the results. The system would have infrastructure to support this requirement and perhaps some standards for data and recommendations for code. Then, do something like CRAN’s Task Views, Amazon lists, or Twitter: people can make lists of papers they find interesting/impactful. Then readers can vote/subscribe to these lists, and there would be some kind of authentication process as well so we could be sure that Andrew Gelman’s list on economics papers really is by you.

    I can see that, say, Andrew’s Economics Feed would gain such prominence that you would wield the power I complain of in my first paragraph. Stay on Gelman’s good side, or he won’t point to your paper and you won’t get that grant from the government guy who just scans Andrew’ Feed every morning and decides who to give money to by lunchtime.

    It’s a lot like the news business: can Twitter/blogs replace the Washington Post, CNN, and other professional news organizations? Probably not, but we’re definitely heading in the direction of crowd-sourced news being one of the pillars of our news cycle, and news organizations are wrestling with where their value-added is.

  5. Gustav says:

    Not sure that these solutions will affect the file drawer problem. Even boring results takes time to write up, time that you can spend on writing up “good” results or perform new studies. Society needs to evaluate scientific success somehow and the number of citations might be one option in a “journal free world”, and boring results will give you less citations.

  6. Tal Yarkoni says:

    Let me toss my own hat into the ring and point to a paper I wrote recently arguing that we should model post-publication evaluation platforms on existing systems implemented in social websites (reddit, stack exchange, etc.). It was a submission for the special topic of Frontiers in Computational Neuroscience that Niko Kriegeskorte’s editing, though I’m not sure if or when it’ll be accepted (ironically, it ran into some problems during the pre-publication review process).

    In general, I fall on the “get rid of conventional journals” side advocated by Wasserman; there’s really no clear purpose traditional peer review serves once you have decent collaborative filtering algorithms, and there’s plenty of empirical evidence showing that peer review doesn’t work very well (it’s certainly better than nothing, but not by very much). If companies like Amazon and Netflix can figure out how to evaluate products collaboratively when billions of dollars are at stake, it stands to reason academics should be able to do the same–though of course we have less money to funnel into such efforts.

  7. Chris Said says:

    Let my toss my hat into the ring as well and say that most of these good ideas have been around for decades, and yet we are still stuck with the same dysfunctional system. Scientists cannot solve this problem alone. We need an external force (i.e. the granting agencies) to reward scientists who submit to good-practice journals. As scientists compete to submit to these journals, other journals will adjust their policies.
    “http://filedrawer.wordpress.com/2012/04/17/its-the-incentives-structure-people-why-science-reform-must-come-from-the-granting-agencies/”

  8. DK says:

    Currently, journals serve as a filter for busy researchers and evaluators of research.

    If that’s true, I don’t know what they are filtering. There are so many journals now that anything, absolutely anything can be (and does get) published in the “peer reviewed journal”. And publishing business is so lucrative that new journals pop up literally daily. I get weekly emails with invitations to submit papers to a new international journal. And I hope that by now no one is seriously defending the validity and value of signaling (Nature – super good, PNAS – good, Biochemistry – probably OK, etc).

    Larry’s suggestion of daily scanning of the Arxiv daily is not so practical—and it would be even much less so if his plan kicked in and the Arxiv suddenly started including the tens of thousands of papers outside of math and physics that are daily submitted to journals.

    Nothing impractical there. It’s been years since I looked at any journal’s TOC – search function in Google, Pubmed and various databases is infinitely more efficient and handles avalanche of biomedical literature quite nicely. It covers my narrow field and I rely on various aggregators (blogs included!) to catch up on more general topics/interests. What exactly is a problem?

  9. Alisia Tasso says:

    Is this Nyhan’s idea? Glymour and Kawachi suggested something like this 7 years ago
    http://www.bmj.com/content/331/7517/638.2.short

    And it has been piloted at the Archives
    http://archinte.ama-assn.org/cgi/content/full/169/11/1022

  10. I agree with DK. I don’t know what journals are filtering and why do we need to spend so much time and energy into this “filtering before publication”. Why can’t we just simply put everything out there for whoever wants to replicate it and let people decide if the work is good enough. With current search engines, one should not have to waste too much time to find the relatively few number of articles published in one’s narrow area of science.
    We have had some success with this model for biomedical sciences on WebmedCentral but I have to confess, we are facing some resistance.
    Kamal Mahawar
    CEO, WebmedCentral

  11. Jared says:

    I also peruse arxiv feeds daily (roughly) – title, author and abstract come in via RSS and are easy to manage. If something looks interesting I queue it up to read later. Really it’s a godsend to have so many tech reports and preprints in one place, especially since stat journals have such a long turnaround time.

  12. [...] Modeling, Causal Inference and Social Science, Andrew Gelman shared his (contradicting) views on two + two proposals  for alternative peer review [...]

  13. Marcus says:

    Yann LeCunn has a proposal for a new publication model that kind of crosses social networking with peer review. See here: http://yann.lecun.com/ex/pamphlets/publishing-models.html

    It’s more elaborate than arxiv but also has a place for the editorial role played by journals and conferences. It’s also more of the more concrete, thorough and plausible proposals for an alternative I’ve seen. It doesn’t necessarily reduce the bias against negative results, but removes the effects of the review process on encouraging that bias.

  14. I’ve posted a followup on my blog with discussion of proposals by Said, Glymour and Kawachi, and others: http://www.brendan-nyhan.com/blog/2012/04/more-on-pre-accepted-academic-articles.html

  15. [...] few weeks ago, I came across this. Andrew Gelman has already written many, many, many, many posts on the subject of shoddy/fraudulent analyses and the apparent inability of the [...]