Why isn’t replication required before publication in top journals?

Gabriel Power asks the above question, writing:

I don’t recall seeing, on your blog or elsewhere, this question raised directly. Of course there is much talk about the importance of replication, mostly by statisticians, and economists are grudgingly following suit with top journals requiring datasets and code.

But why not make it a simple requirement? No replication, no publication.

I suppose that it would be too time-consuming (many reviewers shirk even that basic duty) and that there is a risk of theft of intellectual property.

My reply: In this context, “replication” can mean two things. The first meaning is that the authors supply enough information that the exact analysis can be replicated (this information would include raw data (suitably anonymized if necessary), survey forms, data collection protocols, computer programs and scripts, etc. Some journals already do require this; for example, we had to do it for our paper in the Quarterly Journal of Political Science. The second meaning of “replication” is that the authors would actually have to replicate their study, ideally with a preregistered design, as in the “50 shades of gray” paper. This second sort of replication is great when it can be done, but it’s not in general so easy in fields such as political science or economics where we work with historical data.

13 thoughts on “Why isn’t replication required before publication in top journals?

  1. There is another approach that is occasionally used in finance, that I described in this post: announcing your hypothesis at a conference. In finance, a lot of work is done on predicting the market and cross-validation is rarely done with historic back-testing (in industry or academia, it seems). So a recent approach has been to publicly announce your investment strategy with preliminary back-testing results at a conference, and then publish a journal paper a few months later, with the market data since the conference serving as a very good validation set. For example, this was done by Leinweber & Sisk. (2011). J. Portfolio Management.

  2. The hard part is deciding how “raw” is raw enough. In the studies I work on, the data change hands many times, and many layers of pre-processing result in the data I receive. If I provide the version of the data I got, a well documented set of programs I used to clean, recode, and restructure the data, and a detailed description of results, is this enough? Who knows how badly someone screwed up the data before it ever landed on my desk?

    • If I am reading a report where the lead author says “who knows how badly someone screwed up the data before it ever landed on my desk?” I would be worried about the reliability of what is being reported. As a researcher you often have to trust what others have done with the data before it gets to you, but the report is about the entire study, not just what you have done.

      • To be clear, I’m usually not the lead author on these papers, and I always disclose all the ingredients I contributed. I’m just wondering how much this would improve the replicability of research when the worst errors tend to get made long before the data is made available to the statistician.

    • Let’s at least get started. The status quo is so opaque that any degree of “raw” is bound to be an improvement.

      This is like hesitating to patch up highway potholes just because we cannot agree of a definition of flatness.

  3. Leave aside Poli. Sci or Econ. but what if this was made mandatory in say Chemistry? Might not be too bad I think.

    I’d call it the Journal of More Reliable Research.

  4. Some research areas do have a replication requirement. For example, for studies that claim to find associations between human traits and genetic variations, many journals, including glamour mags like Science and Nature, require replication in a second (independent) sample.

  5. Even the first type of replication described in this post – release data and programs – isn’t necessarily so easy. I work with historical data, and am 100% in favor of rules requiring release of data and programs wherever possible. However, lots of high-quality historical data is proprietary, and my view of “suitably anonymized” is often not the same as that of the data vendor (or their attorney) who has to sign off, or that of the privacy advocate concerned with de-anonymization. For the statisticians who track this blog, creating off-the-shelf tools for pretty-good-anonymization looks to me like a high social-value activity.

  6. An important issue that JG Gardin pointed out when I was a graduate student in his semiotics seminar – is that authors have no business replicating their own work (in the second sense). That needs to be done by a third party at arm’s length. So its not ideal and likely very inefficient to ask authors to replicate their own work.

    Perhaps what is need is some structured co-operative replication efforts between universities. Lets say Harvard and the University of Toronto decided to do this (they are the top two publishers in the world by some measures.) So, reasearch in need of credibility (most stuff discussed on this blog) carried out at Harvard would be sent off to U of T for credibility upgrading (starting with peer review, the first meaning of replication would be assessed and if it passes the second meaning is addressed.) Both universities likely would want to have auditors to ensure the peer review and replication efforts are adequate.

    Now if there is a perception, that this sort of arrangement would give the two universities a competitive advantage, it might actually start to happen.

  7. I’m a molecular biologists who has been reading economics blogs (Brad Delong, economist view) for the last few years.
    ONe thing that has really struck me is how a lot of econ papers have relatively small data sets (a few hundred to a few thousand points) and there is not excel or tab delimited file available; you can down load a pdf of the journal paper, but all you get are highly abstracted statistical values

    I suspect a lot of these papers are like (1) – where the authors made errors that they could have avoided if they had just taken, literally, pencil and paper and done some doodling on graph paper for an hour or so…

    1) http://en.wikipedia.org/wiki/Growth_in_a_Time_of_Debt

    In general, this whole idea of replication is stupid: if it is important, some one will replicate it; if no one bothers, then it probably wasn’t worth paying attention to anyway (by replicate, I don’t mean just replicate, but do additional work predicated on the earlier study; if the early study is not a good foundation, you will find out quick enough

  8. Many top psychology journals do require replication – the multi-experiment paper is the most common article types in JEP and many other journals. The problem is, perhaps, with a perfect storm of: conceptual replications, undisclosed researcher degrees of freedom, a low tolerance for messy results and focus on surprising, counterintuitive or exciting results as a criterion for acceptance. These factors vary from journal to journal but seem most

    My intuition is that psychology journals that have fewer of these features have fairly high replicability among papers. Thus most of the JEP stable tend to publish papers with incremental changes in experiments (so experiment 3 or 4 is typically a more tightly controlled variant of experiment 1 rather than a conceptual replication). They are also often rather “dull” papers (at first glance) compared to, say, the social priming studies. The problem in some (by no means all) subareas of experimental social psychology is that they follow the style but not substance of this approach.

Leave a Reply to Morgan Price Cancel reply

Your email address will not be published. Required fields are marked *