Skip to content

It’s Too Hard to Publish Criticisms and Obtain Data for Replication

Peter Swan writes:

The problem you allude to in the above reference and in your other papers on ethics is a broad and serious one. I and my students have attempted to replicate a number of top articles in the major finance journals. Either they cannot be replicated due to missing data or what might appear to be relatively minor improvements in methodology may either remove or sometimes reverse the findings. Almost invariably, the journal is reluctant publish a comment. Due to the introduction of a new journal, Critical Finance Review, by Ivo Welsh,, that insists on the provision of data/code and encourages the original authors to further comment, this poor outlook is improving in the finance discipline.

See for example: Gavin S. Smith and Peter L. Swan, Do concentrated institutional investors really reduce executive compensation whilst raising incentives?. Code, CFR 3-1, 49-83.

and the response:

Jay C. Hartzell and Laura T. Starks, Institutional Investors and Executive Compensation Redux: A Comment on “Do Concentrated Institutional Investors Really Reduce Executive Compensation Whilst Raising Incentives”, CFR 3-1, 85-97.

The model of criticism and rebuttal is fine, but it’s disturbing that the people criticized never seem to back down and say they were wrong. I don’t think people should always admit they’re wrong, because sometimes they’re not. But everybody makes mistakes, while the rate of admission of mistakes seems suspiciously low!


  1. Handle says:

    Over at Arnold Kling’s I left a comment to the effect that what is needed is an incentive or liability system that will compel scholars to stand by their results or abandon them.

    • Rahul says:

      What if we made publishing a more adversarial system? i.e. the referee starts with the working assumption that the study is bogus & keeps asking for such a lot of supporting evidence that he’s convinced that it is not.

      In other words, let at least the methodological parts of the validation happen at the referring stage itself. Let’s choose referees by thinking who might be the best person to ask to replicate this work.

      The status quo of refereeing seems based on too much trust, goodwill & naivete.

      • “The status quo of refereeing seems based on too much trust, goodwill & naivete.”

        Rahul, I don’t know what field you publish in, but this has definitely not been my experience with referees.

        • Rahul says:

          How often have referees asked you for your raw data or samples? Lab Notebooks? Actual working code? Or maybe an interview with the student who ran the experiment / analysis (one way for fishing to come to light)? That’s the level of skepticism that I was thinking of.

          • The next time I review, I will ask for all raw data and code that led to the paper, and see how that changes my decision. Of course it will double or triple my reviewing time, so I can see why a reviewer might not want to ever do that (we get no payment at all).

            But often it’s easy to see from the paper already that monkey business was afoot. A researcher who always does x (e.g., cuts off all data more than 2.5 SDs away from the mean) doesn’t do it in one paper—red flag. There are many such things that can reveal a lot.

            • Dan Wright says:

              Last week I wrote to an editor to get more information about the procedure used in a paper I am reviewing. I occasionally do this. Once I got a note from the editor just to review it without this information I needed. I did not.

    • Martha says:

      Ioannidis has recently published suggestions for improved research practices and changed incentives and rewards: How to Make More Published Research True, PLoSMedicine, Oct 2014,

  2. jonathan says:

    I don’t see why social science should be different from the rest of life. Given this season, it’s been 2000 years and it seems the only signs of God are burn marks in various forms of bread but … In other words, belief is Colbertian, more truthiness than truth, and when you couple that with two other natural desires, wanting renown (and its attendant goodies) and the desire to get away with something, the replication issue is both Colbertian and Sisyphusian.

  3. ed says:

    Peter Swan, I want to read more about the failed finance replications. Surely some results replicate successfully?

  4. Jack PQ says:

    However, there is also the opposite problem of “replication trolls”: researchers who troll for datasets to challenge, sometimes dishonestly so, and score a Comment publication off of a famous (and usually correct) paper. The existence of replication trolls makes researchers hesitant to share data more than is necessary.

    • Andrew says:


      I read the Deaton post that you link to, and I do not find it convincing at all. If people are going to criticize, I’d rather have them make the criticism based on the raw data, as much as is possible. I don’t think it’s “trolling” for people to want to see the paper trail, the sequence of steps leading from raw data to final conclusions. The Deaton post seems particularly ridiculous in that he criticizes the critics of Reinhart and Rogoff—but those critics are the ones who discovered the notorious Excel error. The system of official experts didn’t seem to work so well in this case, and I’m glad there were some outsiders who weren’t simply going to accept a published assertion without seeing the evidence.

    • Fernando says:

      People who make this kind of comment typically have never tried to get a replication published.

      For starters some higly regarded journals like AJPS do not even consider replications for review. (Yes, you read that correctly.)

      However, those journals that do consider them typically want to see blood (sorry, new results). This of course can turn the replicator into an unwilling troll; searching for additional specifications just to please the reviewers.

  5. jrc says:

    Agreed. A few arguments against being against replication trolls:

    1 – The “Voter Fraud” argument – it just doesn’t really happen that much. It is not like the incentives are there for hordes of young academics to go out specification hunting for some way to say “gotcha” to some famous authors.

    2 – The “Cost of Being Angus Deaton” argument – Angus Deaton’s reputation can handle people challenging him. I’m sorry he had to get an extra SS & Med publication out of that experience, but it doesn’t seem terrible for academics with less prestige to challenge the giants of the field (in Deaton’s case, I mean that literally and figuratively).

    3 – The “Science” argument – I mean, Science, right?

    4 – The “Muddy Waters Rocks” argument – One theme that keeps coming up on this blog is the twin statistical pillars of variation and uncertainty, and along with that thinking comes a certain necessary level of intellectual humility or modesty in the researcher. If social scientists have to tone down their rhetoric and implied certainty because they are afraid that something they say will come back to haunt them, that doesn’t seem terrible to me.

    All that said, I think some of us are really interested in ways of displaying the robustness and lack-thereof of our estimates, and doing away (at least to some degree) with the idea of “preferred estimates” (though emphatically NOT with the ideas of “good” and “bad” estimates). And I think there are some contributions to that coming soon. Because in the end, I think transparency is the best way to both do science and to deal with people who disagree with you.

    • jrc says:

      This was in response to Andrew in response to Jack PQ. My bad.

      • Jack PQ says:

        Great arguments, thanks. My point is to emphasize that there are both costs and benefits to replication, and that the optimal amount of replication isn’t necessarily always “more.” But right now, could we use more replication (and should we find ways to encourage it)? Yes, absolutely.

        If you think of it in a policing sense, potential replication by third parties indeed seems like a more efficient way to encourage accuracy and truthfulness than does full replication of all submitted papers by the reviewers. The latter just isn’t going to happen. It’s hard enough to get a referee report longer than a couple of paragraphs.

  6. Anonymous says:

    This is the problem with a Bayesian approach – a subjective prior yields a subjective posterior :-)

  7. Rahul says:

    Do people count a Comment published on a paper in their citation count?

    • Andrew says:


      One of my most influential papers was published as a comment. The paper was not a comment at all, it was actually a real paper I submitted to the journal. As I remember it, the editor liked the paper but the referees didn’t, so the editor did the favor of getting my paper in the journal by labeling it as a comment on another paper on a similar topic that he was publishing. The paper currently has 1497 citations on Google Scholar but officially it’s listed in the journal as a comment on another article in that journal.

      • Rahul says:

        Interesting. Is it common for editors to over-rule referees in this direction? i.e. Publish something referees recommend not publishing?

        • Jack PQ says:

          No, it is not common–and even in Prof Gelman’s case the editor did not over-rule the referee, but rather used editorial discretion to publish the paper in a different format.

          In my experience editors always go with the reviewers. Moreover they’d rather reject a good paper than accept a bad one (type I and II errors). However, reviewers do not always agree. Then, it seems editors place more weight on some reviewers’ comments than others. I have heard of only one instance (not mine, but a friend’s) when the editor accepted a paper over the reviewers’ strong objections.

Leave a Reply