Clay pigeon

Sam Harper writes:

Not that you are collecting these kinds of things, but I wanted to point to (yet) another benefit of the American Economic Association’s requirement of including replication datasets (unless there are confidentiality constraints) and code in order to publish in most of their journals—certainly for the top-tier ones like Am Econ Review: correcting coding mistakes!
  1. The Impact of Family Income on Child Achievement: Evidence from the Earned Income Tax Credit: Comment
    Lundstrom, Samuel
    The American Economic Review (ISSN: 0002-8282); Volume 107, No. 2, pp. 623-628(6); 2017-02-01T00:00:00
  2. The Impact of Family Income on Child Achievement: Evidence from the Earned Income Tax Credit: Reply
    Dahl, Gordon B.; Lochner, Lance
    The American Economic Review (ISSN: 0002-8282); Volume 107, No. 2, pp. 629-631(3); 2017-02-01T00:00:00
The papers are no doubt gated (I attached them if you are interested), but I thought it was refreshing to see what I consider to be close to a model exchange between the original authors and the replicator: Replicator is able to reproduce nearly everything but finds a serious coding error, corrects it and generates new (and presumably improved) estimates, and original authors admit they made a coding error without making much of a fuss, plus they also generate revised estimates. Post-publication review doing what it should. The tone is also likely more civil because the effort to reproduce largely succeeded and the original authors did not have to eat crow or say that they made a mistake that substantively changed their interpretation (and economists obsession with statistical significance is still disappointing). Credit to Lundstrom for not trying to over-hype the change in the results.
As an epidemiologist I do feel embarrassed that the biomedical community is still so far behind other disciplines when it comes to taking reproducible science seriously—especially the “high impact” general medical journals. We should not have to take our cues from economists, though perhaps it helps that much of the work they do uses public data.
I haven’t looked into this one but I agree with the general point.

10 thoughts on “Clay pigeon

  1. It’s interesting that in economics, pure theory is moving towards something like the “data available online” policy of empirical work. People have realized that most people just read the propositions and skip the proofs even in good papers, so increasingly the proofs are put in appendices or online. The paper contains the propositions, but also an explanation for why the proposition is correct and perhaps an outline of how the formal proof works.
    Fortunately, no journal has adopted the policy of letting the authors just publish their theoretical results and letting them keep their proofs secret.

  2. Sociology has refused to require availability of data & code. Some authors will send on request but there’s no requirement that they do so. So very little reproducibility except where data are public, such as GSS or Census. In addition to issues of scientific method, these requirements would be a terrific teaching tool for grad stats.

    • I’ve long thought that quantitative social science graduate programs should have a replication seminar, where students download available replication datasets and then try to replicate the published tables. I did this once as a grad student, and learned a lot from it. Beyond that, the students will end up detecting errors, which is good for all of us.

      • I did a replication project as a graduate student in economics as part of my methods class. Now I have my students do one in my second year field course on a paper they choose and I approve. I don’t make them replicate every table and every figure, and I’m not interested in them getting exactly the same numbers back. I tend to have them replicate the key results and then do some sort of twist or analyze it some different way, a “replicate and extend” type of exercise. The goal is to teach them how to think about the relationship between data and estimation and variability in estimates, not to comb the original paper for imperfections. Monitoring the literature is just a positive externality of the exercise.

        I also use replication in my problem sets. My graduate students often replicate parts of papers (including my own) as part of homework assignments designed to teach specific lessons about modeling or graphing. And my undergraduates “replicate” the results from a big education experiment using Excel and computing differences in means.

        These kinds of pseudo-replications are quite common in Economics. Our literature also includes papers that replicate old results and show them in relation to the new models they’ve run, or run both models on new data. But all that said, it is still difficult to actually publish totally correct criticisms or corrections of published papers unless the original papers are wrong for reasons that will interest a lot of readers (errr… cite-ers).

  3. These AER papers are available for free in pdf on the AEA website along with (I believe) all published AER papers (Accepted, but not yet published papers require AEA membership for access).

Leave a Reply

Your email address will not be published. Required fields are marked *