In our recent discussion of publication bias, a commenter link to a recent paper, “Star Wars: The Empirics Strike Back,” by Abel Brodeur, Mathias Le, Marc Sangnier, Yanos Zylberberg, who point to the notorious overrepresentation in scientific publications of p-values that are just below 0.05 (that is, just barely statistically significant at the conventional level) and the corresponding underrepresentation of p-values that are just above the 0.05 cutoff.
Brodeur et al. correctly (in my view) attribute this pattern not just to selection (the much-talked-about “file drawer”) but also to data-contingent analyses (what Simmons, Nelson, and Simonsohn call “p-hacking” and what Loken and I call “the garden of forking paths”). They write:
We have identified a misallocation in the distribution of the test statistics in some of the most respected academic journals in economics. Our analysis suggests that the pattern of this misallocation is consistent with what we dubbed an inflation bias: researchers might be tempted to inflate the value of those almost-rejected tests by choosing a “significant” specification. We have also quantified this inflation bias: among the tests that are marginally significant, 10% to 20% are misreported.
They continue with “These figures are likely to be lower bounds of the true misallocation as we use very conservative collecting and estimating processes”—but I would go much further. One way to put it is that there are (at least) three selection processes going on here:
1. (“the file drawer”) Significant results (traditionally presented in a table with asterisks or “stars,” hence the photo above) more less likely to get published.
2. (“inflation”) Near-significant results get jiggled a bit until they fall into the box
3. (“the garden of forking paths”) The direction of an analysis is continually adjusted in light of the data.
Brodeur et al. point out that item 1 doesn’t tell the whole story, and they come up with an analysis (featuring a “lemma” and a “corollary”!) explaining things based on item 2. But I think item 3 is important too.
The point is that the analysis is a moving target. Or, to put it another way, there’s a one-to-many mapping from scientific theories to statistical analyses.
So I’m wary of any general model explaining scientific publication based on a fixed set of findings that are then selected or altered. In many research projects, there is either no baseline analysis or else the final analysis is so far away from the starting point that the concept of a baseline is not so relevant.
Although maybe things are different in certain branches of economics, in that people are arguing over an agreed-upon set of research questions.
P.S. I only wish I’d known about these people when I was still in Paris; we could’ve met and talked.