I thought you might be interested in our paper [the paper is by Annie Franco, Neil Malhotra, and Gabor Simonovits, and the link is to a news article by Jeffrey Mervis], forthcoming in Science, about publication bias in the social sciences given your interest and work on research transparency.
Basic summary: We examined studies conducted as part of the Time-sharing Experiments in the Social Science (TESS) program, where: (1) we have a known population of conducted studies (some published, some unpublished); and (2) all studies exceed a quality threshold as they go through peer review. We found that having null results made experiments 40 percentage points less likely to be published and 60 percentage points less likely to even be written up.
Here’s a funny bit from the news article: “Stanford political economist Neil Malhotra and two of his graduate students . . .” You know you’ve hit the big time when you’re the only author who gets mentioned in the news story!
More seriously, this is great stuff. I would only suggest that, along with the file drawer, you remember the garden of forking paths. In particular, I’m not so sure about the framing in which an experiment can be characterized as producing “strong results,” “mixed results,” or “null results.” Whether a result is strong or not would seem to depend on how the data are analyzed, and the point of the forking paths is that with a given data it is possible for noise to appear as strong. I gather from the news article that TESS is different in that any given study is focused on a specific hypothesis, but even so I would think there is a bit of flexibility in how the data are analyzed and a fair number of potentially forking paths. For example, the news article mentions “whether voters tend to favor legislators who boast of bringing federal dollars to their districts over those who tout a focus on policy matters).” But of course this could be studied in many different ways.
In short, I think this is important work you have done, and I just think that we should go beyond the “file drawer” because I fear that this phase lends too much credence to the idea that a reported p-value is a legitimate summary of a study.
P.S. There’s also a statistical issue that every study is counted only once, as either a 1 (published) or 0 (unpublished). If Bruno Frey ever gets involved, you’d have to have a system where any result gets a number from 0 to 5, representing the number of different times it’s published.