I agree with Uri Simonsohn that you don’t learn much by looking at the distribution of all the p-values that have appeared in some literature. Uri explains:
Most p-values reported in most papers are irrelevant for the strategic behavior of interest.
Covariates, manipulation checks, main effects in studies testing interactions, etc. Including them we underestimate p-hacking and we overestimate the evidential value of data. Analyzing all p-values asks a different question, a less sensible one. Instead of “Do researchers p-hack what they study?” we ask “Do researchers p-hack everything?”
He demonstrates with an example and summarizes:
Looking at all p-values is falsely reassuring.
I agree and will just add two comments:
1. I prefer the phrase “garden of forking paths” because I think the term “p-hacking” suggests intentionality or even cheating. Indeed, in the quoted passage above, Simonsohn refers to “strategic behavior.” I have not doubt that some strategic behavior and even outright cheating goes on, but I like to emphasize that the garden of forking paths can occur even when a researcher does only one analysis of the data at hand and does not directly “fish” for statistical significance.
The idea is that analyses are contingent on data, and researchers can and do make choices in data coding, data exclusion, and data analysis in light of the data they see, setting various degrees of freedom in reasonable-seeming ways that support their model of the world, thus being able to obtain statistical significance at a high rate, merely by capitalizing on chance patterns in data. It’s the forking paths, but it doesn’t feel like “hacking,” not is it necessarily “strategic behavior” in the usual sense of the term.
2. If p-values are what we have, it makes sense to learn what we can from them, as in the justly influential work of Uri Simonsohn, Greg Francis, and others. But, looking at the big picture, once we move to the goal of learning about underlying effects, I think we want to be analyzing raw data (and in the context of prior information), not merely pushing these p’s around. P-values are crude data summaries, and a lot of information can be lost by moving from raw data to p-values. Doing science using published p-values is like trying to paint a picture using salad tongs.