This came up in a research discussion the other day. Someone had produced some estimates, and there was a question: where are the conf intervals. I said that if you have replication and you graph the estimates that were produced, then you don’t really need conf intervals (or, for that matter, p-values). The idea is that the display of the different estimates (produced from different years, or different groups, or different scenarios, or whatever) gives you the relevant scale of variation. In general, this is even better than confidence intervals in that the variation is visually clear and less assumption-based.
What I’m saying is, use the secret weapon.
I strongly agree.
But an exception to the rule could be what if this was a meta-analysis, in particular point estimates extracted from published papers. Then I might want to include the CI’s, for a non-traditional reason; if all the CI’s had lower bounds close to 0, I would think for sure I have an upwardly biased estimate in averaging across the point estimates.
I’m not an expert in the field of meta-analysis. Does anyone know if they have methods that attempt to reduce the upward bias generated by the p = 0.05 filter?
P.S.: I recognize this isn’t really a “counter-example”; I think by replications, you mean no p = 0.05 filter.
Cliff:
Yup. Also let me emphasize that the filter is not just “file drawer” or the selection of what papers to publish; it’s also, and even more importantly, the selection of what to publish within any particular study.
None that actually work as the filter varies and is largely unknown in most fields’ publications.
Some discussion can be found here http://www.stat.columbia.edu/~gelman/research/published/GelmanORourkeBiostatistics.pdf
Psychologists have been hard at work to answer that question.
https://osf.io/preprints/psyarxiv/9h3nu
http://escholarship.org/uc/item/2682p4tr#page-1
Does sound like a first step of multilevel modelling (or meta-analysis) rather than a substitute!
Bootstrap is the poor man’s secret weapon
+1
PS +1d this before I checked back on the other thread…
The bootstrap isn’t very accurate if you’re talking about 2-3 replications, which is what I think Andrew is talking about. With 10,000+ replications, sure. In that case, you’re just presenting an estimate of the sampling distribution of the estimator. This is somewhat analogous to presenting the full posterior instead of a credible interval.
I think the key part of Z’s comment is ‘poor man’s’. In the absence of true replications you resort to resampling.