Richard Morey writes:
On the tail of our previous paper about confidence intervals, showing that researchers tend to misunderstand the inferences one can draw from CIs, we [Morey, Rink Hoekstra, Jeffrey Rouder, Michael Lee, and EJ Wagenmakers] have another paper that we have just submitted which talks about the theory underlying inference by CIs. Our main goal is to elucidate for researchers why many of the things commonly believed about CIs are false, and to show that the theory of CIs does not offer a very compelling theory for inference.
One thing that I [Morey] have noted going back to the classic literature is how clear Neyman seemed about all this. Neyman was under no illusions about what the theory could or could not support. It was later authors who tacked on all kinds of extra interpretations to CIs. I think he would be appalled at how CIs are used.
From their abstract:
The width of confidence intervals is thought to index the precision of an estimate; the parameter values contained within a CI are thought to be more plausible than those outside the interval; and the confidence coefficient of the interval (typically 95%) is thought to index the plausibility that the true parameter is included in the interval. We show in a number of examples that CIs do not necessarily have any of these properties, and generally lead to incoherent inferences. For this reason, we recommend against the use of the method of CIs for inference.
I agree, and I too have been pushing against the idea that confidence intervals resolve the well-known problems with null hypothesis significance testing. I also had some specific thoughts:
For another take on the precision fallacy (the idea that the width of a confidence interval is a measure of the precision of an estimate), see my post, “Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests.” See in particular the graph which illustrates the problem very clearly, I think:
Regarding the general issue that confidence intervals are no inferential panacea, see my recent article, “P values and statistical practice,” in which I discuss the problem of taking a confidence interval from a flat prior and using it to make inferences and decisions.
My current favorite (hypothetical) example is an epidemiology study of some small effect where the point estimate of the odds ratio is 3.0 with a 95% conf interval of [1.1, 8.2]. As a 95% conf interval, this is fine (assuming the underlying assumptions regarding sampling, causal identification, etc. are valid). But if you slap on a flat prior you get a Bayes 95% posterior interval of [1.1, 8.2] which will not in general make sense, because real-world odds ratios are much more likely to be near 1.1 than to be near 8.2. In a practical sense, the uniform prior is causing big problems by introducing the possibility of these high values that are not realistic. And taking a confidence interval and treating it as a posterior interval gives problems too. Hence the generic advice to look at confidence intervals rather than p-values does not solve the problem.
I think the Morey et al. paper is important in putting all these various ideas together and making it clear what are the unstated assumptions of interval estimation.