I’ve become increasingly uncomfortable with the term “confidence interval,” for several reasons:

– The well-known difficulties in interpretation (officially the confidence statement can be interpreted only on average, but people typically implicitly give the Bayesian interpretation to each case),

– The ambiguity between confidence intervals and predictive intervals. (See the footnote in BDA where we discuss the difference between “inference” and “prediction” in the classical framework.)

– The awkwardness of explaining that confidence intervals are big in noisy situations where you have *less* confidence, and confidence intervals are small when you have *more* confidence.

So here’s my proposal. Let’s use the term “uncertainty interval” instead. The uncertainty interval tells you how much uncertainty you have. That works pretty well, I think.

P.S. As of this writing, “confidence interval” outGoogles “uncertainty interval” by the huge margin of 9.5 million to 54000. So we have a ways to go.

I don't think there is much positive to be said for "confidence interval" except the fact that it is almost universal usage. But that is quite a big deal.

It is, agreed, a lousy term, not least because it has nothing to do with anyone's confidence in any other sense of the word, which to learners is likely to be as or more confusing than any of the criticisms you list.

But in my view uncertainty interval won't do either. There are so many different intervals that might be calculated to express uncertainty that giving just one that name is as likely to be confusing as any other proposal.

I think we're stuck with explaining that confidence interval is the standard term, even though it's a silly choice of words.

If we reinvented statistical terminology in a rational way, most of it would have to be changed. Significance, regression, consistent, efficient, exact, …: it's easier to list the words that are justifiable!

I have been using the expression "estimates of uncertainty" with my clients for a while, and actually they seem to be more comfortable with it than with "confidence intervals" and "standard errors". The latter in particular gets easily misunderstood by people without at least some training in data analysis. Allegedly, a journalist reporting on one of the projects I am involved in understood "standard errors" as meaning "mistakes that everyone makes"…

Not a fan of "credible intervals" then?

Do you have a page reference for that footnote? I've got the second edition.

Outervals, I believe were suggested by Tukey to deal with 3rd issue.

I did find it useful to illustrate confidence coverage of credible intervals via simulations when the unknown parameter was randomly generated by the assumed prior then a different prior and finally a point prior (a set fixed value).

Think it ties in with multiple perspectives leading to a fuller understanding of thinks (or even Peirce's pragmatic maxim of getting at all the possible implications and suggestions an idea might have on other ideas).

But also, recently became (re) aware of more formal work on frequency properties of Bayes termed "relative surprise inference".

http://www.utstat.utoronto.ca/mikevans/research.h…

And yes "thinks" was a typo but a neat one.

K?

… "estimates of uncertainty" seems very likely to get mixed up with "estimated standard errors" – leading to people getting things wrong by a factor of 4 or so.

What about "margin of error"? It's widely used already.

To be more confident/certain about covering the truth, we use wider intervals.

But we can also be more confident/certain about what the truth is as sample size increases, and then we use narrower intervals.

More:wider, and more:narrower; either way it's a mess. Use of "credible" seems equally confusing.

Nick:

I agree that it's hard to change common usage. In my books, I've tended to just use the word "interval" (as in, "95% interval" or "50% interval") without including the confusing word "confidence." I've become more aware of this issue when teaching introductory statistics. "Confidence interval" is the technical term but it's confusing.

I do think that with effort it can be possible to change people's terminology for the better. For example, I chose the term "Bayesian data analysis" because it is more general than "inference" (the phrase "data analysis" encompasses Step 1 (model building) and Step 2 (model checking) as well as Step 3 (inference) and more precise than "statistics" (which also includes design and decision analysis, neither of which we discussed much in our book). Similarly, I've pushed to use the term "outcome variable" rather than "dependent variable." And other examples of varying success. "Bayesian data analysis" has really caught on, though. I've seen it used as a generic term without reference to our book.

Bill: p.411. Also see the example on pp. 248-249.

Anon: "Margin of error" is ok except that some people use it to mean +/- 1 s.e. and others use it to mean +/- 2 s.e.

"Precision interval"? More, well, precise description of what is being quantified than confidence or uncertainty. And much more likely to convey an accurate conceptual understanding to smart people who have only modest stats literacy

Andrew: Naturally I agree that terminology is changeable and admire your optimism on this front. Tukey managed it with box plot. Who now even remembers any of the pre-Tukey terms such as dispersion diagram or range bar?

More importantly, either of response or outcome is a big improvement on dependent variable (or, horrors, DV).

Just 95% interval, or whatever, is an interesting idea. I think teachers still need a kind of footnote or aside that confidence interval is very common usage. Using what a teacher thinks are the best terms is the lesser deal. Unless students' experience with statistics ends with the end of the course, what the rest of the world says cannot be ignored.

Do agree with Nick but this was why Cobol was taught for so many years.

And SAS still!

Multiple comparative perspectives likely is the high road.

As an aside, I do remember being happy about enabling some researchers to easily simulate power (using SAS at the time) in thier proposed trials. It ended when they pointed out they absolutely needed to use formulas for granting agencies and ethics boards.

K?

Terminology is one thing. However, there seems to be many ways to interpret these intervals. I often see that coefficients that include zero in the 95% interval, but not the 90%, are considered more likely to have an effect (pos or neg, depending on the sign of the coefficient) than coefficients that do include zero at the 85% interval, but not at 80%. I have my doubts though that this is a correct interpretation.

Gustav: correct interpretation of an interval an insurmountable opportunity.

A particular paper from the link I gave above that tries to do this might provide a good seesne of the challenges involved http://www.utstat.utoronto.ca/mikevans/papers/opt…

Credible intervals might seem less challenging to interpret than confidence intervals but recall that conditional probabilities to not imply but rather just suggest (various) credible intervals.

On the other hand what is even exactly meant by a confidence interval – is in itself – an insurmountable opportunity.

(Like that paper by Tweedy? that obtained a what was called a confidence interval for the mean from a single observation from a Normal with both unknown mean and variance.)

K?

I'm with the Nick Cox@1 (I think) that this is not so obviously an increase in clarity.

But if you want to help the world, please throw you authority and influence against more a more harmful foe: "statistically significant". "Statistically discernible" has its flaws, but is so vastly more accurate and useful in every possible context, and should not upset any _intellectually legitimate_ opponents of change too much, that it's a no-brainer. If your (statistician) colleagues cannot get behind this, what is the real value of arguing for an even more quixotic change in terminology?

It's axiomatic that certainty is unattainable in science and in reasoning. Thus, I like Andrew's suggestion of using "uncertainty interval" because it is faithful to the epistemological underpinnings. But, old customs die hard!