Kent Holsinger sends along this statistics discussion from a climate scientist. I don’t really feel like going into the details on this one, except to note that this appears to be a discussion between two physicists about statistics. The blog in question appears to be pretty influential, with about 70 comments on most of its entries. When it comes to blogging, I suppose it’s good to have strong opinions even (especially?) when you don’t know what you’re talking about.

P.S. Just to look at this from the other direction: I know next to nothing about climate science, but at least I recognize my ignorance! This is perhaps related to Kaiser’s point that statisticians have to be comfortable with uncertainty. In contrast, I could well believe that someone who goes into physics would find complete certainty to be appealing. The laws of physics are pretty authoritative, after all.

P.P.S. Further discussion here. (We seem to have two parallel comment threads. I’d say that we could see how long they need to run before mixing well, but this note has induced between-thread dependence, so the usual convergence diagnostics won’t be appropriate.)

"I know next to nothing about climate science, but at least I recognize my ignorance!"

That's always been my view about the global warming controversy:

To come to an opinion worth publicizing on global warming, first, I'd have to become as expert with statistics as Dr. Gelman.

Then, I'd have to apply my hard-earned technical expertise to the arcane field of climate science.

So, I've largely held my tongue on climate.

In contrast, there are many other fields of controversy, such as education, immigration, human biodiversity and so forth that are so weighted down with political taboos that basic numeracy, an appreciation for Occam's Razor, and a thick skin can make you into the One-Eyed Man in the Kingdom of the Blind.

Tamino isn't exactly a climate scientist, I think he's an astrophysicist who does climate statistics as a hobby.

Andrew,

I guess I'm not following — The author of the textbook in question (the subject of the post) does appear to be a physicist (or at least in a physics department), but the proprietor of that blog is a statistician. Can you elaborate on where he ('Tamino') appears to have gone wrong (as you seem to indicate by "when you don't know what you're talking about") in his reanalysis of the textbook example from the post? Another statistician (Ian Jolliffe) joins in the discussion on the post you link to, and in general discussion on that particular blog is sometimes pleasingly technical, so I'd be interested (as I suspect the other readers of that blog would be) in your perspective.

David MacKay the physicist mentioned who wrote the book is actually a pioneer in Bayesian machine learning, he did some early work on Bayesian neural networks using the variational approximation just slightly before Radford Neal's MCMC approach…

Mackay's reach is pretty broad, but I think he is a competent statistician…

Kevin:

I couldn't figure out from the blog the name of the blogger or his or her profession, and the person who sent me the link described the blogger as a climate scientist, which I assume is a physicist unless I'm informed otherwise. I'm happy to be corrected on this point.

Regarding your question of what's wrong with the analysis on the blog: lots of things, but one tipoff was the statement, based on a p-value, that "it's likely (one could even say “statistically significant”) that the treatment is effective."

What would I do in this example? I'd probably start with a simple Bayes estimate of the form, (y2+1)/(n2+2) – (y1+1)/(n2+2), with a standard error of sqrt (p1*(1-p1)/n1 + p2*(1-p2)/n2). This sort of procedure also has better frequentist properties than so-called exact inference (see Agresti and Coull, 1998).

David Rohde: MacKay's book is fine and I agree that he's a competent statistician. My point was really that the blogger I linked to is showing a ridiculously high level of confidence in a context-free discussion. I'm used to seeing this level of presumed certainty in political blogs (Instapundit, Daily Kos, etc.) but it's a bit jarring to see it in a discussion of statistics.

It's a good reminder to professional statisticians such as myself that much (most?) of the discussions in our field are happening in the lab and on the street, as it were, and not in the measured tones of our journal articles.

But I guess I was unfair in picking on physicists. I've met lots of statisticians who are overconfident about statistical methods.

There's now a follow-up post, where Tamino tries to clarify his original example with a second example: four people get Treatment A but none contract the disease, while only one person gets Treatment B (the placebo), and that person subsequently contracts the disease. I think his objection is that, given what MacKay wrote (assume the placebo and treatment don't have equal effects), theta A comes out "significantly" less than theta B; p=.952. Tamino suggests that this is a situation better suited to traditional NHST.

Ian:

Here's my default Bayesian analysis for the two examples:

<pre>

# simple inference for difference in proportions

compare = function (y.1, n.1, y.2, n.2){

library (arm)

a = 1

n.new.1 = n.1 + 2*a

p.new.1 = (y.1+a)/n.new.1

n.new.2 = n.2 + 2*a

p.new.2 = (y.2+a)/n.new.2

estimate = p.new.2 – p.new.1

se = sqrt (p.new.1*(1-p.new.1)/n.new.1 +

p.new.2*(1-p.new.2)/n.new.2)

cat ("Estimate (and s.e.): ", fround (estimate, 2), " (",

fround (se, 2), ")

", sep="")

}

compare (1, 30, 3, 10)

compare (0, 4, 1, 1)

</pre>

And the results:

<pre>

> compare (1, 30, 3, 10)

Estimate (and s.e.): 0.27 (0.14)

> compare (0, 4, 1, 1)

Estimate (and s.e.): 0.50 (0.31)

</pre>

Make of this what you will. It looks like I'm not disagreeing much with the blogger on the results, though–which makes sense, given that the default analysis is adding very little prior information.

I'd also like to point out that this sort of very simple example misses a lot of the point of Bayesian inference, in that it's an unusual case in which the data model (the "likelihood") is uncontroversial. Once you move to linear regression, logistic regression, and so forth, any statistician–Bayesian or not–is going to be making a lot of assumptions, and in this context the prior distribution is just part of the larger model.

Is there something wrong with

> n=100000; a=1; sum(rbeta(n,1+a,29+a)-rbeta(n,3+a, 7+a) < 0)/n

[1] 0.01282

> n=100000; a=1; sum(rbeta(n,1+a,0+a)-rbeta(n,0+a, 4+a) < 0)/n

[1] 0.0478

… which is what MacKay did (I guess)?

Janne:

Yes, this is fine too. (At least until Jouni publishes his paper explaining why we should all be using a=1/3.)

I used the normal approximation to more easily make the connection to classical statistical procedures, also because generally I get more out of the mean and se (or something similar such as the median and 50% interval) than out of the probability the difference exceeds zero. One reason for all this confusion, I think, is that the problem was framed in terms of testing the hypothesis of zero difference rather than in terms of estimation.

_likelihood taming_ becomes a critrical role for priors in more complex models – to ensure intervals that repeatedly work well in practice.

I'll try and plot out some this – at least in terms of estimation.

K

I think I'll steal this famous quote

"Whenever I hear the word testing,I want to reach for my pistol"

Too bad there would be nothing to shoot at ;-)

It takes a village:

http://tamino.wordpress.com/2010/03/24/bad-bayes-…