Multiple comparisons dispute in the tabloids

Yarden Katz writes:

I’m probably not the first to point this out, but just in case, you might be interested in this article by T. Florian Jaeger, Daniel Pontillo, and Peter Graff on a statistical dispute [regarding the claim, “Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa”].

Seems directly relevant to your article on multiple hypothesis testing and associated talk at the Voodoo correlations meeting. Curious to know your thoughts on this if you think it’s blog-worthy.

Here’s the abstract of the paper:

Atkinson (Reports, 15 April 2011, p. 346) argues that the phonological complexity of languages reflects the loss of phonemic distinctions due to successive founder events during human migration (the serial founder hypothesis). Statistical simulations show that the type I error rate of Atkinson’s analysis is hugely inflated. The data at best support only a weak interpretation of the serial founder hypothesis.

My reaction:

I did not look at either the science or the statistics in detail so I can’t judge the arguments being made on the two sides, but one thing I wold like to comment on, and disagree with, is the implication that the goal of a statistical analysis is to find a correct p-value.

For example, in his response, Quentin Atkinson writes, “What we really want to know, however, is the probability of finding an effect of distance from any origin by chance that is at least as large as the effect we observe in the real data.” I don’t always mind p-values—it can often be useful to check model fit by comparing observed to potentially replicated data, thus giving a sense of whether an observed pattern can easily be explained by chance. But I’d prefer to summarize via inferences on scientifically-meaningful parameters of the model, for example the magnitudes of the different founder events (or whatever is a reasonable way to look at these models). Rather than saying that the p-value is really 0.01 rather than 0.0001, I think it makes more sense to convert these discrepancies into statements about directly interpretable and generalizable parameters.

1 thought on “Multiple comparisons dispute in the tabloids

Comments are closed.