Govind Manian points me to this online textbook by Alex Reinhart. It’s hard for me to evaluate because I am so close to the material. But on first glance it looks pretty reasonable to me.

## “Statistics Done Wrong”

Govind Manian points me to this online textbook by Alex Reinhart. It’s hard for me to evaluate because I am so close to the material. But on first glance it looks pretty reasonable to me.

I’m sure you’ll be glad to hear that the revised edition I’m preparing, which will be published as a physical book (probably late next year), focuses much more on type M errors, though I call them “truth inflation” because that’s more fun.

I’m also writing about issues like the dichotomization of predictor and outcome variables in multiple regression.

I’m happy to take suggestions from you and your readers. I’m trying to build a book useful to graduate students and statistically-undereducated practitioners in the sciences. If I can convince a few graduate students to get statistical advice before starting their thesis research, I’ll count it as a success.

Is there a pdf version? Easier to print etc.

Various sections of the book are about, or involve, confusions about statistical significance vs “significance” (you can say the latter is ‘practical significance’ but really is “‘significance’ as understood by native English speakers”).

If you really want to help people on that front, one amazingly helpful trick to teach (and I know it’s old, and I wish I knew who to properly cite) is to mentally rewrite “statistically significant” as “statistically discernible” on each encounter. Nothing is perfect, including this, but IMO this has a a 10-to-1 benefit-vs-harm ratio. Yet you don’t do this, and it is fairly rarely found in other critiques/warnings about statistical significance: what do you think about this? Not so helpful? Some hidden downside I can’t see? IMO it helps people see more quickly through a lot of the fog, and with fairly low risk.

That’s a good idea, although to be consistent I’d probably have to use “discernible” whenever I mean “statistically significant” throughout the book. I worry about using nonstandard terminology, since readers will see “statistically significant” everywhere.

RA Fisher really should have hired JRR Tolkien to invent a new set of terms, without previous (and potentially confusing) English meanings, for statisticians to use to describe their results.

Alex Reinhart’s stuff makes for terrific reading and I can’t wait to quote him, but I do have two comments:

1. I did not find any Bayesian ideas inasmuch as he seems to concentrate on the foibles of frequentism, especially reliance on p-value with all its well known weaknesses and misunderstandings.

2. Many critics (such as Alex Reinhart) of statistical material tend to overlook the common error of researchers who conflate exploration with confirmation. That is, the data may suggest something but that same data should not be used again for verification. Although textbooks routinely point this out, this conflation is routinely commonplace enough that it is ignored. One reason the recent fecal transplant study for example

http://www.nytimes.com/2013/01/17/health/disgusting-maybe-but-treatment-works-study-finds.html?_r=0

is so highly thought of is because unlike typical medical studies, it confirmed via a randomized clinical trial what was suspected from many previous unrandomized studies.

1. I have very little experience with practical Bayesian analysis, so I don’t have much to say about it. I may discuss Bayes factors, BIC and hierarchical models in a few places, but I don’t think I can credibly give Bayesian data analysis advice.

2. I’ve recently been writing a chapter about this. I also have a chapter about researcher degrees of freedom which has been influenced by Andrew’s recent preprint “The garden of forking paths.”

Alex,

Regarding your request for comments, I liked your discussion of p-values but found one aspect non-intuitive:

“A p value is ..a measure of how surprised you should be if there is no actual difference between the groups, but you got data suggesting there is. A bigger difference, or one backed up by more data, suggests more surprise and a smaller p value.”

“The p value is a measure of surprise..”

In your analogy, p-values are an inverse measure of surprise: the greater the surprise, the smaller the p-value. When I read “X is a measure of Y”, my default tendency is to think of direct, positive correlation. If you don’t feel it would overcomplicate things, maybe please consider reminding the reader of the inverse connection at each mention?

This is the second time I’ve heard this comment, so it has passed from anecdote to anecdata. I’ll have to tweak my explanation.

Looks promising and you are being appropriately grim.

Does seem focussed on “Intensifying the process of being less wrong.” http://andrewgelman.com/2013/12/27/statistics-nobel-prize/#comment-152736

I am a bit surprised that someone doing a Phd in Statistics is interested in how “others poorly cope with statistics”.

(Maybe the times have changed.)

One generic comment is that it is very difficult to know what people without an understanding of statistics will make by what you write. It will make sense to them or they will have a way of making sense of it to themselves – but without some understanding of how statistics works, it is likely to get them into trouble. For folks with some training in statistics, one of my favourite diagnostic questions is “If there is no effect, and the statistics are very straight forward (e.g. a t-test) and the assumptions are true – what will the distribution of p_values be?” If they get that right then ask them, “what if the comparison was between non-randomised groups and some confounding is surely present?”

Statistics, you sure done wrong.