Skip to content
Archive of posts filed under the Decision Theory category.

We need to stop sacrificing women on the altar of deeply mediocre men (ISBA edition)

(This is not Andrew. I would ask you not to speculate in the comments who S is, this is not a great venue for that.) Kristian Lum just published an essay about her experiences being sexually assaulted at statistics conferences.  You should read the whole thing because it’s important, but there’s a sample paragraph. I […]

Always crashing in the same car

“Hey, remember me?  I’ve been busy working like crazy” – Fever Ray I’m at the Banff International Research Station (BIRS) for the week, which is basically a Canadian version of Disneyland where during coffee breaks a Canadian woman with a rake politely walks around telling elk to “shoo”. The topic of this week’s workshop isn’t […]

“How to Assess Internet Cures Without Falling for Dangerous Pseudoscience”

Science writer Julie Rehmeyer discusses her own story: Five years ago, against practically anyone’s better judgment, I knowingly abandoned any semblance of medical evidence to follow the bizarre-sounding health advice of strangers on the internet. The treatment was extreme, expensive, and potentially dangerous. If that sounds like a terrible idea to you, imagine how it […]

Oooh, I hate all talk of false positive, false negative, false discovery, etc.

A correspondent writes: I think this short post on p value, bayes, and false discovery rate contains some misinterpretations. My reply: Oooh, I hate all talk of false positive, false negative, false discovery, etc. I posted this not because I care about someone, somewhere, being “wrong on the internet.” Rather, I just think there’s so […]

What’s the point of a robustness check?

Diomides Mavroyiannis writes: I am currently a doctoral student in economics in France, I’ve been reading your blog fo awhile and I have this question that’s bugging me. I often go to seminars where speakers present their statistical evidence for various theses. I was wondering if you could shed light on robustness checks, what is […]

“Five ways to fix statistics”

Nature magazine just published a short feature on statistics and the replication crisis, featuring the following five op-ed-sized bits: Jeff Leek: Adjust for human cognition Blake McShane, Andrew Gelman, David Gal, Christian Robert, and Jennifer Tackett: Abandon statistical significance David Colquhoun: State false-positive risk, too Michele Nuijten: Share analysis plans and results Steven Goodman: Change […]

Poisoning the well with a within-person design? What’s the risk?

I was thinking more about our recommendation that psychology researchers routinely use within-person rather than between-person designs. The quick story is that a within-person design is more statistically efficient because, when you compare measurements within a person, you should get less variation than when you compare different groups. But researchers often use between-person designs out […]

“A Bias in the Evaluation of Bias Comparing Randomized Trials with Nonexperimental Studies”

Jessica Franklin writes: Given your interest in post-publication peer review, I thought you might be interested in our recent experience criticizing a paper published in BMJ last year by Hemkens et al.. I realized that the method used for the primary analysis was biased, so we published a criticism with mathematical proof of the bias […]

No to inferential thresholds

Harry Crane points us to this new paper, “Why ‘Redefining Statistical Significance’ Will Not Improve Reproducibility and Could Make the Replication Crisis Worse,” and writes: Quick summary: Benjamin et al. claim that FPR would improve by factors greater than 2 and replication rates would double under their plan. That analysis ignores the existence and impact […]

3 more articles (by others) on statistical aspects of the replication crisis

A bunch of items came in today, all related to the replication crisis: – Valentin Amrhein points us to this fifty-authored paper, “Manipulating the alpha level cannot cure significance testing – comments on Redefine statistical significance,” by Trafimow, Amrhein, et al., who make some points similar to those made by Blake McShane et al. here. […]

What should this student do? His bosses want him to p-hack and they don’t even know it!

Someone writes: I’m currently a PhD student in the social sciences department of a university. I recently got involved with a group of professors working on a project which involved some costly data-collection. None of them have any real statistical prowess, so they came to me to perform their analyses, which I was happy to […]

Statistical Significance and the Dichotomization of Evidence (McShane and Gal’s paper, with discussions by Berry, Briggs, Gelman and Carlin, and Laber and Shedden)

Blake McShane sent along this paper by himself and David Gal, which begins: In light of recent concerns about reproducibility and replicability, the ASA issued a Statement on Statistical Significance and p-values aimed at those who are not primarily statisticians. While the ASA Statement notes that statistical significance and p-values are “commonly misused and misinterpreted,” […]

“Quality control” (rather than “hypothesis testing” or “inference” or “discovery”) as a better metaphor for the statistical processes of science

I’ve been thinking for awhile that the default ways in which statisticians think about science—and which scientists think about statistics—are seriously flawed, sometimes even crippling scientific inquiry in some subfields, in the way that bad philosophy can do. Here’s what I think are some of the default modes of thought: – Hypothesis testing, in which […]

The Publicity Factory: How even serious research gets exaggerated by the process of scientific publication and reporting

The starting point is that we’ve seen a lot of talk about frivolous science, headline-bait such as the study that said that married women are more likely to vote for Mitt Romney when ovulating, or the study that said that girl-named hurricanes are more deadly than boy-named hurricanes, and at this point some of these […]

No tradeoff between regularization and discovery

We had a couple recent discussions regarding questionable claims based on p-values extracted from forking paths, and in both cases (a study “trying large numbers of combinations of otherwise-unused drugs against a large number of untreatable illnesses,” and a salami-slicing exercise looking for public opinion changes in subgroups of the population), I recommended fitting a […]

Freelance orphans: “33 comparisons, 4 are statistically significant: much more than the 1.65 that would be expected by chance alone, so what’s the problem??”

From someone who would prefer to remain anonymous: As you may know, the relatively recent “orphan drug” laws allow (basically) companies that can prove an off-patent drug treats an otherwise untreatable illness, to obtain intellectual property protection for otherwise generic or dead drugs. This has led to a new business of trying large numbers of […]

Mick Cooney: case study on modeling loss curves in insurance with RStan

This is great. Thanks, Mick! All the Stan case studies are here.

When do we want evidence-based change? Not “after peer review”

Jonathan Falk sent me the above image in an email with subject line, “If this isn’t the picture for some future blog entry I’ll never forgive you.” This was a credible threat so here’s the post. But I don’t agree with that placard at all! Waiting for peer review is a bad idea for two […]

Workshop on Interpretable Machine Learning

Andrew Gordon Wilson sends along this conference announcement: NIPS 2017 Symposium Interpretable Machine Learning Long Beach, California, USA December 7, 2017 Call for Papers: We invite researchers to submit their recent work on interpretable machine learning from a wide range of approaches, including (1) methods that are designed to be more interpretable from the start, […]

I respond to E. J.’s response to our response to his comment on our paper responding to his paper

In response to my response and X’s response to his comment on our paper responding to his paper, E. J. writes: Empirical claims often concern the presence of a phenomenon. In such situations, any reasonable skeptic will remain unconvinced when the data fail to discredit the point-null. . . . When your goal is to […]