Skip to content
Archive of posts filed under the Decision Theory category.

“A Bias in the Evaluation of Bias Comparing Randomized Trials with Nonexperimental Studies”

Jessica Franklin writes: Given your interest in post-publication peer review, I thought you might be interested in our recent experience criticizing a paper published in BMJ last year by Hemkens et al.. I realized that the method used for the primary analysis was biased, so we published a criticism with mathematical proof of the bias […]

No to inferential thresholds

Harry Crane points us to this new paper, “Why ‘Redefining Statistical Significance’ Will Not Improve Reproducibility and Could Make the Replication Crisis Worse,” and writes: Quick summary: Benjamin et al. claim that FPR would improve by factors greater than 2 and replication rates would double under their plan. That analysis ignores the existence and impact […]

3 more articles (by others) on statistical aspects of the replication crisis

A bunch of items came in today, all related to the replication crisis: – Valentin Amrhein points us to this fifty-authored paper, “Manipulating the alpha level cannot cure significance testing – comments on Redefine statistical significance,” by Trafimow, Amrhein, et al., who make some points similar to those made by Blake McShane et al. here. […]

What should this student do? His bosses want him to p-hack and they don’t even know it!

Someone writes: I’m currently a PhD student in the social sciences department of a university. I recently got involved with a group of professors working on a project which involved some costly data-collection. None of them have any real statistical prowess, so they came to me to perform their analyses, which I was happy to […]

Statistical Significance and the Dichotomization of Evidence (McShane and Gal’s paper, with discussions by Berry, Briggs, Gelman and Carlin, and Laber and Shedden)

Blake McShane sent along this paper by himself and David Gal, which begins: In light of recent concerns about reproducibility and replicability, the ASA issued a Statement on Statistical Significance and p-values aimed at those who are not primarily statisticians. While the ASA Statement notes that statistical significance and p-values are “commonly misused and misinterpreted,” […]

“Quality control” (rather than “hypothesis testing” or “inference” or “discovery”) as a better metaphor for the statistical processes of science

I’ve been thinking for awhile that the default ways in which statisticians think about science—and which scientists think about statistics—are seriously flawed, sometimes even crippling scientific inquiry in some subfields, in the way that bad philosophy can do. Here’s what I think are some of the default modes of thought: – Hypothesis testing, in which […]

The Publicity Factory: How even serious research gets exaggerated by the process of scientific publication and reporting

The starting point is that we’ve seen a lot of talk about frivolous science, headline-bait such as the study that said that married women are more likely to vote for Mitt Romney when ovulating, or the study that said that girl-named hurricanes are more deadly than boy-named hurricanes, and at this point some of these […]

No tradeoff between regularization and discovery

We had a couple recent discussions regarding questionable claims based on p-values extracted from forking paths, and in both cases (a study “trying large numbers of combinations of otherwise-unused drugs against a large number of untreatable illnesses,” and a salami-slicing exercise looking for public opinion changes in subgroups of the population), I recommended fitting a […]

Freelance orphans: “33 comparisons, 4 are statistically significant: much more than the 1.65 that would be expected by chance alone, so what’s the problem??”

From someone who would prefer to remain anonymous: As you may know, the relatively recent “orphan drug” laws allow (basically) companies that can prove an off-patent drug treats an otherwise untreatable illness, to obtain intellectual property protection for otherwise generic or dead drugs. This has led to a new business of trying large numbers of […]

Mick Cooney: case study on modeling loss curves in insurance with RStan

This is great. Thanks, Mick! All the Stan case studies are here.

When do we want evidence-based change? Not “after peer review”

Jonathan Falk sent me the above image in an email with subject line, “If this isn’t the picture for some future blog entry I’ll never forgive you.” This was a credible threat so here’s the post. But I don’t agree with that placard at all! Waiting for peer review is a bad idea for two […]

Workshop on Interpretable Machine Learning

Andrew Gordon Wilson sends along this conference announcement: NIPS 2017 Symposium Interpretable Machine Learning Long Beach, California, USA December 7, 2017 Call for Papers: We invite researchers to submit their recent work on interpretable machine learning from a wide range of approaches, including (1) methods that are designed to be more interpretable from the start, […]

I respond to E. J.’s response to our response to his comment on our paper responding to his paper

In response to my response and X’s response to his comment on our paper responding to his paper, E. J. writes: Empirical claims often concern the presence of a phenomenon. In such situations, any reasonable skeptic will remain unconvinced when the data fail to discredit the point-null. . . . When your goal is to […]

I disagree with Tyler Cowen regarding a so-called lack of Bayesianism in religious belief

Tyler Cowen writes: I am frustrated by the lack of Bayesianism in most of the religious belief I observe. I’ve never met a believer who asserted: “I’m really not sure here. But I think Lutheranism is true with p = .018, and the next strongest contender comes in only at .014, so call me Lutheran.” […]

I’m not on twitter

This blog auto-posts. But I’m not on twitter. You can tweet at me all you want; I won’t hear it (unless someone happens to tell me about it). So if there’s anything buggin ya, put it in a blog comment.

Should we worry about rigged priors? A long discussion.

Today’s discussion starts with Stuart Buck, who came across a post by John Cook linking to my post, “Bayesian statistics: What’s it all about?”. Cook wrote about the benefit of prior distributions in making assumptions explicit. Buck shared Cook’s post with Jon Baron, who wrote: My concern is that if researchers are systematically too optimistic […]

BREAKING . . . . . . . PNAS updates its slogan!

I’m so happy about this, no joke. Here’s the story. For awhile I’ve been getting annoyed by the junk science papers (for example, here, here, and here) that have been published by the Proceedings of the National Academy of Sciences under the editorship of Susan T. Fiske. I’ve taken to calling it PPNAS (“Prestigious proceedings […]

When considering proposals for redefining or abandoning statistical significance, remember that their effects on science will only be indirect!

John Schwenkler organized a discussion on this hot topic, featuring posts by – Dan Benjamin, Jim Berger, Magnus Johannesson, Valen Johnson, Brian Nosek, and E. J. Wagenmakers – Felipe De Brigard – Kenny Easwaran – Andrew Gelman and Blake McShane – Kiley Hamlin – Edouard Machery – Deborah Mayo – “Neuroskeptic” – Michael Strevens – […]

Alan Sokal’s comments on “Abandon Statistical Significance”

The physicist and science critic writes: I just came across your paper “Abandon statistical significance”. I basically agree with your point of view, but I think you could have done more to *distinguish* clearly between several different issues: 1) In most problems in the biomedical and social sciences, the possible hypotheses are parametrized by a […]

2 quick calls

Kevin Lewis asks what I think of these: Study 1: Using footage from body-worn cameras, we analyze the respectfulness of police officer language toward white and black community members during routine traffic stops. We develop computational linguistic methods that extract levels of respect automatically from transcripts, informed by a thin-slicing study of participant ratings of […]