A month ago I (Aki) started a series of tweets about “scientific books which have had big influence on me…”. They are partially in time order, but I can’t remember the exact order. I may have forgotten some, and some stretched the original idea, but I can recommend all of them. I have collected all […]

**Miscellaneous Statistics**category.

## Doomsday! Problems with interpreting a confidence interval when there is no evidence for the assumed sampling model

Mark Brown pointed me to a credulous news article in the Washington Post, “We have a pretty good idea of when humans will go extinct,” which goes: A Princeton University astrophysicist named J. Richard Gott has a surprisingly precise answer to that question . . . to understand how he arrived at it and what […]

## Walter Benjamin on storytelling

After we discussed my paper with Thomas Basbøll, “When do stories work? Evidence and illustration in the social sciences,” Jager Hartman wrote to me: Here is a link to the work by Walter Benjamin I think of when I think of storytelling. He uses storytelling throughout his works and critiques done on his works are […]

## “We continuously increased the number of animals until statistical significance was reached to support our conclusions” . . . I think this is not so bad, actually!

Jordan Anaya pointed me to this post, in which Casper Albers shared this snippet from a recently-published paper from an article in Nature Communications: The subsequent twitter discussion is all about “false discovery rate” and statistical significance, which I think completely misses the point. The problems Before I get to why I think the quoted […]

## Early p-hacking investments substantially boost adult publication record

In a post with the title “Overstated findings, published in Science, on long-term health effects of a well-known early childhood program,” Perry Wilson writes: In this paper [“Early Childhood Investments Substantially Boost Adult Health,” by Frances Campbell, Gabriella Conti, James Heckman, Seong Hyeok Moon, Rodrigo Pinto, Elizabeth Pungello, and Yi Pan], published in Science in […]

## Don’t do the Wilcoxon (reprise)

František Bartoš writes: I’ve read your and various others statistical books and from most of them, I gained a perception, that nonparametric tests aren’t very useful and are mostly a relic from pre-computer ages. However, this week I witnessed a discussion about this (in Psych. methods discussion group on FB) and most of the responses […]

## “Statistics: Learning from stories” (my talk in Zurich on Tues 28 Aug)

Statistics: Learning from stories Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University, New York Here is a paradox: In statistics we aim for representative samples and balanced comparisons, but stories are interesting to the extent that they are surprising and atypical. The resolution of the paradox is that stories can be […]

## It’s all about Hurricane Andrew: Do patterns in post-disaster donations demonstrate egotism?

Jim Windle points to this post discussing a paper by Jesse Chandler, Tiffany M. Griffin, and Nicholas Sorensen, “In the ‘I’ of the Storm: Shared Initials Increase Disaster Donations.” I took a quick look and didn’t notice anything clearly wrong with the paper, but there did seem to be some opportunities for forking paths, in […]

## Do Statistical Methods Have an Expiration Date? (my talk noon Mon 16 Apr at the University of Pennsylvania)

Do Statistical Methods Have an Expiration Date? Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University There is a statistical crisis in the human sciences: many celebrated findings have failed to replicate, and careful analysis has revealed that many celebrated research projects were dead on arrival in the sense of never having […]

## Failure of failure to replicate

Dan Kahan tells this story:

## More bad news in the scientific literature: A 3-day study is called “long term,” and nobody even seems to notice the problem. Whassup with that??

Someone pointed me to this article, “The more you play, the more aggressive you become: A long-term experimental study of cumulative violent video game effects on hostile expectations and aggressive behavior,” by Youssef Hasan, Laurent Bègue, Michael Scharkow, and Brad Bushman. My correspondent was suspicious of the error bars in Figure 1. I actually think […]

## Are self-driving cars 33 times more deadly than regular cars?

Paul Kedrosky writes: I’ve been mulling the noise over Uber’s pedestrian death. While there are fewer pedestrian deaths so far from autonomous cars than non-autonomous (one in a few thousand hours, versus 1 every 1.5 hours), there is also, of course, a big difference in rates per passenger-mile. The rate for autonomous cars is now […]

## Lessons learned in Hell

This post is by Phil. It is not by Andrew. I’m halfway through my third year as a consultant, after 25 years at a government research lab, and I just had a miserable five weeks finishing a project. The end product was fine — actually really good — but the process was horrible and I […]

## The purpose of a pilot study is to demonstrate the feasibility of an experiment, *not* to estimate the treatment effect

David Allison sent this along: – Press release from original paper: “The dramatic decrease in BMI, although unexpected in this short time frame, demonstrated that the [Shaping Healthy Choices Program] SHCP was effective . . .” – Comment on paper and call for correction or retraction: “. . . these facts show that the analyses […]

## What We Talk About When We Talk About Bias

Shira Mitchell wrote: I gave a talk today at Mathematica about NHST in low power settings (Type M/S errors). It was fun and the discussion was great. One thing that came up is bias from doing some kind of regularization/shrinkage/partial-pooling versus selection bias (confounding, nonrandom samples, etc). One difference (I think?) is that the first […]

## Gaydar and the fallacy of objective measurement

Greggor Mattson, Dan Simpson, and I wrote this paper, which begins: Recent media coverage of studies about “gaydar,” the supposed ability to detect another’s sexual orientation through visual cues, reveal problems in which the ideals of scientific precision strip the context from intrinsically social phenomena. This fallacy of objective measurement, as we term it, leads […]

## You need 16 times the sample size to estimate an interaction than to estimate a main effect

Yesterday I shared the following exam question: In causal inference, it is often important to study varying treatment effects: for example, a treatment could be more effective for men than for women, or for healthy than for unhealthy patients. Suppose a study is designed to have 80% power to detect a main effect at a […]

## Classical hypothesis testing is really really hard

This one surprised me. I included the following question in an exam: In causal inference, it is often important to study varying treatment effects: for example, a treatment could be more effective for men than for women, or for healthy than for unhealthy patients. Suppose a study is designed to have 80% power to detect […]

## Incorporating Bayes factor into my understanding of scientific information and the replication crisis

I was having this discussion with Dan Kahan, who was arguing that my ideas about type M and type S error, while mathematically correct, represent a bit of a dead end in that, if you want to evaluate statistically-based scientific claims, you’re better off simply using likelihood ratios or Bayes factors. Kahan would like to […]

## Important statistical theory research project! Perfect for the stat grad students (or ambitious undergrads) out there.

Hey kids! Time to think about writing that statistics Ph.D. thesis. It would be great to write something on a cool applied project, but: (a) you might not be connected to a cool applied project, and you typically can’t do these on your own, you need collaborators who know what they’re doing and who care […]