Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

Why the garden-of-forking-paths criticism of p-values is not like a famous Borscht Belt comedy bit

People point me to things on the internet that they’re sure I’ll hate. I read one of these awhile ago—unfortunately I can’t remember who wrote it or where it appeared, but it raised a criticism, not specifically of me, I believe, but more generally of skeptics such as Uri Simonsohn and myself who keep bringing […]

“Find the best algorithm (program) for your dataset.”

Piero Foscari writes: Maybe you know about this already, but I found it amazingly brutal; while looking for some reproducible research resources I stumbled onto the following at mlcomp.org (which would be nice if done properly, at least as a standardization attempt): Find the best algorithm (program) for your dataset. Upload your dataset and run existing programs on it to […]

Looking at the polls: Time to get down and dirty with the data

Poll aggregation is great, but one thing that we’ve been saying a lot recently (see also here) is that we can also learn a lot by breaking open a survey and looking at the numbers crawling around inside. Here’s a new example. It comes from Alan Abramowitz, who writes: Very strange results of new ABC/WP […]

No statistically significant differences for get up and go

Anoop Balachandran writes:

Multicollinearity causing risk and uncertainty

Alexia Gaudeul writes: Maybe you will find this interesting / amusing / frightening, but the Journal of Risk and Uncertainty recently published a paper with a rather obvious multicollinearity problem. The issue does not come up that often in the published literature, so I thought you might find it interesting for your blog. The paper […]

Hey, PPNAS . . . this one is the fish that got away.

Uri Simonsohn just turned down the chance to publish a paper that could’ve been published in a top journal (a couple years ago I’d’ve said Psychological Science but recently they’ve somewhat cleaned up their act, so let’s say PPNAS which seems to be still going strong) followed by features in NPR, major newspapers, BoingBoing, and […]

Pro Publica Surgeon Scorecard Update

Adan Becerra writes: In light of your previous discussions on the ProPublica surgeon scorecard, I was hoping to hear your thoughts about this article recently published in Annals of Surgery titled, “Evaluation of the ProPublica Surgeon Scorecard ‘Adjusted Complication Rate’ Measure Specifications.”​ The article is by K. Ban, M. Cohen, C. Ko, M. Friedberg, J. […]

Bayesian Statistics Then and Now

I happened to recently reread this article of mine from 2010, and I absolutely love it. I don’t think it’s been read by many people—it was published as one of three discussions of an article by Brad Efron in Statistical Science—so I wanted to share it with you again here. This is the article where […]

Hypothesis Testing is a Bad Idea (my talk at Warwick, England, 2pm Thurs 15 Sept)

This is the conference, and here’s my talk (will do Google hangout, just as with my recent talks in Bern, Strasbourg, etc): Hypothesis Testing is a Bad Idea Through a series of examples, we consider problems with classical hypothesis testing, whether performed using classical p-values or confidence intervals, Bayes factors, or Bayesian inference using noninformative […]

It’s not about normality, it’s all about reality

This is just a repost, with a snazzy and appropriate title, of our discussion from a few years ago on the assumptions of linear regression, from section 3.6 of my book with Jennifer. In decreasing order of importance, these assumptions are: 1. Validity. Most importantly, the data you are analyzing should map to the research […]

Publication bias occurs within as well as between projects

Kent Holsinger points to this post by Kevin Drum entitled, “Publication Bias Is Boring. You Should Care About It Anyway,” and writes: I am an evolutionary biologist, not a psychologist, but this article describes a disturbing Scenario concerning oxytocin research that seems plausible. It is also relevant to the reproducibility/publishing issues you have been discussing […]

Better to just not see the sausage get made

Mike Carniello writes: This article in the NYT leads to the full text, in which these statement are buried (no pun intended): What is the probability that two given texts were written by the same author? This was achieved by posing an alternative null hypothesis H0 (“both texts were written by the same author”) and […]

A day in the life

I like to post approx one item per day on this blog, so when multiple things come up in the same day, I worry about the sustainability of all this. I suppose I could up the posting rate to 2 a day but I think that could be too much of a burden on the […]

One more thing you don’t have to worry about

Baruch Eitam writes: So I have been convinced by the futility of NHT for my scientific goals and by the futility of of significance testing (in the sense of using p-values as a measure of the strength of evidence against the null). So convinced that I have been teaching this for the last 2 years. […]

Kaiser Fung on the ethics of data analysis

Kaiser gave a presentation and he’s sharing the slides with us here. It’s important stuff.

The history of characterizing groups of people by their averages

Andrea Panizza writes: I stumbled across this article on the End of Average. I didn’t know about Todd Rose, thus I had a look at his Wikipedia entry: Rose is a leading figure in the science of individual, an interdisciplinary field that draws upon new scientific and mathematical findings that demonstrate that it is not […]

Will youths who swill Red Bull become adult cocaine addicts?

The above is the question asked to me by Michael Stutzer, who writes: I have attached an increasingly influential paper [“Effects of Adolescent Caffeine Consumption on Cocaine Sensitivity,” by Casey O’Neill, Sophia Levis, Drew Schreiner, Jose Amat, Steven Maier, and Ryan Bachtell] purporting to show the effects of caffeine use in adolescents (well, lab rats […]

Documented forking paths in the Competitive Reaction Time Task

Baruch Eitan writes: This is some luscious garden of forking paths. Indeed. Here’s what Malte Elson writes at the linked website: The Competitive Reaction Time Task, sometimes also called the Taylor Aggression Paradigm (TAP), is one of the most commonly used tests to purportedly measure aggressive behavior in a laboratory environment. . . . While […]

The p-value is a random variable

Sam Behseta sends along this paper by Laura Lazzeroni, Ying Lu, and Ilana Belitskaya-Lévy, who write: P values from identical experiments can differ greatly in a way that is surprising to many. The failure to appreciate this wide variability can lead researchers to expect, without adequate justification, that statistically significant findings will be replicated, only […]

A kangaroo, a feather, and a scale walk into Viktor Beekman’s office

E. J. writes: I enjoyed your kangaroo analogy [see also here—ed.] and so I contacted a talented graphical artist—Viktor Beekman—to draw it. The drawing is on Flickr under a CC license. Thanks, Viktor and E.J.!