Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

Forking paths vs. six quick regression tips

Bill Harris writes: I know you’re on a blog delay, but I’d like to vote to raise the odds that my question in a comment to http://andrewgelman.com/2015/09/15/even-though-its-published-in-a-top-psychology-journal-she-still-doesnt-believe-it/gets discussed, in case it’s not in your queue. It’s likely just my simple misunderstanding, but I’ve sensed two bits of contradictory advice in your writing: fit one complete model all at […]

You’ll never guess what I say when I have nothing to say

A reporter writes: I’m a reporter working on a story . . . and I was wondering if you could help me out by taking a quick look at the stats in the paper it’s based on. The paper is about paedophiles being more likely to have minor facial abnormalities, suggesting that paedophilia is a […]

What’s the difference between randomness and uncertainty?

Julia Galef mentioned “meta-uncertainty,” and how to characterize the difference between a 50% credence about a coin flip coming up heads, vs. a 50% credence about something like advanced AI being invented this century. I wrote: Yes, I’ve written about this probability thing. The way to distinguish these two scenarios is to embed each of […]

The Notorious N.H.S.T. presents: Mo P-values Mo Problems

Alain Content writes: I am a psycholinguist who teaches statistics (and also sometimes publishes in Psych Sci). I am writing because as I am preparing for some future lessons, I fall back on a very basic question which has been worrying me for some time, related to the reasoning underlying NHST [null hypothesis significance testing]. […]

The time-reversal heuristic—a new way to think about a published finding that is followed up by a large, preregistered replication (in context of Amy Cuddy’s claims about power pose)

[Note to busy readers: If you’re sick of power pose, there’s still something of general interest in this post; scroll down to the section on the time-reversal heuristic. I really like that idea.] Someone pointed me to this discussion on Facebook in which Amy Cuddy expresses displeasure with my recent criticism (with Kaiser Fung) of […]

2 new reasons not to trust published p-values: You won’t believe what this rogue economist has to say.

Political scientist Anselm Rink points me to this paper by economist Alwyn Young which is entitled, “Channelling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results,” and begins, I [Young] follow R.A. Fisher’s The Design of Experiments, using randomization statistical inference to test the null hypothesis of no treatment effect in a […]

Paxil: What went wrong?

Dale Lehman points us to this news article by Paul Basken on a study by Joanna Le Noury, John Nardo, David Healy, Jon Jureidin, Melissa Raven, Catalin Tufanaru, and Elia Abi-Jaoude that investigated what went wrong in the notorious study by Martin Keller et al. of the GlaxoSmithKline drug Paxil. Lots of ethical issues here, […]

Read this to change your entire perspective on statistics: Why inversion of hypothesis tests is not a general procedure for creating uncertainty intervals

Dave Choi writes: A reviewer has pointed me something that you wrote in your blog on inverting test statistics. Specifically, the reviewer is interested in what can happen if the test can reject the entire assumed family of models, and has asked me to consider discussing whether it applies to a paper that I am […]

He’s skeptical about Neuroskeptic’s skepticism

Jim Delaney writes: Through a link in the weekend reads on Retraction Watch, I read Neuroskeptic’s post-publication peer review of a study on an antidepressant application of the drug armodafinil. Neuroskeptic’s main criticism is that he/she feels that a “conclusion” in the abstract is misleading, “… Adjunctive armodafinil 150 mg/day reduced depressive symptoms associated with […]

Rapid post-publication review

A colleague points me to a published paper and writes: Do you believe this finding? If your biology isn’t strong enough to pass judgement — mine certainly isn’t — can you ask somebody who knows? My reply: 4 groups with a total n=71? No way. The topic is too sad for me to mock on […]

Death of a statistician

It’s not often that one of our profession earns an obituary in the New York Times: Lawrence R. Herkimer, who elevated cheerleading into an aspirational goal for generations of youths and a highly successful business for himself, organizing camps for would-be cheerleaders and selling the clothing and gear they would need, died on Wednesday in […]

Gathering of philosophers and physicists unaware of modern reconciliation of Bayes and Popper

Hiro Minato points us to a news article by physicist Natalie Wolchover entitled “A Fight for the Soul of Science.” I have no problem with most of the article, which is a report about controversies within physics regarding the purported untestability of physics models such as string theory (as for example discussed by my Columbia […]

Jökull Snæbjarnarson writes . . .

Wow! After that name, anything that follows will be a letdown. But we’ll answer his or her question anyway. So here goes. Jökull Snæbjarnarson writes: I’m fitting large bayesian regression models in Stan where I have many parameters. Having fitted a model and some of the “beta” coefficients HDI’s, where beta is the beta in […]

Syllabus for my course on design and analysis of sample surveys

Here’s last year’s course plan. Maybe I’ll change it a bit, haven’t decided yet. The course number is Political Science 4365, and it’s also cross-listed in Statistics.

Questions about data transplanted in kidney study

Hey—check out the above title. It’s my attempt at a punny, Retraction-Watch-style headline! OK, now on to the content. Dan Walter writes: In order to gauge longevity of kidney donors, this paper [by Dorry Segev, Abimereki Muzaale, Brian Caffo, Shruti Mehta, Andrew Singer, Sarah Taranto, Maureen McBride, and Robert Montgomery] compares data collected on about […]

Symposium on Population Inference at Johns Hopkins University, Friday February 26, 2016

Liz Stuart announces this conference:

Gary McClelland agrees with me that dichotomizing continuous variables is a bad idea. He also thinks my suggestion of dividing a variable into 3 parts is also a mistake.

In response to some of the discussion that inspired yesterday’s post, Gary McClelland writes: I remain convinced that discretizing a continuous variable, especially for multiple regression, is the road to perdition. Here I explain my concerns. First, I don’t buy the motivation that discretized analyses are easier to explain to lay citizens and the press. […]

Beyond the median split: Splitting a predictor into 3 parts

Carol Nickerson pointed me to a series of papers in the journal Consumer Psychology, first one by Dawn Iacobucci et al. arguing in favor of the “median split” (replacing a continuous variable by a 0/1 variable split at the median) “to facilitate analytic ease and communication clarity,” then a response by Gary McClelland et al. […]

Benford lays down the Law

A few months ago I received in the mail a book called An Introduction to Benford’s Law by Arno Berger and Theodore Hill. I eagerly opened it but I lost interest once I realized it was essentially a pure math book. Not that there’s anything wrong with math, it just wasn’t what I wanted to […]

Tip o’ the iceberg to ya

Paul Alper writes: The Washington Post ran this article by Fred Barbas with an interesting quotation: “Every day, on average, a scientific paper is retracted because of misconduct,” Ivan Oransky and Adam Marcus, who run Retraction Watch, wrote in a New York Times op-ed in May. But, can that possibly be true, just for misconduct […]