Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

Gary McClelland agrees with me that dichotomizing continuous variables is a bad idea. He also thinks my suggestion of dividing a variable into 3 parts is also a mistake.

In response to some of the discussion that inspired yesterday’s post, Gary McClelland writes: I remain convinced that discretizing a continuous variable, especially for multiple regression, is the road to perdition. Here I explain my concerns. First, I don’t buy the motivation that discretized analyses are easier to explain to lay citizens and the press. […]

Beyond the median split: Splitting a predictor into 3 parts

Carol Nickerson pointed me to a series of papers in the journal Consumer Psychology, first one by Dawn Iacobucci et al. arguing in favor of the “median split” (replacing a continuous variable by a 0/1 variable split at the median) “to facilitate analytic ease and communication clarity,” then a response by Gary McClelland et al. […]

Benford lays down the Law

A few months ago I received in the mail a book called An Introduction to Benford’s Law by Arno Berger and Theodore Hill. I eagerly opened it but I lost interest once I realized it was essentially a pure math book. Not that there’s anything wrong with math, it just wasn’t what I wanted to […]

Tip o’ the iceberg to ya

Paul Alper writes: The Washington Post ran this article by Fred Barbas with an interesting quotation: “Every day, on average, a scientific paper is retracted because of misconduct,” Ivan Oransky and Adam Marcus, who run Retraction Watch, wrote in a New York Times op-ed in May. But, can that possibly be true, just for misconduct […]

First, second, and third order bias corrections (also, my ugly R code for the mortality-rate graphs!)

As an applied statistician, I don’t do a lot of heavy math. I did prove a true theorem once (with the help of some collaborators), but that was nearly twenty years ago. Most of the time I walk along pretty familiar paths, just hoping that other people will do the mathematical work necessary for me […]

Just Filling in the Bubbles

Collin Hitt writes: I study wrong answers, per your blog post today. My research focuses mostly on surveys of schoolchildren. I study the kids who appear to be just filling in the bubbles, who by accident actually reveal something of use for education researchers. Here’s his most recent paper, “Just Filling in the Bubbles: Using […]

Asking the question is the most important step

In statistics, the glamour often comes to those who perform a challenging data analysis that extracts signal from noise, as in Aki Vehtari’s decomposition of the famous birthday data which led to the stunning graphs on the cover of BDA3. But, from a social-science point of view, the biggest credit has to go to whoever […]

3 postdoc opportunities you can’t miss—here in our group at Columbia! Apply NOW, don’t miss out!

Hey, just once, the Buzzfeed-style hype is appropriate. We have 3 amazing postdoc opportunities here, and you need to apply NOW. Here’s the deal: we’re working on some amazing projects. You know about Stan and associated exciting projects in computational statistics. There’s the virtual database query, which is the way I like to describe our […]

Ta-Nehisi Coates, David Brooks, and the “Street Code” of Journalism

In my latest Daily Beast column, I decide to be charitable to the factually-challenged NYT columnist: From our perspective, Brooks’s refusal to admit error makes him look like a buffoon. But maybe we’re just judging him based on the norms of another culture. . . . From our perspective, Brooks spreading anti-Semitic false statistics in […]

Click here to get FREE tix to my webinar with Brad Efron this Wednesday!

The Royal Statistical Society (U.K.) has organized a discussion of a new paper, Frequentist accuracy of Bayesian estimates, by Brad Efron. The discussion will be an online event (a “webinar”) on 21 Oct 2015 (that’s right, “Back to the Future Day”) at noon 11am eastern time (4pm in the U.K.). Brad will present, I’ll ask […]

Lee Sechrest

Quantitative psychologist Lee Sechrest has passed away at the age of 86. Lee was a friend of the blog and we corresponded by email a bit during the past two years but I never had the privilege of meeting him. A wonderful obituary is here. As the saying goes, read the whole thing. Actually, I […]

Hierarchical logistic regression in Stan: The untold story

Corey Yanofsky pointed me to a paper by Neal Beck, Estimating grouped data models with a binary dependent variable and fixed effects: What are the issues?, which begins: This article deals with a very simple issue: if we have grouped data with a binary dependent variable and want to include fixed effects (group specific intercepts) […]

What do you learn from p=.05? This example from Carl Morris will blow your mind.

I keep pointing people to this article by Carl Morris so I thought I’d post it. The article is really hard to find because it has no title: it appeared in the Journal of the American Statistical Association as a discussion of a couple of other papers. All 3 scenarios have the same p-value. And, […]

Mindset interventions are a scalable treatment for academic underachievement — or not?

Someone points me to this post by Scott Alexander, criticizing the work of psychology researcher Carol Dweck. Alexander looks carefully at an article, “Mindset Interventions Are A Scalable Treatment For Academic Underachievement,” by David Paunesku, Gregory Walton, Carissa Romero, Eric Smith, David Yeager, and Carol Dweck, and he finds the following: Among ordinary students, the […]

Hot hand explanation again

I guess people really do read the Wall Street Journal . . . Edward Adelman sent me the above clipping and calculation and writes: What am I missing? I do not see the 60%. And Richard Rasiej sends me a longer note making the same point: So here I am, teaching another statistics class, this […]

How to use lasso etc. in political science?

Tom Swartz writes: I am a graduate student at Oxford with a background in economics and on the side am teaching myself more statistics and machine learning. I’ve been following your blog for some time and recently came across this post on lasso. In particular, the more I read about the machine learning community, the […]

Low-power pose

“The samples were collected in privacy, using passive drool procedures, and frozen immediately.” Anna Dreber sends along a paper, “Assessing the Robustness of Power Posing: No Effect on Hormones and Risk Tolerance in a Large Sample of Men and Women,” which she published in Psychological Science with coauthors Eva Ranehill, Magnus Johannesson, Susanne Leiberg, Sunhae […]

What was the worst statistical communication experience you’ve ever had?

In one of the jitts for our statistical communication class we asked, “What was the worst statistical communication experience you’ve ever had?” And here were the responses (which I’m sharing with permission from the students): Not sure if this counts, but I used to work with a public health researcher who published a journal article […]

“I do not agree with the view that being convinced an effect is real relieves a researcher from statistically testing it.”

Florian Wickelmaier writes: I’m writing to tell you about my experiences with another instance of “the difference between significant and not significant.” In a lab course, I came across a paper by Costa et al. [Cognition 130 (2) (2014) 236-254 ( In several experiments, they compare the effects in two two-by-two tables by comparing the […]

“The frequentist case against the significance test”

Richard Morey writes: I suspect that like me, many people didn’t get a whole lot of detail about Neyman’s objections to the significance test in their statistical education besides “Neyman thought power is important”. Given the recent debate about significance testing, I have gone back to Neyman’s papers and tried to summarize, for the modern […]