Skip to content

On deck this week

Mon: “Regular Customer: It was so much easier when I was a bum. I didn’t have to wake up at 4am to go to work, didn’t have all these bills and girlfriends.”

Tues: Rational != Self-interested

Wed: When there’s a lot of variation, it can be a mistake to make statements about “typical” attitudes

Thurs: “Science does not advance by guessing”

Fri: When am I a conservative and when am I a liberal (when it comes to statistics, that is)?

Sat: Science tells us that fast food lovers are more likely to marry other fast food lovers

Sun: 10th anniversary of “Statistical Modeling, Causal Inference, and Social Science”

On deck this month

Lots of good stuff in the queue:

“Regular Customer: It was so much easier when I was a bum. I didn’t have to wake up at 4am to go to work, didn’t have all these bills and girlfriends.”

Rational != Self-interested

When there’s a lot of variation, it can be a mistake to make statements about “typical” attitudes

“Science does not advance by guessing”

When am I a conservative and when am I a liberal (when it comes to statistics, that is)?

Science tells us that fast food lovers are more likely to marry other fast food lovers

10th anniversary of “Statistical Modeling, Causal Inference, and Social Science”

In one of life’s horrible ironies, I wrote a paper “Why we (usually) don’t have to worry about multiple comparisons” but now I spend lots of time worrying about multiple comparisons

The Fault in Our Stars: It’s even worse than they say

Buggy-whip update

The inclination to deny all variation

Hoe noem je?

“Your Paper Makes SSRN Top Ten List”

The Fallacy of Placing Confidence in Confidence Intervals

Try a spaghetti plot

I ain’t got no watch and you keep asking me what time it is

Some questions from our Ph.D. statistics qualifying exam

Solution to the helicopter design problem

Solution to the problem on the distribution of p-values

Solution to the sample-allocation problem

A key part of statistical thinking is to use additive rather than Boolean models

Yes, I’ll help people for free but not like this!

I love it when I can respond to a question with a single link

Boo! Who’s afraid of availability bias?

That last one is a special Halloween-themed post. I hope you enjoy it.

Anova is great—if you interpret it as a way of structuring a model, not if you focus on F tests

Shravan Vasishth writes:

I saw on your blog post that you listed aggregation as one of the desirable things to do. Do you agree with the following argument? I want to point out a problem with repeated measures ANOVA in talk:

In a planned experiment, say a 2×2 design, when we do a repeated measures ANOVA, we aggregate all responses by subject for each condition. This actually leads us to underestimate the variability within subjects. The better way is to use linear mixed models (even in balanced designs) because they allow us to stay faithful to the experiment design and to describe how we think the data were generated.

The issue is that in a major recent paper the authors did an ANOVA after they fail to get statistical significance with lmer. Even ignoring the cheating and p-value chasing aspect of it, I think that using ANOVA is statistically problematic for the above reason alone.

My response: Yes, this is consistent with what I say in my 2005 Anova paper, I think. But I consider that sort of hierarchical model to be a (modern version of) Anova. As a side note, classical Anova is kinda weird because it is mostly based on point estimates of variance parameters. But classical textbook examples are typically on the scale of 5×5 datasets, and in these cases the estimated variances are very noisy.

Carrie McLaren was way out in front of the anti-Gladwell bandwagon

Here she was back in 2005, way before Gladwell-bashing became cool.

65% of principals say that at least 30% of students . . . wha??

Alan Sloane writes:

The OECD put out a report drawing on their PISA and TALIS data:

I notice that it’s already attracted a NY Times op-ed by David Leonhart:

There are a number of things I find strange in its analysis and interpretation but, for starters, there’s the horizontal axis in the chart that’s reproduced in both the original and the NYT piece. As best I can tell the data is actually drawn from Table 2.4A here:

So what’s actually being measured for each country is “the percentage of teachers working in schools whose principals estimated that 30% or more of their pupils came from socioeconomically disadvantaged homes”. Then what’s initially interesting in the discussion is how the measures on the vertical axis (a supposedly “objective” measure of disadvantage used in the PISA survey) differ from those on the horizontal, i.e. looking at points that lie significantly above or below the diagonal. So Brazil and the US are obvious outliers, although Singapore, Serbia and Croatia are by a proportional measure also fairly notable. So what caught me first is that this measure is obviously affected by the distribution of disadvantage across schools, e.g. if disadvantage (PISA-measure) is concentrated spatially then you can get a high score on the horizontal axis without having a correspondingly high score on the vertical axis. A highly skewed distribution of school size will also affect things (as I guess will a skewed distribution of teachers, but presumably that’s highly correlated with school size).

The discussion on the third dimension, shown in the bubbles, also seems to me to be dubious, but that’s more complicated.

I don’t really have anything to say on this except that I agree these numbers are hard to interpret.

Rss move

Our RSS feed is now directly accessible via – no need to go through feedburner. You need to resubscribe to the feed.

International Journal of Epidemiology versus Hivemind and the Datagoround


The Hivemind wins (see the comment thread here, which is full of detective work from various commenters).

As I wrote as a postscript to that earlier post, maybe we should call this the “stone soup” or “Bem” phenomenon, when a highly flawed work stimulates interesting, thoughtful discussion.

In defense of stories and classroom activities, from a resubmission letter from 1999

I was going through my files looking for some old data (which I still haven’t found!) and came across a letter from 1999 accompanying the submission of a revision of this article with Glickman.

Here’s a part of the letter, a response to some questions of one of the reviewers:

With regard to the comment that “You present absolutely no evidence that any of these demonstration methods is actually helpful. For at least a couple of these demonstrations you need to collect data to see if your tools are helping in understanding the concept. I will let you worry about how to measure this but this is a must”:

Of course, your statement is true, but consider the alternative, which is to do examples like this on the blackboard. We haven’t seen Moore & McCabe or Mosteller or anyone else conducting experiments to show that class-participation demos are _not_ better than straight lectures. And, given this state of uncertainty, we think that it’s useful to consider this alternative approach to teaching this material.

We agree that it would be a good idea for someone to collect data on the effectiveness of various teaching approaches. As all are well aware, this is a potentially huge research project. In the meantime, we think that presenting a bunch of demos in an easy-to-use format is potentially a major contribution. Our feeling is that a paper like this should have either (a) some really cool stuff that people can go out and use right away, or (b) some perhaps-boring stuff but with some evidence that it “works” (e.g., studies showing that students learn better when they work in groups). We think that there is room in the literature for papers like ours of type (a) and also other papers of type (b).

You might also notice that all the papers of the form, “A new proof of the central limit theorem” or whatever, never seem to have evidence of whether they are effective in class. Why? Because it seems evident that if such a new proof can increase statistical understanding, then it’s a good thing and can in some way be usefully integrated into a course. We think this is similar with the demos in our paper: they are ultimately about increasing understanding by focusing on the fact that statistics is, in reality, a participatory process with many actors. This is a deep truth which is obscured when a professor merely does blackboard material. (We have added this point in the conclusion to our article.)

. . .

Finally, the referee writes, “I think this paper needs more work so that it is not just a set of interesting stories.” Actually, I think that interesting stories (with useful directions) is not a bad thing. I wouldn’t want all the Teacher’s corner articles to be like that, but the occasional such article, if of high quality, is a contribution, I believe, in that people might actually read the article and use it to improve their teaching.

I continue to hold and express this pluralistic attitude toward research and publication.

Can anyone guess what went wrong here?

OK, here’s a puzzle for all of you. I received the following email:

Dear Professor Gelman:

The editor of ** asked me to write to see if you would be willing to review MS ** entitled


We are hoping for a review within the next 2-3 weeks if possible. I would appreciate if you confirm whether you are willing to advise me on this by clicking on the url below


This site will also not only allow you to choose an alternative due date, but also to suggest alternative referees if you are unable to review.

If you choose to review the manuscript you can upload your report and cover letter via our secure online form at


This is a secure form and your report will be transmitted anonymously. You should supply either the title or the MS number, **, to ensure that your report is properly filed.

Thanks for your assistance. I very much value your advice.


I’ve omitted identifying details as there’s no point in embarrassing the journal editor. We all make mistakes, and this is not a big one.

Anyway, here’s the riddle: What was horribly wrong about the above email?

And here’s a hint: There’s no way you can figure out the problem merely from what I’ve sent you above. You’ll have to guess.

And another hint: The email came from a legitimate journal, not one of those “predatory” or spam journals.

I’ll give the answer tomorrow, but I’m guessing some of you will figure this out right away.

P.S. OK, OK, you win. Everybody guessed it already (see comments). I guess this puzzle was too easy.

Are Ivy League schools overrated?

I won’t actually answer the above question, as I am offering neither a rating of these schools nor a measure of how others rate them (which would be necessary to calibrate the “overrated” claim). What I am doing is responding to an email from Mark Palko, who wrote:

I [Palko] am in broad agreement with this New Republic article by William Deresiewicz [entitled "Don't Send Your Kid to the Ivy League: The nation's top colleges are turning our kids into zombies"] and I’ll try to blog on it if I can get caught up with more topical threads. I was particularly interested in the part about there being a “non-aggression pact” outside of the sciences.

This fits in with something I’ve noticed. I know this sounds harsh, but when I run across someone who is at the top of their profession and yet seems woefully underwhelming, they often have Ivy League BAs in non-demanding majors (For example, Jeff Zucker, Harvard, History. John Tierney, Yale, American Studies). My working hypothesis is that, while everyone who graduates from an elite school has an advantage in terms of reputation and networks, the actual difficulty of completing certain degrees isn’t that high relative to non-elite schools. Thus a history degree from Harvard isn’t worth that much more than a history degree from a Cal State school.

And David Brooks graduated from the University of Chicago with a degree in history . . .

In all seriousness, I don’t know if I agree with the claim in the headline of that article Palko links to.

I was very impressed by some of the Harvard undergrads I taught. Then again, they were statistics majors. In the old days, statistics might have been considered the soft option compared to math, but I don’t think that’s the case anymore. If anything, math majors are sometimes the sleepwalkers who happened to be good at math in school and never thought of stepping off the track. Anyway, it’s hard for me to make any general statements considering that I don’t teach many undergrads at all at Columbia.

Palko responded:

Yeah, I don’t want to put down Harvard grads, even the history majors. I’m sure that a disproportionate number of the brightest, most promising young historians are working on Harvard B.A. What’s more, I suspect most of them are developing valuable relationships with some of the most important names in their field.

What I’m wondering about is the popular notion that Ivy League schools are hard to get into and hard to get through. The first part is certainly true and the second appears to be true for STEM (which also has an additional self-selection bias). I’m not just not sure if it holds for all fields.

I don’t think there’s any question that selection bias, networking opportunities and halo effects play a large role here. What if they account for most of the benefit of attending an elite school for most students? This is worrisome from both sides: students are twisting themselves into knots to meet artificial and frankly somewhat odd selection criteria; and we’re giving the students who meet these odd criteria huge advantages in terms of wealth, career, and influence.

That can’t be good.