Skip to content

My next 170 blog posts (inbox zero and a change of pace)

I’ve successfully emptied my inbox:

Screen Shot 2016-07-30 at 11.47.28 PM

And one result was to fill up the blog through mid-January.

I think I’ve been doing enough blogging recently, so my plan now is to stop for awhile and instead transfer my writing energy into articles and books. We’ll see how it goes.

Just to give you something to look forward to, below is a list of what’s in the queue. I’m sure I’ll be interpolating new posts on this and that—there is an election going on, after all, indeed I just inserted a politics-related post two days ago. And other things come up sometime that just can’t wait. Also, my co-bloggers are free to post on Stan or whatever else they want, whenever they want.

But this is what’s on deck so far:

In policing (and elsewhere), regional variation in behavior can be huge, and perhaps can give a clue about how to move forward.

Guy Fieri wants your help! For a TV show on statistical models for real estate

The p-value is a random variable

“What can recent replication failures tell us about the theoretical commitments of psychology?”

Documented forking paths in the Competitive Reaction Time Task

Shameless little bullies claim that published triathlon times don’t replicate

Boostrapping your posterior

You won’t be able to forget this one: Alleged data manipulation in NIH-funded Alzheimer’s study

Are stereotypes statistically accurate?

Will youths who swill Red Bull become adult cocaine addicts?

Science reporters are getting the picture

Modeling correlation of issue attitudes and partisanship within states

Tax Day: The Birthday Dog That Didn’t Bark

The history of characterizing groups of people by their averages

Calorie labeling reduces obesity Obesity increased more slowly in California, Seattle, Portland (Oregon), and NYC, compared to some other places in the west coast and northeast that didn’t have calorie labeling

What’s gonna happen in November?

An ethnographic study of the “open evidential culture” of research psychology

Things that sound good but aren’t quite right: Art and research edition

Michael Porter as new pincushion

Kaiser Fung on the ethics of data analysis

One more thing you don’t have to worry about

Evil collaboration between Medtronic and FDA

His varying slopes don’t seem to follow a normal distribution

A day in the life

Letters we never finished reading

Better to just not see the sausage get made

Oooh, it burns me up

Birthdays and heat waves

Publication bias occurs within as well as between projects

Graph too clever by half

Take that, Bruno Frey! Pharma company busts through Arrow’s theorem, sets new record!

A four-way conversation on weighting and regression for causal inference

How paracompact is that?

In Bayesian regression, it’s easy to account for measurement error

Garrison Keillor would be spinning etc

“Brief, decontextualized instances of colaughter”

The new quantitative journalism

It’s not about normality, it’s all about reality

Hokey mas, indeed

You may not be interested in peer review, but peer review is interested in you

Hypothesis Testing is a Bad Idea (my talk at Warwick, England, 2:30pm Thurs 15 Sept)

Genius is not enough: The sad story of Peter Hagelstein, living monument to the sunk-cost fallacy

Bayesian Statistics Then and Now

No guarantee

Let’s play Twister, let’s play Risk

“Evaluating Online Nonprobability Surveys”


Pro Publica Surgeon Scorecard Update

Hey, PPNAS . . . this one is the fish that got away

FDA approval of generic drugs: The untold story

Acupuncture paradox update

More p-value confusion: No, a low p-value does not tell you that the probability of the null hypothesis is less than 1/2

Multicollinearity causing risk and uncertainty

Andrew Gelman is not the plagiarism police because there is no such thing as the plagiarism police.

Cracks in the thin blue line

Politics and chance

I refuse to blog about this one

“Find the best algorithm (program) for your dataset.”

NPR’s gonna NPR

Why the garden-of-forking-paths criticism of p-values is not like a famous Borscht Belt comedy bit

Don’t trust Rasmussen polls!

Astroturf “patient advocacy” group pushes to keep drug prices high

It’s not about the snobbery, it’s all about reality: At last, I finally understand hatred of “middlebrow”

The never-back-down syndrome and the fundamental attribution error

Michael Lacour vs John Bargh and Amy Cuddy

It’s ok to criticize

“The Prose Factory: Literary Life in England Since 1918” and “The Windsor Faction”

Note to journalists: If there’s no report you can read, there’s no study


No, I don’t think the Super Bowl is lowering birth weights

Gray graphs look pretty

Should you abandon that low-salt diet?

Transparency, replications, and publication

Is it fair to use Bayesian reasoning to convict someone of a crime?

“Marginally Significant Effects as Evidence for Hypotheses: Changing Attitudes Over Four Decades”

Some people are so easy to contact and some people aren’t.

Should Jonah Lehrer be a junior Gladwell? Does he have any other options?

Advice on setting up audio for your podcast

The Psychological Science stereotype paradox

We have a ways to go in communicating the replication crisis

Authors of AJPS paper find that the signs on their coefficients were reversed. But they don’t care: in their words, “None of our papers actually give a damn about whether it’s plus or minus.” All right, then!

Another failed replication of power pose

“How One Study Produced a Bunch of Untrue Headlines About Tattoos Strengthening Your Immune System”

Ptolemaic inference

How not to analyze noisy data: A case study

The problems are everywhere, once you know to look

“Generic and consistent confidence and credible regions”

Happiness of liberals and conservatives in different countries

Conflicts of interest

“It’s not reproducible if it only runs on your laptop”: Jon Zelner’s tips for a reproducible workflow in R and Stan

Unintentional parody of Psychological Science-style research redeemed by Dan Kahan insight

Rotten all the way through

Some modeling and computational ideas to look into

How to improve science reporting? Dan Vergano sez: It’s not about reality, it’s all about a salary

Kahan: “On the Sources of Ordinary Science Knowledge and Ignorance”

Why I prefer 50% to 95% intervals

How effective (or counterproductive) is universal child care? Part 1

How effective (or counterproductive) is universal child care? Part 2

“Another terrible plot”

Can a census-tract-level regression analysis untangle correlation between lead and crime?

The role of models and empirical work in political science

More on my paper with John Carlin on Type M and Type S errors

Should scientists be allowed to continue to play in the sandbox after they’ve pooped in it?

“Men with large testicles”

Sniffing tears perhaps not as effective as claimed

Thinking more seriously about the design of exploratory studies: A manifesto

From zero to Ted talk in 18 simple steps: Rolf Zwaan explains how to do it!

Individual and aggregate patterns in the Equality of Opportunity research project

Unfinished (so far) draft blog posts

Deep learning, model checking, AI, the no-homunculus principle, and the unitary nature of consciousness

How best to partition data into test and holdout samples?

Abraham Lincoln and confidence intervals

“Breakfast skipping, extreme commutes, and the sex composition at birth”

Discussion on overfitting in cluster analysis

Happiness formulas

OK, sometimes the concept of “false positive” makes sense.

“A bug in fMRI software could invalidate 15 years of brain research”

Interesting epi paper using Stan

How can you evaluate a research paper?

Some U.S. demographic data at zipcode level conveniently in R

So little information to evaluate effects of dietary choices

Frustration with published results that can’t be reproduced, and journals that don’t seem to care

Using Stan in an agent-based model: Simulation suggests that a market could be useful for building public consensus on climate change

Data 1, NPR 0

Dear Major Textbook Publisher

“So such markets were, and perhaps are, subject to bias from deep pocketed people who may be expressing preference more than actual expectation”

Temple Grandin

fMRI clusterf******

How to think about the p-value from a randomized test?

Avoiding selection bias by analyzing all possible forking paths

The social world is (in many ways) continuous but people’s mental models of the world are Boolean

Science journalist recommends going easy on Bigfoot, says you should bash of mammograms instead

Applying statistical thinking to the search for extraterrestrial intelligence

An efficiency argument for post-publication review

Bayes is better

What’s powdery and comes out of a metallic-green cardboard can?

Low correlation of predictions and outcomes is no evidence against hot hand

Jail for scientific fraud?

Is the dorsal anterior cingulate cortex “selective for pain”?

Quantifying uncertainty in identification assumptions—this is important!

If I had a long enough blog delay, I could just schedule this one for 1 Jan 2026

Historical critiques of psychology research methods

p=.03, it’s gotta be true!

Objects of the class “George Orwell”

Sorry, but no, you can’t learn causality by looking at the third moment of regression residuals

“The Pitfall of Experimenting on the Web: How Unattended Selective Attrition Leads to Surprising (Yet False) Research Conclusions”

Two unrelated topics in one post: (1) Teaching useful algebra classes, and (2) doing more careful psychological measurements

Transformative treatments

Comment of the year

Migration explaining observed changes in mortality rate in different geographic areas?

Fragility index is too fragile

When you add a predictor the model changes so it makes sense that the coefficients change too.

Nooooooo, just make it stop, please!

“Which curve fitting model should I use?”

We fiddle while Rome burns: p-value edition

The Lure of Luxury

Confirmation bias

Problems with randomized controlled trials (or any bounded statistical analysis) and thinking more seriously about story time

When do stories work, Process tracing, and Connections between qualitative and quantitative research

A small, underpowered treasure trove?

Problems with “incremental validity” or more generally in interpreting more than one regression coefficient at a time

No evidence of incumbency disadvantage?

To know the past, one must first know the future: The relevance of decision-based thinking to statistical analysis

Powerpose update

Absence of evidence is evidence of alcohol?

“Estimating trends in mortality for the bottom quartile, we found little evidence that survival probabilities declined dramatically.”

SETI: Modeling in the “cosmic haystack”

There should really be something here for everyone. I don’t remember half these posts myself, and I look forward to reading them when they come out!

P.S. It’s a good thing I blog for free because nobody could pay me enough for the effort that goes into it.


  1. Bruce McCullough says:


    I have been reading your blog for years. At least two of my articles are directly attributable to your blog; but for your blog I never would have written them or even thought of writing them. Your blog has definitely affected my approach to teaching, hopefully for the better (given measurement problems we may never know…) That’s a plus for my research and my teaching, and I’m sure I’m not the only one. (Now, if you could just help me with my service…. Actually, you have! You gave me some help when I was AE at a journal and had a problem.)

    I was at a small, off-the-record conference last year, where the attendees mostly engaged in field experiments in development. This is millions of dollars of research money; maybe tens of millions over the years. One of the speakers clearly was heading in the direction of the “garden of forking paths”. I waited for him to say the words. When he said, “garden of forking paths” I looked around the room, and at least a third of the heads were nodding knowingly. At a coffee break, I circulated and asked several persons about your paper. They’d all read it, and they all read your blog at least occasionally. So your blog is having a demonstrable effect on policy activities.



    • Rahul says:

      I just love blog posts over (say) articles because if you say something on a blog it doesn’t go easily uncontested. By the end of the comment thread one has a good view of the flaws in the original argument because others pick it apart, sometimes mercilessly.

      In that sense articles are relatively risky, you just take the author at face value. Even if others realize the absolute stupidity of some of the argument they cannot easily communicate their criticism. Each one of us has to read and analyze the argument from scratch.

      In other words, my belief in the soundness (or unsoundness) of something posted on a popular blog is far greater than that in an article.

  2. Rahul says:

    I wonder if Funding Agencies are / ever will be progressive enough to fund a good blogger’s blogging just because of the impact it has on improving science in general.

    If they can fund outreach & science communication why not blogging, eh?

  3. Hernan Bruno says:

    Any advice regarding email management?

    PS: I second the comment praising the impact of this blog. It helped me come to STAN earlier than I would otherwise have. Which has been great for two research projects. Plus all the material useful for teaching. Thanks!

  4. jd says:

    I was just thinking of suggesting that you write about the ego-depletion replication failure, but looks like your plate is already full!

  5. Martha (Smith) says:

    “Boostrapping your posterior”

    Is this corporal punishment by ghosts, or just a typo?

    (Somehow I managed to post this on Thanks eBay! by mistake.)

  6. cugrad says:

    wait, what? you are gonna stop blogging for a while? that’s so not Andrew though!

    • Andrew says:


      A couple days ago I opened up a file on my computer where I dump material that could be suitable for blogs or maybe could appear in some other formats such as op-eds, research articles, books, etc. So now when something is interesting in that way, I have it there in the file, and I can blog it whenever. (Obv no rush on that.)

      The trouble was, I’d have an interesting item, I’d open up a new post on the blog, I’d paste in the item and write some amusing prose around it, this would make me think of new things, I’d paste in some links, maybe find a good image to go with everything, think of a clever title, . . . , it could take 10 minutes or a half hour or more. The time wasn’t wasted—I’d have put in the time to express myself more clearly, and I’d have explored new ideas—but it was time that I could otherwise have put into writing a book, for example. With 170 blog posts in the queue, it seemed to make sense to do some of my writing in a different form.

  7. Tim says:

    This blog is the best, have been reading it for years now, thanks for the effort!
    Can’t wait to find out what these are about: “Men with large testicles” and “Nooooooo, just make it stop, please!”

  8. As a newcomer to the blog, I am grateful for the towering queue that will accompany your break. I also appreciate the reasons for the break.

    Reading can be hazardous, anyway. I had to stop reading your article “No Trump!” (or even thinking about it) because it made me laugh too hard. I got stitches yesterday, and the laughing started to pull at them. Something similar happened when I read “Enough with the Replication Police.” So maybe the measured pace of the blog will encourage moderation in the readers, who will spread their moderation outward to the world. That one little change will revolutionize something or other (hint: TED talk!).

    Then again, there’s always the temptation to reread….

  9. Anonymous says:

    “Why the garden-of-forking-paths criticism of p-values is not like a famous Borscht Belt comedy bit”.

    Can’t wait for this one! I have been reading the garden of forking path article ( and i don’t understand the main point of it. I am not smart enough to follow the formula’s, but i tried to understand the other pieces of the paper.

    It seems to me to want to distinguish “fishing expeditions” from “the garden of forking paths”, but i have a hard time finding out exactly how they differ.

    What does it mean when researchers “compute a single test based on the data” and how does this differ from “reporting J tests and reporting the best result” (p. 2)?

    The same thing in slightly different words can be found on page 3:

    “And it would take a highly unscrupulous researcher to perform test after test in a search for statistical significance (which could almost certainly be found at the 0.05 or even the 0.01 level, given all the options above and the many more that would be possible in a real study). We are not suggesting that researchers generally do such a search. What we are suggesting is that, given a particular data set, it is not so difficult to look at the data and construct completely reasonable rules for data exclusion, coding, and data analysis that can lead to statistical significance – thus, the researcher needs only perform one test, but that test is conditional on the data;”

    I assume that when researchers “look at the data” this implies performing statistical tests to see which ones are significant and then “construct completely reasonable rules for data exclusion, coding, and data-analysis that can lead to statistical significance”. If this is correct, then i don’t understand how this differs from “performing test after test in a search for statistical significance”.

    What am i not understanding?

    • Andrew says:


      You could consider the garden of forking paths to be an informal version of p-hacking. What happens sometimes is that researchers look at their data and make decisions about data coding, data exclusion, and model specification without formally running any tests. Rather, they make what seem to be reasonable decisions, but these decisions are contingent on their data: had the data been different, they could well have chosen different rules for data exclusion, coding, and analysis. They’re not performing test after test, but their analyses have the same problem.

      This all comes up because sometimes I’ve heard people declare that they were not p-hacking because they only performed this one analysis on their data. But what’s relevant for interpreting these p-values is not just what analyses they happened to do, what’s also relevant is the analyses they would’ve done, had the data been different. Asking such questions can make people uncomfortable, but this sort of counterfactual is central to the definition of the p-value.

      • Martha (Smith) says:

        I like to give the following example:

        A group of researchers plans to compare three dosages of a drug in a clinical trial. There’s no intent to compare effects broken down by sex, but the sex of the subjects is recorded as part of routine procedure. The pre-planned comparisons show no statistically significant difference between the three dosages when the data are not broken down by sex. However, since the sex of the patients is known, the researchers decide to look at a chart of the outcomes broken down by combination of sex and dosage. They notice that the results for women in the high-dosage group look much better than the results for the men in the low dosage group, and decide to perform a hypothesis test to check that out. But in looking at the chart, they have informally compared (6×5)/2 = 15 pairs of dosage×sex combinations.

  10. Denis Cote says:

    Thank you Andrew for this blog. I think you are a wonderful example of a public intellectual, caring about spreading important ideas. Your impact has been far greater with this blog than with traditional scientific publications. You bring people to the scientific publications by bringing us behind the sceces. You definitelty enlightened me with your blog.

Leave a Reply