## Handy statistical lexicon

These are all important methods and concepts related to statistics that are not as well known as they should be. I hope that by giving them names, we will make the ideas more accessible to people:

Mister P: Multilevel regression and poststratification.

The Secret Weapon: Fitting a statistical model repeatedly on several different datasets and then displaying all these estimates together.

The Superplot: Line plot of estimates in an interaction, with circles showing group sizes and a line showing the regression of the aggregate averages.

The Folk Theorem: When you have computational problems, often there’s a problem with your model.

The Pinch-Hitter Syndrome: People whose job it is to do just one thing are not always so good at that one thing.

Weakly Informative Priors: What you should be doing when you think you want to use noninformative priors.

P-values and U-values: They’re different.

Conservatism: In statistics, the desire to use methods that have been used before.

WWJD: What I think of when I’m stuck on an applied statistics problem.

Theoretical and Applied Statisticians, how to tell them apart: A theoretical statistician calls the data x, an applied statistician says y.

The Fallacy of the One-Sided Bet: Pascal’s wager, lottery tickets, and the rest.

Alabama First: Howard Wainer’s term for the common error of plotting in alphabetical order rather than based on some more informative variable.

The USA Today Fallacy: Counting all states (or countries) equally, forgetting that many more people live in larger jurisdictions, and so you’re ignoring millions and millions of Californians if you give their state the same space you give Montana and Delaware.

Second-Order Availability Bias: Generalizing from correlations you see in your personal experience to correlations in the population.

The “All Else Equal” Fallacy: Assuming that everything else is held constant, even when it’s not gonna be.

The Self-Cleaning Oven: A good package should contain the means of its own testing.

The Taxonomy of Confusion: What to do when you’re stuck.

The Blessing of Dimensionality: It’s good to have more data, even if you label this additional information as “dimensions” rather than “data points.”

Scaffolding: Understanding your model by comparing it to related models.

Ockhamite Tendencies: The irritating habit of trying to get other people to use oversimplified models.

Bayesian: A statistician who uses Bayesian inference for all problems even when it is inappropriate. I am a Bayesian statistician myself.

Multiple Comparisons: Generally not an issue if you’re doing things right but can be a big problem if you sloppily model hierarchical structures non-hierarchically.

Taking a Model Too Seriously: Really just another way of not taking it seriously at all.

God is in Every Leaf of Every Tree: No problem is too small or too trivial if we really do something about it.

As They Say in the Stagecoach Business: Remove the padding from the seats and you get a bumpy ride.

Story Time: When the numbers are put to bed, the stories come out.

The Foxhole Fallacy: There are no X’s in foxholes (where X = people who disagree with me on some issue of faith).

The Pinocchio Principle: A model that is created solely for computational reasons can take on a life of its own.

The Statistical Significance Filter: If an estimate is statistically significant, it’s probably an overestimate.

Arrow’s Other Theorem (weak form): Any result can be published no more than five times.

Arrow’s Other Theorem (strong form): Any result will be published five times.

The Ramanujan Principle: Tables are read as crude graphs.

The Paradox of Philosophizing: If philosophy is outlawed, only outlaws will do philosophy.

Defaults: What statistics is the science of.

Default, the greatest trick it ever pulled: Convincing the world it didn’t exist.

The Methodological Attribution Problem: The many useful contributions of a good statistical consultant, or collaborator, will often be overly attributed to the statistician’s methods or philosophy.

The Chris Rock Effect: Some graphs give the pleasant feature of visualizing things we already knew, shown so well that we get a shock of recognition, the joy of relearning what we already know, but seeing it in a new way that makes us think more deeply about all sorts of related topics.

The Freshman Fallacy: Just because a freshman might raise a question, that does not make the issue irrelevant.

The Garden of Forking Paths: Multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time.

The One-Way Street Fallacy: Considering only one possibility of a change that can go in either direction.

The Pluralist’s Dilemma: How to recognize that my philosophy is just one among many, that my own embrace of this philosophy is contingent on many things beyond my control, while still expressing the reasons why I prefer my philosophy to the alternatives (at least for the problems I work on).

More Vampirical Than Empirical: Those hypotheses that are unable to be killed by mere evidence. (from Jeremy Freese)

Statistical Chemotherapy: It slightly poisons your key result but shifts an undesired result above the .05 threshold. (from Jeremy Freese)

Tell Me What You Don’t Know: That’s what I want to ask you.

Salad Tongs: Not to be used for painting.

The Edlin Factor: How much you should scale down published estimates.

Kangaroo: When it is vigorously jumping up and down, don’t use a bathroom scale to weigh a feather that is resting loosely in its pouch.

The Speed Racer Principle: Sometimes the most interesting aspect of a scientific or cultural product is not its overt content but rather its unexamined assumptions.

Uncertainty Interval: Say this instead of confidence or credible interval.

What would you do if you had all the data?: Rubin’s first question.

What were you doing before you had any data?: Rubin’s second question.

The Time-reversal Heuristic: How to think about a published finding that is followed up by an unsuccessful replication.

Clarke’s Law: Any sufficiently crappy research is indistinguishable from fraud.

The wedding, never about the marriage: With scientific journals, what it’s all about.

The problem with peer review: The peers.

The “What does not kill my statistical significance makes it stronger” fallacy: The belief that statistical significance is particularly impressive when it was obtained under noisy conditions.

Reverse Poe: It’s evidently sincere, yet its contents are parodic.

The (Lance) Armstrong Principle: If you push people to promise more than they can deliver, they’re motivated to cheat.

The Chestertonian Principle: Extreme skepticism is a form of credulity.

The most important aspect of a statistical method: not what it does with the data but rather what data it uses.

The Pandora Principle: Once you’ve considered a possible interaction or bias or confounder, you can’t un-think it.

The Paradox of Influence: Anticipated influence becomes valueless if you end up saying whatever it takes to keep it.

Cantor’s Corner: Where you want to be.

Correlation: It does not even imply correlation.

The Javert Paradox: Suppose you find a problem with published work. If you just point it out once or twice, the authors of the work are likely to do nothing. But if you really pursue the problem, then you look like a Javert.

Eureka bias: When you think you made a discovery and then you don’t want to give it up, even if it turns out you interpreted your data wrong.

A picture plus 1000 words: Better than two pictures or 2000 words.

The piranha problem: These large effects can’t all coherently coexist.

I know there are a bunch I’m forgetting; can youall refresh my memory, please? Thanks.

P.S. No, I don’t think I can ever match Stephen Senn in the definitions game.

1. marcel says:

In WWJD, you say, "My quick answer is, Yeah, I think it would be excellent for an econometrics class if the students have applied interests. Probably I'd just go through chapter 10 (regression, logistic regression, glm, causal inference), with the later parts being optimal."

So just skip the earlier parts?

2. Andrew Gelman says:

Marcel: When I say "through chapter 10," I mean, "from chapters 1 through 10." And in the last sentence above, I meant "optional," not "optimal." I'll fix that.

3. jonathan says:

Mister P, huh? Isn't that reflective of the old male dominant paradigm?

4. Ken Williams says:

I'm not grokking what "WWJD" stands for. "What Would Jennifer Do"?

5. Andrew Gelman says:

y

6. […] There’s something that fascinates me about this aggressive anti-Bayesians: it’s not enough for them to simply restrict their own practice to non-Bayesian methods; they have to go the next step and put down Bayesian methods that they don’t even understand. This topic comes up from time to time on this blog, for example in discussing the uninformed rants of David Hendry (“I don’t know why he did this, but maybe it’s part of some fraternity initiation thing, like TP-ing the dean’s house on Halloween”), John DiNardo (“if philosophy is outlawed, only outlaws will do philosophy”), and various others (the Foxhole Fallacy). […]

7. […] data, which is the #1 goal of an infographic—if it does work, it’s doing so using the Chris Rock effect, in which we enjoy the shock of recognition of a familiar idea presented in an unfamiliar […]

8. […] Fung has written about “story time“: after researchers do the hard work of causal identification and statistical analysis, they […]

9. […] context of reporting the latest on hurricanes/himmicanes, Freese comes up with a new one for the lexicon. Considering the latest manipulations performed by the hurricanes/himmicanes people, the […]

10. […] the data, and this can occur even if the existing data were analyzed in only one way. This is the garden of forking paths, of which Eric Loken and I give several examples in our paper (and it’s easy enough to find […]

11. […] That last one is an appropriate response to the Freshman Fallacy. […]

12. […] if some nitpickers can argue on the edges about this or that. But it doesn’t work that way. The garden of forking paths is multiplicative, and with enough options it’s not so hard to multiply up to factors of […]

13. […] to . . . an interview with Chris Rock, who lets off some zingers. Also, Rock has a statistical effect named after him. So he moves to the third […]

14. […] analysis, and concomitant immersion in the internet. I landed on Andrew Gelman’s stat blog and remembered that ‘humor’ is a great approach and natural response to dealing with […]

15. […] 2011: Various episodes of scientific misconduct hit the news. Diederik Stapel is kicked out of the pscyhology department at Tilburg University and Marc Hauser leaves the psychology department at Harvard. These and other episodes bring attention to the Retraction Watch blog. I see a connection between scientific fraud, sloppiness, and plain old incompetence: in all cases I see researchers who are true believers in their hypotheses, which in turn are vague enough to support any evidence thrown at them. Recall Clarke’s Law. […]

16. […] using abundant researcher degrees of freedom. It’s the paradigm of the theory that in the words of sociologist Jeremy Freese, is “more vampirical than empirical—unable to be killed by […]

17. […] evidence that a study’s design was efficient for research purposes. And that’s where the “What does not kill my statistical significance makes it stronger” fallacy comes back […]

18. […] to be shored up with robustness studies, then the result is taken as a stylized fact and it’s story time. There’s nothing particularly bad about this particular paper, indeed their general […]

19. […] individual data, or you can characterize your entire population in terms of x’s and then do Mister P. (That is, you can poststratify; here BART is playing the role of the multilevel model.) It’s […]

20. […] above demonstrates some forking paths, and there are a bunch more in the published paper, for […]

21. […] And the new method they’re using is multilevel regression and poststratification (MRP or Mister P), which was developed by . . . me! And I’m a political scientist […]

22. […] more stable and reasonable output. 3. When looking at the published literature, use some sort of Edlin factor to interpret the claims being made based on biased […]

23. […] like is for the fallacy I described to have a name—even better if it could be listed on your lexicon page. Maybe “The Null Hypothesis Screening Fallacy” or something. Then I could just refer to […]

24. […] line between idiocy and malice isn’t always clear” . . . that reminds be a bit of Clarke’s law, relating to the fine line between scientific incompetence and scientific fraud. At what point does […]

25. […] With continuous data you just have so much more to work with. Remember the adage that the most important aspect of a statistical method is not what it does with the data but what data it […]

26. […] The “What does not kill my statistical significance makes it stronger” fallacy, right there in black and white. This one’s even better than the quote I used in my blog […]

27. […] patients, which turns out to be largely wrong?”, nor am I asking, “What’s a good Edlin factor for clinical research […]

28. […] am proposing a new term: DOCO. I will, in spirit, add it to the already impressive list of useful terminology. DOCO stands for Data(or datum) Otherwise Considered […]

29. […] more stable and reasonable output. 3. When looking at the published literature, use some sort of Edlin factor to interpret the claims being made based on biased […]

30. […] There are different ways to attack this problem but my preferred solution is to use Mister P: […]

31. […] Indeed. And remember the time-reversal heuristic. […]

32. […] that robustness checks lull people into a false sense of you-know-what. It’s a bit of the Armstrong principle, actually: You do the robustness check to shut up the damn reviewers, you have every motivation for […]

33. Zack says:

I can’t decide if I’m very happy or very annoyed that this exists.

On the one hand, I love learning about ALL of this stuff, especially the more subtle fallacies.

But on the other hand, my list of things to read just exploded exponentially.

So, thank you. Jerk.

34. One can just relegate thinking to the dustbin of history b/c much thinking, more generally is constituted from these concepts & methods. Statistics if enabling such thinking will be futile. That’s what I myself have been trying to convey to my circles. I think we are due for new epistemics/epistemology. I can visualize some dimensions already. But how to communicate it is my challenge.

I have identified some individuals who I think can make superb contributions. This forum too can be helpful.

35. Andrew,

It would be great if you got John Ioannidis here to debate the p-value debate. What is its disposition? Everyone goes off on leaving just shy of making an impact debate wise. Is one to conclude that this debate on backburner?

36. […] terrible, and it brings to mind the Armstrong principle. If this is really happening, something should be done about it […]

37. […] 2. Without a culture of transparency, there is an incentive to cheat. OK, a short-term incentive. Long-term, if your goal is scientific progress, cheating can just send you down the wrong track, or at best a random track. Cheating can get you publication and fame. There’s also an incentive for a sort of soft cheating, in which researchers pursue a strategy of active incompetence. Recall Clarke’s Law. […]

38. […] Clarke’s Law. Remember Clarke’s Third Law: Any sufficiently crappy research is indistinguishable from fraud. […]

39. […] reminds me of Clarke’s Law: Any sufficiently crappy research is indistinguishable from fraud. Just to clarify: I’m not […]

40. […] let their guard down that they reveal interesting aspects of their life and times. It’s the Speed Racer principle: Sometimes the most interesting aspect of a cultural product is not its overt content but rather […]

41. […] It’s related to the fallacy of the one-sided bet. […]