Here they are. I love seeing all the titles lined up in one place; it’s like a big beautiful poem about statistics: After Peptidegate, a proposed new slogan for PPNAS. And, as a bonus, a fun little graphics project. “Developers Who Use Spaces Make More Money Than Those Who Use Tabs” Question about the secret […]

**Decision Theory**category.

## The Publicity Factory: How even serious research gets exaggerated by the process of scientific publication and media exposure

The starting point is that we’ve seen a lot of talk about frivolous science, headline-bait such as the study that said that married women are more likely to vote for Mitt Romney when ovulating, or the study that said that girl-named hurricanes are more deadly than boy-named hurricanes, and at this point some of these […]

## How has my advice to psychology researchers changed since 2013?

Four years ago, in a post entitled, “How can statisticians help psychologists do their research better?”, I gave the following recommendations to researchers: – Analyze all your data. – Present all your comparisons. – Make your data public. And, for journal editors, I wrote, “if a paper is nothing special, you don’t have to publish […]

## All the things we have to do that we don’t really need to do: The social cost of junk science

I’ve been thinking a lot about junk science lately. Some people have said it’s counterproductive or rude of me to keep talking about the same few examples (actually I think we have about 15 or so examples that come up again and again), so let me just speak generically about the sort of scientific claim […]

## The Other Side of the Night

Don Green points us to this quantitative/qualitative meta-analysis he did with Betsy Levy Paluck and Seth Green. The paper begins: This paper evaluates the state of contact hypothesis research from a policy perspective. Building on Pettigrew and Tropp’s (2006) influential meta-analysis, we assemble all intergroup contact studies that feature random assignment and delayed outcome measures, […]

## PCI Statistics: A preprint review peer community in statistics

X informs me of a new effort, “Peer community in . . .”, which describes itself as “a free recommendation process of published and unpublished scientific papers.” So far this exists in only one field, Evolutionary Biology. But this looks like a great idea and I expect it will soon exist in statistics, political science, […]

## This company wants to hire people who can program in R or Python and do statistical modeling in Stan

Doug Puett writes: I am a 2012 QMSS [Columbia University Quantitative Methods in Social Sciences] grad who is currently trying to build a Data Science/Quantitative UX team, and was hoping for some advice. I am finding myself having a hard time finding people who are really interested in understanding people and who especially are excited […]

## How is a politician different from a 4-year-old?

A few days ago I shared my reactions to an op-ed by developmental psychologist Alison Gopnik. Gopnik replied: As a regular reader of your blog, I thought you and your readers might be interested in a response to your very fair comments. In the original draft I had an extra few paragraphs (below) that speak […]

## Some natural solutions to the p-value communication problem—and why they won’t work.

John Carlin and I write: It is well known that even experienced scientists routinely misinterpret p-values in all sorts of ways, including confusion of statistical and practical significance, treating non-rejection as acceptance of the null hypothesis, and interpreting the p-value as some sort of replication probability or as the posterior probability that the null hypothesis […]

## How to interpret “p = .06” in situations where you really really want the treatment to work?

We’ve spent a lot of time during the past few years discussing the difficulty of interpreting “p less than .05” results from noisy studies. Standard practice is to just take the point estimate and confidence interval, but this is in general wrong in that it overestimates effect size (type M error) and can get the […]

## A completely reasonable-sounding statement with which I strongly disagree

From a couple years ago: In the context of a listserv discussion about replication in psychology experiments, someone wrote: The current best estimate of the effect size is somewhere in between the original study and the replication’s reported value. This conciliatory, split-the-difference statement sounds reasonable, and it might well represent good politics in the context […]

## 7th graders trained to avoid Pizzagate-style data exploration—but is the training too rigid?

[cat picture] Laura Kapitula writes: I wanted to share a cute story that gave me a bit of hope. My daughter who is in 7th grade was doing her science project. She had designed an experiment comparing lemon batteries to potato batteries, a 2×4 design with lemons or potatoes as one factor and number of […]

## What hypothesis testing is all about. (Hint: It’s not what you think.)

From 2015: The conventional view: Hyp testing is all about rejection. The idea is that if you reject the null hyp at the 5% level, you have a win, you have learned that a certain null model is false and science has progressed, either in the glamorous “scientific revolution” sense that you’ve rejected a central […]

## The Bolt from the Blue

Lionel Hertzog writes: In the method section of a recent Nature article in my field of research (diversity-ecosystem function) one can read the following: The inclusion of many predictors in statistical models increases the chance of type I error (false positives). To account for this we used a Bernoulli process to detect false discovery rates, […]

## “The earth is flat (p > 0.05): Significance thresholds and the crisis of unreplicable research”

Valentin Amrhein, Fränzi Korner-Nievergelt, and Tobias Roth write: The widespread use of ‘statistical significance’ as a license for making a claim of a scientific finding leads to considerable distortion of the scientific process. We review why degrading p-values into ‘significant’ and ‘nonsignificant’ contributes to making studies irreproducible, or to making them seem irreproducible. A major […]

## Blue Cross Blue Shield Health Index

Chris Famighetti points us to this page which links to an interactive visualization. There are some problems with the mapping software—when I clicked through, it showed a little map of the western part of the U.S., accompanied by huge swathes of Canada and the Pacific Ocean—and I haven’t taken a look at the methodology. But […]

## Using prior knowledge in frequentist tests

Christian Bartels send along this paper, which he described as an attempt to use informative priors for frequentist test statistics. I replied: I’ve not tried to follow the details but this reminds me of our paper on posterior predictive checks. People think of this as very Bayesian but my original idea when doing this research […]

## Would you prefer three N=300 studies or one N=900 study?

Stephen Martin started off with a question: I’ve been thinking about this thought experiment: — Imagine you’re given two papers. Both papers explore the same topic and use the same methodology. Both were preregistered. Paper A has a novel study (n1=300) with confirmed hypotheses, followed by two successful direct replications (n2=300, n3=300). Paper B has […]