Psychology researcher Alison Gopnik discusses the idea that some of the systematic problems with human reasoning can be explained by systematic flaws in the statistical models we implicitly use.
I really like this idea and I’ll return to it in a bit. But first I need to discuss a minor (but, I think, ultimately crucial) disagreement I have with how Gopnik describes Bayesian inference. She writes:
The Bayesian idea is simple, but it turns out to be very powerful. It’s so powerful, in fact, that computer scientists are using it to design intelligent learning machines, and more and more psychologists think that it might explain human intelligence. Bayesian inference is a way to use statistical data to evaluate hypotheses and make predictions. These might be scientific hypotheses and predictions or everyday ones.
So far, so good. Next comes the problem (as I see it). Gopnik writes:
Here’s a simple bit of Bayesian election thinking. In early September, the polls suddenly improved for Obama. It could be because the convention inspired and rejuvenated Democrats. Or it could be because Romney’s overly rapid response to the Benghazi attack turned out to be a political gaffe. Or it could be because liberal pollsters deliberately manipulated the results. How could you rationally decide among those hypotheses? . . .
Combining your prior beliefs about the hypotheses and the likelihood of the data can help you . . . In this case, the inspiring convention idea is both likely to begin with and likely to have led to the change in the polls, so it wins out over the other two.
I have no problem with the general message here (which is why I label it as only “slightly” garbled) but I’d like to make one correction on the details. (I’m a statistician; I care about details.) I’m (slightly) unhappy with Gopnik’s framing of the problem as A, or B, or C. It’s really A, B, and C. The appropriate claim, I believe, is not that A is more likely than B or C, but rather that the continuous effect A is probably larger than B or C.
As noted, this point is minor–I have no problem with Gopnik’s summary that one of the hypotheses “wins out over the other two.” But I think we are led to confusion if we place ourselves in an either/or setting. (This is probably a good place for me to plug my article with Kari Lock from a couple years ago on Bayesian combination of state polls and election forecasts, where we use continuous weighting.)
Blame the discrete models, not the priors
One way this seemingly minor point can matter is when we follow Gopnik’s suggestion that Bayesian inference “might explain human intelligence.” I agree that we naturally think discretely. But discrete thinking does not describe how much of the biological social world works. Out there are lots and lots of varying effects of varying sizes. If we, as humans, take these continuous phenomena and try to model them discretely, we will trip up, in predictable ways–even if we use (discrete) Bayesian methods.
To put it another way: what if Josh Tenenbaum and his colleagues (not mentioned in Gopnik’s article but you can search for them here on the blog) are right that our brains use some sort of approximate discrete Bayesian reasoning to make decisions and perform inferences about the world? Then this should imply some predictable errors.
Gopnik asks, “If kids are so smart, why are adults so stupid?” She’s referring to this experiment done in her lab: “We gave 4-year-olds and adults evidence about a toy that worked in an unusual way. The correct hypothesis about the toy had a low ‘prior’ but was strongly supported by the data. The 4-year-olds were actually more likely to figure out the toy than the adults were.”
In that example, Gopnik might well be correct: it seems reasonable to suspect that a kid will have a better prior than an adult on how a toy works.
More generally, though, I think we should avoid the temptation to think that, when a Bayesian inference goes wrong, it has to be a problem with the prior. That’s old-fashioned thinking, the idea that the likelihood is God-given and known perfectly, leaving us all to fight over our priors. In many cases, the model matters (for example, in our discussion above about natural-seeming but flawed discrete models). Even if the data model generally makes sense, its details can matter: as I point out to my students, the prior only counts once in the posterior, but the likelihood comes in over and over again, once for each data point.
If, as I think is the case, our brains like discrete models (perhaps they can be more quickly coded and computed) but the world is continuous and varying, this suggests interesting systematic ways that our brains might be misunderstanding the world in everyday reasoning. (Conversely, if discrete models really do have major computational advantages, maybe statisticians like myself should be giving them a second look.)
P.S. This post had been titled, “I notice a (slightly) garbled version of Bayesian inference, which provokes some thoughts on the applicability of Bayesian models of human reasoning.” But I think the new title is better!