Christopher Chabris reviewed the new book by Malcolm Gladwell:
One thing “David and Goliath” shows is that Mr. Gladwell has not changed his own strategy, despite serious criticism of his prior work. What he presents are mostly just intriguing possibilities and musings about human behavior, but what his publisher sells them as, and what his readers may incorrectly take them for, are lawful, causal rules that explain how the world really works. Mr. Gladwell should acknowledge when he is speculating or working with thin evidentiary soup. Yet far from abandoning his hand or even standing pat, Mr. Gladwell has doubled down. This will surely bring more success to a Goliath of nonfiction writing, but not to his readers.
Afterward he blogged some further thoughts about the popular popular science writer. Good stuff. Chabris has a thoughtful explanation of why the “Gladwell is just an entertainer” alibi doesn’t work for him (Chabris). Some of his discussion reminds me of my article with Kaiser Fung on Freakonomics. The situations are different—Levitt does his own research whereas Gladwell describes the research of others, and overall I’m a lot more positive about Freakonomics than Chabris is about Gladwell—but in both cases there is the tension of applying academic standards to popular entertainment, of taking wonderful and complex stories and flattening them into cliches (see discussion here).
All of Chabris’s post is worth reading, but here’s a part that I noticed because it relates to some of our recent blog discussions regarding small-N counterintuitive findings:
This leads to my [Chabris's] last topic, the psychology experiment Gladwell deploys in David and Goliath to explain what he means by “desirable difficulties.” The difficulties he talks about are serious challenges, like dyslexia or the death of a parent during one’s childhood. But the experiment is a 40-person study on Princeton students who solved three mathematical reasoning problems presented in either a normal typeface or a difficult-to-read typeface. Counterintuitively, the group that read in a difficult typeface scored higher on the reasoning problems than the group that read in a normal typeface.
In my review, I criticized Gladwell for describing this experiment at length without also mentioning that a replication attempt with a much larger and more representative sample of subjects did not find an advantage for difficult typefaces. One of the original study’s authors wrote to me to argue that his effect is robust when the test questions are at an appropriate level of difficulty for the participants in the experiment, and that his effect has in fact been replicated “conceptually” by other researchers. However, I cannot find any successful direct replications—repetitions of the experiment that use the same methods and get the same results—and direct replication is the evidence that I believe is most relevant.
This would be even better, blog-style, with links for the study and the failed replication.
P.S. As a bonus, here’s a post on disgraced science writer Jonah Lehrer, where Chabris writes, “Jonah Lehrer was never a very good science writer. He seemed not to fully understand the science he was trying to explain; his explanations were inaccurate, overblown, and often just plain wrong, usually in the direction of giving his readers counterintuitive thrills and challenging their settled beliefs.”
Particularly resonant to me was Chabris’s linking of Lehrer’s ethical lapses with his scientific lapses:
The fabrications and the scientific misunderstanding are actually closely related. The fabrications tended to follow a pattern of perfecting the stories and anecdotes that Lehrer — like almost all successful science writers nowadays — used to illustrate his arguments. Had he used only words Bob Dylan actually said, and only the true facts about Dylan’s 1960s songwriting travails, the story wouldn’t have been as smooth. . . .
After the Dylan episode, others found more examples of how Lehrer did this. I think one of the clearest was Seth Mnookin’s analysis of Lehrer’s retelling of psychologist Leon Festinger’s famous original story of “cognitive dissonance,” based on Festinger’s experience of infiltrating a doomsday cult in 1954. Of the moments after an expected civilization-destroying cataclysm failed to start, Festinger wrote, “Midnight had passed and nothing had happened … But there was little to see in the reactions of the people in that room. There was no talking, no sound. People sat stock still, their faces seemingly frozen and expressionless.” Lehrer narrated the same event as follows: “When the clock read 12:01 and there were still no aliens, the cultists began to worry. A few began to cry. The aliens had let them down.” Do you see the difference? Lehrer’s version is more dramatic: people worry, they cry, they feel let down. It’s more human. Each one of these little errors or fabrications makes the story work a little bit better, makes it match our expectations more closely, and thus gives it greater influence on our beliefs.
This seems closely related to the idea that Thomas Basbøll and I had, that plagiarism (or, more generally, obscuring of the provenance of data) is a statistical crime in that it reduces our ability to learn from reality.
Simplification is necessary for storytelling, but when you smooth away the parts of the story that don’t fit your template (whether you’re Levitt, Gladwell, Lehrer, or anyone else) you close off a possibility for learning.