A kaleidoscope of responses to Dubner’s criticisms of our criticisms of Freakonomics

Jonathan Cantor pointed me to a new blog post by Stephen Dubner in which he expresses disagreement with what Kaiser and I wrote in our American Scientist article, “Freakonomics: What Went Wrong?”.

In response, I thought it would be interesting to go “meta” here by considering all the different ways ways that I could reply to Dubner’s criticism of our criticism of his writings. (In the same post, Dubner also slams Chris Blattman (or, as he calls him, “a man named Chris Blattman”), but I won’t get into that here.)

In reacting to Dubner, I will give several perspectives, all of which I believe. Usually in writing a response, one would have to choose, but here I find it interesting to present all these different perspectives in one place.

1. Understand.

I understand where Dubner is coming from. We say some positive things and some negative things about his work, but it’s natural for him to focus on the negative. I’m the same way myself, probably almost all authors are. In not replying to every point in Dubner’s post, I am not implying that I agree with the items not mentioned; rather, I’m relying on what Kaiser and I have already written. We said our piece, Dubner’s had his reply, and the reader can work it out from there.

2. Show respect.

As I wrote to Dubner last week:

Both Kaiser and I greatly admire Freakonomics, especially in the way that in your books and your blog you engage the reader in the struggles and excitement of discovery in the social sciences. Also, we share your interest in topics ranging from sports to baby names to criminology. Where we do criticize your writings, it is always from the perspective that your work has set a high standard. As sometime popularizers ourselves, Kaiser and I are well aware of the challenges of communicating topics of scientific debate to general audiences.

In his recent post, Dubner writes that we “come to bury, not to praise” his work. I regret that we gave that impression. As noted, we are fans of Freakonomics. I can’t speak for Kaiser but I continue to read the Freakonomics blog with interest. My disappointment at some of the low-quality items does not remove my appreciation of the good stuff, and I feel bad that we did not emphasize this more in our article.

Here it is again (from a couple years ago).

3. Step back.

Dubner lives in different worlds than those of Kaiser and me. (Levitt is in between, with one foot in the publishing/media world and the other in academia.) To the millions of readers of his books and blogs, Levitt and Dubner are the kings (and rightly so, the’ve done some great stuff), and Kaiser and I have the status of moderately-annoying gnats.

But I suspect Dubner realizes that, outside of his circle, he and Levitt have some credibility problems. They have fans but a lot of non-fans too. As I wrote a couple months ago:

About a year ago, I gave my talk, “Of Beauty, Sex, and Power,” at the meeting of the National Association of Science Writers. At one point I mentioned Freakonomics and the audience groaned. Steve Levitt is not a popular guy with this crowd. And that’s the typical reaction I get: “Freakonomics” is a byword for sloppy science reporting, it’s a word you throw out there if you want an easy laugh. Even some defenders of Freakonomics nowadays will say I shouldn’t be so hard on it, it’s just entertainment.

Now go back a few years. In 2005, Freakonomics was taken seriously. It was a sensation. Entertaining, sure, but not just entertainment—rather, the book represented an exciting new way of looking at the world. There was talk of the government hiring Levitt to apply his Freakonomics tools to catch terrorists.

That’s what Kaiser and I meant when we asked “What went wrong?” Freakonomics was once a forum for a playful discussion of serious, important ideas; now it’s more of a grab-bag of unfounded arguments. There’s some good stuff there but seemingly no filter.

This is what I’m talking about. When a roomful of science reporters treats you like a punch line, the problem isn’t with statisticians Gelman and Fung, or with economists Ariel Rubinstein and John DiNardo, or with bloggers Felix Salmon and Daniel Davies (to name several people who have published serious criticisms of Freakonomics). There are deeper problems, some clue of which might be found by reading all these critiques with an eye to learning rather than mere rebuttal. Don’t get distracted by your fans on the blog—consider that room full of science writers! Try to recover the respect of Felix Salmon and Daniel Davies; that would be a worthy goal.

4. Explain.

As noted above, Dubner seems to misunderstand the purpose of our article. We think Freakonomics has some great stuff but it also seems (to us) to lend an air of authority to some speculations and some outright mistakes. We had to keep our article short to fit the constraints of American Scientist magazine, so we thought it best to detail a few of the problems and then discuss how it all could’ve happened.

Dubner asks why we did not include other examples from the Freakonomics book or from their radio broadcasts. The answer is that we did not have space to include all the examples we had already found. We were not looking for more. Our point was not that X% of Freakonomics articles had problems, our point was that some avoidable errors had gotten through.

In another confusion of motivation, Dubner describes it as “weaselly” when Kaiser and I wrote, “Although there’s no way we can be sure, perhaps, in some of the cases described above, there was a breakdown in the division of labor when it came to investigating technical points.” It looks weaselly to Dubner, but to a statistician such as myself, it’s caution. Dubner compares our writing unfavorably to that of freshman composition students. Fair enough: he’s the professional writer, we are not. I fear it would upset Dubner even further to learn that Kaiser and I are generally considered to write better than the average statistician! I’ll just say that we do our best and that we would welcome the constructive suggestions of any professional writer who cares to give us advice. Until then, I recommend that Dubner accept that academics tend to write in a certain stilted way, and there’s no need to attribute weaseliness to our awkward prose stylings.

To get back to the specific point, we suspect there were breakdowns in the division of labor, but we can’t be sure, so that’s what we wrote. I’ve never met Steven Levitt but I’m pretty sure that he has the technical ability to, say, evaluate a statistically-flawed claim about beauty and sex ratios. But he’s a busy man and doesn’t necessarily have the time for it. That’s an example of a breakdown in the division of labor.

Here are a few of the other things that have been presented uncritically on the Freakonomics blog over the past several years:

– Casey Mulligan’s claim in October 2008 that the economy is not that bad: “The current unemployment rate of 6.1 percent is not alarming.”

– The ESP research of Daryl Bem. In no uncertain terms, “Daryl Bem has demonstrated ‘numerous “retroactive” psi effects – that is, phenomena that are inexplicable according to current scientific knowledge . . .'”

– A report, unqualified by any skepticism, of a dubious claim that “The companies Tiger Woods endorsed — and their shareholders — are feeling the negative effects of his extramarital affairs.”

Any of these (along with the other examples that have come up on this blog and elsewhere) is understandable, but when you put it all together, the message is that you can’t trust what you see in Freakonomics. It’s often interesting, often provocative, sometimes avoidably wrong. There’s an on switch but no off switch, no skeptical Felix Salmon voice in the background saying, “Hey, wait a minute? Are you sure about that?”

Summary

Levitt and Dubner can do better—I know they can do better, I’ve seen it in their best work—and I hope that they will be motivated by our critiques, along with those of Rubinstein, DiNardo, Salmon, and Davies, to be more careful about trusting the latest claims of albedo-focused billionaires, ESP researchers, politically incorrect sociologists, and various other rogues who come to them with compelling stories. Dubner is a professional journalist with a proven talent for inspiring millions of readers with the excitement of discovery; I am a sour statistician, trained to be skeptical. I’m sure each of us could stand to be a bit more like the other.

51 thoughts on “A kaleidoscope of responses to Dubner’s criticisms of our criticisms of Freakonomics

  1. Mostly a good post, although the point about seeking Daniel Davies’ approval is a bit odd (he’s an English stockbroker whose econometric capabilities don’t seem to be anywhere near Levitt’s or your own).

    • JD:

      I cited Davies and Salmon partly because I thought they made good points in their critiques, and also to indicate that Freakonomics has been criticized by non-academics.

      Also, one thing I forgot to mention above was that I liked that Dubner referred to some research (in this case by law professor Dan Kahan) in his reply to me. I don’t blame Dubner for getting annoyed—I’d feel the same way if someone came out of nowhere to mock me, too. I was happy with Dubner’s measured tone, given the psychological constraints under which we all operate, and given his understandable perception of our article as an attack (even if we did not intend it that way).

  2. Kudos to you guys for following up on this stuff. We scientists _crave_ having science writers out there who can write better than freshman composition students while displaying a high standard of criticality (and hence technical correctness) – there’s just so few of them. So when we see writers with the required writing ability _and_ a potential for criticality it is absolutely necessary to try to nudge them onto the right path (i.e. being more critical).

  3. I agree with a lot of what you have said here. I also listen to the “Freakonomics” podcast and consider it entertainment. Many of their pieces remind me of that old book “How to Lie with Statistics.” Particularly a somewhat recent podcast arguing that it is safer to drive drunk than to walk drunk.

    The one thing that really gets me (as a social scientist but not an economist)that I don’t think you’ve mentioned here (but I haven’t read the other things you have written on this issue) is that their show is often couched in language suggesting that the whole world makes sense if you just “think like an economist” or perhaps more specifically, a behavioral economist. And that every other social scientist is wasting their time not seeing the forest for the trees. I understand that this helps them be provocative and probably results in more sales of the book/listens of the podcast/reads of the blog, but they are also marginalizing a lot of the research from other areas of social science that informs their thinking.

  4. Dubner is a professional journalist with a proven talent for inspiring millions of readers with the excitement of discovery; I am a sour statistician, trained to be skeptical. I’m sure each of us could stand to be a bit more like the other.

    Give me a break. This is a weaselly comment if I ever saw one. A scientist bends over backwards to find problems with a theory or idea. If a statistician is a scientist, that what he/she does. A scientist with integrity does it to his/her own work, but other scientists do it the original one won’t. I realize your position is more equivocal, as you are also a political scientist, and the standards of scientific validity in that field are, to put it mildly, less stringent than in statistics. But just because you’re hobnobbing with the dark side doesn’t mean you have to forget what you learned (though now that I think about it, there’s a downward spiral–physics, statistics, political science–what next? Scientology?)

    • Numeric:

      Hey, now everybody’s calling me a weasel! To unpack my comment a bit: When I do statistics and social science, yes, I think my sour skepticism is just fine. When I popularize, I don’t want to lower my standards, but there should be a way for me to be skeptical without being sour, perhaps allowing more speculation while being clear that I am speculating. When they uncritically promote the work of Daryl Bem and Satoshi Kanazawa, the Freakonomists go too far, but there is an engaging aspect of their books when they breezily go through hypothesis after hypothesis after hypothesis. I tend to be more plodding in my style.

      • I didn’t call you a weasel. I called your comment weaselly. I took my definition of a scientist from Feynman, but let me recall a letter he wrote to the LA Times. He had made the mistake of given an interview with a science reporter from that “newspaper” (recall the Robert Benchley joke–I was on a train and asked the porter for a newspaper and the poor man, being hard of hearing, brought me a copy of the Los Angeles Times) on the topic of the “fifth” force in physics (a bogus experiment was getting some popular press) and the reporter had mangled Feynman’s answers to make it appear he was supporting this concept. Feynman wrote a letter stating that if there was a fifth force, one would expect the following to be true, and we don’t observe it, and this was the point he was making to the reporter, not what was written.
        My point is that the letter was a straightforward this is implied, these are the results we expect, this is what we get. No editorial comment. You could remove the editorial comments to be less “sour”.

        That being aside, pretty much all social scientists are weasels in Feynman’s sense, as he felt that scientists had a duty to expose erroneous reasoning where ever it might occur. You clearly pick your targets and avoid certain individuals. Since there are a number of individuals trying to fire you, it is doubtlessly prudent to avoid alienating your friends, but it is a lacking in the sense that Feynman would have had a scientist behave (I should mention that they threatened to kick him off the Challenger investigation if he issued his original report, and he modified it, so even his integrity was qualified).

        • Numeric:

          You write, “You clearly pick your targets and avoid certain individuals. Since there are a number of individuals trying to fire you, it is doubtlessly prudent to avoid alienating your friends, but it is a lacking in the sense that Feynman would have had a scientist behave.”

          I think you’re confused here. Nobody is trying to fire me; I’ve been tenured forever. And I don’t know what you mean about alienating my friends. Is there a particular friend of mine you’re thinking about, who is doing the statistical equivalent of mistaken “fifth force” research that I’m holding off on criticizing??

          On the contrary, I’ve never had problems with public disagreements with my friends, including Adrian Raftery, Christian Robert, Ray Fisman, . . .

          Your statements about weaselly comments may apply in some general sense to scientists but I don’t think they apply so much to me!

        • I think you’re confused here. Nobody is trying to fire me; I’ve been tenured forever.

          You were complaining about your department head at Columbia wanting to get rid of you. You didn’t receive tenure at Berkeley. I agree it won’t be easy to get rid of you unless you take up with your daughter like some other political science faculty at Columbia (that’s a joke–I don’t know if you have a daughter and if you do I’m sure you won’t take up with her). I do have an example of your failure to be as obnoxious as you might be to someone you aren’t friends with but it would identify me and I have too much fun making snarky comments under the guise of anonymity. I don’t think you’re particularly weaselly but I don’t see how one gets along in an academic environment without being so. I won’t exempt myself here, and I’ll give an example. On a project I’m associated with a statistician at Berkeley (someone you’d love to see taken down a peg or two) has proposed a model which, while statistically complex and “elegent”, does not fit the problem well at all. One of the investigators on the project (it’s spread over multiple universities) took me aside and told me that this statistical model is a poor choice but that it has the advantage that it shows that another one of the principle investigators work is decidedly sub-par, and that principle investigator refuses to listen to anyone but that statistician. So the statistical model which nearly everyone agrees is not a good choice is the model used in the project because of internal political considerations. I suppose one could develop a theory of statistical inference based on this criteria but it’s mostly just an embarrassment to not be talked about.

        • The sad truth is if most people practiced what Feynman preaches they’d be pointlessly unemployed and broke for 2 reasons 1) most people are not irreplaceable (in terms of their work contribution) and 2) you’d probably end up being wrong a lot anyway (I imagine cranks, in Martin Gardner’s sense, probably imagine they’re acting like Feynman… and there are a lot more cranks than Feynmans…).

          Lucky for tenured professors, they’re a bit more shielded from these concerns relative to everyone else … don’t think this applies much to Andrew. The “meta” commentary does make the positive comments sound pretty insincere though.

        • Numeric:

          Regarding your colleague who proposed a model that is elegant but does not fit the problem, I’m reminded of a discussion I’ve had with Don Rubin in the context of several different examples. I wrote about it here:

          Like many (most?) statisticians, my tendency is to try to model the data. Don, in contrast, prefers to set up a model that matches what the scientists in the particular field of application are studying. He doesn’t worry so much about fit to the data and doesn’t do much graphing. For example, the schizophrenics’ reaction time example (featured in the mixture-modeling chapter of Bayesian Data Analysis), we used the model Don recommended of a mixture of normal distributions with a fixed lag between them. Looking at the data and thinking about the phenomenon, a fixed lag didn’t make sense to me, but Don emphasized that the psychology researchers were interested in an average difference and so it didn’t make sense in his perspective to try to do any further modeling on these data. He said that if we wanted to model the variation of the lag, that would be fine but it would make sense to gather more data rather than knocking ourselves out on this particular small data set. In a field such as international relations, this get-more-data approach might not work, but in experimental psychology it seems like a good idea. (And I have to admit that I have not at all kept up with whatever research has been done in eye-tracking and schizophrenia in the past twenty years.)

          Your story is probably different but it’s interesting to think of these different attitudes toward models and data fitting. Rubin’s approach is more like what they do in economics, where a model is chosen on substantive grounds rather than being based on data.

      • I can’t believe you’re letting this guy’s clear misunderstanding of political science and other social sciences stand unchallenged. Or is that an exercise for the reader: find the sentence that proves this guys doesn’t know what he’s talking about?

  5. I agree with your article, though I think you mis-wrote when you said, “I hope that they will be motivated by our critiques, along with those of Rubinstein, DiNardo, Salmon, and Davies, to be more careful about trusting the latest claims of … politically incorrect sociologists, and various other rogues who come to them with compelling stories.”

    Personally, I can’t see any benefit for science when scientists (let’s include sociologists, if we may) are politically correct. (Job security is another matter, and one does have to worry about consensus and political correctness to some degree.) I think I know what you mean by associating “politically incorrect sociologists” with “rogues”, but “politically incorrect” is not really what you’re shooting for.

  6. Personally, I found Freakonomics’ coverage of the “birthdate bulge effect” in youth sports highly informative. I was initially skeptical, but now I make sure to keep the effect in mind.

    As I’ve said before, the strongest black mark against Levitt is his bull-headed insistence from 1999 onward on the correctness of his celebrated abortion-cut-crime analysis, which turned out to be based on his own faulty programming. But, you don’t bring that up.

    Overall, I would say that the Freakonomicsmania of 2005 was absurd, based on the then widespread assumption that since economists have perfected their management of the economy it’s time for them to turn their infallible glance at other questions. But, few feel that way anymore, so I see the Freakonomics blog today as basically benign.

    • Steve:

      When Freak 1 came out, Levitt was a hero. Now he’s a punch line. So something happened (according to my crude before-after analysis). It may very well be, as you suggest, that the decline in reputation was simply a natural ebbing of the hype, a regression to the mean (to continue the analogy of before-after studies). But I think it didn’t help that they lent their credibility to fringe characters such as Bem, Kanazawa, Mr. Albedo, etc.

      Sometimes I think that Levitt, Dubner, and their crew are basing their judgment on the “What would Robert Heinlein think” principle. ESP? Check. Absurd claims of sex differences? Check. Voting is for suckers? Check. Our climate problems solved by the good old American knowhow of a genius billionaire who got a Ph.D. at age 23 and now has a plan to save the world and make another fortune? Check, check, check, check, check.

      Maybe here’s the problem. As I keep saying, Levitt has the ability and the training to see through a lot of the errors that have gotten into Freakonomics. But having the ability and training isn’t the same thing as doing it. Actually looking into a study takes work. And I’ve suspected all along that the Levitt half of the Levitt-Dubner duo sees the whole Freakonomics franchise as a fun side project, not to be taken as seriously as his real academic work. Hence the breakdown of the division of labor. One of Levitt’s “jobs” in the Freakonomics team is to check the scientific reasoning, but maybe he just doesn’t feel like doing it. So he lets his co-bloggers post on ESP, he decides to trust Casey Mulligan on the health of the economy, etc.

      On the other hand, as you say, Freakonomics has a lot of good stuff too. The trouble is that when the good stuff is mixed in with the bad, it’s hard for an outsider to know what to do.

      • ““What would Robert Heinlein think?”

        That’s great. I bet you are right on the the Heinlein influence on Levitt or Dubner. I’m a lifelong Heinlein fan myself, but I cherish him as an intellectual provocateur, not as a final authority (especially since he constantly contradicted himself depending on the political views of his latest wife). I suspect Heinlein considered setting himself up in the cult seer business like his old buddy L. Ron Hubbard and Ayn Rand, but he resisted the temptation. I suspect he would have gotten bored.

        Anyway, my impression is that Levitt getting humiliated by economists Christopher Foote and Christopher Goetz discovering in late 2005 that his most famous theory, abortion-cut-crime, was based on simple error, had no impact on his celebrity. The real mistake Levitt made was not showing sufficient worshipfulness toward global warming orthodoxy in SuperFreaknonomics.

        • Steve:

          1. My impression is that 25+ years ago, Heinlein was a big name and Ayn Rand was an obscure footnote (I certainly hadn’t heard of her!), whereas nowadays it’s the reverse: probably lots of people have Heinlein’s books on their shelves, but Rand is the big shot. It seems kind of unfair, somehow.

          2. I agree that the global warming muddle didn’t help Levitt’s reputation. The interesting thing is, he didn’t go all-in on that stuff the way he did with abortion, drunk walking, it’s-irrational-to-vote, and so on. I took a look at a bunch of things that Levitt and Dubner wrote about climate change around the time their book came out, and they were all over the map, sometimes saying that global warming was a big deal and they wanted to stop it, other times saying that the earth was about to enter a cooling phase. I admire them—sort of—for not trying to maintain a consistent story. But I would’ve admired them even more if they’d just flat-out admitted that they didn’t know what they were talking about. And it’s hard to blame them for getting sucked into Albedo-man’s reality distortion field, given that so many other reporters have been in that black hole themselves (I recall a New Yorker puff piece from a few years ago on his patent-troll shop^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H invention factory and a couple of highly favorable NYT articles on his cooking book).

          Hmm, I’m rambling here . . . what I’m trying to say is, yes, they screwed up by devoting an entire chapter of their book to a topic they knew nothing about. But I think this was a symptom of a larger problem, which is that they were trying to write a book and they didn’t have enough material.

          I haven’t listened to Dubner’s radio show but maybe he’s right that it’s great. I could imagine a radio show being the perfect medium for Freakonomics.

        • 25+ years ago, I don’t know, but between 20 and 25 years ago, which covers the times I read both Heinlein and Rand, both were well-known, but not really as things of contemporary interest; more like “books that were very popular and considered important by the generation before ours, and which you therefore see as used paperbacks a lot.” I wonder which writers I think of as important are seen that way by today’s college students?

    • I’m a layman – why is there no final word on the abortion result? I’ve seen Levitt defend it plausibly. That would be foolish if he were wrong.

  7. “Warning: what follows is a horribly long, inside-baseball post that most people will likely have little interest in reading, and which I had little interest in writing. But it did need to be written. Apologies for the length and the indulgence; we will soon return to our regular programming.” The disclaimer on Dubner’s article.

    Could he get anymore condescending? Not only to the critics but also to the readers of the blog. This is boring stuff, you wouldn’t be interested. Why wouldn’t their readers be interested in seeing how they respond to criticism? Surely this is when people learn the most?

  8. What Steve said. I have learned so much about “popular science” and its popularizers from the way they respond to criticism (see a nice example on spiritual sisterblog, Language Log, right now, in which Mark’s blogged critique of a published miniscule correlation lacking any causative mechanism sent the original authors into panic mode: http://languagelog.ldc.upenn.edu/nll/?p=3848).

  9. I’ve never met Steven Levitt but I’m pretty sure that he has the technical ability to, say, evaluate a statistically-flawed claim about beauty and sex ratios.

    I don’t necessarily agree and I think this is the problem. Obviously at some point in the past, Levitt’s career peak in terms of technical econometrics was higher than mine because he’s got a Chicago PhD, which I don’t think you can get without passing more rigorous econometrics courses than I’ve ever taken; also, James Heckman’s view of his work was not in the past as negative as it apparently is now. But, straight out of the gate, he has never really portrayed himself as an econometrician and has always avoided technical econometrics in his academic career – the entire point of Freakonomics was to find a short way with difficult econometric issues by searching for the notorious quirky datasets, and nearly every single one of Levitt’s most famous papers had a co-author who was responsible for the econometrics. I seem to remember reading a more recent interview with Levitt where he basically said that his comparative advantage was now in the finding (and the use of his celebrity to persuade people to release) datasets, and that he intended to follow his comparative advantage and leave all the statistical work to co-authors. I think his technical skills are very likely to have atrophied a bit – although frankly, it is not really as if things like Satoshi Kanazawa and the Tiger Woods effect required any advanced skills at all to see the problems, so this can’t be the whole story.

    I take a somewhat harsher view of the Freakonomics blog – I don’t think that the good stuff excuses the bad, because it’s the presence of the good stuff that ensures that a wider audience will be misled by the bad (I call this the “Economist Effect”, after the newspaper). The real damage in, say, the dot com era was not done by bucket shops, but by reputable research houses that dropped their standards. I really wish Justin Wolfers would either leave the blog, or take the quality-control role that is needed.

    Thanks very much for your kind comments, by the way.

    • Dsquared:

      When I said that Levitt can evaluate a claim, I’m not talking about econometric or statistical skills so much as common sense (of the social-science variety). For example, David Lee is a top econometrician, but I respect Levitt’s political science work much more than Lee’s: when working on a political science problem, Levitt seems focused on the real question, whereas Lee is all about identification without always a clear idea of what is being identified. So what I was saying about Levitt was that he seems to have a sense of the connections between data, analysis, and the social science questions of interest.

      When Levitt goes into junior-Gary-Becker mode (drunk walking, rationality of voting, etc.), I’m not so impressed. Then again, I’m not impressed when Gary Becker goes into Gary Becker mode either.

      Regarding the beauty-and-sex-ratio example, yes this seems easy enough to follow, but maybe it’s not so easy. Here’s how I noticed the problem:

      1. Claimed effect size of 36%. I happened to know this was a ridiculously huge number, but only because I was curious about sex-ratio variation and had looked it up about 20 years ago when preparing a class I was teaching. (See here for the story.)

      2. Cultural politics. I’m suspicious of people who deny sex differences or who exaggerate them. Either way it sounds like someone’s letting their politics get the better of them. In this case, the content of Dubner’s post made it pretty clear that the sex-ratio story matched his politics. He wanted to believe it was true because it fit his worldview.

      3. The statistical significance filter. Actually, this is a concept I developed only after thinking about that example. But at least I knew that p<0.05 isn't enough reason to believe a claim.

      Levitt presumably didn't know about sex ratios (thus couldn't use cue #1 above), was culturally predisposed to want to find huge sex differences (consistent with the biological-deterministic thinking that is all the rage among some economists nowadays) hence he lost out on #2, and his second-hand knowledge of statistics meant that cue #3 was not easily available to him.

      Levitt didn’t actually write that blog post, nor did he write the Tiger Woods post or the ESP post. But he didn’t seem to mind that his franchise was putting them out. All of this just seems to fit the story that Levitt treats the whole Freakonomics enterprise as a bit of fun, not to be taken seriously. I blog as a way to work out ideas, as a form of relaxation, and as part of my professional service. Levitt edits journals as his professional service. I doubt he cares much what appears on the blog. My guess is that if you cornered him, he’d admit that the sex-ratio, Tiger Woods, and ESP reports were all mistaken, but he probably wouldn’t care—not enough to post a correction note, certainly. After all, it’s just entertainment!

      • Always been curious about “blind sights” such as people choosing to smoke beside empty gas cans rather than full gas cans (although the risk is from fumes that are higher in the empty cans, empty _sounds_ safe)

        “his second-hand knowledge of statistics meant that cue #3 was not easily available to him”

        One of my first encounters with this was the (early career) faculty member with a Harvard Phd in Biostats and their senior (late career) epidemiology colleague who argued, “the study was very small with very low power and statistically significant so the true effect must be huge!” My sense is that many statistically well trained don’t get this early in their career “concept I developed only after thinking about that example” and that is interesting.

        A question I use to identify statistical naivety amongst the statistically well trained is “there is absolutely no effect, the assumptions are all true and the study is repeatedly done – what is the distribution of the p-values?” Of the last two I tried this on, both with MSCs in statistics and years of experience, one got it and one didn’t. I asked the one who did get and they told me their professor had made them work this out.

        • Well, when I was asked that question (entrance interview to grad biostat school) I said I hoped it would be Normal ;-)

          Usually its just “don’t know”. Once, after my suggestion that they do simulations assuming single group Normally distributed data, I got a “I don’t think it is actually uniform, but I am not sure yet.”

          [More generally, as I am sure you know, it is a complicated distribution on functions of nuissance parameters.]

          Simulation _should_ clarify this sort of thing out easily for most people.

          On another note, I was looking for something that argued for simulating power calculations rather than encouraging people to get it (mostly) wrong with over simplified (and over valued) formulas and found this – Size matters: just how big is BIG? Quantifying realistic sample size requirements for human genome epidemiology.

          Of possible interest here,
          “Studies enrolling several hundred subjects are commonplace in human genome epidemiology. But,… provide <1% power"

  10. This might be petty of me, but did anyone note that this was Dubner and not Levitt. Are we sure that Levitt agrees with this response? If he does it seems pretty unfortunate.

    • Jonathan,

      Considering that the response went out on a blog that has Levitt’s name right there, I assume that, even if he doesn’t agree with the response, he endorses it. Just as he implicitly endorsed the claims about ESP, the Tiger Woods effect, beauty and sex ratio, etc etc etc.

      I don’t expect a blogger to endorse (explicitly or not) all the comments on his blog, but I would expect him not to object to the posts.

  11. Since migrating from the NyTimes site, the “Freaks” have turned their site into an economics version of the Drudge report. As others have noted, the site now mixes tabloid articles with academic research , leading to confusion about why exactly they are promoting certain links. This seems to be the logical conclusion of pursuing counter-intuitive conclusions at any cost. A prime examples is this gem, simply described on the site as “British medical students turn to prostitution.” A clever bit of Freaknomic insight or total garbage? You be the judge:
    “http://www.freakonomics.com/2012/03/02/some-links-worth-reading/

  12. Pingback: Freakonomics Critique and Rebuttal

  13. Oh I wasn’t denying that, I am curious whether Dubner is credited to it so that Levitt doesn’t get any blow back from the response. I think Levitt would have to respond differently especially on the methodological considerations. But your point is entirely valid.

    • I don’t know, but my guess is that Levitt doesn’t really care about it. It’s always seemed to me that for Levitt, Freakonomics is a fun side project that he doesn’t take seriously.

      • I mean his most recent paper is similar to that. I posted it before but it was a bit disconcerting that it was released at all. Still not sure what he was studying or measuring.
        http://www.nber.org/papers/w17023

        Nothing against the guy but his talents can be used to answer more substantive questions (like the stuff you do).

      • Levitt’s response (http://www.freakonomics.com/2007/04/25/am-i-ruining-economics-or-not/) to Noam Scheiber’s critical article (http://www.tnr.com/article/freaks-and-geeks-how-freakonomics-ruining-the-dismal-science) in the New Republic in 2007 seems to indicate that he takes Freakonomics very seriously. In particular, note his stunningly arrogant opening comments in his rebuttal in the above link.

        Perhaps Levitt has mellowed since then, but Dubner’s incredibly long recent blog post gives the impression that they both perceived Andrew’s and Kaiser Fung’s article in the American Scientist to pose an existential threat to their brand.

        • Welles:

          I followed the link. Levitt is defending his own peer-reviewed research there, but he doesn’t seem to be defending the weaker claims made in the Freakonomics blog and books, such as the claim in Oct 2008 that the economy is going well, the claim that that dude from Cornell discovered ESP, the claim about drunk driving, the Tiger Woods effect, the non-mention of the public health researchers who were right all along on the missing girls, some of his wackier claims about pimps, etc etc. I assume that Levitt would stand by all these—after all, they all appeared on his blog or book and I haven’t heard of him backing down on any of them—but maybe he doesn’t feel so strongly about it that he is willing to debate Kaiser and me on the subject.

          I didn’t think our article was such an existential threat to the Freakonomics brand. The previously-published articles and blogs from Rubinstein, DiNardo, Salmon, and Davies seemed to me to be more comprehensive and hard-hitting. Our goal, as fellow pop-statistics writers (albeit much less successful than Levitt/Dubner) was to explore what went wrong with the Freak brand which, as noted above, is already a punch line in many quarters.

  14. Pingback: » “Freakanomics” by Steven D. Levitt, Stephen J. Dubner Ben Goertz

  15. “When Freak 1 came out, Levitt was a hero. Now he’s a punch line. So something happened (according to my crude before-after analysis).”

    I’ve read some version of this point multiple times by you. It’s completely anecdotal! Your MAIN POINT stems from “I was in a room and some people groaned”. That’s it! Everything else hinges on this. You think that Dubner and Levitt need to “win back” the respect of the people “groaning”.

    I’m not sure you could have chosen a more unscientific (and downright juvenile) basis for lamenting the “science cred” of an author.

    • Tom:

      Our systematic points are explained in the American Scientist article. In the blog, I went into detail on the reception of Freakonomics (and this experience happened many times, it was not just once) to convey to Levitt and Dubner that they don’t just have a problem with Kaiser and me, and with Ariel Rubinstein, and with John DiNardo, and with Daniel Davies, and with Felix Salmon. In many circles, Freakonomics has a bad reputation—which I think is too bad because I worry that this reputation diminishes lots of the excellent stuff that they do. By telling that story, I’m hoping to jog Levitt and Dubner to think seriously about the cost to their reputation of mixing in items of dubious quality along with their good stuff.

      • Here is Dubner’s recent comment: “For what it’s worth, those are two of the key criteria that go into determining what Levitt and I write: overlooked and worth writing about.”

        The Freakonomics Logo is showing how things aren’t always what they seem.

        Their oft-quoted “manifesto” uses phrases like ‘people respond to incentives’, and something like: ‘an honest look at data can tell you things that go against conventional wisdom.’

        I never got the impression that they were out to justify themselves to professional societies or bloggers.

        It’s one thing to say something like ‘they are being taken less seriously as cold-blooded economists because of their subject matters, and this diminished reputation (proven by a handful of people groaning and an enlisted down-thumb army) is unfortunate because they have the capability to be excellent group-think, cold-blooded economists.’ It’s quite another thing to say “We attribute many of these errors to the structure of the authors’ collaboration, which, from what we can tell, relies on an informal social network that has many potential failure points.”

        You can’t tell. You don’t know anything about “the structure of the authors’ collaboration”. And many of the so-called ‘errors’ you bring up are fallacies that Dubner refutes.

        Also, calling Dubner a “professional” writer does him a disservice. He is an exceptional author and the “National Association of Science Writers” should be trying to learn from him, not groaning at his work.

        • Tom:

          I don’t think that Levitt and Dubner were out to justify themselves to professional societies or bloggers. Nonetheless, I think that they should be interested in the opinions of thoughtful outsiders such as Rubinstein, DiNardo, Salmon, and . . . me! There is no “enlisted down-thumb army.” I’m not enlisting anyone and nobody is enlisting me! Like it or not, just as you’re not the only person who appears to think that it’s unfair of me (and, I assume, Rubinstein, Salmon, etc.) to pick on Freakonomics, I’m not the only person who thinks that Freakonomics mixes in some duds along with its good stuff. And, yes, you’re not the only person who thinks that Kaiser and I were mistaken in our criticisms, but, again, lots of other people think we made good points.

          Finally, when I called Dubner a professional writer, that was a compliment! I’m a professional statistician. Professional is good. I don’t think the members of the National Association of Science Writers (or, as you put it, the “National Association of Science Writers”) have a problem with Levitt and Dubner’s writing style, I think they have a problem with their apparent occasional willingness to lend their credibility to weak arguments.

          Again, Kaiser and I are fans of Freakonomics. We offer criticism because we want it to be even better (perhaps in the same spirit that you are criticizing me here!).

        • Andrew,

          Well I can only speculate about your motives (a disclaimer you should have made multiple times in your criticisms) but it seems to me that after the Nazi’s carpet-bombed Britain they didn’t hedge and say “we LOVE Britain, we only want them to become even better.” However, I respect you for replying civilly to my aggressive comments. I could be overreacting. That being said, I am a huge fan of the IDEA of Freakonomics and the format, and while I vehemently disagree with some of the implied conclusions of the studies (which the authors work hard to isolate themselves from), I would most certainly write 5-star reviews of the book before spending considerable time trying to blow holes in it. Reading Freakonomics and meeting Dubner combined to flip a switch for me. For the first time, I got excited about my worldview. I had always been so defensive of it, and after reading Freakonomics it was like “wow, you CAN be scientific, rational, thoughtful, and not succumb to group-think.” It felt overtly honest, humble, and true, just like Dubner’s writing style (you should read all his works they are excellent) which seems hard to come by in the world of rampant agendas, especially within the scientific community. And while I use ascii air quotes facetiously I meant no disrespect to the National Association of Science Writers (since I know nothing about them).

        • Tom:

          Actually, I think it was the Allies who carpet-bombed Germany, not the other way around. But I guess it depends on the definition of carpet-bombing.

    • Anecdotes are not unscientific.

      They are helpful for abduction or hypothesis generation and perhaps adding salience on top of deductions or inductions.

      Confusing their role is overly wrong (and likely what you meant.)

      Perhaps one of the reasons for a lack of good descriptive studies and case studies being published (which is a loss).

  16. This criticism by Gelman and Fung of Freakonomics is important, in part because: (1) economists have a disproportionate influence on public policy (especially vis-a-vis political scientists, sociologists, and anthropologists); (2) for reasons I don’t fully understand (given the high uncertainty in predicting economic phenomena) many economists have extremely strong prior beliefs (sometimes bordering on arrogance); (3) despite backgrounds in applied math, many economists are using statistical methods in ways out of sync with good statistical practices (e.g., obsessing over extracting pseudo-causal estimates from observational data, neglecting hierarchical or cross-classified data structures, focusing on frequentist interpretations of parameters, oddly hostile to qualitative data, cramming papers with needless equations, rarely graphing estimates or even the data, etc.); (4) many economists are biased toward beliefs and values that are vaguely right-wing libertarian, which is I fear is affecting their interpretations of otherwise good statistical work.

    None of this is to say that economics is “worse” than any of the other social sciences, but to the extent I’m correct about any of the above four points (which I hope I’m not), then Gelman and Fung’s article is an important contribution.

  17. I like this “kaleidoscope” method of responding to a defensive response. It possibly solves a writing problem I’ve been wrestling with for some time. Thanks for the unwitting inspiration.

Comments are closed.