I read your post on Kanazawa. I don’t know whether his paper is correct, but I wanted to say something slightly different. Here is my concern.
The whole spirit of your blog would have led, in my view, to a rejection of the early papers arguing that smoking causes cancer (because, your eloquent blog might have written around 1953 or whenever it was exactly, smoking is endogenous). That worries me. It would have led to many extra people dying.
I can tell that you are a highly experienced researcher and intellectually brilliant chap but the slightly negative tone of your blog has a danger — if I may have the temerity to say so. Your younger readers are constantly getting the subtle message: A POTENTIAL METHODOLOGICAL FLAW IN A PAPER MEANS ITS CONCLUSIONS ARE WRONG. Such a sentence is, as I am sure you would say, quite wrong. And one could then talk about type one and two errors, and I am sure you do in class.
Your blog is great. But I often think this.
I appreciate it is a fine distinction.
In economics, rightly or wrongly, referees are obsessed with thinking of some potential flaw in a paper. I teach my students that those obsessive referees would have, years ago, condemned many hundreds of thousands of smokers to death.
I replied as follows:
I agree completely with your point, of course; in fact, I have had colleagues in the past who used to specialize in criticizing applied statistical work, without ever suggesting constructive alternatives.
In my defense, I think that my blog often features studies with potential methodological problems, and I think I am judicious in considering these. For example:
- I had an extensive discussion of the advantages and disadvantages of Seth Roberts’s results from self-experimentation. I think I made it clear that I thought it plausible that his findings would replicate to others, but I can’t be sure.
- Regarding your study, perhaps I was being too harsh by lumping you with some other studies that controlled for total #kids. But I did provide a constructive solution (to run the “intent-to-treat analysis”, not controlling for total #kids), so I don’t think I was rejecting your conclusoins, just pointing out a way to get more confidence in them.
- The Kanazawaa paper is of lower quality, I think. I say this partly because such huge coefficients are extremely scientifically implausible to me. I’m not in the business of going around trashing random papers (with millions of scientific papers published each year, what’s the point); I only noticed this one because it had been “Slashdotted.” The point here is not that “a potential methodological flaw means its conclusions are wrong” but rather that its conclusions are highly scientifically implausible, and it has a huge methodological flaw.
In general, I try to be constructive in my comments (and I certainly hope I wasn’t rude in my comments on your paper); I just found the Kanazawaa paper particularly irritating because they seemed so confident in their results For a more typical way in which I comment on a paper, see here.
Finally, I disagree that I would’ve rejected the argument in 1953 that smoking causes cancer. The main point I’d like to make here is that my comments on Kanazawaa’s paper (and, to a lesser extent, on yours) involved the potential methodological flaw of controlling for an intermediate outcome. In a situation with outcome y, treatment T, and pre-treatment variables X, the standard approach would be to regress y on T,X, but what you did was to regress (or, in some way, model) y on z,T,X, and look at the coefficient of T. This can be a real problem (even if it didn’t make a big difference in your particular example), and the #kids example is interesting because it seems so natural to subdivide the analysis by #kids, but it can lead to problems. Also, there’s a simple check here, which is to take z out of the model.
In contrast, with the smoking analysis, you’re talking about the problem that T is endogenous. This is a different problem, no simple solution and it’s often the best you can do. To criticize studies because T is endogeous would shoot down almost all observational studies, and I’m not trying to do that.
So although I agree with the spirit of your comment (that one should have a sense of proportion in one’s criticisms), I think that I’ve actually been ok in the specifics. I suspect it’s a problem with tone rather than content.
In retrospect, Oswald’s comments clearly hit a nerve, since my reply was longer than his original message! In any case, he replied as follows:
I agree that it is tiresome and dangerous when researchers sound like they are unreasonably confident. Perhaps the reason that I’m gentler-spirited than I was when young is that I have seen the relentless pressure to publish sound work that will get you a small pay rise each year. Publishing our Daughters work seemed the right thing to do once Nick and I had seen the same pattern in German data. Until then, we sat on the finding. But I am conscious that, because you can be wrong and people will notice, making any unusual claim is scary.
I feel that, sometime without being conscious of it, a lot of applied researchers prefer to work — I am going to use loose language here — on an issue that is dull but is 99.9% probably true rather than something deeply inconoclastic and potentially important that is 90.0% likely to be true. They do this partly, perhaps, because subconciously they very much fear being exposed as having made an error in their conclusion. Whether or not that is rational for an individual researcher, and very likely it is, there is also, I think, a case for believing that society needs risky iconoclasm in a very deep sense and to get it that society can live with some medium-run errors (because the profession will go on correct those reasonably quickly). I shall try to think through whether there is a way to make precise my intuition here that, in a society where scholars’ scientific reputations may be damaged by one false conclusion, individual risk-aversion may be suboptimally high — from a society’s standpoint. Of course if I’m right there is a convex function of some kind at the bottom of all this.