The difference between “significant” and “non-significant” is not itself statistically significant

Commenter Rahul asked what I thought of this note by Scott Firestone (link from Tyler Cowen) criticizing a recent discussion by Kevin Drum suggesting that lead exposure causes violent crime. Firestone writes:

It turns out there was in fact a prospective study done—but its implications for Drum’s argument are mixed. The study was a cohort study done by researchers at the University of Cincinnati. Between 1979 and 1984, 376 infants were recruited. Their parents consented to have lead levels in their blood tested over time; this was matched with records over subsequent decades of the individuals’ arrest records, and specifically arrest for violent crime. Ultimately, some of these individuals were dropped from the study; by the end, 250 were selected for the results.

The researchers found that for each increase of 5 micrograms of lead per deciliter of blood, there was a higher risk for being arrested for a violent crime, but a further look at the numbers shows a more mixed picture than they let on. In prenatal blood lead, this effect was not significant. If these infants were to have no additional risk over the median exposure level among all prenatal infants, the ratio would be 1.0. They found that for their cohort, the risk ratio was 1.34. However, the sample size was small enough that the confidence interval dipped as low as 0.88 (paradoxically indicating that additional 5 µg/dl during this period of development would actually be protective), and rose as high as 2.03. This is not very convincing data for the hypothesis.

For early childhood exposure, the risk is 1.30, but the sample size was higher, leading to a tighter confidence interval of 1.03-1.64. This range indicates it’s possible that the effect is as little as a 3% increase in violent crime arrests, but this is still statistically significant.

I have not followed this at all and have no comments on the substance of the matter. But based on Firestone’s piece linked above, I am not impressed by his statistical criticisms. He seemed to just be going around looking for subsets of the data with statistically insignificant results. With a small sample size, not every comparison is going to be statistically significant. That does not represent evidence against the hypothesis of an effect.

P.S. Firestone comments, explaining that his goal is not to shoot down the claim but rather to point out areas of uncertainty which should motivate further study.


  1. ScottF says:

    Hi Andrew, I’m the author of the blog. I totally understand why it appears I’m cherry picking and mixing evidence of absence with absence of evidence. Within the context of the discussion, I was hoping to make it clear that I’d just like to see this replicated. I didn’t intend to present it as evidence against the hypothesis as much as make a case of how replication is very important in public health, and really in science all-around.

    Picking the lower bounds of a CI is certainly not something I make a habit of, and I respect the view that it’s inappropriate even in the way I used it. I’m a firm believer in systematic reviews and meta-analysis, but for this case I’d relax my standards a bit, just not where it currently is now. Hope that makes sense.

    • K? O'Rourke says:

      Replication _can be_ helpful, but with non-randomized studies this is hampered by bias (confounding, selection and misclassification) that often can be consistent over independently replicated studies.

      If you have dealt with that, it would be nice to hear about how here.

  2. Andrew, I’ve been hoping that you’d take a look at this. A lot of the criticism I’ve seen amounts to statistical critiques, but from almost exclusively laypeople or non-statisticians.

    In Scott’s Discover piece he explicitly claims that this is potentially an ecological fallacy.

    I keep seeing what amounts to these two claims: “correlation is not causation” (by the naive who think they are sophisticated), and that this argument is an example of the ecological fallacy (by those who are more sophisticated).

    I find this bothersome, having read Drum’s piece and others, and some of the citations.

    As best I understand, this warning is invalid when there’s actually scientific data about individual subjects (that’s the whole point of the critique!) and especially so when there’s a clear, compelling causal model within individuals for this relationship — which there is with lead toxicity, brain development, and crime. We have enormous amounts of data on acute lead toxicity and how that affects cognition and behavior; there’s increasing data about chronic lead exposure’s affects on developing brains that shows similar impairments; there’s new, causal neurological models about how this happens and where it happens in the brain (as well as a great deal of data about how those areas are involved in behavior); and there’s at least one long-term cohort study of a group of children, their various lead exposures, their school performance, and their criminal histories (to Scott’s credit, he discusses this in his article).

    Not to mention, of course, that even if we were strictly only within the domain where the ecological fallacy warning truly applies, that these correlations are found in different worldwide populations, at different times, with similar temporal offsets, and in this and other ways many of the confounding variables have been accounted for.

    I certainly wouldn’t argue that the matter is settled or that there’s not a great deal more research to be done. But I’m having a lot of trouble taking seriously these two frequent claims when, it seems to me, they’re much less applicable here than they are against a lot of research that is much more widely (and much more uncritically) accepted.

    I, personally, would highly value your opinion on the argument that Drum presents. More importantly, I think that someone of your stature and expertise would be doing a great public service by weighing in, as a widely respected statistician who writes popularly.

    • Rahul says:

      Andrew: Thanks for posting on the topic!

      +1 to Keith. This seems exactly the sort of matter where having an expert like Andrew comment on the strength of the technical analysis and claims would be very valuable.

    • Andrew says:


      Thanks for asking, but looking at the issue in a serious way would require a lot of work on my part! And the policy implications are clear in any case: reduce all sources of lead exposure.

      What interests me here is the political angle: perhaps it a “liberal” thing to focus on the lead-crime connection and a “conservative” thing to be skeptical of it? I see some general attitude connections: blaming crime on lead is “liberal” in the sense that it is blaming the environment rather than the free will of the criminal. Also, this is a “pollution is bad” story, which resonates better among liberals than conservative. Finally, the bad guys in this story are the oil companies.

      So maybe there is some sense that liberals resonate to, and conservatives bristle at, this lead-pollution and crime connection.

      • Well, the political aspect of this is that it’s unwelcome to both liberals and conservatives. Both sides prefer to see crime as primarily a function of culture — they just disagree about which parts of culture are at fault. And, if I may speculate, I’d say that this is because crime is for most people intuitively very much inherently a moral issue. It’s very heavily morally loaded conceptually, as an intuitive matter; and therefore it is a lightning rod for implicitly-but-deeply moral arguments about culture that play out in the political sphere. My sense is that even people approaching this mostly free of experience with prior partisan arguments about crime will hear anyone’s claims about what does and doesn’t contribute to crime as implicitly very value-laden and very political. And for those who are familiar with all these prior arguments, this is even more true. So this is in some sense a culture war argument by another name. I mean, in the last few days I’ve noticed that the simple factual claim (a true claim) that violent crime has been declining for about three decades is to many, at best controversial and, at worst, absurd on its face.

        Anyway, conservatives and liberals will disagree about how much individual choice plays a role relative to culture; but neither side wants to see a strong environmental developmental biology component. Conservatives see it as making excuses that deny the importance of moral choice; liberals see it as making deterministic excuses for the status quo and as a means to avoid addressing structural social injustice.

        It’s also a battleground in the science wars. Many social scientists involved — criminologists and political scientists — see it as another incursion of biological determinism.

        I do think that the opposition is more universal on the right for the reasons you say, but also because the class and race components of uneven environmental exposure. Still, I’ve noted that opposition from the left can be just as intense and angry, or more so, from certain segments that have a lot invested in a particular perspective. For example, a very strong economic model of the sociology of crime, or a deep suspicion of neurological/cognitive approaches to social issues, or both. Speaking for myself, I still think the former is very important and, with regard to the latter, am suspicious of all the science like this in the last ten years, so I have some sympathy.

        What’s interesting is that everything I just wrote works in the opposite direction, too. That is, this can be welcome to conservatives for exactly the reasons that liberals don’t like it (it’s not all about economic inequality or racism!), and welcome to liberals for exactly the reason that conservatives don’t like it (it’s not all about the decline of the nuclear family or religiosity!). Also, it implies a culpability among certain groups and institutions that dwarfs anything comparable (except global warming) and some will find that possibility welcome, others terrifying.

        • Vance Maverick says:

          Well, I’m a very conventional liberal, and I’m not upset that “it’s not all about economic inequality or racism” — indeed, the claim is irritating. A more genuinely “liberal” attitude to this problem is that it makes it quite clear how we can act together (taking advantage of that handy tool, the government) to improve things for everybody.

          Of course, lead exposure is affected strongly by economic inequality. I hope this won’t cause lead abatement to be seen as a program for the benefit only of the poor.

          • I’m progressive as well, and I don’t feel that way, either. And I feel the same way as Joel and you do — irritated — that this is highly politicized. Yet it is highly politicized.

            I’ve been following this issue for awhile (primarily because Drum has been writing about it occasionally for several years) and very carefully following the response Drum’s article has generated. I somewhat masochistically read more than 400 comments posted to Monbiot’s Guardian piece. And there was intense opposition from both left and right. This is just the way it is. We can wish that it weren’t, but we can wish for a pony, too. Regardless, it’s worthwhile to examine why people react negatively in what I think is a knee-jerk fashion. The politics matter for the actual implementation of policy. Therefore the nature of the political discourse matters.

          • Vance Maverick says:

            Maybe we’re just disagreeing on terms. You attributed a view to liberals which I, a liberal, don’t hold. If it matters, it’s because action will depend on understanding the actual, effective opposition. Guardian comment sections are not perhaps 100% representative of the American electorate or political class ;-)

          • @Vance, yes but this is a global issue, not merely an American issue, is supported by global data, and this theory is strongly opposed by both liberals and conservatives in the UK for similar reasons as in the US. Trust me, I’ve subjected myself to commentary from more than Guardian readers.

            But, anyway, I attributed views to liberals who are strongly opposed to this theory. That is to say, I asserted that it finds both liberal and conservative opposition (true), and I described how/why liberals and conservatives object. That’s not at all the same thing as attributing some view to all conservatives or all liberals. For what it’s worth, I think that the opposition on the right is more homogeneous and widespread, while the opposition on the left is more varied and is concentrated among those who are especially interested in the political/social issues involving crime.

            In that sense, this is on the left not unlike nature/nurture arguments. For the left-of-center population as a whole, there’s considerable enthusiasm for biological determinist arguments, sophisticated or pop-culture crude varieties. But among certain highly interested communities on the left, they are almost anathema. And this is pretty much in precise proportion to which such arguments have been used in the past by conservatives to justify inequality and oppression.

            All of which is to say that there’s a long history of various arguments from biology (and not necessarily from genetics, but also from developmental exposures), originating from the political right, that stigmatize certain populations and argue that manifestly unjust and contributory social policies are not at fault. This is particularly true with regard to crime and intelligence and stigmatized behavior — certain very interested and very knowledgeable groups involved in criminology on the left have every reason to have a very strong, initial, reflexive negative reactions.

        • Joel says:

          I’m a liberal and I have no problem whatsoever with the lead hypothesis–in fact, I think it is welcome insight. I regard attempts to parse this as political dialectic to be pointless, useless and frankly ignorant.

          Look, this is fundamentally an medical/scientific issue. What should interest us is whether or not lead abatement could reduce crime (and potentially other social ills).

          In the ’60s, it was widely believed that peptic ulcers were the result of an alpha personality. Turns out, there is a simple biological mechanism to account for most peptic ulcers–Helicobacter infection. It is pointless, useless and frankly ignorant to discuss how behavioral psychologists would or should react to the hypothesis the a treatable bacterial infection causes ulcers. If the evidence says taking an antibiotic is likely to lead to a good outcome, who cares what a behavioral psychologist says?

          Similarly, if there is compelling evidence that lead abatement leads to a good outcome, who cares what “liberals” or “conservatives” say?

          • Vance Maverick says:

            If the conservatives say, “Lead abatement is a matter of personal virtue and self-reliance, for each of us to handle on his own without the nanny state, and besides, we should wait until all the science is in”, I care that this argument might prevent good being done.

      • Noah says:

        The reason it’s a liberal issue is that liberals have in general been the one’s to criticize the massive growth of the prison system and broken windows policing strategies. Those things are generally credited with the long term decline of crime. If it’s lead instead, then the skyrocketing imprisonment rates are a waste of money, and we should cease incarcerating people at such rates.

        Basically it takes law and order policies off the table

      • Paul Davis says:


        If you think Contra-Causal Free Will exists then I would say I agree. But brain development does matter as to what actions a person will take. Contra-Causal free will I will say does not exist.

      • hb says:

        Actually some conservatives eat this stuff up because it means that social correlates with violence (like inequality & poverty) can be ignored.

        Case in point:

    • ScottF says:

      There’s a lot of good correlation to make a very strong inference on causality, and a Bayesian analysis may certainly be appropriate given what we do know about lead’s neurological effects. There’s statistical theory way beyond what I’m capable of bringing to the table that could totally undermine my argument, and make it clear that the ecological fallacy has been adequately addressed. I know my own limitations as a researcher, and also that applications where I evaluate evidence maybe don’t hold quite the same way for lead contamination. Long story short, I’m certainly open to being convinced that I’m too skeptical, but really I’m trying to just focus on why reproducing the results of the cohort study directly linking exposure and crime is important.

      My goal with the post was really twofold:

      1. Help people understand why compelling arguments can fall into regulatory limbo, if they even get in front of the agencies.
      2. To highlight how a journalist makes a case for more research vs. how someone in public health perceives it for the benefit of people not in the field. He overstated his conclusion for that effect, and I can accept if it’s generally believed that I did mine.

      Non-randomized studies obviously have their own limitations, and this could really go on forever if someone wanted to make it so. That’s not how I roll. Your’s is a solid argument that maybe we do have enough to expect hundreds of billions of dollars in benefits right now, obviously I’m more cautious. That being said, I wouldn’t really object if there was a groundswell of support for lead abatement right now and it actually got done. It would be a good thing. I just seriously doubt that this could occur in reality as it currently stands.

      • Scott, I pressed you both here and on Drum’s blog (and I’m not stalking you, really! I read both blogs every day!) and you’ve been a good sport about it. You’re getting some of my misplaced frustration because I’ve read a lot of criticism that is much less civil and much more immoderate and nuanced than yours, but expressing a similar (and similarly justified) skepticism.

  3. LemmusLemmus says:

    If I’ve counted correctly, Reyes (2007: 6) cites six studies linking lead to crime at the individual level. I have not looked at the cited studies and cannot vouch for their quality, but this suggests that the impression conveyed by the blog post in question – that there is only one individual-level study – is not correct.

  4. Matt says:

    Can I ask a simple question? Is there anyone who believes lead is a nutrient or anything other than bad? If not, the question is how to clean it up and what it costs. As to the latter, don’t we have a lot of people looking for work right now? As to the former, that was answered by Mother Jones: About 30 billion dollars per year over 20 years. The benefits include putting people back to work and a healthier and smarter population, including future soldiers.

    Why is this a debatable idea? Just do it already!

    • Matt says:

      Sorry, I meant to mention that this is less than 5 % of the military budget, thus the reference to soldiers.

    • Vance Maverick says:

      It actually matters how bad it is — we need to consider cost (which as you say is quite affordable) vs. benefit. If Drum and his sources are right, the benefit will be huge. If not, we might decide it wasn’t worth it. That’s the debatable part.

  5. Nameless says:

    If the best the article can do is to produce a confidence interval of 1.03-1.64 for a 5 mcg/dl increase, I’m not sure if it’s even worth bringing it up. That’s almost the equivalent of throwing the hands in the air and saying “I tried, I did my best, but my sample size was just too small!”

    The only really interesting question is whether the mean risk of 1.30 or even the upper boundary of 1.64 per 5 mcg/dl is consistent with the narrative presented by Kevin Drum. None of the articles reports directly on the distribution of blood lead levels, but the original article says implicitly that even the highest soil lead levels in worst parts of New Orleans only get kids somewhere in the vicinity of 8 mcg/dl, which, according to this study, might lead to a doubling of crime rates. Which is, of course, woefully inadequate if we want to explain a 5x swing in nationwide crime rates between 1960 and 1990.

  6. Steve Sailer says:

    Back in 2007, I looked into the research Drum relies upon today and found it interesting, more resilient to reality checks than Steven Levitt’s celebrated abortion-cut-crime theory. Nonetheless, I pointed out a number of anomalies that needed to be resolved, such as why in densely populated Japan, which had lots of lead spewing cars, was there never a rise in crime?

    • I think you’d need to make a stronger case than “Japan has lots of people and therefore should show an especially strong correlation”. Because, first of all, while very shallow, Japan’s rate of assault deaths per 100K has a nice curve rising from 1960, peaking in 1990, and falling until 2000. So it doesn’t look contrary right off the bat. Given that, I’d need to know what the actual gasoline consumption was per capita over this period and the timing of the introduction and phase-outs of TEL in Japan.

      Anyway, no one is arguing that environmental lead is the only factor, just an important one among several. All the other sociocultural factors that both the left and right feel strongly are involved would, naturally, vary from culture to culture; and in some cultures the environmental loading of atmospheric lead may have much larger affects on violent crime rates than in others … just as is the case with alcohol consumption. Alcohol consumption is clearly contributory to crime rates — and we have a pretty strong biological/psychological model for why it is — but this is much more strongly seen in some nations than in others. Not to mention that reporting rates vary culturally and nationally.

      Which is to say that failing to find the same magnitude of the relationship from one country to another is not at all discomfirming.

      • Nameless says:

        The following article claims that the situation was completely opposite and that rates of all major violent crimes were falling in Japan from early to mid 50’s to 1990:

        In any event, it’s hard to compare Japan to the United States. The U.S. crime peak of 1970..1990 is supposed to correspond to high levels of lead emissions due to the use of leaded gasoline that peaked in 1950..1970. On the leading edge of this peak, emissions were very low in Japan compared to the U.S., because the country was devastated by WWII, there was almost no domestic auto industry (it did not take off till late 60’s), there never was any domestic oil, and Japanese people relied on public transportation and motorcycles/scooters. Look at this graph:

        On the trailing edge, Japan began restricting and phasing out leaded gasoline about 5 years sooner than the United States. It had the first legislation restricting lead content of gasoline in place by 1971.

  8. Jorgen Klaveness says:

    The sample size of 250 is far too small to allow any conclusions to be drawn. Lead does not have the same effect on everyone, and it does not have a linear dose response curve. Its medical effect is dependent on a number of other synergistic interactions. Some of these (like with mercury) are well understood. Others are still hazy. So, in order to see the effect of lead on any range of outcomes, you need a sample size that is big enough that you don’t get confused by the the variability in the supply of those other toxins. This is what you get when you look at the issue on the nation / state / city level. You don’t get it when you only look at 250 patients.


  9. hb says:

    “With a small sample size, not every comparison is going to be statistically significant. That does not represent evidence against the hypothesis of an effect.”

    Nor is it evidence *for* a hypothesis. Nor do the results of one study of one population (90% black, 100% urban, low-income) with a small number of subjects (250) and data points (108 arrests for violent crime) with a few statistically significant results and a few seemingly paradoxical ones constitute proof of a hypothesis.