The next Lancet retraction? [“Subcortical brain volume differences in participants with attention deficit hyperactivity disorder in children and adults”]

[cat picture]

Someone who prefers to remain anonymous asks for my thoughts on this post by Michael Corrigan and Robert Whitaker, “Lancet Psychiatry Needs to Retract the ADHD-Enigma Study: Authors’ conclusion that individuals with ADHD have smaller brains is belied by their own data,” which begins:

Lancet Psychiatry, a UK-based medical journal, recently published a study titled Subcortical brain volume differences in participants with attention deficit hyperactivity disorder in children and adults: A cross-sectional mega-analysis. According to the paper’s 82 authors, the study provides definitive evidence that individuals with ADHD have altered, smaller brains. But as the following detailed review reveals, the study does not come close to supporting such claims.

Below are tons of detail, so let me lead with my conclusion, which is that the criticisms coming from Corrigan and Whitaker seem reasonable to me. That is, based on my quick read, the 82 authors of that published paper seem to have made a big mistake in what they wrote.

I’d be interested to see if the authors have offered any reply to these criticisms. The article has just recently come out—the journal publication is dated April 2017—and I’d like to see what the authors have to say.

OK, on to the details. Here are Corrigan and Whitaker:

The study is beset by serious methodological shortcomings, missing data issues, and statistical reporting errors and omissions. The conclusion that individuals with ADHD have smaller brains is contradicted by the “effect-size” calculations that show individual brain volumes in the ADHD and control cohorts largely overlapped. . . .

Their results, the authors concluded, contained important messages for clinicians: “The data from our highly powered analysis confirm that patients with ADHD do have altered brains and therefore that ADHD is a disorder of the brain.” . . .

The press releases sent to the media reflected the conclusions in the paper, and the headlines reported by the media, in turn, accurately summed up the press releases. Here is a sampling of headlines:

Given the implications of this study’s claims, it deserves to be closely analyzed. Does the study support the conclusion that children and adults with ADHD have “altered brains,” as evidenced by smaller volumes in different regions of the brain? . . .

Alternative Headline: Large Study Finds Children with ADHD Have Higher IQs!

To discover this finding, you need to spend $31.50 to purchase the article, and then make a special request to Lancet Psychiatry to send you the appendix. Then you will discover, on pages 7 to 9 in the appendix, a “Table 2” that provides IQ scores for both the ADHD cohort and the controls.

Although there were 23 clinical sites in the study, only 20 reported comparative IQ data. In 16 of the 20, the ADHD cohort had higher IQs on average than the control group. In the other four clinics, the ADHD and control groups had the same average IQ (with the mean IQ scores for both groups within two points of each other.) Thus, at all 20 sites, the ADHD group had a mean IQ score that was equal to, or higher than, the mean IQ score for the control group. . . .

And why didn’t the authors discuss the IQ data in their paper, or utilize it in their analyses? . . . Indeed, if the IQ data had been promoted in the study’s abstract and to the media, the public would now be having a new discussion: Is it possible that children diagnosed with ADHD are more intelligent than average? . . .

They Did Not Find That Children Diagnosed with ADHD Have Smaller Brain Volumes . . .

For instance, the authors reported a Cohen’s d effect size of .19 for differences in the mean volume of the accumbens in children under 15. . . in this study, for youth under 15, it was the largest effect size of all the brain volume comparisons that were made. . . . Approximately 58% of the ADHD youth in this convenience sample had an accumbens volume below the average in the control group, while 42% of the ADHD youth had an accumbens volume above the average in the control group. Also, if you knew the accumbens volume of a child picked at random, you would have a 54% chance that you could correctly guess which of the two cohorts—ADHD or healthy control—the child belonged to. . . . The diagnostic value of an MRI brain scan, based on the findings in this study, would be of little more predictive value than the toss of a coin. . . .

The authors reported that the “volumes of the accumbens, amygdala, caudate, hippocampus, putamen, and intracranial volume were smaller in individuals with ADHD compared with controls in the mega-analysis” (p. 1). If this is true, then smaller brain volumes should show up in the data from most, if not all, of the 21 sites that had a control group. But that was not the case. . . . The problem here is obvious. If authors are claiming that smaller brain regions are a defining “abnormality” of ADHD, then such differences should be consistently found in mean volumes of ADHD cohorts at all sites. The fact that there was such variation in mean volume data is one more reason to see the authors’ conclusions—that smaller brain volumes are a defining characteristic of ADHD—as unsupported by the data. . . .

And now here’s what the original paper said:

We aimed to investigate whether there are structural differences in children and adults with ADHD compared with those without this diagnosis. In this cross-sectional mega-analysis [sic; see P.P.S. below], we used the data from the international ENIGMA Working Group collaboration, which in the present analysis was frozen at Feb 8, 2015. Individual sites analysed structural T1-weighted MRI brain scans with harmonised protocols of individuals with ADHD compared with those who do not have this diagnosis. . . .

Our sample comprised 1713 participants with ADHD and 1529 controls from 23 sites . . . The volumes of the accumbens (Cohen’s d=–0·15), amygdala (d=–0·19), caudate (d=–0·11), hippocampus (d=–0·11), putamen (d=–0·14), and intracranial volume (d=–0·10) were smaller in individuals with ADHD compared with controls in the mega-analysis. There was no difference in volume size in the pallidum (p=0·95) and thalamus (p=0·39) between people with ADHD and controls.

The above demonstrates some forking paths, and there are a bunch more in the published paper, for example:

Exploratory lifespan modelling suggested a delay of maturation and a delay of degeneration, as e ect sizes were highest in most subgroups of children (<15 years) versus adults (>21 years): in the accumbens (Cohen’s d=–0·19 vs –0·10), amygdala (d=–0·18 vs –0·14), caudate (d=–0·13 vs –0·07), hippocampus (d=–0·12 vs –0·06), putamen (d=–0·18 vs –0·08), and intracranial volume (d=–0·14 vs 0·01). There was no di erence between children and adults for the pallidum (p=0·79) or thalamus (p=0·89). Case-control differences in adults were non-signi cant (all p>0·03). Psychostimulant medication use (all p>0·15) or symptom scores (all p>0·02) did not in uence results, nor did the presence of comorbid psychiatric disorders (all p>0·5). . . .

Outliers were identified at above and below one and a half times the interquartile range per cohort and group (case and control) and were excluded . . . excluding collinearity of age, sex, and intracranial volume (variance in ation factor <1·2) . . . The model included diagnosis (case=1 and control=0) as a factor of interest, age, sex, and intracranial volume as fixed factors, and site as a random factor. In the analysis of intracranial volume, this variable was omitted as a covariate from the model. Handedness was added to the model to correct for possible effects of lateralisation, but was excluded from the model when there was no significant contribution of this factor. . . . stratified by age: in children aged 14 years or younger, adolescents aged 15–21 years, and adults aged 22 years and older. We removed samples that were left with ten patients or fewer because of the stratification. . . .

Forking paths are fine; I have forking paths in every analysis I’ve ever done. But forking paths render published p-values close to meaningless; in particular I have no reason to take seriously a statement such as, “p values were significant at the false discovery rate corrected threshold of p=0·0156,” from the summary of the paper.

So let’s forget about p-values and just look at the data graphs, which appear in the published paper:



Unfortunately these are not raw data or even raw averages for each age; instead they are “moving averages, corrected for age, sex, intracranial volume, and site for the subcortical volumes.” But we’ll take what we’ve got.

From the above graphs, it doesn’t seem like much of anything is going on: the blue and red lines cross all over the place! So now I don’t understand this summary graph from the paper:

I mean, sure, I see it for Accumbens, I guess, if you ignore the older people. But, for the others, the lines in the displayed age curves cross all over the place.

The article in question has the following list of authors: Martine Hoogman, Janita Bralten, Derrek P Hibar, Maarten Mennes, Marcel P Zwiers, Lizanne S J Schweren, Kimm J E van Hulzen, Sarah E Medland, Elena Shumskaya, Neda Jahanshad, Patrick de Zeeuw, Eszter Szekely, Gustavo Sudre, Thomas Wolfers, Alberdingk M H Onnink, Janneke T Dammers, Jeanette C Mostert, Yolanda Vives-Gilabert, Gregor Kohls, Eileen Oberwelland, Jochen Seitz, Martin Schulte-Rüther, Sara Ambrosino, Alysa E Doyle, Marie F Høvik, Margaretha Dramsdahl, Leanne Tamm, Theo G M van Erp, Anders Dale, Andrew Schork, Annette Conzelmann, Kathrin Zierhut, Ramona Baur, Hazel McCarthy, Yuliya N Yoncheva, Ana Cubillo, Kaylita Chantiluke, Mitul A Mehta, Yannis Paloyelis, Sarah Hohmann, Sarah Baumeister, Ivanei Bramati, Paulo Mattos, Fernanda Tovar-Moll, Pamela Douglas, Tobias Banaschewski, Daniel Brandeis, Jonna Kuntsi, Philip Asherson, Katya Rubia, Clare Kelly, Adriana Di Martino, Michael P Milham, Francisco X Castellanos, Thomas Frodl, Mariam Zentis, Klaus-Peter Lesch, Andreas Reif, Paul Pauli, Terry L Jernigan, Jan Haavik, Kerstin J Plessen, Astri J Lundervold, Kenneth Hugdahl, Larry J Seidman, Joseph Biederman, Nanda Rommelse, Dirk J Heslenfeld, Catharina A Hartman, Pieter J Hoekstra, Jaap Oosterlaan, Georg von Polier, Kerstin Konrad, Oscar Vilarroya, Josep Antoni Ramos-Quiroga, Joan Carles Soliva, Sarah Durston, Jan K Buitelaar, Stephen V Faraone, Philip Shaw, Paul M Thompson, Barbara Franke.

I also found a webpage for their research group, featuring this wonderful map:

The number of sites looks particularly impressive when you include each continent twice like that. But they should really do some studies in Antarctica, given how huge it appears to be!

P.S. Following the links, I see that Corrigan and Whitaker come into this with a particular view:

Mad in America’s mission is to serve as a catalyst for rethinking psychiatric care in the United States (and abroad). We believe that the current drug-based paradigm of care has failed our society, and that scientific research, as well as the lived experience of those who have been diagnosed with a psychiatric disorder, calls for profound change.

This does not mean that the critics are wrong—presumably the authors of the original paper came into their research with their own strong views—; it can just be helpful to know where they’re coming from.

P.P.S. The paper discussed above uses the term “mega-analysis.” At first I thought this might be some sort of typo, but apparently the expression does exist and has been around for awhile. From my quick search, it appears that the term was first used by James Dillon in a 1982 article, “Superanalysis,” in Evaluation News, where he defined mega-analysis as “a method for synthesizing the results of a series of meta-analyses.”

But in the current literature, “mega-analysis” seems to simply refer to a meta-anlaysis that uses the raw data from the original studies.

If so, I’m unhappy with the term “mega-analysis” because: (a) The “mega” seems a bit hypey, (b) What if the original studies are small? Then even all the data combined might not be so “mega”?, and (c) I don’t like the implication that plain old “meta-analysis” doesn’t use the raw data. I’m pretty sure that the vast majority of meta-analyses use only published summaries, but I’ve always thought of it as the preferred version of meta-anlaysis to use the original data.

I bring up this mega-analysis thing not as a criticism of the Hoogman et al. paper—they’re just using what appears to be a standard term in their field—but just as an interesting side-note.

P.P.P.S. The above post represents my current impression. As I wrote, I’d be interested to see the original authors’ reply to the criticism. Lancet does have a pretty bad reputation—it’s known for publishing flawed, sensationalist work—but I’m sure they run the occasional good article too. So I wouldn’t want to make any strong judgments in this case before hearing more.

P.P.P.P.S. Regarding the title of this post: No, I don’t think Lancet would ever retract this paper, even if all the above criticisms are correct. It seems that retraction is used only in response to scientific misconduct, not in response to mere error. So when I say “retraction,” I mean what one might call “conceptual retraction.” The real question is: Will this new paper join the list of past Lancet papers which we would not want to take seriously, and which we regret were ever published?

46 thoughts on “The next Lancet retraction? [“Subcortical brain volume differences in participants with attention deficit hyperactivity disorder in children and adults”]

  1. It certainly looks odd that in the line graphs the ADHD group is only ever higher than the control group (blue line>red line) up until about age 35 for ICV, but in the data summary they have a negative effect size.

  2. Not much diagnostic signal in a Cohen’s d that small: “if you knew the accumbens volume of a child picked at random, you would have a 54% chance that you could correctly guess which of the two cohorts—ADHD or healthy control—the child belonged to”

  3. I think that what the authors wan to show you in the lines is just the differences during the child period. They theory is that you have a delay in brain maturation, so the graph shows some difference in the volume unit ~14 old childs, after that the volumes are comparable, meaning that the ADHD childs have a late maturation of the brain zones.
    From this point of view I don’t see any problem in the graphs, although I think that the problem may be in other analysis.

    • Gabriel:

      The graphs are hard to interpret because the successive points in the moving average are dependent, so it’s not clear how much information is in the apparent consistency of the patterns at low ages.

  4. In view of recent discussion, my question is: would the type of problems with this study be avoided by requiring programming and/or calculus? I look forward to hearing what people have to say and I’ll venture my thoughts. As I’ve said before, I don’t think either is necessary or sufficient for avoiding issues such as identified with this paper. I’m sure a “good” programming or calculus course would be helpful – for that matter, a “good” sociology course would be helpful. But I believe a good data analysis course (yes, with a good GUI) would go further towards helping people think sensibly about the data they have and what it means than requiring either programming or calculus.

    So, what do others think?

    • Perhaps the ADHD diagnosis, or even just visiting a psychiatrist, leads to a smaller brain. I mean, this is too obvious, but there is a reason psychiatrists are called shrinks. Also, perhaps the type of parent who would bring their child to a psychiatrist for ADHD-type issues has a smaller brain…

      I didn’t read too carefully but it wasn’t clear where these controls came from.

    • I think you could make an argument that knowing how to program might make this sort of problem worse. The more you know how to do, the more options there are for doing weird, inappropriate things.

      To be clear, I think it’s very, very useful to learn (statistical) programming, for a number of reasons (e.g., it frees you from only being able to use what’s available in GUI-based software, it makes tools like pystan and pymc available to you, it makes it easy to leave a written record of everything you’ve done). But the fact that programming can be extremely useful does not imply that its effects are all positive all the time.

    • The one last thing I will push back against a bit, based on personal experience, is the notion that GUI’s enforce good data analysis habits. They do not. As Dale pointed out in a post a few days ago, what produces a good analysis is thoroughly thinking through the problem and understanding the assumptions underlying what one is doing that produces a good analysis whether using a GUI or code. One can make most of the same mistakes with a GUI that one can with code. It does not solve the problem of poorly trained analysts using tools about which they have little understanding applied to problems they only superficially understand.

  5. I’ve never quite understood the notion that a *smaller* brain is a *worse* brain. What makes brains are connections, right? Can’t smaller brains have more connections? Do we complain that microprocessors have gotten too small to do computing? Wasn’t all that brain-size-correlated-with-intelligence stuff discredited long ago, even if Stephen Gould completely cheated when he tested it?

    • You can function alright while missing most of your brain:

      Three years ago, a 44-year-old man was admitted to hospital in Marseille, France, complaining of weakness in his left leg. He had no idea what doctors would find to be the source of the problem: a huge pocket of fluid where most of his brain ought to be.

      “We were very surprised when we looked for the first time the CT scan,” says Lionel Feuillet, a neurologist at the Mediterranean University, Marseille. “The brain was very, very much smaller than normal.” Nevertheless, subsequent tests showed the man to have an IQ of 75 — at the lower end of the ‘normal range’.

      The patient was a married father with two children and a job as a civil servant. His problems with his left leg were a neurological symptom of the condition, says Feuillet.

      http://www.nature.com/news/2007/070716/full/news070716-15.html

    • Aha, gotcha. Are you suggesting ADHD is “worse?” Perhaps it is just worse from the point of view of certain powerful social institutions, such as schools, in their current form.

      • Actually, I was assuming it was better! Higher IQ and smaller, ie more compact, brains. I suppose heat dissipation could be an issue though. Just don’t wear a hat!

    • It sounds like the main point of contention is not a “small brains are worse” claim, but this:

      “The data from our highly powered analysis confirm that patients with ADHD do have altered brains and therefore that ADHD is a disorder of the brain.”

      Given the controversy over the nature of ADHD, how it is diagnosed, and how it is treated, claims like “ADHD is a disorder of the brain” are going to be contentious regardless of whether there is anything “worse” about smaller brains generally.

      • That seems odd in so many ways. First, “have altered brains” begs the question of causality. Which way does it run? Second, learning statistics “altered my brain,” just about by definition, but that’s a good thing! Second, since what they are measuring are gross volumetric anomalies, shouldn’t there be at least some effort to demonstrate that gross volumetric anomalies have some independent meaning? Third, if ADHD weren’t a disorder of the brain what would it be a disorder of? I suppose the alternative is ADHD doesn’t exist, but since it is defined by a collection of symptoms, it exists wherever the symptoms do.

        • These are all great questions, and I don’t know what the answers are. I just gathered from the quote above that the main thing the authors were trying to establish was *some* kind of physical difference between ADHD brains and non-ADHD brains. I’d suspect the pushback would be from people who are either saying “ADHD doesn’t exist” or “ADHD is just a new name given to a collection of symptoms that have been around forever and so it isn’t a disorder” or “what we call ADHD is just an understandable reaction to children being brought up in more and more restrictive environments”. In all these cases it would seem that the claim being put forth could be called into question by a discovery of obvious and consistent differences in the brains of ADHD vs non-ADHD children (and it doesn’t sound like this paper shows obvious and consistent differences).

          But I know nothing about brains, so this is speculation.

        • The brain controls just about everything about our behavior and our cognition — the only exceptions I can think of are some reflexes — so of course any difference in behavior is going to be due to a different brain. Musicians have different brains from physicists, criminals have different brains from non-criminals, people who like anchovies have different brains from people who don’t, etc. ,etc.

          The claim that the brain of one group is _smaller_ on average (or has fewer or more connections, or a larger or smaller left frontal lobe, or whatever) could be interesting or informative, but it’s a trivial observation that there’s a _difference_. How could there not be?

          And I agree with Jonathon (another one) that even if one identifies a systematic difference in the brains of people in two groups, it’s not clear which way the causality runs. If musicians have a larger area of their brains devoted to processing music, does that mean people with ‘musical brains’ tend to become musicians, or does it mean that people who practice a lot of music end up with more development in that area? Or perhaps both are true?

          In the past ten or fifteen years I have often heard the claim that teenagers are incapable of being good decision-makers because the decision-making parts of their brains aren’t developed enough. But even if this is true, could it be because in modern society children are not required to make consequential decisions, or indeed permitted to do so? If we imagine a world in which nobody practiced music until age 18, perhaps people would say “teenagers can’t be good musicians because the musical part of the brain hasn’t finished developing yet.” I also note that David Farragut (the “damn the torpedoes” guy) was given command of a ship at age 12; Alexander the Great made important and competent decisions at age 16; and until the last 50 years or so it was common to leave teenagers, even young teenagers, in household or business circumstances that required decision-making, and that they were deemed sufficiently competent to do so. I do not assert that teenagers would in fact be capable of making good decisions if only they had practice at doing so; I merely raise the possibility.

        • Indeed, totally inconceivable.

          Brian Harold May, CBE (born 19 July 1947) is an English musician, singer, songwriter and astrophysicist, best known as the lead guitarist of the rock band Queen. (Wiki)

        • Phil: Don’t forget the phenomenon known as neuroplasticity: It is the mind-states that gradually bring about changes in the brain. As I have stated in my other comment, psychological stress that these individuals are subjected to (as a result of being given a label, etc.), CAUSES gradual changes in the brain. This is what the researches observed – it is wrong interpretation of the data.

        • Phil: Nice argument – I would add that representations can be different in the same physical brain (though those different representations will soon change the same brains differently).

  6. I didn’t see any mention of the possibility that many of the subjects with ADHD have been taking medication for it, and that the medication might cause brain changes.

    • From one of Andrew’s block quotes in the post: “Psychostimulant medication use (all p>0·15) or symptom scores (all p>0·02) did not in uence results, nor did the presence of comorbid psychiatric disorders (all p>0·5)” . Which I guess doesn’t directly clarify if they examined *every* medication that people might be taking, but they at least checked some of them.

  7. One thing that jumps out at me when reading this type study is that even if this study was perfectly done then all it says is on average some parts of the brain are slightly smaller in people with ADHD than people without ADHD.

    So many research articles make similar claims. Thinking of Experimental design if one is growing vegetables or making widgits differences in averages are very meaningful because they increase or decrease yields. What does it even really mean if one group has a slight difference in average brain size than another, even if it is true? Like the article says if they were so spread out that it could help with discrimination it would be useful but with so much overlap in the distributions it seems useless.

    I can’t help but wonder how much money was spent on collecting this data. Given it is funded by NIH maybe the data will be made public at some point? It would be interesting to look for other facts, people with ADHD have smaller big toes than people without ADHD and such things. I studied body measurements (creating race specific growth charts) for a time and also looking at different measurements in kids with fetal alcohol exposure and it was striking (to me) how many body measurements are different in different people groups. I also often wonder how much things are influenced in these type cases by a few people who have undiagnosed disorders with microcephaly due to fetal alcohol syndrome, or inadadequate nutrition or genetic illnesses, given the average differences are small it could be a small group of individuals who are fueling the difference as these disorders come with ADHD like symptoms as well.

  8. I have been working with such MRI stuff for nearly 8 years now. The biggest effect on the volume of any brain structure I saw was the re-scan effect, i.e. the change in volume between scans of the same subject on the same scanner within a few days. Brain volume data is very noisy. Not to mention the extreme differences among different MR scanners, the huge effect of small changes of the pre-processing pipeline, … So personally, I flush those studies down the toilet and would advise you to do the same.

  9. ADHD is diagnosed using merely a checklist – no objective tests are used for diagnosis, because there are no brain differences between people who have this condition and normal people at the time of the diagnosis (this point is valid for all DSM disorders, not just for ADHD). Now, being labeled as a “person with a disorder” and being told that “these are long-term conditions” (which is often the case for all mental disorders) this process leads to a great deal of psychological stress for that person, and this stress gradually results in structural changes in the brain. See the following studies (published in the journal ‘Nature’) that have shown changes in brain structure as a result of psychological stress. (There are more studies like this – I am just listing only two due to limited space).

    Davidson, R. J., and McEwen, B. (2012). Social influences on neuroplasticity: stress and interventions to promote well-being. Nature neuroscience 15.5: 689-695.

    Popoli M, et al. (2012). The stressed synapse: the impact of stress and glucocorticoids on glutamate transmission. Nature reviews Neuroscience. 2011;13(1):22-37.

    So, this ADHD study that simply detects structural changes several years after ADHD “diagnosis” is a result of psychological stress on the brain.
    This misleading study should be retracted, because it is based on an incorrect cause & effect conclusion, in addition to having the limitations listed above.

    • Yep, as I said above: “there is a reason psychiatrists are called shrinks”. Unfortunately such studies written by people who do not understand rudimentary aspects of science such as “the need to compare different explanations for an observed correlation” are the norm in medical research. There is no chance it will be retracted, at least not for that reason.

  10. The claim that maturation rates are delayed in ADHD brains is absolutely not novel. This has been confirmed over decades of research.

    What is new is that this delay seems to be neuroprotective across the lifespan! This means that the delay may be helpful later in life!

    As an aside – ADHD is the second most heritable neurdevelopmental disorder, and the genetics are related to diminished expression of dopamine receptors on the caudate part of the brain.

    The graphs – regress out the total brain volume and other effects across site, etc – and this is a good practice. Total brain volume differs across humans, and one wants to correct for this when making these types of comparisons.

    The term “mega analysis” is actually correct here. Why? Because NONE of the raw data was shared. This is mostly due to privacy concerns. Each site shared only the statistics already extracted using a program called Freesurfer.

    Essentially, the critiques from Mad in America are mostly broad strokes criticisms of psychiatry – and criticisms of how the press chose to interpret the findings of the paper.

    PS – I’m not Hoogman or anyone – but I am a scientist in this domain

Leave a Reply

Your email address will not be published. Required fields are marked *