Skip to content

What explains my lack of openness toward this research claim? Maybe my cortex is just too damn thick and wrinkled

Diana Senechal writes:

Yesterday Cari Romm reported that researchers had found a relation between personality traits and cortex shape: “People who scored higher on openness tended to have thinner and smoother cortices, while those who scored high on neuroticism had cortices that were thicker and more wrinkled.”

I [Senechal] looked up the study itself ( and read:

These findings demonstrate that anatomical variability in prefrontal cortices is linked to individual differences in the socio-cognitive dispositions described by the FFM.

At the end of the Statistical Methods section, the authors state:

To control for multiple comparisons in the SBM analysis, cluster correction was completed using Monte Carlo simulation (vertex-wise cluster forming threshold of P < 0.05) at a cluster-wise P (CWP) value of 0.05.

In the discussion, they return to this issue:

In terms of potential shortcomings, it can be surmised that a relatively large number of statistical tests was performed. This could have increased the probability of type I errors, although the use of a large sample size and state-of-art methods to correct for multiple comparisons should have mitigated against this problem.

Are they overly confident that they corrected for multiple comparisons?

My reply: To paraphrase Flava Flav, 0.05 is a joke. If there are multiple potential comparisons to be considered, I think the researchers should study all of them and analyze them using a mutilevel model. That makes a lot more sense than picking just one comparison and trying to correct for things. Why should you care about just one comparison? It strains credulity to think that, when you’re talking about something as multifaceted as “scoring high on openness,” that it would just show up in one particular dimension. I mean, sure, you never know: useful discoveries could come from this approach. But I’m skeptical, as the statistical method seems like such a poor match to the problem being studied.

P.S. Senechal supplied the above picture. I have no idea how thick and wrinkly is the cortex of this cat.


  1. Garnett says:

    Many investigators are obligated by their funding agencies to measure a gazillion things under a particular grant. That leaves them with the contradictory problem of tons of measurements and little funded time to analyze the data.

    Massive tables of p-values are awfully appealing under these circumstances, and you can always fall back on “We’re not saying that personality is _caused_ by topography of the pre-frontal cortex, but they are linked!” (whatever that means)

  2. Jeff Walker says:

    a general question that might be useful for teaching – what of significance has been discovered by large scale data mining? I don’t mean observational studies generally, but data mining (of the sort done in gene association studies or these kinds of neuro-association studies)?

    • Jeff Walker says:

      And by discovered I mean that there have been numerous follow-up studies, hopefully experimental, that confirm that initial findings from the data mining.

      • Dylan Vrana says:

        Speaking specifically of gene association studies, most major ones do replicate findings in independent samples. In terms of experimental work, the discovery of the link between complement component 4 and schizophrenia (which hilighted the role of synaptic pruning and immune function in schizophrenia) came out of the PGC gene association study and was later replicated in the lab. The apolipoprotein E is consistently replicated as a risk factor for Alzheimer’s, though I’m not aware of the status of experimental work on it.

        In addition, much of the signal from gene association studies comes from aggregating the results. If, for example, your hits are in genes associated with immune function, that tells you something about the disorder being studied even if you haven’t thoroughly replicated each hit.

  3. Paul says:

    Here, the number of comparisons refers to the number of vertices, not individual hypotheses. It is common (bad) practice to estimate one model for each vertex independently and perform some sort of correction to the resulting p-maps. This is the standard approach in this field (I worked for nearly seven years with this sort of data). If this does not yield the desired results you can limit the analysis to a set of vertices instead of the whole brain, because you know what, the effect may only be visible in the frontal lobe. Yeah, the garden of forking paths all over again. Oh, and if this does not help just choose a different procedure for the correction step …

    Some hierarchical Bayesian approaches exist that are able to analyze all vertices jointly by considering the spatial relationship between vertices. In this way shrinkage can be applied and the results are more trustworthy.

    However, in my opinion this is just the tip of the iceberg. The results strongly depend on the preprocessing pipeline which consists of a row of complex transformations from different toolboxes (affine and non-linear registration, normalization, smoothing, …). In one of these toolboxes (FreeSurfer) the user even needs to check the results and correct them by hand. And then there is the effect of the scanner, so noisy …

    I also find it quite ridiculous that, in contrast to the rather complex preprocessing pipeline, the final vertex-wise analysis is always performed by a simple linear model. They always try to break it down to the simplest thing possible. If they could, they would even use a simple correlation coefficient.

    If I learned anything during the past years, then not to trust results from MRI studies.

  4. Thank you, Andrew, for posting my question and cat picture and responding to both! And thanks to Paul and others for the comments. If I understand correctly, some of the problems lie (1) in relating a multifaceted dimension of personality to a feature of the cortex–that is, oversimplifying the personality dimension itself; (2) in applying a simple linear model to vertex-wise analysis–that is, oversimplifying the relation; and (3) in accounting (or failing to account) for the many possible inaccuracies of measurement, including those of the scanner.

    Here’s the full abstract of the study:

    “The five-factor model (FFM) is a widely used taxonomy of human personality; yet its neuro anatomical basis remains unclear. This is partly because past associations between gray-matter volume and FFM were driven by different surface-based morphometry (SBM) indices (i.e. cortical thickness, surface area, cortical folding or any combination of them). To overcome this limitation, we used Free-Surfer to study how variability in SBM measures was related to the FFM in n = 507 participants from the Human Connectome Project.

    “Neuroticism was associated with thicker cortex and smaller area and folding in prefrontal–temporal regions. Extraversion was linked to thicker pre-cuneus and smaller superior temporal cortex area. Openness was linked to thinner cortex and greater area and folding in prefrontal–parietal regions. Agreeableness was correlated to thinner prefrontal cortex and smaller fusiform gyrus area. Conscientiousness was associated with thicker cortex and smaller area and folding in prefrontal regions. These findings demonstrate that anatomical variability in prefrontal cortices is linked to individual differences in the socio-cognitive dispositions described by the FFM. Cortical thickness and surface area/folding were inversely related each others as a function of different FFM traits (neuroticism, extraversion and consciousness vs openness), which may reflect brain maturational effects that predispose or protect against psychiatric disorders.”

    • Martha (Smith) says:

      Oh that weasel word “linked”!

      • David P says:

        In conjunction with the more definitive word “demonstrate.”

        Webster’s: “Demonstrate – 1: to show clearly 2a: to prove or make clear by reasoning or evidence b: to illustrate and explain esp. with many examples”.

        • Martha (Smith) says:

          In context, “suggest” would be better than “demonstrate” — better yet, “suggest that anatomical variability in prefrontal cortices might be related to …”– in other words be really upfront about the uncertainty.

    • Paul says:

      (2) The vertex-wise analysis is one problem. However, the biggest problem in this kind of analysis is the large amount of researchers degrees of freedom: There are so many black boxes involved that you can show nearly anything you want.

      (3) Yes. For example, if you scan the same subject over and over again on the same scanner you will obtain quite different images. Those studies usually use only one image of each subject, so the data is largely corrupted by noise.

  5. Anne Pier Salverda says:

    Diane, now I get the picture. Voxelcat!

Leave a Reply