Skip to main content

Verified by Psychology Today


How Not to Do Personality Neuroscience: Brain Structure and the Big Five

No, dear, extroversion does not make your brain look big.

One of the oddest things to me about neuroscience is the infrequency with which neuroscientists criticize one another. The outrage that followed the 2009 publication of Vul et al. "Puzzlingly high correlations in fMRI studies of emotion, personality and social cognition", a.k.a. "Voodoo correlations in social neuroscience," illustrates the issue. (See, e.g. for a brief account of the dust-up.) Coming to neuroscience from fields where direct (and sometimes quite trenchant) criticism is fundamental to progress, the degree of offense caused by this paper—which appears to make a sound methodological critique, albeit in a rather flamboyant way--was difficult to fathom, and suggests the existence of a set of social norms rather specific to the neuroscience community.

It also raises the question: If the community rarely drives progress with critique, what mechanism does it use? The neuroscientists I have talked to say that bad papers (and bad ideas more generally) don't need to be criticized (which is rude) because they will simply go away when nobody cites them. This is interesting on many levels, not least because of the degree of faith it implies in the unanimity of post-publication evaluation (which somehow failed to obtain during peer-review). There's a nice topic there for an aspiring sociologist of science.

But, since I'm not a sociologist of science, why do I bring this up? Because the 2010 publication of DeYoung et al. "Testing predictions from personality neuroscience: Brain structure and the big five" ( marked the beginning of my very small personal test of the effectiveness of the "just let it die" neuroscience-without-criticism approach to bad ideas. Just over a year and 46 citations in many prominent journals later, I think we can safely say the results are in. It's time to be critical.

The study is an attempt, as they put it, to "support our biologically based, explanatory model of the Big Five and demonstrate the potential of personality neuroscience (i.e., the systematic study of individual differences in personality using neuroscience methods) as a discipline." The Big Five, as you probably know, represent one prominent attempt to quantify individual differences using five factors: Extraversion, Neuroticism, Agreeableness, Openness/Intellect, and Conscientiousness. The authors analyzed the central traits defining each of these five factors and hypothesized that individual variation in each would be reflected in the anatomical volume of brain regions that have associated with the same or similar traits. Here is a representative paragraph in their analysis:

"Conscientiousness appears to reflect the ability and tendency of individuals to inhibit or constrain impulses in order to follow rules or pursue nonimmediate goals. This trait is linked to both academic and occupational success, as well as to behavior that promotes health and longevity (Ozer & Benet-Martinez, 2006). Conscientiousness is characterized by traits such as industriousness, orderliness, and self-discipline, as opposed to impulsivity, distractibility, and disorganization. Conscientiousness is likely to be associated with functions of PFC, which is thought to be responsible for much of the human ability to plan and follow complex rules (Bunge & Zelazo, 2006; Miller & Cohen, 2001). Functional neuroimaging studies have linked trait impulsivity to both dorsal and ventral regions of lateral PFC (Asahi, Okamato, Akado, Yamawaki, & Yokota, 2004; Brown, Manuck, Flory, & Hariri, 2006). We, therefore, hypothesized that Conscientiousness would be associated with structural variation in lateral PFC." (p. 821)

To test these hypotheses, the authors gave participants a standard personality inventory, and then performed a structural MRI of their brains. They then chose a reference subject who was near the group average on the various personality measures and compared everyone else's brain to that reference. This procedure is something like systematically squeezing a balloon in various places until it matches the size and shape of another similar but somewhat different balloon. Some regions of the balloon—and of the participants' brains—need to be contracted and some expanded to get the shapes to match. These regional expansions and contractions are the source of the data determining which regions were larger and which smaller, and the degree of these volume differences were then correlated with the degree of difference in the personality measures.

Consistent with their hypotheses, the authors did indeed find regions that correlated with four of the five personality factors: "Extraversion, Neuroticism, Agreeableness, and Conscientiousness. Extraversion covaried with the volume of the medial orbitofrontal cortex, a brain region involved in processing reward information. Neuroticism covaried with the volume of brain regions associated with threat, punishment, and negative affect. Agreeableness covaried with volume in regions that process information about the intentions and mental states of other individuals. Conscientiousness covaried with volume in the lateral prefrontal cortex, a region involved in planning and the voluntary control of behavior."

So, what's wrong with this? Shouldn't we let the data speak for themselves here, and conclude that the authors may indeed have taken "an important step toward the integration of individual differences research in psychology and neuroscience?" No, because data never speak for themselves. They are always placed inside an interpretive frame, and when that frame is inadequate, no interpretation can be valid. That is to say, without an adequate interpretive framework, we can't know the meaning of even the most statistically and methodologically sound finding. That appears to be the case here.

It is worth noting first of all, that the authors have literally re-adopted the phrenological framework whereby having a bigger local brain region denotes more local neural power and more resources devoted to whatever it is that the region controls. As reasonable an idea that may have been when the brain was understood by analogy with muscles, there is little good reason to resurrect this framework now.

Although the authors do not, of course, make the connection to Phrenology by name, they do explicitly adopt the bigger-is-more assumption: "A greater-than-average volume of a specific brain structure may signify greater-than-average power to carry out specific functions associated with that structure, on the assumption that larger populations of neurons can produce larger outputs and can, therefore, be more influential than smaller populations of neurons." (p.822) And this is exactly where the trouble begins, for there is little good evidence for this assumption.

First, although it is true ceteris paribus that a larger area of the brain will contain more neurons than a smaller area, there are many factors that complicate this general rule. Grey matter density varies across the brain and moreover changes at different rates in different areas at different stages of lifetime development, so different-sized regions may, in fact, contain the same number of neurons at a given moment, and different numbers at some other time. (Moreover, the different types of neurons, and the different styles of neural layering, will affect local density.)

Second, although it seems reasonable to suppose that more neurons imply greater processing power, the relation of function to structure is more subtle than that. Although function can be refined by neurogenesis-adding more neurons-it can also be refined both by adding and by pruning neural connections to existing neurons. Depending on the function to be performed, increasing the population or density of neurons can be beneficial, or it can be harmful. There really is no single good general rule here.

Third, even when a region contains more neurons because of its size in a given individual, this does not mean that this individual uses more neurons to perform the functions that have been associated with that region in studies analyzing group averages. An individual with more neural resources in a given area may use the "excess" for different purposes; without good functional studies of these individuals, it is impossible to know.

Fourth, even if an individual is devoting more neurons to a given task, this does not imply that these neurons will be more "influential" in that individual's brain (and thus have a greater influence on behavior). How "influential" a given neural population is will depend far more on the details of their synchrony and connectivity—the ratio of inter-to intra-cluster connections, for instance. Besides, an individual may devote more resources to a problem when it is relatively more difficult, thus indicating relative weakness rather than relative strength. There is suggestive evidence in the literature that demonstrates the complexity of this equation. London cab drivers appear to have a greater volume of grey matter in a particular part of their hippocampus (when compared, for instance, to London bus drivers). On the other hand, there is a large body of literature that suggests that experts often devote fewer neural resources to problems than novices do.

In general, there just isn't much good evidence for an overarching size = power equation. And although the authors do cite supporting evidence for the perspective, they appear to have misunderstood what that evidence shows. For instance, they cite a study that showed increased grey matter density (based on voxel-by-voxel comparisons in the amount of grey matter present) in the elderly after learning to juggle; but this study does not show changes in regional volume. Similarly, they cite a study establishing a positive correlation between overall brain volume and IQ. But nothing about that study supports the idea that local anatomical differences correlate with specific behavioral and functional differences. (Moreover, since early childhood nutrition positively correlates with bodily growth, with increased SES, with increased access to educational opportunities, etc., the general correlation between brain volume and performance on IQ tests is not that surprising-but for reasons having nothing to do with the hypotheses under consideration here).

Unfortunately, it is not just the phrenological structural hypothesis that size = power that is flawed; the phrenological functional assumption that region = function is equally flawed. Consider the following passage from the discussion section: "Conscientiousness was associated positively with the volume of the middle frontal gyrus in left lateral PFC. The region of association was large, stretching from close to the frontal pole to the posterior region of lateral PFC. The middle frontal gyrus is crucially involved in maintaining information in working memory and in the execution of planned action. . . . Our results may, therefore, reflect the association of Conscientiousness with effective self-regulation at multiple levels of complexity, which would be in keeping with this trait's importance as a predictor of academic and occupational performance, health, and longevity." (p. 826)

It's quite true that the middle frontal gyrus is involved in those processes. But it is also involved in attention; in counting; in semantic and episodic retrieval; in rhyme generation; in music cognition; in processing color words; and in conflict detection, just to name a few of the things that activate this structure. Given the functional complexity of individual brain regions, to interpret the correlation between a complex personality factor, like Conscientiousness with reference to a selective sub-set of the tasks the region supports is simply to make up post-hoc just-so stories.

That concocting just-so stories is in fact what the authors are up to can be seen by comparing their discussion of the left middle frontal gyrus (lMFG) / Conscientiousness relation with their discussion of the left superior temporal sulcus (lSTS) / Agreeableness relationship: "Agreeableness was associated with reduced volume in the posterior left superior temporal sulcus and with increased volume in posterior cingulate cortex. The superior temporal sulcus is involved in the interpretation of other individuals' actions and intentions on the basis of biological motion (Pelphrey & Morris, 2006), a process that may be more efficient in individuals who score higher in Agreeableness." (p. 826)

Did you catch that sleight-of-hand? The increased volume of lMFG is indicative of the greater self-regulation typical of Conscientiousness, while the reduced volume of lSTS is indicative of the greater ability to interpret others exhibited in Agreeableness. When both positive and negative relationships between the variables of interest equally support a hypothesis, this is a sign that the authors don't yet really have a scientific handle on their area of inquiry. Just as disturbing is the fact the authors do not find many of the relationships they do predict and find many relationships they didn't predict, and yet still count this as a scientific success.

Moreover, my colleague John Campbell, a specialist in individual differences, notes some cause for concern on the personality side of the ledger. It is, for instance, somewhat unusual to norm one's measures to a single individual. Presumably what the authors want is a neural explanation for the observed differences in personality (or a personality explanation for the observed differences in regional brain volume), but what we have here appears to be at best a series of potential explanations for deviations from the characteristics of a single individual. A subtle difference, maybe, but when combined with the other issues outlined above, it underscores the difficulty of making scientific sense of the reported results.

This brings us finally to the astonishing oversimplification inherent in a perspective that would even expect 1:1 mappings between personality factors as behaviorally complex as, say, Neuroticism and brain regions as functionally complex as ours seem to be. Many brain regions are involved in each cognitive and behavioral process, and each brain region is involved in many cognitive and behavioral processes. Cognition is supported by the real-time assembly of functional coalitions in the brain, and the same must be true for personality factors. The complex, relatively stable sets of traits and behavioral tendencies that we call Agreeableness or Extroversion come about only through the concerted activities of multiple networks of multiple regions.

And yes, of course, that makes the science hard. But even if 1:1 mappings were preferable to the authors for pedagogical reasons, it would at least have been an acknowledgment of the real complexity here had the authors analyzed the functional profiles of individual brain regions and tried to find all-things-considered best hypothetical matches between the range of functions a given region supports, and the range of traits comprising a given factor. But instead of trying to manage this complexity with deliberately chosen provisional abstractions, the authors simply ignore it. That makes the science easier, but only because it's no longer good science.