I recently received an email from an expert witness service that contained the following request for an expert witness:
"Our client, representing a party in a patent matter, is looking for an expert in clinical psychology. The expert will need to discuss personalities, and the history of personality types and tests. In particular, the expert will need to discuss the state of personality testing and typing in the late 1990s, whether and how much subjectivity was involved in determining the boundaries of different types, where various people would fall, etc. The ideal expert will have previous experience working on patent matters."
As fate would have it, this request deals precisely with a topic I had planned to blog about—how personality psychologists place people into categories based on their scores on self-report personality tests. As the request from the expert witness service indicates, there is some subjectivity involved when deciding whether someone falls into a type category or not. The process is not totally subjective, but there are some arbitrary rules involved in the process, and readers might be interested in learning about them.
The challenge of assigning persons to type categories based on self-report personality inventory scores is that these scores usually range continuously from very low to very high without any breaks in the range of scores. For example, the graph below shows how many people in a group of almost 620,000 scored at different levels of an Extraversion scale, where scores can range from 24 to 120. (These results come from a study I recently published: Johnson, 2014.)
The distribution of scores in this sample forms the familiar bell-shaped curve, with most persons near the average score for the group (82), with steadily decreasing numbers of persons toward the low and high ends of the scale. And you will see that there is no break in the distribution of scores. This means that there is no obvious point where a score would be high enough to call a person an "Extravert" or low enough to call a person an "Introvert."
The absence of any clear boundary separating introverts from extraverts is why most personality psychologists use the language of personality traits rather than personality types. Let's say a person scored 92 on this Extraversion trait scale. A 92 is clearly above the average score of 82, but is it high enough to type the person as an extravert? Most personality psychologists will not even try to answer this question. Instead, they will use some sort of statistical language to describe how high this Extraversion trait score is. One such statistical term is a percentile score. A percentile score indicates the percentage of persons who score lower than the score on the test. If you count how many people in this sample of almost 620,000 scored lower than a 92, you would find that number to be 75 percent. So for this Extraversion scale and this group of persons, a 92 represents a percentile score of 75. (That result might be different if you were being compared only to males or females in the group, and/or to people of roughly the same age.)
Although percentile scores are intuitively meaningful for most professional psychologists and non-psychologists alike, sometimes we want more interpretation of the significance of a score beyond the fact that it is higher than 75 percent of other scores. Specifically, sometimes we want to know if a score is considered "low," "average," or "high." In my own work, I constructed a narrative report program that prints different descriptions based on low, average, or high scores. For the extraversion trait, this means three possible report descriptions—low scores indicate the person is relatively introverted; high scores, relatively extraverted; and average scores, somewhere in between.
But the low/average/high classification is just as problematic as the two-category classification of introverts and extraverts. There are no clear boundaries in the graph that suggest what constitutes an "average" score. Psychologists who want to define such boundaries usually employ a very common rule-of-thumb, which is to define as "average" any score within one-half standard deviation of the mean (average) score. (Standard deviation is a statistic that describes how bunched up or spread out a set of scores is. There are plenty of sites on the Internet that explain the concept, like this video, and how to calculate it, like this page.)
For the sample of roughly 620,000 persons, the standard deviation of Extraversion scores is 14, so half of one standard deviation is 7. The rule of thumb would, therefore, suggest classifying scores between 75 and 89 as "average," as these scores are 7 points below and 7 points above the mean of 85. Scores 74 or lower would be classified as "low" (introverted) and scores 90 or higher would be classified as "high" (extraverted). With this scheme, that person who scored 92 would indeed be classified as an extravert.
But what is the objective basis for defining "average" as plus-or-minus one-half standard deviation from the mean? There isn't any, and I have never been able to track down where this rule-of-thumb originated. In fact, the one-half standard deviation convention is not the only rule-of-thumb out there. Some schemes (see this chart) regard scores within one full standard deviation of the mean as average, and the next standard deviation out as low-average or high-average. With this scheme, that person who scored 92 would be classified as simply average—neither an introvert nor an extravert.
For a distribution of scores that perfectly displays the normal, bell-shaped distribution, this alternative scheme will place about 68 percent of people in the simply average category. If we also include the low-average and high-average people with the simply average group, this means that we are defining just over 95 percent of people as average. In contrast, the one-half standard deviation rule places about 30 percent of people in the low category, 40 percent in the average category, and 30 percent in the high category.
So which rule should we be using if we want to classify people as low, average, or high on a personality trait? Do we really want to say that 95 percent of the population is neither introverted nor extraverted, but just somewhere in between, with only 2.5 percent introverts and 2.5 percent extraverts? Would it be more accurate to say that about 30 percent of the population is relatively introverted, 30 percent relatively extraverted, and 40 percent neither particularly introverted nor extraverted? How can we answer this question?
The lack of an objective rationale for choosing cutting points for low, average, and high is only one problem in moving from trait scores to type categories. Another problem is that we might be mistaken in assuming that our rules for defining low, average, and high scores on a self-report introversion-extraversion questionnaire will necessarily correspond to how people who know us well would describe our degree of introversion-extraversion. One would think that researchers would have investigated whether, say, the one-half standard deviation rule for self-report questionnaires actually corresponds to how other people perceive someone's personality traits, or whether there might be a better rule. But as far as I can tell, the only research on this topic is a small study I conducted seven years ago (Johnson, 2009).
In that study, I had participants complete a 300-item inventory that provided scores on the Big Five personality traits plus six facets of each of the Big Five. These participants provided me contact information for acquaintances who knew them well. I contacted these acquaintances and had them rate the participant's standing on each of the 35 traits with a percentile scale. That way, I was able to see how well the one-half standard deviation rule placed individuals into the low (less than 30th percentile), average (30th to 70th percentile), and high (greater than 70th percentile) categories, as perceived by acquaintances. It turned out that the one-half standard deviation rule, used by so many psychologists to classify people as low, average, or high, correctly classified only 41 percent of the cases. But a statistical procedure called Optimal Data Analysis correctly classified 62.2 percent of the cases, and was superior to the one-half standard deviation rule for 32 of the 35 traits.
If neither 41 percent nor 62.2 percent sounds like very accurate classification to you, keep in mind that there will always be measurement error for both the questionnaire scores and the acquaintance ratings, and that there will always be a number of cases on the two boundary points that could have gone either way (e.g., a 30 percentile rating could easily imply either a low degree or low-average degree of the trait). In other words, the measurement of personality is not perfectly precise, so it is risky to use scores to place people into discrete categories. Consequently, the next time you encounter a personality assessment in which the only result is classification into a discrete category, you might want to take the result with a grain of salt.
Facebook image: Gelpi/Shutterstock
Johnson, J. A. (2009, July). Calibrating personality self-report scores to acquaintance ratings. Poster presented at the first stand-alone conference of the Association for Research in Personality, Evanston, IL.
Johnson, J. A. (2014). Measuring thirty facets of the five factor model with a 120-item public domain inventory: Development of the IPIP-NEO-120. Journal of Research in Personality, 51, 78-89. DOI: 10.1016/j.jrp.2014.05.003