Can We Estimate Our Own Ability to Reason?

The Dunning-Kruger effect isn’t the only snag for self-report reasoning scales.

Key points

  • Self-report measures of reasoning rely on people's perceptions of their own reasoning abilities, while behavioral tests show how they perform.
  • Research reveals a discrepancy between self-reported and behavioral tests of reasoning.
  • To understand people's reasoning ability, both self-reports and behavioral measures are needed.

Suppose you ask me how often I tend to overcome faulty intuitions and biases by stopping to reflect on my initial impulse. I probably think that I'm relatively reflective. I've been studying philosophy for over a decade and philosophers tend to be reflective (Livengood et al., 2010). Also, I've been doing cognitive science for nearly a decade and cognitive scientists are probably familiar with the evidence suggesting that we should question our gut in at least some circumstances (e.g., Scherer et al., 2017; Tversky & Kahneman, 1983).

But suppose you also give me a reflection test with questions that lure me toward intuitively appealing answers that—upon reflection—I can realize are incorrect (e.g., Frederick, 2005). That may reveal that I am not as reflective as I believe. This would make sense. We often overestimate ourselves (Zell et al., 2020).

However, there is more: Some have found that the less skilled people are, the less they seem to realize it (Kruger & Dunning, 1999). So if I am a less reflective reasoner, then I may be even more likely to overestimate my reflective tendencies than more reflective reasoners.

Self-Report vs. Behavior

This discrepancy between self-reported and behavioral tests of reasoning raises questions about the trustworthiness of self-reported reasoning ability. After all, if we measure people's reasoning ability the way we measure many personality traits—e.g., by merely asking people to rate their agreement with statements like, "I prefer complex to simple problems" and "Thinking is not my idea of fun" (Cacioppo & Petty, 1982)—then we may be systematically overestimating poor reasoners who are more likely to overestimate their reasoning ability.

However, there may be a bigger problem than the possibility that poor reasoners are less aware of their actual reasoning performance. The bigger problem may be that self-reported habits can be a poor predictor of actual habits more generally (a la Parry et al., 2021).

Some Evidence

Various studies have found this kind of disconnect between self-reported and behavioral measures of reflective reasoning.

  • Mirja Peret and Petra Rekat (2020) had high school students and professionals (a) report their agreement with sentences like the ones above (a.k.a., the "Need for Cognition" or NFC scale) and (b) complete a series of financial decision-making tasks. Alas, they found "no difference in the NFC scores among participants that solved [financial decision-making tasks] correctly and those who [did] not” (ibid.).
  • Gärtner et al (2021) had people complete the NFC items as well various measures of cognitive control, working memory, and other tests of executive function. They "found no conclusive evidence that NFC was related to any executive function measure [but instead] obtained ...moderate evidence for the null hypothesis" (ibid.).
  • Newman and colleagues (2020, Experiment 3) found that people who scored themselves higher in "need for cognition" were more (not less) "susceptible to the illusory truth effect": the bias toward believing things that are repeated—even if they are false. (Then again, De Keersmaecker et al. (2019) found that even behavioral measures of reasoning didn’t seem to predict less susceptibility to the illusory truth effect, suggesting that the illusory truth effect may not be as related to reasoning ability as we might have expected.)
  • Coutinho et al. (2021) had people score themselves on how unreflectively they reason by reporting their agreement with statements like, "The first idea is often the best one" and "When I need to form an opinion about an issue, I completely rely on my intuition"—a.k.a., the Faith In Intuition scale (Epstein et al., 1994). Coutinho and colleagues also had people complete a behavioral test of whether people stop to reconsider initial intuitions. Not only did they find a Dunning-Kruger effect, they also found that self-reported "faith in intuition" was "not related to intuitive responding" on a behavioral test, replicating earlier work (Pennycook et al., 2015).


  1. One potential takeaway is that measuring reasoning via self-report is inferior to measuring reasoning via actual behavior. After all, worse reasoners have been more likely to overestimate their reasoning skills and some self-scored measures of reasoning seemed to be unrelated to actual reasoning performance.
  2. Another possible takeaway is that self-reported and/or behavioral measures of reasoning do not measure what we think they do. Perhaps one's self-reported habits measure idealized perceptions of oneself rather than one's actual tendencies (e.g., Brown-Iannuzzi et al., 2019). Or perhaps some behavioral reflective reasoning tests measure more than just reflection (e.g., Byrd & Conway, 2019). These possibilities could help explain why we find less overlap than expected when we compare behavioral measures of reasoning with self-report measures of reasoning.

Of course, these possibilities are not mutually exclusive. The point is just that if you need to assess someone's reasoning, then you might want to triangulate with both self-report and behavioral measures—especially if you are evaluating your own reasoning ability.


