Artificial Intelligence
Can AI Score Higher Than Humans on Emotional Intelligence?
LLMs outperformed people on tests of emotional intelligence.
Updated June 6, 2025 Reviewed by Kaja Perina
Can a machine process emotional information better than humans? Artificial intelligence (AI) large language models (LLMs) become more sophisticated as researchers worldwide probe their capabilities. A new study by researchers at the University of Geneva (UNIGE) and the University of Bern (UniBE) shows how six LLMs outperformed humans on emotional intelligence (EI) tests.
“These findings have major implications for the use of LLMs in social agents as well as for the assessment of socio-emotional skills,” wrote corresponding author Katja Schlegel, in collaboration with co-authors Nils Sommer and Marcello Mortillaro.
What Is Emotional Intelligence?
The term “emotional intelligence” was coined by American psychologists John D. Mayer and Peter Salovey in 1990 with the publication of their work in Imagination, Cognition and Personality. Mayer and Salovey defined emotional intelligence as “a set of skills hypothesized to contribute to the accurate appraisal and expression of emotion in oneself and in others, the effective regulation of emotion in self and others, and the use of feelings to motivate, plan, and achieve in one's life.”
A half decade later, the concept of emotional intelligence was spread globally by American psychologist, science journalist, and former Psychology Today contributor Daniel Goleman with the success of his 1995 New York Times bestseller Emotional Intelligence: Why It Can Matter More Than IQ, which was translated into 40 languages and distributed worldwide.
Why Does Emotional Intelligence Matter?
Emotional intelligence may contribute to the quality of a person’s life in such areas as personal and professional relationships, family ties, education, employment, mental health, and many others.
The researchers of this new study point out that emotional intelligence frequently results in more favorable results at the workplace and other areas of an individual’s life.
“Emotions are crucial for forming and maintaining social bonds and effectively communicating them is vital for achieving positive outcomes in individuals and groups,” wrote the paper's authors.
How to Measure Emotional Intelligence in AI?
Given the importance of emotional intelligence, the researchers set out to evaluate the emotional intelligence abilities of AI large language models. But how? Currently, in humans, emotional intelligence can be evaluated using self-reporting questionnaires or ability-based tests. The questions were scenarios with multiple-choice answers. Given this, the team set out to evaluate six different LLMs on five different emotional intelligence tests.
The six LLMs tested for this study were ChatGPT-o1, ChatGPT-4, Claude 3.5 Haiku, Gemini 1.5 flash, DeepSeek V3, and Copilot 365, and the five emotional intelligence tests used include the Situational Test of Emotion Management (STEM), Situational Test of Emotion Understanding (STEU), Geneva Emotional Competence Test (GECo—Emotion Regulation and GECo—Emotion Management subtest), and the Geneva EMOtion Knowledge Test (GEMOK-Blends). The emotional intelligence tests used not only measured the test-takers' comprehension of the triggers and results of emotions but also their comprehension of the best way to manage one’s emotions and the emotions of others.
The results were that the AI large language models scored significantly higher than the human participants in the original emotional intelligence tests’ validation studies, with an average accuracy score of 81 percent versus the humans' 56 percent average.
“These results contribute to the growing body of evidence that LLMs like ChatGPT are proficient—at least on par with, or even superior to, many humans—in socio-emotional tasks traditionally considered accessible only to humans, including Theory of Mind, describing emotions of fictional characters, and expressing empathic concern,” the researchers wrote.
To further evaluate the LLMs' capabilities, the research team created entirely novel scenarios, generated by ChatGPT-4, for all five of the emotional intelligence tests for work and personal life domains.
Here is one example of a new scenario with multiple-choice answers generated by ChatGPT-4 for the Situational Test of Emotion Management.
Emotion: Disgust, Domain: Personal-Life
Dave's neighbor's dog keeps leaving messes in his yard. What action would be the most effective for Dave?
a) Return the mess to his neighbor's yard.
b) Confront his neighbor and express his concern about the issue.
c) Ignore it and clean up the mess himself.
d) Report the neighbor to the police.
ChatGPT-4 has choice B as the correct answer for its generated emotional intelligence test question. If you decided not to retaliate and fling the dog’s mess to the neighbor’s yard, say nothing and clean it up yourself, or escalate the situation with a call to the police, then you have chosen the optimal response for emotional intelligence.
The scientists tested 467 human participants with these newly generated ChatGPT-4 tests for all five of the emotional intelligence tests. The test difficulty was statistically equivalent between the standard emotional intelligence tests and the ChatGPT-generated tests.
“These findings suggest that LLMs can generate responses that are consistent with accurate knowledge about human emotions and their regulation,” the researchers concluded.
Copyright © 2025 Cami Rosso. All rights reserved.