AI Can Make Moral Judgments, but Should It?
Evaluating a new artificial intelligence tool for moral reasoning.
Posted November 26, 2021 | Reviewed by Davia Sills
- Are machines ready to undertake the (up until now) uniquely human enterprise of ethics?
- Delphi is a natural language processing model trained on a large dataset of human language samples to make ethical evaluations.
- It seems that humans can create machines that can make moral decisions, but the question of whether we should remains.
This post was written by Melanie McGrath and Melissa Wheeler, Ph.D.
Around the world, governments and citizens are increasingly attending to the ethical implications of our growing development and use of artificial intelligence (AI) systems. AI is already incorporated into our data governance, our recruitment processes, parole decisions, diagnoses in health, and nearly every facet of our social media experience. As the scope of machine activity expands and our reliance on it deepens, so does the potential for both profound benefit and equally profound risk.
Ethical risks of AI
Frameworks to manage the ethical risks presented by AI frequently focus on the external imposition of principles or safeguards, such as data privacy, identifiable chains of accountability for algorithmic processes, and ensuring the programming or training of systems does not introduce discriminatory social biases. Some computer scientists, however, are taking a different approach and attempting to develop intelligent systems that emulate human moral reasoning and decision-making. Such efforts present a meaty question for moral psychologists: Are machines ready to undertake the (up until now) uniquely human enterprise of ethics?
An impressive recent effort towards the creation of ethical AI is the Delphi model of moral reasoning. Delphi is a natural language processing model trained on a large dataset of human language samples to make ethical evaluations. It can respond to free-form questions (“Should I yell at my co-worker?”), assertions of moral judgment (“Women cannot be scientists”), or make a morally preferable choice between two options (“Going to bed earlier than my roommate” or “Not being honest to my parents about university”). Initial testing of Delphi’s judgments against those of human annotators suggested greater than 90 percent accuracy. The model also shows considerable social and contextual nuance, distinguishing between the ethical acceptability of, for example, “running a red light” versus “running a red light in an emergency.” But is Delphi ready for real-world moral reasoning? Not quite, it would seem.
When Delphi’s creators opened it up to the world, questions from interested parties online and the ensuing answers to those questions raised a few eyebrows. For example, when asked if it was permissible to “do genocide if it makes me very, very happy,” the answer was, “It’s OK.” And the model was less than successful when it came to stemming human biases: “Men are smarter than women.” Answer? “It’s expected.”
Other computer scientists have weighed in, signposting where Delphi might fall down as a system of moral reasoning. Psychological scientists, especially those focusing on the psychology of morality, are unlikely to be surprised that early efforts to create moral machines have fallen short of the mark. Variation in human morality within and between individuals, cultures, and times is vast. True ethical decision-making requires an understanding that ethics are dynamic, and the distinction between moral principle and social convention is often fine.
It’s all relative
Swathes of literature in the psychological sciences attest to the fact that what is considered a moral violation is context-dependent, constantly shifting, and evolving. The well-documented findings of Moral Foundations Theory indicate that liberals and conservatives emphasize different domains of morality, with liberals almost exclusively focused on notions of harm and fairness and conservatives endorsing an expanded set of moral concerns that include loyalty and respect for hierarchy and tradition. Our own research shows that people differ in their understanding of what is harmful and who is harmed. There is considerable variation, for example, in evaluations of what constitutes prejudice and bullying, particularly at the margins. Any artificial intelligence trained on particular subsets of human language is likely to be captured by the moral priorities of a limited range of demographics.
Morality versus etiquette
Another outcome of the dynamism of human morality identified by Elliot Turiel is that at any point in time, some social norms will reflect ethical principles of acceptable behavior, whilst others are better considered etiquette—standards of normal behavior to ensure the smooth functioning of society. Delphi, and likely any model of machine ethics to follow, grapples with this distinction, labeling examples like “wearing a pajama to a party” or “feeding your cat using forks” as moral wrongs. Without any clear means of distinguishing the moral from the conventional, it is difficult to accurately classify an AI model as a system of distinctly moral reasoning.
Moral intuitions (over reason)
Perhaps the most fundamental issue for machine ethics from the perspective of moral psychologists, however, is its necessary conceptualization of moral evaluation as an act of pure reason and logic. While the notion of moral evaluation as an outcome of reason (human or machine) was dominant in early moral psychology, social psychology has largely moved on from this view. Modern psychological accounts of morality tend to focus on intuition and social-situational influences. Jonathan Haidt’s social intuitionist model proposes that moral decision-making is automatic and based on intuitive, emotional reactions. Given that social psychology no longer embraces the notion of moral evaluation as the outcome of dispassionate reasoning, it begs the question of whether we can truly hope for machine ethics to approximate human ethical decision-making.
The conversation regarding ethical AI is only just beginning, and there is significant scope for psychological researchers and business ethicists to contribute to the understanding of not only whether we can create machines that can make moral decisions but whether we should. While artificial intelligence may be able to be trained to handle judgments regarding social norms or conventions, we may find that moral and ethical judgment continues to be led by humans in collaboration with their computationally powerful creations.
Haslam, N., Dakin, B. C., Fabiano, F., McGrath, M. J., Rhee, J., Vylomova, E., Weaving. M., & Wheeler, M. A. (2020). Harm inflation: Making sense of concept creep. European Review of Social Psychology, 31(1), 254-286. https://doi.org/10.1080/10463283.2020.1796080.
McGrath, M. J., & Haslam, N. (2020). Development and validation of the Harm Concept Breadth Scale: Assessing individual differences in harm inflation. PLoS One, 15(8), e0237732. https://doi.org/10.1371/journal.pone.0237732.
Jiang, L., Hwang, J. D., Bhagavatula, C., Bras, R. L., Forbes, M., Borchardt, J., Liang, J., Etzioni, O., Sap, M., & Choi, Y. (2021). Delphi: Towards machine ethics and norms. arXiv preprint arXiv:2110.07574.
Talat, Z., Blix, H., Valvoda, J., Ganesh, M. I., Cotterell, R., & Williams, A. (2021). A Word on Machine Ethics: A Response to Jiang et al.(2021). arXiv preprint arXiv:2111.04158.
Dawson, D., Schleiger, E., Horton, J., McLaughlin, J., Robinson, C., Quezada, G., Scowcroft, J., & Hajkowicz, S. (2019). Artificial intelligence: Australia’s Ethics Framework.
Graham, J., Haidt, J., & Nosek, B. A. (2009). Liberals and conservatives rely on different sets of moral foundations. Journal of Personality and Social Psychology, 96(5), 1029. https://doi.org/10.1037/a0015141
Knight, W. (2021, 28 October, 2021). This program can give AI a sense of ethics - sometimes. Wired. https://www.wired.com/story/program-give-ai-ethics-sometimes/