A new study from Georgia State University’s Psychology Department has found that artificial intelligence (AI) can outperform humans in making moral judgments. The study, led by Associate Professor Eyal Aharoni, and published in Nature Scientific Reports, stemmed from the team’s curiosity about how language models address ethical questions.
The research was conceived in the style of the Turing test, a traditional measure of a machine’s capacity to mimic human intelligence. Aharoni wanted to see if the same test could apply to moral decision-making. The context for such a study is growing relevancy of AI in fields like law, where it is being used increasingly for tasks like drafting legal opinions.
The research team used the GPT-4 model developed by OpenAI to conduct the study in which both undergraduate students and the GPT-4 were asked the same set of ten moral and ethical questions. These answers were then grades on various criteria such as virtuousness, intelligence, and trustworthiness by a group of adults who were not aware that a machine had generated some of the answers.
Results showed that the AI-generated responses consistently received higher ratings in most areas, including virtue, intelligence, and trustworthiness. Participants also agreed more with AI responses than human ones. Furthermore, many participants correctly identified the AI-generated responses, which were generally seen as superior to human responses.
However, the study also identified potential pitfalls for AI’s role in moral decision-making. The length of the responses, which the study didn’t control for, might have acted as a giveaway in identifying AI responses. Furthermore, there may be inherent biases in the AI’s judgments stemming from the data used to train it. Such biases may result in differences in judgment depending on the socio-cultural context.
Nevertheless, the results provide food for thought about the potential of AI in moral reasoning and decision-making. As AI continues to be adopted across various fields, it’s critical to understand how it functions and the potential risks associated with it. This study, therefore, becomes a relevant starting point.
Participants in the study often perceived AI responses as more rational and less emotional than human ones. This belief in the objectivity of machines leads to questions about AI’s true ‘moral compass,’ given the biases that can stem from training data and the fact that AI responses can vary greatly depending on different inputs.
Despite these concerns, and the need for further research, the study suggests that AI has an impressive ability to make compelling judgments that can pass a Turing test scenario. Its better performance indicates the potential of machines in areas requiring moral judgments, even with their inherent limitations and potential risks.