Language Learning Models (LLMs) such as ChatGPT and Gemini have shown the capability of answering complex queries, but they often produce false or unsupported information, a situation aptly titled “hallucinations”. This gets in the way of their reliability, with potential repercussions in critical fields like law and medicine. A specific subset of these hallucinations, known as “confabulations”, involve arbitrary or incorrect answers to the same question.
To mitigate this issue of confabulations, researchers from the OATML group at the University of Oxford have proposed a statistical approach that leverages entropy-based uncertainty estimators. This new method, unlike previous solutions, factors in the meaning, not just the exact wording of responses. This is done by measuring the “semantic entropy,” the degree of uncertainty concerning the meaning of generated answers. It results in an overall improvement in the reliability of LLMs and provides users more confidence in relying on model outputs, by alerting them when the answers are likely unreliable.
This method functions by clustering similar answers based on their meaning and then measuring the entropy within these clusters. A high entropy signifies a potential for confabulated responses. The technique has been applied across multiple domains, including general knowledge, trivia, and healthcare queries, and has demonstrated substantial improvements in detecting unreliable answers. Also, by choosing not to respond to questions that might trigger high-entropy (confabulated) replies, it helps enhance the accuracy of LLM outputs.
The use of “semantic entropy” is an innovation in the realm of LLM reliability. This metric isn’t merely focused on lexical differences, as traditional entropy measures are. Instead, it evaluates the variation in meaning across differing models, providing a robust mechanism for detecting false or misleading responses. It also proves to be superior to other methods, such as naive entropy and supervised embedding regression.
The study has also shown the successful application of semantic entropy in extended text like biographical paragraphs. By breaking them into factual claims and assessing the consistency of these claims, confabulations can be effectively detected. This implies that while LLMs might inherently recognize their knowledge gaps, traditional methods don’t fully leverage this capability. Therefore, semantic entropy can provide a more reliable approach to evaluate and manage LLM outputs, especially in complex and open-ended tasks.
This new approach to address the confabulation issue in LLMs has been laid out in a published research paper and has been shared on GitHub. The researchers deserve the credit for this significant advancement in improving the reliability and performance of LLMs.