This paper delves into the realm of uncertainty quantification in large language models (LLMs), aiming to pinpoint scenarios where uncertainty in responses to queries is significant. The study delves into both epistemic and aleatoric uncertainties. Epistemic uncertainty arises from inadequate knowledge or data about reality, while aleatoric uncertainty originates from inherent randomness in prediction problems. Recognizing these uncertainties is pivotal to improving the accuracy and dependability of LLM responses, particularly regarding the detection and reduction of hallucinations or inaccurate responses generated by these models.
There are several existing methods for identifying hallucinations in LLMs, each with unique constraints. One frequently used method is the probability of the greedy response, which assesses the likelihood of the most probable response produced by the model. The semantic-entropy method, which evaluates the entropy of the semantic distribution of responses, is another. Lastly, the self-verification method entails the model confirming its responses to estimate uncertainty.
Despite their utility, these methods have significant drawbacks. For example, the probability of the greedy response can be susceptible to the size of the label set, showing subpar performance when there are a multitude of possible responses. Both the semantic-entropy method and the self-verification method do not take into account the comprehensive distribution of responses, leading to limited uncertainty assessments.
To address these limitations, the paper proposes a novel strategy: creating a combined distribution for multiple responses from an LLM for a singular query using iterative prompting. This involves the LLM generating responses to a query, followed by generating additional responses while including its prior responses in the prompt. An information-theoretic metric of epistemic uncertainty is consequently derived from this process. This is quantified by evaluating the mutual information of the joint distribution of responses, which remains unaffected by aleatoric uncertainty. The study proposes a finite-sample estimator for this mutual information, which despite the potentially infinite support of LLM outputs, presents barely notable error in practical situations.
The paper also discusses an algorithm for hallucination detection rooted in this mutual information metric. When a threshold is set via a calibration procedure, this method excels against conventional entropy-based approaches, especially in datasets comprising mixed single-label and multi-label queries. It keeps high recall rates while reducing errors, consequently enhancing the reliability of LLM outputs.
The study contributes significantly to understanding uncertainty in LLMs by distinguishing epistemic and aleatoric uncertainty. The proposed iterative prompting and mutual information-based metric pave the way for a more thorough comprehension of LLM confidence, boosting the detection of hallucinations and improving the overall accuracy of responses. This method resolves a significant limitation in existing techniques, providing a practical and effective approach for real-world applications of LLMs. Access the full paper and follow the project on Twitter for more updates. Join the project’s social platforms and if you appreciate their work, subscribe to their newsletter and join the ML SubReddit, which has 44k+ members.