Language Learning Models (LLMs) can come up with good answers and even be honest about their mistakes. However, they often provide simplified estimations when they haven’t seen certain questions before, and it’s crucial to develop ways to draw reliable confidence estimations from them. Traditionally, both training-based and prompting-based approaches have been used, but these often result in less accurate confidence estimates.
Recently, a group of researchers from Purdue University, University of Illinois Urbana-Champaign, University of Southern California, and The Hong Kong University of Science and Technology came up with a novel framework referred to as SaySelf. This framework is tailored to enhance the precision and reliability of confidence estimations from LLMs.
Unlike traditional methods, SaySelf enables LLMs to produce self-reflective rationales explaining their confidence estimates and where the models lack knowledge. The researchers utilize a pre-designed LLM like GPT-4 to create a customized dataset for the model, which can then be fine-tuned. They then draw a random sample of reasoning chains, representing the LLM’s thought process for each query. These chains are grouped into clusters based on semantic similarity, and one example from each group is retained.
Then, from a first-person perspective, GPT-4 evaluates the cases chosen from each cluster and explains any uncertainty about certain knowledge in everyday language. To ensure accurate confidence estimations, the researchers calibrate the LLMs’ confidence estimate for each response using reinforcement learning. They create a reward system that discourages overconfident predictions and penalizes inaccurate ones. The researchers tested SaySelf on several knowledge-extensive question-answering tasks including complex medical diagnoses and legal case analyses.
The results of the study showed that SaySelf maintains task performance while significantly reducing confidence calibration errors. The self-reflective rationales further improved the calibration performance and successfully captured internal uncertainty, which could have a significant impact on scholarly investigations and practical applications. For instance, LLM’s alignment can greatly benefit from a transparent confidence statement that includes explanations. Furthermore, LLMs’ interaction and performance can be enhanced by following these self-reflective rationales to initiate additional activities such as calling upon external tools or posing clarification questions.
Upon finalizing the SaySelf training process, the researchers anticipate seeing progress in training procedures, such as active learning algorithms that boost the learning outcomes of LLMs via their interactions with users.