The accuracy of Large Language Models (LLMs) such as Google’s GPT (Generative Pre-trained Transformer) is vital, particularly when it comes to producing content that needs to be factually correct, such as educational content or news reports. However, despite their abilities, LLMs often generate plausible but incorrect information, a phenomenon known as “hallucination.”
Google AI researchers have proposed a solution to this problem with AGREE (Adaptation for GRounding EnhancEment), a learning-based framework that allows LLMs to affirm their responses before producing them, a process known as grounding. This improves the accuracy of the responses and allows the LLMs to provide accurate citations.
Existing methods of controlling the hallucination phenomenon in LLMs primarily rely on post-hoc citing and prompting-based grounding. Post-hoc citing involves adding citations once the responses have been generated, often using Natural Language Inference (NLI) models. However, this method heavily relies on the knowledge within the LLMs’ embeddings and struggles to handle facts outside its training data. Prompting-based grounding uses LLMs’ instruction-following and in-context learning abilities, but this method is often ineffective, especially in real-world situations requiring high accuracy.
AGREE addresses these issues by integrating both learning-based adaptation and Test-Time Adaptation (TTA). The model is fine-tuned using synthetic data from unlabeled questions during training, enabling the LLMs to self-ground claims by adding citations to their responses. During test time, AGREE uses an iterative inference strategy which prompts LLMs to seek further information based on their self-generated citations, enabling improving responses.
Google AI researchers collect synthetic data from unlabeled queries at the training stage, retrieve useful passages from reliable sources, and fine-tune a base LLM to affirm its claims. An NLI model is used during the fine-tuning process to evaluate each claim and add citations as needed. Experiments demonstrate that AGREE improves both the grounding and precision of citations when compared to baseline methods, achieving relative improvements of over 30% in grounding quality.
AGREE also proves effective with out-of-domain data, meaning it is versatile across different types of inquiries, which demonstrates its robustness. The use of TTA helps improve both the correctness of answers and grounding.
In essence, AGREE improves the issue of hallucination in LLMs by enhancing their verifiability. It enables LLMs to ground their responses and provide accurate citations, making them more reliable, especially in fields requiring a high degree of factual accuracy. By combining learning-based adaptation with TTA, AGREE’s approach appears to outperform current approaches and seems adaptable for many datasets types. Therefore, AGREE’s potential to yield more reliable language models for real-world applications requiring high factual accuracy is promising.