Large Language Models (LLMs) are the latest development in artificial intelligence and have gained significant popularity due to their ability to answer questions, complete codes, and summarize long texts, strongly utilizing Natural Language Processing (NLP) and Natural Language Generation (NLG). However, these models sometimes produce content which is false or misleading, also known as “hallucinations”.
To address this issue, recent research focuses on the automatic detection of these hallucinations. The research team proposed a detailed taxonomy of six varying forms of hallucinations and developed automated systems for their detection and modification. Unlike previous systems which often oversimplified factual errors into binary categories, this approach aims to identify a wider range of hallucinations, including entity-level contradictions and the creation of nonexistent entities.
The team’s goals are to accurately detect hallucination sequences, differentiate between types of errors, and suggest possible improvements. They’ve created a unique taxonomy that separates factual errors into six kinds and introduced a new task, benchmark and model to overcome the challenges.
Their new benchmark, incorporating human judgments on outputs from two language models, ChatGPT and Llama2-Chat 70B, across multiple domains, revealed that 60% and 75% of these models’ outputs, respectively, displayed hallucinations. They found an average of 1.9 and 3.4 hallucinations per response and discovered that a significant proportion of these hallucinations were previously under-examined categories like fabricated concepts or unverifiable words.
To address these issues, they trained FAVA, a retrieval-augmented language model, specifically to identify and address these hallucinations. Both automated and human assessments on the benchmark indicated that FAVA performed better than ChatGPT at identifying fine grained hallucinations. The proposed edits by FAVA enhanced the factuality of the generated text and detected hallucinations, leading to a 5-10% FActScore improvement.
This study highlights the necessity for continued improvements in the area of automatic hallucination detection in language model-generated text. The paper ends by recommending interested parties to read the full research paper, follow the team on social media and join their community for more updates.