GENAUDIT: An AI-Based Instrument Assisting Users in Validating Facts and Comparing Machine-Learned Outputs with Evidence-Backed Inputs

Recent developments in Artificial Intelligence (AI), particularly in Generative AI, have proven the capacities of Large Language Models (LLMs) to generate human-like text in response to prompts. These models are proficient in tasks such as answering questions, summarizing long paragraphs, and more. However, even provided with reference materials, they can generate errors which could have grave implications in document-grounded question answering for sectors like banking or healthcare.

Addressing this issue, researchers have introduced GENAUDIT, a tool designed to fact-check responses produced by LLMs for jobs based on documents. GENAUDIT functions through suggesting modifications to the LLM’s output, identifying inaccurate claims from the reference document, and recommending necessary adjustments or deletions. It also furnishes proof from the reference text to authenticate the factual assertions made by the LLM.

Creating GENAUDIT involved training models to perform these specific tasks. They have been trained to detect unsupported claims, offer suitable modifications, and extract evidence from the reference text to support factual assertions. GENAUDIT is equipped with an interactive interface to facilitate decision-making and user interaction, enabling users to inspect and approve recommended adjustments and supporting documents.

GENAUDIT’s performance has undergone extensive assessments carried out by human evaluators who assessed its accuracy in identifying flaws in LLM outputs during document summarization. The assessments showed that GENAUDIT proficiently detects faults in the outputs from eight different LLMs across multiple fields.

The team has proposed a method to enhance GENAUDIT’s error detection performance. This approach optimizes error recall and reduces accuracy loss, maintaining high accuracy levels while ensuring the system detects the majority of faults.

Among the primary contributions of the team are the introduction of GENAUDIT as a reliable tool for fact-checking LLM outputs, assessment, and provision of refined LLMs that act as backend models for fact-checking, an evaluation of GENAUDIT’s efficacy in fact-checking errors in summaries generated by different LLMs, and proposing a decoding time technique to balance overall accuracy and enhance error detection.

To conclude, GENAUDIT is an excellent tool that can significantly improve fact-checking processes in document-based tasks, thereby enhancing the reliability of LLM-generated information in crucial applications.

As per the announcement, GENAUDIT is now readily available for installation on pypi and its code, a tutorial, and sample outputs are available on Github. The research credits belong to the project’s researchers. Furthermore, the team encouraged individuals interested in their work to join their SubReddit, newsletter, and social media channels. The GENAUDIT project signifies a significant milestone in machine learning, promoting accuracy and reliability in fact checking and information generation.

All
Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All
Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All
Categories

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

GENAUDIT: An AI-Based Instrument Assisting Users in Validating Facts and Comparing Machine-Learned Outputs with Evidence-Backed Inputs

Leave a comment Cancel reply

You May Also Like

Narrowing the gap between design and production in the field of optical devices.

SpeechVerse: An AI Framework Built with Multiple Modes allowing LLMs to Comprehend and Carry Out a Wide Range of Speech-processing Tasks via Natural Language Commands.

+60 12-462 2768

All Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All Categories

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

GENAUDIT: An AI-Based Instrument Assisting Users in Validating Facts and Comparing Machine-Learned Outputs with Evidence-Backed Inputs

Leave a comment Cancel reply

You May Also Like

Narrowing the gap between design and production in the field of optical devices.

SpeechVerse: An AI Framework Built with Multiple Modes allowing LLMs to Comprehend and Carry Out a Wide Range of Speech-processing Tasks via Natural Language Commands.

+60 12-462 2768

All
Categories

All
Categories

All
Categories