Skip to content Skip to footer

Researchers from MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) have designed a new type of game to enhance how artificial intelligence (AI) comprehends and produces text. This “consensus game” includes two parts of an AI system – the part that generates sentences and a part that evaluates those sentences. This model significantly improved the AI’s ability to provide precise answers to questions.

Large language models traditionally generate answers directly from the model or use the model to evaluate predefined answers, each leading to divergent results. To tackle this issue, the researchers developed an innovative, training-free, game-theoretic method. This process, likened to a game of clues and signals, uses an ‘equilibrium ranking,’ decoding algorithm.

When the algorithm was tested across tasks like reading comprehension, commonsense reasoning, and mathematical problem-solving, the performance of these models improved significantly. It even surpassed the results from larger models when used with the LLaMA-7B model.

“Diplomacy,” a strategic board game, had partly inspired the development of the Consensus Game. The game requires players to negotiate alliances, betray friends, and conquer territories without the use of dice, requiring skill, strategy, and language capabilities.

The consensus game’s system aims for equilibrium, ensuring accuracy and fidelity. The game changes the interactions between generative and discriminative components iteratively until they agree on a realistic answer aligning with their initial beliefs. This success bridges the gap between the two querying methods.

Implementing this approach, especially for question-answering tasks, presents numerous computational challenges. The model must reach consensus between the generative and discriminative parts for each question and its potential responses.

An intriguing result was the system’s struggle with math word problems. It couldn’t generate incorrect answers, a critical component of comprehending the process of finding the right answer. The researchers hope to develop methods that can yield more factual and consistent answers across various tasks and significantly enhance the performance of the base model.

The findings received a “best paper award” at the NeurIPS R0-FoMo Workshop in December 2023, fulfilling the potential promise seen in the research. Jacob, an MIT PhD student, was involved in the development and testing of the model, along with collaborators Yikang Shen, Gabriele Farina, and Jacob Andreas. The team aims to integrate this novel method into AI training to achieve more reliable AI competence in the future.

Leave a comment

0.0/5