Large Language Models (LLMs) have transformed the landscape of Artificial Intelligence. However, their true potential, especially in mathematic reasoning, remains untapped and underexplored. A group of researchers from the University of Hong Kong and Microsoft have proposed an innovative approach named ‘CoT-Influx’ to bridge this gap. This approach is aimed at enhancing the mathematical reasoning capabilities of smaller LLMs like LLaMA.
Despite advancements in the field, LLMs face obstacles with Chain-of-Thought (CoT) prompts, which are crucial in math reasoning. Recent research has made significant progress using CoT-based training data, but there are persistent issues like token redundancy and sub-optimal math reasoning using existing prompt retrieval methods.
CoT-Influx presents a potential solution to these challenges. Utilizing a ‘coarse-to-fine’ pruning mechanism, the method strives to optimize the input of effective CoT examples while adhering to the confines of context windows. Essentially, the approach makes more room for useful CoT examples without adding to computational complexity.
This novel method utilizes the MRD3, a specially designed math reasoning dataset, consisting of problems with a wide array of difficulty levels. Leveraging this dataset, the researchers devised a pruner mechanized to precisely choose and trim irrelevant tokens while respecting original context constraints.
Applied on several LLaMA models across five math datasets, CoT-Influx resulted in notable success, boosting the models’ accuracy. A testament to their efficacy, the LLaMA2-70B model with CoT-Influx outperformed larger models, such as GPT-3.5, on the GSM8K dataset by an impressive 2.5%. It also achieved peak performance on other datasets like AddSub and Multiarith.
In essence, CoT-Influx offers a promising means to enhance the mathematical reasoning capabilities of LLMs. With the efficient pruning mechanism and advanced token utilization, models can significantly improve their accuracy on mathematically challenging datasets. This development opens new avenues for utilizing LLMs to solve intricate mathematical problems and has promising implications for future research in AI reasoning and learning efficiency.
Don’t forget to catch our other publications and join us on various platforms. Mention not to miss out on our newsletter and also join the expansive ML SubReddit community of ours. Credit for this research goes to the team from University of Hong Kong and Microsoft.
Read more about the research in the original paper.