The development of large language models (LLMs) has significantly expanded the field of computational linguistics, moving beyond traditional natural language processing to include a wide variety of general tasks. These models have the potential to revolutionize numerous industries by automating and improving tasks that were once thought to be exclusive to humans. However, one significant…
The progress in Language Learning Models (LLMs) has been remarkable, with innovative strategies like Chain-of-Thought and Tree-of-Thoughts augmenting their reasoning capabilities. These advancements are making complex behaviors more accessible through instruction prompting. Reinforcement Learning from Human Feedback (RLHF) is also aligning the capabilities of LLMs more closely with human predilections, further underscoring their visible progression.
In…
The rise in the use of large language models (LLMs) such as GPT-3, OPT, and BLOOM on digital interfaces has highlighted the necessity of optimizing their operating infrastructure. LLMs are known for their colossal sizes and considerable computational resources required, making them difficult to efficiently implement and manage.
Researchers from various institutions, including Microsoft Research and…
Large Language Models (LLMs) are increasingly used for tasks related to Natural Language Processing (NLP) and Natural Language Generation (NLG). However, the understanding of LLMs in processing structured data like tables needs further exploration. Addressing this need, Microsoft researchers have developed a benchmark dubbed Structural Understanding Capabilities (SUC) to assess how well LLMs can comprehend…
Researchers at the Korea Advanced Institute of Science and Technology (KAIST) have created a unique benchmark system known as INSTRUCTIR to improve the fine-tuning of Large Language Models (LLMs). The goal is to enhance these models' response to individual user preferences and instructions across a variety of generative tasks.
Traditionally, retrieval systems have struggled to…
Large language models (LLMs) such as OpenAI's GPT series have had significant impacts across various industries since their development, with their ability to generate contextually rich and coherent text outputs. However, despite their potential, there is a significant issue with the precision of these models when utilizing external tools. There is a need for improvement…
Artificial intelligence heavily relies on the intricate relationship between visual and textual data, utilising this to comprehend and create content that bridges these two modes. Vision-Language Models (VLMs), which utilise datasets containing paired images and text, are leading innovations in this area. These models leverage image-text datasets to boost progress in tasks ranging from improving…
Researchers from Shenzhen Research Institute of Big Data and The Chinese University of Hong Kong, Shenzhen, have introduced Apollo, a suite of multilingual medical language models, set to transform the accessibility of medical AI across linguistic boundaries. This is a crucial development in a global healthcare landscape where the availability of medical information in local…
Artificial intelligence possesses large language models (LLMs) like GPT-4 that enable autonomous agents to carry out complex tasks within various environments with unprecedented accuracy. However, these agents still struggle to learn from failures, which is where the Exploration-based Trajectory Optimization (ETO) method comes in. This training introduced by the Allen Institute for AI; Peking University's…
The field of large language models (LLMs) has witnessed significant advances thanks to the introduction of State Space Models (SSMs). Offering a lower computational footprint, SSMs are seen as a welcome alternative. The recent development of DenseSSM represents a significant milestone in this regard. Designed by a team of researchers at Huawei's Noah's Ark Lab,…
The rapid development in Large Language Models (LLMs) has seen billion- or trillion-parameter models achieve impressive performance across multiple fields. However, their sheer scale poses real issues for deployment due to severe hardware requirements. The focus of current research has been on scaling models to improve performance, following established scaling laws. This, however, emphasizes the…
Large Language Models (LLMs) such as GPT-4 and Llama-2, while highly capable, require fine-tuning with specific data tailored to various business requirements. This process can expose the models to safety threats, most notably the Fine-tuning based Jailbreak Attack (FJAttack). The introduction of even a small number of harmful examples during the fine-tuning phase can drastically…