The development of Large Language Models (LLMs) such as GPT and LLaMA has significantly revolutionized natural language processing (NLP). They have found use in a broad range of functions, causing a growing demand for custom LLMs amongst individuals and corporations. However, the development of these LLMs is resource-intensive, posing a significant challenge for potential users.
To…
Salesforce AI Researchers have developed a new solution to enhancing text-embedding models for use in a variety of natural language processing (NLP) tasks. While current models have set extremely high standards, it is believed there is room for progression, particularly in tasks related to retrieval, clustering, classification, and semantic textual similarity.
The new model, named…
The intersection of artificial intelligence (AI) and music has become an essential field of study, with Large Language Models (LLMs) playing a significant role in generating sequences. Skywork AI PTE. LTD. and Hong Kong University of Science and Technology have developed ChatMusician, a text-based LLM, to tackle the issue of understanding and generating music.
ChatMusician shows…
The BigCode project has successfully developed StarCoder2, the second iteration of an advanced large language model designed to revolutionise the field of software development. A collaboration between over 30 top universities and institutions, StarCoder2 uses machine learning to optimise code generation, making it easier to fix bugs and automate routine coding tasks.
Training StarCoder2 on…
Researchers from the University of Oxford and University College London have developed Craftax, a reinforcement learning (RL) benchmark that unifies effective parallelization, compilation, and the removal of CPU to GPU transfer in RL experiments. This research seeks to address the limitations educators face in using tools such as MiniHack and Crafter due to their prolonged…
Language models' performance pertains to their efficiency and ability to recall information, with demand for these capabilities high as artificial intelligence continues to tackle the intricacies of human language. Researchers from Stanford University, Purdue University, and the University at Buffalo have developed an architecture, called Based, differing significantly from traditional methodologies. Its aim is to…
IBM Research has unveiled "SimPlan", an innovative method designed to enhance the planning capabilities of large language models (LLMs), which traditionally struggle with mapping out action sequences toward achieving an optimal outcome. The SimPlan method, developed by researchers from IBM, combines the linguistic skills of LLMs with the structured approach of classical planning algorithms, addressing…
A group of researchers from the Sea AI Lab and Singapore University of Technology and Design have developed Sailor, a sophisticated collection of language models designed to ease the process of language translation in linguistically-diverse regions such as Southeast Asia. This solution distinguishes itself by accurately addressing the nuances of languages such as Indonesian, Thai,…
In a collaborative effort, researchers from Microsoft Research Asia, Zhejiang University, College of William & Mary, and Tsinghua University introduced a novel artificial intelligence method called DiLightNet. This method aims to solve the fine-grained lighting control issue present in text-driven diffusion-based image generation. While current text-driven generative models can produce images from text prompts, they…
In a world filled with complexity and unpredictability, making informed decisions often proves difficult. The conventional strategies and human expertise often fall short, especially in sectors such as business, finance, and agriculture that involve high stakes and uncertainty. Enter DeLLMa – a Decision-making Large Language Model Assistant developed by researchers from the University of Southern…
Introducing Gen4Gen: A Partially Automated Process for Creating Datasets Utilizing Generative Models
Text-to-image diffusion models are arguably some of the greatest advancements in Artificial Intelligence (AI). However, personalizing these models with diverse concepts has proven challenging due to issues predominantly rooted in mismatches between the simplified text descriptions of pre-training datasets and the complexities of real-world scenarios.
One significant hurdle in the field is the absence of…
Researchers from the School of Computer Science and Engineering at Beihang University in Beijing, China, and Microsoft have developed an improved framework for Low-rank Adaptation (LoRA), known as ResLoRA. Improving LoRA is necessary to address the challenge of high costs which are incurred when fine-tuning Large Language Models (LLMs) on specific datasets, due to their…