DeepMind researchers have unveiled a new model, PaliGemma, pushing forward the evolution of vision-language models. The new model successfully integrates the strengths of both the PaLI vision-language model series and the Gemma family of language models. PaliGemma is an example of a sub-3B vision-language model that uses a 400M SigLIP vision model along with a…
DeepMind researchers have developed an open vision-language model called PaliGemma, blending the strengths of the PaLI vision-language model series with Gemma family of language models. This model merges a 400 million SigLIP vision model with a 2 billion Gemma language model, creating a compact vision-language model that can compete with larger predecessors such as PaLI-X,…
A new study attempts to address the limitations associated with next-token prediction methods in artificial intelligence (AI), which currently hinder the technology's ability to mimic human intelligence, specifically in the area of advance planning and reasoning. Featuring in a multitude of language models today, these methods are increasingly shown to be deficient when it comes…
Artificial intelligence research often examines whether next-token prediction—the convention for AI language models—can replicate some aspects of human intelligence such as planning and reasoning. However, despite its extensive use, this method may have native limitations when it comes to tasks necessitating foresight and decision-making. This is important because overcoming this could allow the development of…
FlashAttention-3, the newest addition to the FlashAttention series, was created to address the fundamental issues related to Transformer architectures' attention layer. This is particularly important to the performance of large language models (LLMs) and applications that need long-context processing.
Historically, the FlashAttention series, which includes FlashAttention and FlashAttention-2, has reshaped how attention mechanisms function on GPUs…
MIT researchers have developed a computational approach that predicts protein mutations, based on limited data, that would enhance their performance. The researchers used their model to create optimized versions of proteins derived from two naturally occurring structures. One of these was the green fluorescent protein (GFP), a molecule used to track cellular processes within the…
The MIT Stephen A. Schwarzman College of Computing recently celebrated the completion of its new Vassar Street building. The dedication ceremony was attended by members of the MIT community, distinguished guests, and supporters, reflecting on the transformative gift from Stephen A. Schwarzman that initiated the biggest change to MIT’s institutional structure in over 70 years.…
This week in artificial intelligence news, OpenAI, despite its closed nature, faced a data breach and received considerable criticism. This cybersecurity incident is not surprising as the global competition in the AI field continues to intensify. In corporate restructuring, Microsoft withdrew from its observer role on OpenAI’s board and Apple declined an offer for a…
MagiCode, an innovative solution offering autonomous AI software engineering, is a solution designed to bridge the gap in currently available AI coding assistants. Where most AI coding tools often focus on only certain aspects of software development, this sometimes leads to ineffective coding due to constraints on users in expressing their overall and specific needs…
Artificial Intelligence (AI) advancements have significantly evolved voice interaction technology with the primary goal to make the interaction between humans and machines more intuitive and human-like. Recent developments have led to the attainment of high-precision speech recognition, emotion detection, and natural speech generation. Despite these advancements, voice interaction needs to improve latency, multilingual support, and…
Large Language Models (LLMs) are pivotal for numerous applications including chatbots and data analysis, chiefly due to their ability to efficiently process high volumes of textual data. The progression of AI technology has amplified the need for superior quality training data, critical for the models' function and enhancement.
A major challenge in AI development is guaranteeing…
Recent research into Predictive Large Models (PLM) aims to align the models with human values, avoiding harmful behaviors while maximising efficiency and applicability. Two significant methods used for alignment are supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). RLHF, notably, commoditizes the reward model to new prompt-response pairs. However, this approach often faces…