Large language models (LLMs) are essential for natural language processing (NLP), but they demand significant computational resources and time for training. This requirement presents a key challenge in both research and application of LLMs. The challenge lies in efficiently training these huge models without compromising their performance.
Several approaches have been developed to address this issue.…
Large Language Models (LLMs) have proven highly competent in generating and understanding natural language, thanks to the vast amounts of data they're trained on. Predominantly, these models are used with general-purpose corpora, like Wikipedia or CommonCrawl, which feature a broad array of text. However, they sometimes struggle to be effective in specialized domains, owing to…
Large Language Models (LLMs) are typically trained on large swaths of data and demonstrate effective natural language understanding and generation. Unfortunately, they can often fail to perform well in specialized domains due to shifts in vocabulary and context. Seeing this deficit, researchers from NASA and IBM have collaborated to develop a model that covers multidisciplinary…
The MIT administration issued an open call for papers on generative AI, attracting 75 proposals above expectations. Following this, MIT's President, Sally Kornbluth, and Provost, Cynthia Barnhart, issued a second call for proposals which saw 53 submissions. Now, 16 of these submissions have been chosen by the faculty committee to receive exploratory funding for detailed…
A study conducted by the University of Essex and published in Communications Biology utilized artificial intelligence to shed light on the longstanding debate around the theory of evolution. While Charles Darwin believed sexual selection was responsible for the diverse appearances of males in a species, Alfred Russel Wallace contended that natural selection influenced both sexes…
Peptides are involved in various biological processes and are instrumental in the development of new therapies. Understanding their conformations, i.e., the way they fold into their specific three-dimensional structures, is critical for their functional exploration. Despite the advancements in modeling protein structures, like with Google's AI system AlphaFold, the dynamic conformations of peptides remain challenging…
The advancement of deep generative models has brought new challenges in denoising, specifically in blind denoising where noise level and covariance are unknown. To tackle this issue, a research team from Ecole Polytechnique, Institut Polytechnique de Paris, and Flatiron Institute developed a novel method called the Gibbs Diffusion (GDiff) approach.
The GDiff approach is a fresh…
Last summer, the Massachusetts Institute of Technology (MIT) President Sally Kornbluth and Provost Cynthia Barnhart called on the academic community to provide effective strategies, policy proposals, and initiatives for the expansive realm of generative artificial intelligence (AI). They were met with an overwhelming response, receiving 75 submissions. After reviewing them, the committee selected 27 proposals…
The Massachusetts Institute of Technology (MIT) launched a call papers to examine generative AI and formulate suggestions on its applications. The initial call was widely acclaimed and received 75 submissions, 27 of which were selected for seed funding. Seeing the enthusiasm, MIT President Sally Kornbluth and Provost Cynthia Barnhart announced a second call for proposals,…
Concept-based learning (CBL) is a machine learning technique that involves using high-level concepts derived from raw features to make predictions. It enhances both model interpretability and efficiency. Among the various types of CBLs, the concept-based bottleneck model (CBM) has gained prominence. It compresses input features into a lower-dimensional space, capturing the essential data and discarding…
Large Language Models (LLMs) like GPT-3.5 Turbo and Mistral 7B often struggle to maintain accuracy while retrieving information from the middle of long input contexts, a phenomenon referred to as "lost-in-the-middle". This complication significantly hampers their effectiveness in tasks requiring the processing and reasoning over long passages, such as multi-document question answering (MDQA) and flexible…
Scientists from The Hong Kong University of Science and Technology, and the University of Illinois Urbana-Champaign, have presented ScaleBiO, a unique bilevel optimization (BO) method that can scale up to 34B large language models (LLMs) on data reweighting tasks. The method relies on memory-efficient training technique called LISA and utilizes eight A40 GPUs.
BO is attracting…