Researchers from MIT and the MIT-IBM Watson AI Lab have introduced an efficient method to train machine-learning models to identify specific actions in videos by making use of the video's automatically generated transcripts. The method, known as spatio-temporal grounding, helps the model intricately understand the video by dissecting it and analysing it through the lens…
Federated learning is a way to train models collaboratively using data from multiple clients, maintaining data privacy. Yet, this privacy can become compromised by gradient inversion attacks that reconstruct original data from shared gradients. To address this threat and specifically tackle the challenge of text recovery, researchers from INSAIT, Sofia University, ETH Zurich, and LogicStar.ai…
Fine-tuning large language models is a common challenge for many developers and researchers in the AI field. It is a critical process in adapting models to specific tasks or enhancing their performance. But it often necessitates significant computational resources and time. Conventional solutions, such as adjusting all model weights, are resource-intensive, requiring substantial memory and…
NVIDIA, a leader in artificial intelligence (AI) and graphic processing units (GPUs), has recently launched NV-Embed, an advanced embedding model built on the large language model (LLM) architecture. NV-Embed is set to transform the field of natural language processing (NLP) and has already demonstrated high performance results in the Massive Text Embedding Benchmark (MTEB). Its…
Causal models play a vital role in establishing the cause-and-effect associations between variables in complex systems, though they struggle to estimate probabilities associated with multiple interventions and conditions. Two main types of causal models have been the focus of AI research - functional causal models and causal Bayesian networks (CBN).
Functional causal models make it…
Large language models (LLMs) have rapidly improved over time, proving their prowess in text generation, summarization, translation, and question-answering tasks. These advancements have led researchers to explore their potential in reasoning and planning tasks.
Despite this growth, evaluating the effectiveness of LLMs in these complex tasks remains a challenge. It's difficult to assess if any performance…
When engaging in continuous dialogues, powerful language machine-learning models that drive chatbot technologies such as ChatGPT can struggle to cope, often leading to a decline in performance. Now, a team of researchers from MIT and elsewhere believe they have found a solution to this issue, which ensures chatbots can continue a conversation without crashing or…
A few years ago, MIT researchers created an innovative cryptographic ID tag several times smaller and much cheaper than traditional RFIDs (radio frequency tags) commonly used to authenticate products. Despite the significant improvements in size, cost, and security this new ID tag brought, it shared a major security vulnerability with RFIDs, where a counterfeiter could…
Researchers from MIT, Brigham and Women’s Hospital and Duke University have developed a multipronged strategy to identify which transporter proteins drugs use to pass through the GI tract. This could not only improve patient treatment by revealing which drugs might interact unfavorably with each other, but also enhance the development of new drugs by informing…
In 2010, Media Lab students Karthik Dinakar SM ’12, PhD ’17, and Birago Jones SM ’12 wanted to develop a tool to assist content moderation teams at companies like Twitter and YouTube. The project prompted excitement, earning them a demo at a White House cyberbullying summit. When Dinakar struggled to create a working demo, Jones…
A team of researchers from the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and Google Research have developed an image-to-image diffusion model called Alchemist, which allows users to modify the material properties of objects in photos. The system adjusts aspects such as roughness, metallicity, innate color (albedo), and transparency, and can be applied to…