Skip to content Skip to sidebar Skip to footer

Machine learning

Improving Understanding and Efficiency of Neural Networks through the Integration of Wavelet and Kolmogorov-Arnold Networks (Wav-KAN)

Recent advancements in Artificial Intelligence (AI) have given rise to systems capable of making complex decisions, but this lack of clarity poses a potential risk to their application in daily life and economy. As it is crucial to understand AI models and avoid algorithmic bias, model renovation is aimed at enhancing AI interpretability. Kolmogorov-Arnold Networks (KANs)…

Read More

A novel method has been developed to allow AI chatbots to engage in conversation all day without experiencing any system breakdowns.

A team of researchers from MIT and other institutions has developed a method to stop the performance deterioration in AI language models involved in continuous dialogue, like the AI chatbot, ChatGPT. Named StreamingLLM, the solution revolves around a modification in the machine’s key-value cache, acting as a conversation memory. Conventionally, when the cache overflows, the…

Read More

This small, secure identification tag has the ability to verify nearly anything.

MIT researchers have developed a small, affordable, and secure cryptographic ID tag that improves upon traditional radio frequency identification (RFID) tags by using terahertz waves, which are smaller and have higher frequencies than radio waves. Traditional RFIDs are often attacked by counterfeiters who take them off genuine items and reattach to a fake one; the…

Read More

The new method pinpoints medications that should not be combined.

Researchers from MIT, Brigham and Women’s Hospital, and Duke University have developed a strategy to identify how different drugs are transported through the digestive tract. This new multipronged strategy combines the use of tissue models and machine learning algorithms to comprehend which transporters help various drugs to pass through the digestive tract. This is an important…

Read More

Implementing artificial intelligence for individuals seeking problem-solving solutions.

In 2010, Media Lab students Karthik Dinakar SM ’12, Ph.D.’17, and Birago Jones SM ’12 developed a tool intended to assist content moderation teams for platforms such as Twitter and YouTube. The tool aimed to flag harmful content, with a key focus on posts that could be linked to cyberbullying. The project was warmly received,…

Read More

DIAMOND (Dissemination as a Framework of Environmental Dreams): A Training Method for Reinforcement Learning Agents within a Diffusion-Based World Model.

Reinforcement Learning (RL) involves learning decision-making through interactions with an environment and has been used effectively in games, robotics, and autonomous systems. RL agents aim to maximize their results and increase their efficiency by improving performance through continually adapting to new data. However, the RL agent's sample inefficiency impedes its practical application by necessitating comprehensive…

Read More

Revealing the Concealed Parallelism in Transformer Decoders: Fresh Perspectives for Effective Trimming and Improved Efficiency

Researchers from various institutions have recently unveiled a unique linear property of transformer decoders in natural language processing models such as GPT, LLaMA, OPT, and BLOOM. This discovery could have significant implications for future advancements in the field. These researchers discovered that there is a nearly perfect linear relationship in the embedding transformations between sequential…

Read More

Researchers from MIT have suggested a change known as Cross-Layer Attention (CLA) to the Transformer Architecture, which leads to a shrinkage in the Key-Value KV Cache size through an integrated approach to KV activations across different layers.

Managing large language models (LLMs) often entails dealing with issues related to the size of key-value (KV) cache, given that it scales with both the sequence length and the batch size. While techniques have been employed to reduce the KV cache size, such as Multi-Query Attention (MQA) and Grouped-Query Attention (GQA), they have only managed…

Read More

Researchers from MIT suggest a method called Cross-Layer Attention (CLA), which is a modification of Transformer Architecture aimed at decreasing the size of Key-Value KV cache by distributing KV activations over different layers.

MIT researchers have developed a method known as Cross-Layer Attention (CLA) to alleviate the memory footprint bottleneck of the key-value (KV) cache in large language models (LLMs). As more applications demand longer input sequences, the KV cache's memory requirements limit batch sizes and necessitate costly offloading techniques. Additionally, persistently storing and retrieving KV caches to…

Read More

A novel method allows AI chatbots to engage in conversations throughout the day without experiencing a system crash.

Researchers from MIT have devised a method called StreamingLLM which enables chatbots to maintain long, uninterrupted dialogues without crashing or performance dips. It involves a modification to the key-value cache at the core of many large language models which serves as a conversation memory, ensuring the initial data points remain present. The method facilitates a…

Read More

This compact, secure identification tag can verify virtually anything.

In a bid to tackle the problem of item counterfeiting, researchers at MIT have taken a significant step forward in developing a microscopic, cheap and secure cryptographic ID tag. This tiny tag, which uses terahertz waves and is notably smaller, less expensive, and safer than conventional radio frequency tags (RFIDs), was initially found to have…

Read More

The latest model recognizes medications that are not safe to consume concurrently.

Researchers from MIT, Duke University, and Brigham and Women’s Hospital have designed an innovative strategy to identify the specific transporters that different drugs utilize. The study could potentially improve patient treatment as it uncovered that certain common drugs can interfere with each other if they rely on the same transporter. The process is based upon…

Read More