MIT researchers have developed a technique for improving the accuracy of uncertainty estimates in machine-learning models. This is especially important in situations where these models are used for critical tasks such as diagnosing diseases from medical imaging or filtering job applications. The new method works more efficiently and is scalable enough to apply to large…
Artificial intelligence (AI) and particularly large language models (LLMs) are not as robust at performing tasks in unfamiliar scenarios as they are positioned to be, according to a study by researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL).
The researchers focused on the performance of models like GPT-4 and Claude when handling “default tasks,”…
Ensuring the safety of large language models (LLMs) is vital given their widespread use across various sectors. Despite efforts made to secure these systems, through approaches like reinforcement learning from human feedback (RLHF) and the development of inference-time controls, vulnerabilities persist. Adversarial attacks have, in certain instances, been able to circumvent such defenses, raising the…
Artificial Intelligence (AI) and Machine Learning (ML) are transforming the field of cybersecurity by enhancing both defensive and offensive capabilities. On the defensive end, they are assisting systems to better detect and tackle cyber threats. AI and ML algorithms are proficient in dealing with vast datasets, thereby effectively identifying patterns and anomalies. These techniques have…
A group of researchers from Stanford University, UC San Diego, UC Berkeley, and Meta AI has proposed a new class of sequence modeling layers that blend the expressive hidden state of self-attention mechanisms with the linear complexity of Recurrent Neural Networks (RNNs). These layers are called Test-Time Training (TTT) layers.
Self-attention mechanisms excel at processing extended…
Complex tasks in software development often lead to a decrease in user experience quality and spike in business costs due to engineers pushing off tasks for later. However, Fume, a startup that uses Artificial Intelligence (AI) can efficiently address these complicated issues that include sentry mistakes, bugs, and feature requests.
Fume is known for its…
Large Language Models (LLMs) are advanced Artificial Intelligence tools designed to understand, interpret, and respond to human language in a similar way to human speech. They are currently used in various areas such as customer service, mental health, and healthcare, due to their ability to interact directly with humans. However, recently, researchers from the National…
Data curation, particularly high-quality and efficient data curation, is crucial for large-scale pretraining models in vision, language, and multimodal learning performances. Current approaches often depend on manual curation, making it challenging to scale and expensive. An improvement to such scalability issues lies in model-based data curation that selects high-quality data based on training model features.…
Researchers at the University of Cambridge are utilizing artificial intelligence (AI) tools in the objective to combat antibiotic resistance. Led by Professor Stephen Baker, the team developed a machine learning tool that can distinguish resistant bacteria from susceptible ones. The tool uses microscopy images to identify bacteria that are resistant to common antibiotics such as…
Adversarial attacks, efforts to deceitfully force machine learning (ML) models to make incorrect predictions, have presented a significant challenge to the safety and dependability of crucial machine learning applications. Neural networks, a form of machine learning algorithm, are especially susceptible to adversarial attacks. These attacks are especially concerning in applications such as facial recognition systems,…
Controllable Learning (CL) is being recognized as a vital element of reliable machine learning, one that ensures learning models meet set targets and can adapt to changing requirements without the need for retraining. This article examines the methods and applications of CL, focusing on its implementation within Information Retrieval (IR) systems, as demonstrated by researchers…
Large language models (LLMs) have demonstrated impressive performances across various tasks, with their reasoning capabilities playing a significant role in their development. However, the specific elements driving their improvement are not yet fully understood. Current strategies to enhance reasoning focus on enlarging model size and expanding the context length via methods such as chain of…