Companies that build large language models, like those used in AI chatbots, routinely safeguard their systems using a process known as red-teaming. This involves human testers generating prompts designed to trigger unsafe or toxic responses from the bot, thus enabling creators to understand potential weaknesses and vulnerabilities. Despite the merits of this procedure, it often…
Artificial Intelligence chatbots have the capacity to construct helpful code, summarize articles, and even create more hazardous content. To prevent safety violations like these, companies employed a procedure known as "red-teaming" in which human testers crafted prompts intended to elicit unsafe responses from chatbots, which were then taught to avoid these inputs. However, this required…
Researchers from the Improbable AI Lab at MIT and the MIT-IBM Watson AI Lab have developed a new technique to improve "red-teaming," a process of safeguarding large language models, such as AI chatbot, through the use of machine learning. The new approach focuses on the automatic generation of diverse prompts that result in undesirable responses…
Neural networks have been of immense benefit in the design of robot controllers, boosting the adaptive and effectiveness abilities of these machines. However, their complex nature makes it challenging to confirm their safe execution of assigned tasks. Traditionally, the verification of safety and stability are done using Lyapunov functions. If a Lyapunov function that consistently…
Researchers from the Massachusetts Institute of Technology's (MIT) Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed an algorithm to mitigate the risks associated with using neural networks in robots. The complexity of neural network applications, while offering greater capability, also makes them unpredictable. Current safety and stability verification techniques, called Lyapunov functions, do not…
Researchers from Improbable AI Lab at MIT and the MIT-IBM Watson AI Lab have developed a technique to enhance the safety measures implemented in AI chatbots to prevent them from providing toxic or dangerous information. They have improved the process of red-teaming, where human testers trigger unsafe or dangerous context to teach AI chatbot to…
Artificial intelligence (AI) advancements have led to the creation of large language models, like those used in AI chatbots. These models learn and generate responses through exposure to substantial data inputs, opening the potential for unsafe or undesirable outputs. One current solution is "red-teaming" where human testers generate potentially toxic prompts to train chatbots to…
Methods for evaluating the dependability of a multi-functional AI model prior to its implementation.
Foundation models, or large-scale deep-learning models, are becoming increasingly prevalent, particularly in powering prominent AI services such as DALL-E, or ChatGPT. These models are trained on huge quantities of general-purpose, unlabeled data, which is then repurposed for various uses, such as image generation or customer service tasks. However, the complex nature of these AI tools…
Artificial intelligence (AI) and particularly large language models (LLMs) are not as robust at performing tasks in unfamiliar scenarios as they are positioned to be, according to a study by researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL).
The researchers focused on the performance of models like GPT-4 and Claude when handling “default tasks,”…
A group of New England Innovation Academy students have developed a mobile app that highlights deforestation trends in Massachusetts as part of a project for the Day of AI, a curriculum developed by the MIT Responsible AI for Social Empowerment and Education (RAISE) initiative. The TreeSavers app aims to educate users about the effects of…
GenSQL, a new AI tool developed by scientists at MIT, is designed to simplify the complex statistical analysis of tabular data, enabling users to readily understand and interpret their databases. To this end, users don't need to grasp what is happening behind the scenes to develop accurate insights.
The system's capabilities include making predictions, identifying anomalies,…