Defense Advanced Research Projects Agency (DARPA) Archives

An improved, speedier method to inhibit an AI chatbot from providing harmful responses.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 22, 202472Views 0Likes 0Comments

While artificial intelligence (AI) chatbots like ChatGPT are capable of a variety of tasks, concerns have been raised about their potential to generate unsafe or inappropriate responses. To mitigate these risks, AI labs use a safeguarding method called "red-teaming". In this process, human testers aim to elicit undesirable responses from the AI, informing its development…

An improved, quicker method to stop an AI chatbot from providing harmful responses.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 22, 202473Views 0Likes 0Comments

Artificial Intelligence (AI) Chatbots like OpenAI's ChatGPT are capable of performing tasks from generating code to writing article summaries. However, they can also potentially provide information that could be harmful. To prevent this from happening, developers use a process called red-teaming, where human testers write prompts to identify unsafe responses in the model. Nevertheless, this…

A more efficient and improved method to inhibit AI chatbots from producing harmful responses.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 21, 202465Views 0Likes 0Comments

AI chatbots like ChatGPT, trained on vast amounts of text from billions of websites, have a broad potential output which includes harmful or toxic material, or even leaking personal information. To maintain safety standards, large language models typically undergo a process known as red-teaming, where human testers use prompts to elicit and manage unsafe outputs.…

An improved and speedier method to stop AI chatbot from providing harmful responses.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 21, 202468Views 0Likes 0Comments

AI chatbots pose unique safety risks—while they can write computer programs or provide useful summaries of articles, they can also potentially generate harmful or even illegal instructions, including how to build a bomb. To address such risks, companies typically use a process called red-teaming. Human testers aim to generate unsafe or toxic content from AI…

A quicker, more efficient method to safeguard against an AI chatbot providing harmful or inappropriate responses.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 20, 202461Views 0Likes 0Comments

To counter unsafe responses from chatbots, companies often use a process called red-teaming, in which human testers write prompts designed to elicit such responses so the artificial intelligence (AI) can be trained to avoid them. However, since it is impossible for human testers to cover every potential toxic prompt, MIT researchers developed a technique utilizing…

An improved, quicker method to avoid AI chatbots from delivering harmful responses.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 20, 202467Views 0Likes 0Comments

Large language models powering AI chatbots possess the potential for generating harmful content due to their exposure to countless websites, putting users at risk if the AI generates illegal activities description, illicit instructions, or personal information leakage. To mitigate such threats, AI-developing companies use a procedure known as red-teaming, where human testers compose prompts aimed…

An improved, quicker method to stop an AI chatbot from providing harmful replies.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 19, 202465Views 0Likes 0Comments

Artificial intelligence (AI) chatbots like ChatGPT, capable of generating computer code, summarizing articles, and potentially even providing instructions for dangerous or illegal activities, pose unique safety challenges. To mitigate this risk, companies use a safeguarding process known as red-teaming, where human testers attempt to prompt inappropriate or unsafe responses from AI models. This process is…

A quicker and more effective method to stop an AI chatbot from providing harmful replies.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 19, 202462Views 0Likes 0Comments

Companies that build large language models, like those used in AI chatbots, routinely safeguard their systems using a process known as red-teaming. This involves human testers generating prompts designed to trigger unsafe or toxic responses from the bot, thus enabling creators to understand potential weaknesses and vulnerabilities. Despite the merits of this procedure, it often…

An improved and quicker method to guard against an AI chatbot providing harmful responses.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 18, 202467Views 0Likes 0Comments

Artificial Intelligence chatbots have the capacity to construct helpful code, summarize articles, and even create more hazardous content. To prevent safety violations like these, companies employed a procedure known as "red-teaming" in which human testers crafted prompts intended to elicit unsafe responses from chatbots, which were then taught to avoid these inputs. However, this required…

An improved, quicker method to inhibit an AI chatbot from providing harmful responses.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 18, 202466Views 0Likes 0Comments

Researchers from the Improbable AI Lab at MIT and the MIT-IBM Watson AI Lab have developed a new technique to improve "red-teaming," a process of safeguarding large language models, such as AI chatbot, through the use of machine learning. The new approach focuses on the automatic generation of diverse prompts that result in undesirable responses…

An improved, more efficient method to prohibit an AI chatbot from producing harmful responses.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 17, 202466Views 0Likes 0Comments

Researchers from Improbable AI Lab at MIT and the MIT-IBM Watson AI Lab have developed a technique to enhance the safety measures implemented in AI chatbots to prevent them from providing toxic or dangerous information. They have improved the process of red-teaming, where human testers trigger unsafe or dangerous context to teach AI chatbot to…

An improved, quicker method to restrict an AI chatbot from delivering harmful replies.

Algorithms, Artificial Intelligence, Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Defense Advanced Research Projects Agency (DARPA), Electrical Engineering & Computer Science (eecs), Human-computer interaction, Machine learning, MIT Schwarzman College of Computing, MIT-IBM Watson AI Lab, Research, School of Engineering, UncategorizedJuly 17, 202467Views 0Likes 0Comments

All Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All Categories

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

All Categories

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

Electrical Engineering & Computer Science (eecs)(430)

Machine learning(1188)

News(748)

Research(613)

School of Engineering(648)

Artificial Intelligence(2794)

Computer science and technology(559)

Data(164)

All
Categories

All
Categories

All
Categories