Skip to content Skip to footer

AI transcription devices can create damaging illusions.

Artificial intelligence (AI) transcription tools have become incredibly accurate and revolutionized various industries, such as medicine with its critical patient record-keeping and in the office setting for transcribing meeting minutes. But they are not infallible, with the latest study revealing troubling errors. This research indicates that advanced AI transcribers like OpenAI’s Whisper don’t just create meaningless or random text when they err. These transcribers “hallucinate” entirely new phrases, often bearing disturbing connotations.

Researchers from Cornell University, the University of Washington, New York University, and the University of Virginia found that Whisper, despite being one of the most advanced tools, hallucinated just over 1% of the time. More alarmingly, the team found that around 38% of these hallucinations included explicit harmful content, such as promoting violence, fabricating inaccurate associations, or insinuating false authority. The issue tends to worsen when transcribing speech of those with aphasia, a speech disorder leading to difficulties in finding the right words, exacerbated by the extended silences that the AI tends to “fill” with hallucinations.

The researchers classified three main categories of harmful hallucinations: Perpetuation of Violence, Inaccurate Associations, and False Authority. For instance, it added violent narratives or sexual innuendos to harmless scenes, introduced false information such as incorrect names or fictional relationships, and impersonated authoritative figures with potentially deceptive directives.

These errors can be catastrophic if integrated into critical documentation like witness statements, phone calls, or medical records. Let’s consider a real-world example: a hiring system using AI-transcribed video interview results to analyze and identify the most fitting candidate. Where the interviewee might pause, the Whisper may hallucinate troubling phrases like “terror knife”, negatively influencing the candidate’s hiring odds.

The implications of even minor, infrequent hallucinations in transcriptions are severe. While OpenAI has not explained why Whisper behaved this way, they’ve made improvements that have reduced these harmful hallucinations. Nevertheless, the researchers recommend that OpenAI inform users about the risk of hallucinations and discover why problematic transcriptions are being produced. They also urge the need for the technology to better accommodate groups like those with speech impediments – who are currently underserved by these tools.

Leave a comment

0.0/5