Skip to content Skip to footer

Recently, an AI-generated robocall mimicking Joe Biden urged New Hampshire residents not to vote. Meanwhile, “spear-phishers” – phishing campaigns targeting specific people or groups – are using audio deepfakes to extract money. However, less attention has been paid to how audio deepfakes could positively impact society. Postdoctoral fellow Nauman Dawalatabad does just that in a Q&A on the topic.

One area where audio deepfakes could help is in obscuring the identity of people speaking, because speech reveals sensitive information like age, gender, health status, and even potential future health conditions. Dawalatabad cites research indicating that certain health conditions such as dementia can be detected from speech patterns, which highlights the need for ways to keep such data private. He points out that the task of anonymizing speakers isn’t just a technical challenge, but a moral obligation to preserve individual privacy.

Dawalatabad also touches on how to combat audio deepfake-related spear-phishing attacks. These can spread misinformation, lead to identity theft, infringe privacy, and alter content maliciously. While anyone can create such fake audio, there are countermeasures being developed. Two prominent methods for detecting fake audio involve artifact and liveness detection: the former identifies anomalies in the audio, the latter focuses on natural elements of speech that AI models struggle to emulate. Companies like Pindrop are developing such solutions.

Dawalatabad also acknowledges the potential misuse of audio deepfakes, but insists the positive impacts shouldn’t be overlooked. He suggests that they could benefit fields like entertainment, healthcare, and education. He mentions an ongoing project of his to anonymize voices in health-related interviews, which would allow for sharing of crucial medical data while maintaining privacy. The technology could also restore the voices of people with speech impairment, improving their communication capabilities and quality of life.

He is optimistic about the potential of AI audio models, especially in the realm of psychoacoustics – the study of how humans perceive sound. He also believes innovations in augmented and virtual reality will enhance the audio experience, while at the same time advocating for further development of the technology to combat potential misuse. Despite the risks, Dawalatabad sees a positive trajectory for research in AI audio models, given its potential to revolutionize various sectors.

Leave a comment

0.0/5