Skip to content Skip to footer

OpenAI suggests that releasing the Voice Engine could be fraught with risks.

OpenAI has announced that it has conducted small-scale testing of its new Voice Engine technology, a product capable of cloning a human voice from a single 15-second audio sample. The resulting voice can be used to convert text inputs into natural-sounding speech with emotive and realistic characters. However, despite the promising applications of the technology, OpenAI has expressed safety concerns which may impede its full release.

The tests carried out by select partners showcased potential use cases for the Voice Engine technology. Learning company ‘Age of Learning’ utilized the technology for reading assistance, Voiceover content creation and providing personalized verbal responses. Translation company ‘HeyGen’ used the Voice Engine to translate videos whilst maintaining the native accent of the speaker. ‘Dimagi’, a company that trains health workers in remote locations, used the technology to provide training and feedback in underrepresented languages. Furthermore, the technology was also tested to assist non-verbal individuals in using alternative communication devices, helping them to choose voices that best represent them, as well as aiding those with speech impairments due to cancer or neurological conditions regain their voice.

Though not the first of its kind, OpenAI’s Voice Engine represents state-of-the-art technology and possibly outperforms its competitors, such as ElevenLabs. The software’s capacity to generate lifelike voices with authentic inflection and emotive qualities are highlighted as key features.

However, OpenAI has stressed that there are significant safety considerations associated with large scale deployment of this technology. The ability to convincingly replicate someone’s voice poses risks, especially during an election year as evidenced by instances of fake calls and manipulated videos. For the test phase, OpenAI ensured clear restrictions were in place, demanding explicit and informed consent from the original speaker and disallowing participants from creating products to clone voices. Other security measures implemented included audio watermarking and proactive monitoring of the technology’s usage.

Notwithstanding OpenAI’s safety precautions, there are concerns within the AI industry about the potential misuse of this technology. For this reason, it is currently unlikely that OpenAI will release Voice Engine to the wider public. Advocating for institutions such as banks to cease using voice authentication as a safety measure, OpenAI considers that more safeguards are needed to identify AI-generated audiovisual content.

The announcement by OpenAI has thus spurred discussions around the ethical implications and potential misuse of voice cloning technology. Even if OpenAI refrains from releasing Voice Engine to the public, other companies will inevitably follow suit. Consequently, the industry must grapple with new challenges around trust and authenticity in the age of AI.

Leave a comment

0.0/5