Be amazed by OpenVoice – an incredible instant voice cloning AI library developed by the researchers at MIT, MyShell.ai, and Tsinghua University. With OpenVoice, you can replicate the voice of a reference speaker and generate speech in multiple languages with just a short audio sample from the reference speaker. This astonishing technology can even adaptably manipulate elements such as emotion, accent, rhythm, pauses, and intonation, creating contextually authentic speech and dynamic conversations. On top of that, OpenVoice also offers Zero-Shot Cross-Lingual Voice Cloning – allowing you to clone the tone color of the reference speaker even when the language of the reference speaker or the generated speech is unseen in the training dataset. What’s even more impressive is that OpenVoice is incredibly efficient, costing tens of times less than commercially available APIs.
The technical approach of OpenVoice involves decoupling the components in a voice as much as possible, independently generating language, tone color, and other voice features. The tone color cloning in OpenVoice is achieved through a tone color converter structurally similar to flow-based TTS methods but with different functionalities and training objectives. The base speaker TTS model in OpenVoice is trained using audio samples from English, Chinese, and Japanese speakers, with the ability to change accent, language, and emotions.
OpenVoice has proven itself to be extremely versatile, offering granular control over voice styles, including emotion, accent, rhythm, pauses, and intonation, while accurately cloning the tone color of the reference speaker. As such, OpenVoice introduces a remarkable design principle by separating the cloning of tone color from other voice styles and language components, enhancing its overall versatility.
Check out the Paper and Github. All credit for this research goes to the researchers of this project. If you’re looking for more cutting-edge AI research news, cool AI projects, and more, join our 35k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter today!