While speech AI is used to build digital assistants and voice agents, its impact extends far beyond these applications. Core technologies like text-to-speech…

NVIDIA DevTalk serves as a vibrant community hub where developers can engage in discussions, seek assistance, and collaborate on projects involving NVIDIA hardware and software. Developers can tap into the collective expertise of the NVIDIA developer community, sharing insights, troubleshooting issues, and exploring best practices for GPU programming and AI development. Additionally, DevTalk provides a platform for developers to showcase their projects, receive feedback, and network with peers, fostering collaboration and knowledge exchange within the NVIDIA ecosystem.

NVIDIA Developer

NVIDIA introduces three new Riva TTS models that advance multilingual speech synthesis and voice cloning capabilities. Magpie TTS Multilingual supports four languages with streaming encoder-decoder architecture, Magpie TTS Zeroshot enables voice cloning from 5-second samples, and Magpie TTS Flow targets studio applications with 3-second voice samples. All models use preference alignment and classifier-free guidance to improve text adherence and reduce audio artifacts. The models achieve superior performance in character error rates and naturalness compared to open source alternatives while requiring less training data.

Enhancing Multilingual Human-Like Speech and Voice Cloning with NVIDIA Riva TTS

Get started with NVIDIA Riva Magpie TTS models