A TTS model capable of generating ultra-realistic dialogue in one pass. - nari-labs/dia

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

Dia is a 1.6B parameter text-to-speech model developed by Nari Labs that generates ultra-realistic dialogue in one pass. It allows conditioning on audio for emotion and tone control and can produce nonverbal communications. Pretrained model checkpoints and inference code are available on Hugging Face, and there is a demo comparing it with ElevenLabs Studio and Sesame CSM-1B. The model runs on GPUs with future support for CPUs and comes with detailed installation instructions. Ethical and legal guidelines are provided for its use.

nari-labs/dia: A TTS model capable of generating ultra-realistic dialogue in one pass.