Our mission is to build and democratize artificial general intelligence through open science.

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

Pocket TTS is a 100M-parameter text-to-speech model with voice cloning capabilities that runs in real time on CPUs. Unlike larger LLM-based TTS models requiring GPUs or smaller specialized models with fixed voices, it bridges the gap by using continuous audio latents instead of discrete tokens. The model achieves the lowest Word Error Rate on benchmarks while maintaining high audio quality and speaker similarity. Key innovations include a neural audio codec based on continuous latents, Masked Autoregressive framework with Lagrangian Self-Distillation, and techniques like Head Batch Multiplier, Gaussian Temperature Sampling, and Latent Classifier-Free Guidance. Trained on 88k hours of public English datasets, it's open-sourced under MIT license and requires only 5 seconds of audio for voice cloning.

Pocket TTS: A high quality TTS that gives your CPU a voice