ExecuTorch, PyTorch's native inference platform, now supports on-device voice workloads across CPU, GPU, and NPU on Linux, macOS, Windows, Android, and iOS. Reference implementations are provided for five voice models: Voxtral Realtime (streaming transcription, ~4B params), Parakeet TDT (offline transcription, 0.6B params),

8m read timeFrom pytorch.org
Post cover image
Table of contents
TL;DRVoice on the Edge TodayDesign PrinciplesVoice Models in PracticeSample ApplicationsAdoption Case Study in production: LM StudioGet InvolvedAcknowledgement

Sort: