Speed of Sound is a new open-source Linux desktop app that uses OpenAI's Whisper model (specifically the tiny variant) to enable voice-to-text typing in any focused text field. It works across GNOME and KDE on both X11 and Wayland via the XDG Desktop Portal, supports multiple languages, and processes audio locally with no data leaving the device. Users can trigger recording with a keyboard shortcut, and the app allows custom vocabulary and optional LLM-based text polishing. Additional Whisper models can be downloaded in-app for better accuracy. It's available on Flathub, Snap Store, and GitHub as AppImage, Deb, and RPM packages.

4m read timeFrom omgubuntu.co.uk
Post cover image
Table of contents
Has its uses, if you want to try it

Sort: