Vosk is an offline open source speech recognition toolkit supporting 20+ languages with small 50MB models. It provides continuous large vocabulary transcription with zero-latency streaming API, works across multiple programming languages (Python, Java, Node.js, C#, etc.), and scales from small devices like Raspberry Pi to large clusters. The toolkit enables applications like chatbots, virtual assistants, subtitle generation, and lecture transcription.

1m read timeFrom github.com
Post cover image
Table of contents
Vosk Speech Recognition ToolkitDocumentation

Sort: