Vosk is an offline open source speech recognition toolkit supporting 20+ languages with small 50MB models. It provides continuous large vocabulary transcription with zero-latency streaming API, works across multiple programming languages (Python, Java, Node.js, C#, etc.), and scales from small devices like Raspberry Pi to large clusters. The toolkit enables applications like chatbots, virtual assistants, subtitle generation, and lecture transcription.
Sort: