Cohere has released Transcribe, a 2B-parameter open-source automatic speech recognition (ASR) model based on a conformer encoder-decoder architecture. It currently ranks #1 on the HuggingFace Open ASR Leaderboard with an average word error rate of 5.42%, outperforming Whisper Large v3, ElevenLabs Scribe v2, and others. The model supports 14 languages, is licensed under Apache 2.0, and is available for download on HuggingFace or via Cohere's API and Model Vault managed inference platform. It is designed for enterprise use cases including meeting transcription, speech analytics, and real-time customer support agents.

5m read timeFrom cohere.com
Post cover image

Sort: