Best of Speech RecognitionMarch 2025

  1. 1
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·1y

    Building a Real-time Voice RAG Agent

    Real-time voice interactions are becoming increasingly popular. This post provides a detailed, step-by-step guide on building a real-time Voice RAG Agent. Key components include using AssemblyAI for speech-to-text transcription, LlamaIndex for document-based answers, and Cartesia for generating seamless speech. The post includes a video and open-source code for easy implementation.

  2. 2
    Article
    Avatar of deepgramDeepgram·1y

    Introducing Nova-3 Medical: The Future of AI-Powered Medical Transcription

    Nova-3 Medical, the latest from Deepgram, offers best-in-class medical speech-to-text capabilities. The model provides unmatched accuracy in healthcare settings, capturing vital details like medication names and diagnostic terms while filtering out irrelevant noise. It is HIPAA-compliant and supports flexible customization with Keyterm Prompting. Nova-3 Medical outperforms its competitors significantly in terms of Word Error Rate (WER) and Keyterm Error Rate (KER). It integrates seamlessly with existing healthcare infrastructures and is optimized for performance and scale.