Best of Speech Recognition β€” June 2024

  1. 1
    Article
    Avatar of hnHacker NewsΒ·2y

    niedev/RTranslator: RTranslator is the world's first open source real-time translation app.

    RTranslator is a nearly open-source, offline real-time translation app for Android. It allows seamless conversation translation using Bluetooth headsets and phones, ensuring privacy by running AI models directly on the device. The app supports multiple languages and works in various modes, including conversation and walkie-talkie. It requires at least 6GB of RAM for optimal performance. The app is free, with no need for configuration, using Meta's NLLB for translation and OpenAI's Whisper for speech recognition.

  2. 2
    Article
    Avatar of gopenaiGoPenAIΒ·2y

    πŸš€ Revolutionizing Document Interaction: An AI-Powered PDF Voice-2-Voice Chatbot Using LlamaIndex πŸ‘, Langchain πŸ”— Azure AI Speech 🎀and Google Audio πŸ”Š

    Experience a breakthrough in document interaction with an AI-driven PDF voice-2-voice chatbot. Utilizing LlamaIndex, Langchain, Azure AI Speech, and Google Audio, this advanced system allows for seamless verbal interactions with PDF documents, enhancing accessibility and usability. Learn about its evolution from text-based dialogue to voice-enabled functionalities and explore the technical components, including dependencies, document handling, and user interaction features.