OpenAI launches gpt-realtime, a new speech-to-speech AI model designed for enterprise applications with more natural, expressive voices and improved instruction-following capabilities. The model operates through the newly available Realtime API, featuring enhanced function calling, image recognition, and SIP support for contact center use cases. Despite competition from ElevenLabs, Soundhound, and others in the crowded voice AI market, OpenAI positions its solution with better benchmarking scores and reduced pricing at $32 per million audio input tokens.
Sort: