Play 3.0 mini is a new lightweight, reliable, and cost-efficient multilingual text-to-speech model that can converse in over 30 languages. It achieves a mean latency of 189 milliseconds, making it the fastest model yet, and supports text-in and audio-out streaming via HTTP REST API, websockets API, or SDKs. The model also exhibits significant improvements in audio quality, reliability, and naturalness of speech. Additionally, it features state-of-the-art voice cloning capabilities and is offered at reduced pricing for different business tiers.

5m read timeFrom play.ht
Post cover image
Table of contents
Play 3.0 mini is our fastest, most conversational speech model yetPlay 3.0 mini supports 30+ languages across any voicePlay 3.0 mini is more accuratePlay 3.0 mini reads alphanumeric sequences more naturallyPlay 3.0 mini achieves the best voice similarity for voice cloningWebsockets API SupportPlay 3.0 mini is a cost efficient model

Sort: