Sanas, a Speech AI company, built a real-time multilingual video calling app in 3 months using Expo and React Native. The app supports 25+ languages with sub-2-second translation latency, preserving each speaker's natural voice. The team leveraged Expo SDK 54, EAS Build, EAS Update, and modules like expo-audio, expo-camera, and custom native modules for LLM streaming inference and on-device voice cloning. A single persistent WebSocket multiplexes transcription, translation, and TTS synthesis to minimize latency. The architecture splits computation between the device and an edge server running a fine-tuned multilingual LLM, with WebRTC handling real-time bi-directional communication.

6m read timeFrom expo.dev
Post cover image
Table of contents
The challenge of real-time, low latency voice translationWhy Sanas chose ExpoSanas application architecture at a GlanceExample: Streaming translation in real timeThe ROI of building with ExpoWhat’s next for Sanas?

Sort: