Expo Blog

Sanas, a Speech AI company, built a real-time multilingual video calling app in 3 months using Expo and React Native. The app supports 25+ languages with sub-2-second translation latency, preserving each speaker's natural voice. The team leveraged Expo SDK 54, EAS Build, EAS Update, and modules like expo-audio, expo-camera, and custom native modules for LLM streaming inference and on-device voice cloning. A single persistent WebSocket multiplexes transcription, translation, and TTS synthesis to minimize latency. The architecture splits computation between the device and an edge server running a fine-tuned multilingual LLM, with WebRTC handling real-time bi-directional communication.

How Sanas built a real-time video translation app in 3 months using Expo

The challenge of real-time, low latency voice translation

Sanas application architecture at a Glance

Example: Streaming translation in real time