LiteRT is Google's cross-platform framework for on-device AI that abstracts NPU acceleration across mobile, desktop, and IoT hardware. It provides a unified API covering CPU, GPU, and NPU targets, eliminating the need for vendor-specific SDK integrations. Real-world deployments include Google Meet's Ultra-HD segmentation, Epic Games' 30 FPS MetaHuman real-time facial animation, and Argmax's on-device speech transcription achieving 2x+ speedup over GPU. The Google AI Edge Gallery app now supports NPU benchmarking for select Gemma models. LiteRT also extends to industrial edge hardware (Qualcomm Dragonwing IQ8, Arduino VENTUNO Q) and AI PCs via Intel OpenVINO integration, with the AI Edge Portal offering benchmark data across 100+ devices to guide deployment decisions.
Table of contents
Translating NPU performance into meaningful experiencesScaling performance across the hardware spectrumGet started with your NPU journeyAcknowledgementsSort: