Google AI Edge Portal now supports benchmarking and debugging on-device LLMs across a physical lab of over 120 Android devices. Developers can measure key metrics like initialization time, prefill speed, decode speed, and peak memory usage for LiteRT-LM format models on CPU and GPU backends. A newly integrated Model Explorer tool enables graph visualization, side-by-side model comparison, and per-layer analysis to help identify and fix conversion, quantization, and optimization issues. The feature is currently in private preview for allowlisted Google Cloud customers at no charge.

4m read timeFrom cloud.google.com
Post cover image
Table of contents
Benchmark LLMs across over 120 different mobile devices

Sort: