Google has launched Gemini 3.1 Flash-Lite, its fastest and most cost-efficient model in the Gemini 3 series, targeting high-volume developer workloads. Priced at $0.25/1M input tokens and $1.50/1M output tokens, it delivers 2.5x faster Time to First Answer Token and 45% higher output speed compared to Gemini 2.5 Flash. The model achieves an Elo score of 1432 on Arena.ai and scores 86.9% on GPQA Diamond and 76.8% on MMMU Pro benchmarks. It includes configurable thinking levels in Google AI Studio and Vertex AI, making it suitable for tasks ranging from high-volume translation and content moderation to complex reasoning tasks like UI generation. It is available in preview via the Gemini API.

2m read timeFrom blog.google
Post cover image
1 Comment

Sort: