Gemini 3.1 Flash-Lite is our fastest and most cost-efficient Gemini 3 series model yet.

DM provides a diverse range of content spanning technology, business, and culture, offering articles, interviews, and analysis for readers interested in staying updated with the latest trends and developments across various industries. Readers can learn about emerging technologies, industry insights, and  perspectives from experts in different fields.

DeepMind

Google has launched Gemini 3.1 Flash-Lite, its fastest and most cost-efficient model in the Gemini 3 series, targeting high-volume developer workloads. Priced at $0.25/1M input tokens and $1.50/1M output tokens, it delivers 2.5x faster Time to First Answer Token and 45% higher output speed compared to Gemini 2.5 Flash. The model achieves an Elo score of 1432 on Arena.ai and scores 86.9% on GPQA Diamond and 76.8% on MMMU Pro benchmarks. It includes configurable thinking levels in Google AI Studio and Vertex AI, making it suitable for tasks ranging from high-volume translation and content moderation to complex reasoning tasks like UI generation. It is available in preview via the Gemini API.

Gemini 3.1 Flash-Lite: Built for intelligence at scale

<p>Gemini 3.1 Flash-Lite clearly targets the sweet spot between speed, cost, and scalability.</p>