Google has released the stable version of Gemini 2.5 Flash-Lite, their fastest and most cost-efficient model in the 2.5 family at $0.10 per 1M input tokens and $0.40 per 1M output tokens. The model offers lower latency than previous versions, a 1 million-token context window, controllable thinking budgets, and native tool support including Grounding with Google Search and Code Execution. Early adopters like Satlyt achieved 45% latency reduction and 30% power consumption decrease, while HeyGen uses it for video translation into 180+ languages. The model demonstrates improved quality across coding, math, science, reasoning, and multimodal understanding benchmarks compared to 2.0 Flash-Lite.
Sort: