A recently released Google AI model scores worse on certain safety tests than its predecessor, according to the company's internal benchmarking.

TechCrunch (TC) is a leading technology news and media site that covers the latest trends, startups, and innovations in the tech industry. With breaking news,  analysis, and expert commentary, TechCrunch provides  insights into the world of technology and entrepreneurship. Developers can learn about emerging technologies, funding opportunities, and market trends by following TechCrunch's coverage of the tech industry.

TechCrunch

Google's recently released Gemini 2.5 Flash AI model exhibits worse performance on safety benchmarks compared to its predecessor, Gemini 2.0 Flash. The model is more likely to generate content violating safety guidelines, though it follows instructions more faithfully even when problematic. Google attributes some safety regressions to false positives but admits to violations when explicitly prompted. The company's transparency in safety reporting has been criticized, with calls for more detailed model testing disclosures.

One of Google’s recent Gemini AI models scores worse on safety