Windsurf releases Arena Mode leaderboard results showing surprising rankings where speed matters as much as quality. Unlike traditional web-based arenas, this in-IDE evaluation allows users to vote for faster models if they complete first with good-enough results. Key findings include Gemini 3 Flash and Grok Code Fast beating Gemini 3 Pro, Claude Haiku 4.5 beating GPT 5.2, and Windsurf's own SWE 1.5 outperforming Claude Haiku. The methodology penalizes inefficient thinking tokens and reflects real-world IDE usage patterns across 40,000 votes, with no model achieving over 80% win rate.
Sort: