Android Bench has a leaderboard update, check it out!

This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).

Android Bench, Google's benchmark for evaluating LLMs on real-world Android development tasks, has updated its leaderboard with new models. The latest results feature a tie at the top, with two models achieving the same high score within comparable confidence intervals. The benchmark evaluates LLMs against real codebases with real dependencies, using unit and instrumentation tests for deterministic scoring. Results are available at developer.android.com/bench, and the GitHub repo is open for reproducing results and community feedback.

1m watch time

Sort: