Android Bench has a leaderboard update, check it out!

This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).

Android Bench, a benchmark for evaluating how LLMs handle real-world Android development, has refreshed its leaderboard with new models. The benchmark evaluates LLMs on real codebases with real dependencies, using unit and instrumentation tests for deterministic scoring. The latest update reveals a tie at the top between two models. Results are available at developer.android.com/bench, and the GitHub repo is open for reproducing results and community feedback.

1m watch time

Sort: