An open-weights Chinese model just beat Claude, GPT-5.5, and Gemini in a programming challenge

This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).

The AI Coding Contest Day 12 pitted 10 major language models against each other in a Word Gem Puzzle — a sliding-tile letter game on grids up to 30×30. Kimi K2.6, an open-weights model from Chinese startup Moonshot AI, won outright with 22 match points and a 7-1-0 record. Xiaomi's MiMo V2-Pro came second, while GPT-5.5, GLM 5.1, and Claude Opus 4.7 placed third through fifth. The contest revealed that models capable of active tile-sliding outperformed static word-scanners on larger boards. Notably, Muse scored −15,309 by claiming every short word despite heavy scoring penalties. The author argues this result reflects a narrowing capability gap between open-weights Chinese models and Western frontier labs, with Kimi K2.6 scoring 54 on the Artificial Analysis Intelligence Index versus GPT-5.5's 60 and Claude's 57.

#llm

#ai-coding

May 03•7m read time•From thinkpol.ca

Table of contents

The challenge What I saw What surprised me The bigger picture

Comment

Bookmark

Copy

Sort: