Z.ai has released GLM-5.1, an open-source coding model designed for long-running agentic software engineering tasks. Unlike models that degrade after a few turns, GLM-5.1 claims to sustain performance over hundreds of iterations — in one example, it improved a vector database optimization task over 600+ iterations and 6,000 tool calls, achieving 6x better throughput than a single short session. The model scores 58.4 on SWE-Bench Pro, reportedly outperforming GPT-5.4, Opus 4.6, and Gemini 3.1 Pro on that benchmark. Released under the MIT License with weights available for local deployment, it targets enterprises in regulated sectors where data governance and cost control matter. Analysts note the shift from prompt-based tools to agents that can be assigned multi-hour tasks, while cautioning that public benchmarks still don't fully reflect messy real-world codebases.

4m read timeFrom infoworld.com
Post cover image

Sort: