In this video, I'll be sharing my early thoughts on GLM-5.1 after getting early access from Z AI. I’ll talk about how it improves long-running and agentic tasks, where it regresses in general chat, and why it might be one of the best cheap open models for coding right now.

--
Key Takeaways:

🚀 GLM-5.1 is mostly a post-train update to GLM-5, and it is noticeably better at long-running and agentic tasks.  
🤖 The model is much more coding-focused now, sometimes using code or HTML even when a simple text answer would be better.  
🛠️ Instruction following, debugging, and staying focused on the main objective are all much better than before.  
⚡ GLM-5.1 feels snappier than GLM-5 because it does less unnecessary reasoning on simple tasks.  
📉 General chat performance seems weaker now, especially for math and non-agentic use cases.  
🏆 In my tests, it ranks 5th on the overall leaderboard and 2nd on the agentic leaderboard, which is very impressive.  
💸 For the price, GLM-5.1 feels extremely competitive and could be a serious alternative to models like Codex and Opus for coding workflows.

AICodeKing

GLM 5.1, a post-training update to GLM 5 from ZAI, has been tested with early access. The model shows significant improvements in agentic tasks, instruction following, debugging, and planning compared to its predecessor. It performs comparably to Claude Opus and outperforms Codex in agentic benchmarks, ranking second on agentic leaderboards. However, it has regressed in general chat and non-agentic use cases, often generating unnecessary code blocks even for simple questions. It excels when used through agentic frameworks like OpenClaw or Kilo CLI, completing complex multi-step coding tasks including a movie tracker app, a Go terminal calculator, and a Kanban app in Svelte. The model is notably cost-effective for its performance level.

GLM-5.1 (Fully Tested): THE BEST OPEN / AGENTIC MODEL IS HERE! This is CRAZY!