Kimi K2 Thinking : Best Agentic Reasoning LLM is here, beats GPT5, Sonnet 4.5

This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).

Moonshot AI released Kimi K2 Thinking, an open-source LLM that uses test-time scaling to perform extended reasoning chains with up to 300 tool calls per session. Unlike traditional models that scale parameters, K2 scales the number of reasoning steps, maintaining coherence across long chains while integrating web search, code execution, and documentation reading. The model achieves strong results on complex benchmarks like Humanity's Last Exam (44.9%) and SWE-Bench Verified (71.3%) through agentic reasoning. It uses INT4 quantization-aware training for efficiency and offers a Heavy Mode that runs eight parallel reasoning trajectories. K2 represents a shift from word prediction to sustained, tool-augmented cognition.

3m read timeFrom medium.com
Post cover image

Sort: